Population Genetics and Microevolutionary Theory [2 ed.] 1118504232, 9781118504239

Population Genetics and Microevolutionary Theory Explore the fundamentals of the biological implications of population g

287 3 26MB

English Pages 768 [761] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Population Genetics and Microevolutionary Theory [2 ed.]
 1118504232, 9781118504239

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Population Genetics and Microevolutionary Theory

Population Genetics and Microevolutionary Theory Second Edition

Alan R. Templeton Department of Biology Washington University St. Louis, MO, USA

This second edition first published 2021 © 2021 John Wiley & Sons, Inc. Edition History 2006 John Wiley & Sons, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions. The right of Alan R. Templeton to be identified as the author of this work has been asserted in accordance with law. Registered Office John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA Editorial Office 9600 Garsington Road, Oxford, OX4 2DQ, UK For details of our global editorial offices, customer services, and more information about Wiley products, visit us at www.wiley.com. Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats. Limit of Liability/Disclaimer of Warranty The contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting scientific method, diagnosis, or treatment by physicians for any particular patient. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials, or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Library of Congress Cataloging-in-Publication Data Names: Templeton, Alan Robert, author. Title: Population genetics and microevolutionary theory / Alan R. Templeton. Description: Second edition. | Hoboken, NJ : Wiley-Blackwell, 2021. | Includes bibliographical references and index. Identifiers: LCCN 2021006543 (print) | LCCN 2021006544 (ebook) | ISBN 9781118504239 (hardback) | ISBN 9781118504369 (adobe pdf) | ISBN 9781118504345 (epub) Subjects: LCSH: Population genetics. | Evolution (Biology) Classification: LCC QH455 .T46 2021 (print) | LCC QH455 (ebook) | DDC 576.5/8–dc23 LC record available at https://lccn.loc.gov/2021006543 LC ebook record available at https://lccn.loc.gov/2021006544 Cover Design: Wiley Cover Image: Courtesy of Alan R. Templeton Set in 9.5/12.5pt STIXTwoText by SPi Global, Pondicherry, India 10 9 8 7 6 5 4 3 2 1

To Bonnie and to the Memory of Leon Blaustein

vii

Contents Preface xii About the Companion Website 1

The Scope and Basic Premises of Population Genetics 1 Basic Premises of Population Genetics 1 DNA Can Replicate 1 DNA Can Mutate and Recombine 3 Phenotypes Emerge from the Interaction of DNA and Environment Population Genomics 9

Part 1 2

xiv

The Scope and Basic Premises of Population Genetics

7

17

Modeling Evolution and the Hardy–Weinberg Law 19 How to Model Microevolution 19 The Hardy–Weinberg Model 21 An Example of the Hardy–Weinberg Law 28 Importance of the Hardy–Weinberg Law 30 Hardy–Weinberg for Two Loci 32 Sources of Linkage Disequilibrium 39 Some Implications of the Impact of Evolutionary History upon Disequilibrium

3

Systems of Mating 45 Inbreeding 45 Definitions of Inbreeding 47 Assortative Mating 61 A Simple Model of Assortative Mating 61 The Creation of Linkage Disequilibrium by Assortative Mating 64 Assortative Mating Versus Inbreeding 68 Assortative Mating, Linkage Disequilibrium, and Admixture 71 Disassortative Mating 73

4

Genetic Drift 77 Basic Evolutionary Properties of Genetic Drift Founder and Bottleneck Effects 83

78

41

viii

Contents

Genetic Drift and Disequilibrium 88 Genetic Drift, Disequilibrium, and System of Mating 89 Effective Population Size 91 Inbreeding Effective Size 93 Variance Effective Size 97 Some Contrasts Between Inbreeding and Variance Effective Sizes Estimating Effective Population Sizes 110

102

5

Genetic Drift in Large Populations and Coalescence 121 Newly Arisen Mutations 121 Neutral Alleles 122 The Neutral Theory and Its Origins 122 Critiques of the Neutral Theory 129 The Coalescent 134 The Basic Coalescent Process 135 Coalescence with Mutation 143 Genealogies, Gene Trees, and Haplotype Trees 146 Coalescence and Species Trees 160 Recombination and Coalescence 162

6

Gene Flow and Population Subdivision 169 Gene Flow Between Two Local Populations 169 The Balance of Gene Flow and Drift 172 An Example of the Balance of Drift and Gene Flow 183 Factors Influencing the Amount and Pattern of Gene Flow 197 Dispersal 197 Isolation by Distance and Resistance 200 Total Effective Population Size in Subdivided Populations 212 Multiple Modes of Inheritance and Population Structure 217 Admixture 220 Identifying Subpopulations and Population Structure 223 A Final Warning 235

7

Population History 237 Inferring Historical Effective Population Sizes 239 Inferring Historical Gene Flow Patterns and Admixture Events 243 Using Haplotype Trees to Study Population History 249 Expected Patterns Under Isolation by Distance 257 Expected Patterns Under Fragmentation 257 Expected Patterns Under Range Expansion 259 Multiple Patterns in Nested-Clade Analysis 263 Integrating Haplotype Tree Inferences Across Loci or DNA Regions Model-Based Approaches to Phylogeographic Analysis 274 Direct Studies over Space and Past Times 282 Historical Population Genetics and Macroevolution 287

264

Contents

Part 2

Genotype and Phenotype

295

8

Basic Quantitative Genetic Definitions and Theory 297 “Simple” Mendelian Phenotypes 298 The Phenotype of Electrophoretic Mobility 298 The Phenotype of Sickling 299 The Phenotype of Sickle Cell Anemia 300 The Phenotype of Malarial Resistance 301 The Phenotype of Health (Viability) 301 Nature Versus Nurture? 302 The Fisherian Model of Quantitative Genetics 304 Quantitative Genetic Measures Related to the Mean 305 Quantitative Genetic Measures Related to the Variance 318

9

Quantitative Genetics: Unmeasured Genotypes 321 Correlation Between Relatives 322 The Distinction Between Heritability and Inheritance 329 Response to Selection 331 The Problem of Between-Population Differences in Mean Phenotype 332 Controlled Crosses for the Analyses of Between Population Differences 338 The Balance Between Mutation, Drift, and Gene Flow Upon Phenotypic Variance

10

Quantitative Genetics: Measured Genotypes 345 Marker Association Studies 347 Admixture Mapping 347 Markers of Linkage 350 Genome-Wide Association Studies 356 Candidate Loci 366 Candidate Loci and Genetic Architecture 379 Pleiotropy 380 Epistasis 381 Pleiotropy and Epistasis 387 Gene by Environment Interactions 389 Genetic Architecture: Is the Whole Greater than the Sum of the Parts?

Part 3 11

Natural Selection and Adaptation

390

393

Natural Selection 395 The Fundamental Equation of Natural Selection: Measured Genotypes 397 Sickle-cell Anemia as an Example of Natural Selection 400 Adaptation as a Polygenic Process 409 The Fundamental Theorem of Natural Selection: Unmeasured Genotypes 411 Some Implications of the Fundamental Equations of Natural Selection 413 The Course of Adaptation and Natural Selection 422

343

ix

x

Contents

12 Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection 423 The Interaction of Natural Selection with Mutation 424 The Interaction of Natural Selection with Mutation and System of Mating 426 The Interaction of Natural Selection with Gene Flow 429 The Interaction of Natural Selection with Genetic Drift 435 The Interactions of Natural Selection, Genetic Drift, and Gene Flow 441 Population Subdivision 445 Genetic Architecture 446 The Interactions of Natural Selection, Genetic Drift, and Mutation 451 The Interactions of Natural Selection, Genetic Drift, Mutation, and Recombination Candidate Loci 468 Quantitative Genetic Approaches to Detecting Selection 471 The Neutralist/Selectionist Debate 474

465

13 Units and Targets of Selection 475 The Unit of Selection 477 Targets of Selection Below the Level of the Individual 483 The Genome 483 Gametes 489 Somatic Cells 501 Overview of Selection Below the Level of an Individual 502 Targets of Selection Above the Level of the Individual 503 Sexual Selection 503 Fertility/Fecundity 508 Competition and Cooperation 511 Kin/Family Selection 516 14 Selection in Heterogeneous Environments 523 Coarse-grained Spatial Heterogeneity 524 Coarse-grained Temporal Heterogeneity 546 Seasonal and Cyclical Variation 547 Random or Frequent Temporal Variation 549 Sporadic, Recurrent Environments 551 Transitions to a New Long-term Environment 554 Fine-grained Heterogeneity 558 Coevolution 567 15 Selection in Age-Structured Populations 574 Life History and Fitness 574 The Evolution of Senescence 581 Abnormal Abdomen: An Example of Selection in an Age-Structured Population 586 Genetic Architecture and Units and Targets of Selection Below the Level of the Individual 586 Genetic Architecture and Units of Selection at the Level of the Individual 592

Contents

Phenotypes and Potential Targets of Selection at the Level of the Individual 594 Natural Selection on the aa Supergene in a Spatially and Temporally Heterogeneous Environment 603 Natural Selection on the Components of the aa Supergene 614 Overview 616 Comparative Analysis 617 Reductionism 617 Holism 618 Monitoring Populations 620 Appendix A: Genetic Survey Techniques 622 Appendix B: Probability and Statistics References Index

723

668

636

xi

xii

Preface Much has changed in population genetics since the first edition came out in 2006, and much has stayed the same. The scope, questions, and methods of population genetics have always been constrained by the techniques available for surveying genetic variation, and the DNA/genomics revolution continues to alter these constraints. Yet, the basic premises of population genetic theory and practice have remained the same. These core premises allow this second edition to grow smoothly out of the first edition, so the basic structure of this edition is identical to that of the first. However, the revolution in DNA technologies and genomics leads to a significant expansion of topics and scope. The concepts and methods from population genetics are now widely used in almost all fields of biology, from molecular biology to ecology, and in applied fields from genetic epidemiology to conservation biology. This expanded applicability of population genetics is illustrated well by the final research project of my long-term collaborator and friend, Leon Blaustein, who sadly died all too young in 2020. By training and career, Leon was a freshwater population/community ecologist with a great concern for conservation applications. His final project focused on plasticity in aquatic resource use in the endangered species Salamandra infraimmaculta at its southernmost boundary in northern Israel. Most salamanders specialize in the type of aquatic habitat they use for their larval phase, but S. infraimmaculta can use permanent ponds and springs, permanent and seasonal streams, as well as rock pools that fill with rainwater for only a few months of the year. Leon assembled a diverse team of graduate students, post-docs, and collaborators to address this system from multiple directions. Leon first sought to define the evolutionary and ecological context in which this plasticity is manifest. Leon and his group performed surveys of molecular genetic diversity, maximum entropy modeling of habitat data to determine optimal and sub-optimal areas, and phylogeographic analyses to uncover historical effects. His group performed mark/recapture studies to estimate adult population sizes in a diversity of habitats. These studies revealed that this southernmost boundary was highly heterogeneous, varying from areas of optimal habitat with high levels of genetic diversity and gene flow, to areas of marginal habitat that were strongly subdivided genetically, and to an area of optimal habitat that was an historical isolate with a severe reduction in genetic diversity due to a past founder event. Field and laboratory experiments on larval developmental and morphological plasticity as well as on their transcriptomes revealed that plasticity at both the organismal and transcriptome levels displayed both genetically based differences among these populations and direct individual responsiveness to environmental variation. Coupling climate projection models with the maximum entropy models revealed diverse conservation challenges in these different areas of the southernmost boundary. Hence, both the evolvability of plasticity and current individual environmental plasticity could play an important role in the survival of this endangered species.

Preface

Although not a population geneticist, Leon’s final project illustrates well the importance of using population genetics and genomics in other fields of biology. The increasing scope and relevance of population genetics to many fields of basic and applied biology also means an expanded audience that needs to be knowledgeable of population genetic theory and practice. This edition is written with this expanded audience in mind. Many examples are given from conservation biology, human genetics, and genetic epidemiology, yet the focus of this book remains on the basic microevolutionary mechanisms and how they interact to create evolutionary change. This book is intended to provide a solid understanding of the core concepts in population genetics both for those students primarily interested in evolutionary biology and genetics and for those students primarily interested in applying the tools of population genetics in other areas such as conservation biology, genetic epidemiology, and genomics. Without a solid foundation in population genetics, the analytical tools emerging from population genetics will frequently be misapplied and incorrect interpretations can be made. This book is designed to provide that foundation both for future population and evolutionary geneticists and for those who will be applying population genetic concepts and techniques to other areas. I thank David Queller for reading over a draft of this edition. His many suggestions and corrections were highly valuable, and I greatly appreciate his efforts. David also used some of the drafts of this edition in the 2020 class on Population Genetics at Washington University. I thank the students of that class for their suggestions and corrections. Finally, I thank all the past students of my course on population genetics and my former graduate students and post-docs. They were the inspiration for this book in the first place, and they contributed valuable input to the first edition, much of which has carried over to the second edition.

xiii

xiv

About the Companion Website This book is accompanied by a companion website. www.wiley.com/go/templeton/populationgenetics2 This website includes:

••

Powerpoint slides of Figures from the book Problem and Answer sets

1

1 The Scope and Basic Premises of Population Genetics Population genetics is concerned with the origin, amount, and distribution of genetic variation present in populations of organisms and the fate of this variation through space and time. The kinds of populations that will be the primary focus of this book are populations of sexually reproducing diploid organisms, and the fate of genetic variation in such populations will be examined at or below the species level. Variation in genes through space and time constitutes the fundamental basis of evolutionary change; indeed, in its most basic sense, evolution is the genetic transformation of reproducing populations over space and time. Population genetics is, therefore, at the very heart of evolutionary biology and can be thought of as the science of the mechanisms responsible for microevolution, evolution within species. Many of these mechanisms have a great impact on the origin of new species and on evolution above the species level (macroevolution). A few of the impacts of population genetics upon species and speciation will be discussed, but this is not the main focus of this book.

Basic Premises of Population Genetics Microevolutionary mechanisms work upon genetic variability, so it is not surprising that the fundamental premises that underlie population genetic theory and practice all deal with various properties of DNA, the molecule that encodes genetic information in most organisms. (A few organisms use RNA as their genetic material, and the same properties apply to RNA in those cases.) Indeed, the theory of microevolutionary change stems from just three premises:

•• •

DNA can replicate DNA can mutate and recombine Phenotypes emerge from the interaction of DNA and environment The implications of each of these premises will now be examined.

DNA Can Replicate Because DNA can replicate, a particular kind of gene (specific set of nucleotides) can be passed on from one generation to the next and can also come to exist as multiple copies in different individuals. Genes, therefore, have an existence in time and space that transcends the individuals that temporarily bear them. The biological existence of genes over space and time is the physical basis of evolution. Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

2

Population Genetics and Microevolutionary Theory

The physical manifestation of a gene’s continuity over time and through space is a reproducing population of individuals. Individuals have no continuity over space or time; individuals are unique events that live and then die and cannot evolve. But the genes that an individual bears are potentially immortal through DNA replication. For this potential to be realized, the individuals must reproduce. Therefore, to observe evolution, it is essential to study a population of reproducing individuals. A reproducing population does have continuity over time as one generation of individuals is replaced by the next. A reproducing population generally consists of many individuals, and these individuals collectively have a distribution over space. Hence, a reproducing population has continuity over time and space and constitutes the physical reality of a genes’ continuity over time and space. Evolution is therefore possible only at the level of a reproducing population and not at the level of the individuals contained within the population. The focus of population genetics must be upon reproducing populations to study microevolution. However, the exact meaning of what is meant by a population is not fixed, but rather can vary depending upon the questions being addressed. The population could be a local breeding group of individuals found in close geographic proximity, or it could be a collection of local breeding groups distributed over a landscape such that most individuals only have contact with other members of their local group but that, on occasion, there is some reproductive interchange among local groups. Alternatively, a population could be a group of individuals continuously distributed over a broad geographical area such that individuals at the extremes of the range are unlikely to ever come into contact. Sometimes, the population includes the entire species. Within this hierarchy of populations found within species, much of population genetics focuses upon the local population or deme, a collection of interbreeding individuals of the same species that live in sufficient proximity that they share a system of mating. Systems of mating will be discussed in more detail in subsequent chapters, but, for now, the system of mating refers to the rules by which individuals pair for sexual reproduction. The individuals within a deme share a common system of mating. Because a deme is a breeding population, individuals are continually turning over as births and deaths occur, but the local population is a dynamic entity that can persist through time far longer than the individuals that temporarily comprise it. The local population, therefore, has the attributes that allow the physical manifestation of the genetic continuity over space and time that follows from the premise that DNA can replicate. Because our primary interest is on genetic continuity, we will make a useful abstraction from the deme. Associated with every population of individuals is a corresponding population of genes called the gene pool, the set of genes collectively shared by the individuals of the population. An alternative, and often more useful, way of defining the gene pool is that the gene pool is the population of potential gametes produced by all the individuals of the population. Gametes are the bridges between the generations, so defining a gene pool as a population of potential gametes emphasizes the genetic continuity over time that provides the physical basis for evolution. For empirical studies, the first definition is primarily used; for theory, the second definition is preferred. The gene pool associated with a population is described by measuring the numbers and frequencies of the various types of genes or gene combinations in the pool. Evolution is defined as a change through time of the frequencies of various types of genes or gene combinations in the gene pool. This definition is not intended to be an all-encompassing definition of evolution. Rather, it is a narrow and focused definition of evolution that is useful in much of population genetics precisely because of its narrowness. This will therefore be our primary definition of evolution in this book.

The Scope and Basic Premises of Population Genetics

Since only a local population at the minimum can have a gene pool, only populations can evolve under this definition of evolution, not individuals. Therefore, evolution is an emergent property of reproducing populations of individuals that is not manifested in the individuals themselves. However, there can be higher order assemblages of local populations that can evolve. In many cases, we will consider collections of several local populations that are interconnected by dispersal and reproduction, up to and including the entire species. However, an entire species in some cases could be just a single deme or it could be a collection of many demes with limited reproductive interchange. A species is therefore not a convenient unit of study in population genetics because species status itself does not define the reproductive status that is so critical in population genetic theory. We will always need to specify the type and level of reproducing population that is relevant for the questions being addressed.

DNA Can Mutate and Recombine Evolution requires change, and change can only occur when alternatives exist. If DNA replication were always 100% accurate, there would be no evolution. A necessary prerequisite for evolution is genetic diversity. The ultimate source of this genetic diversity is mutation. There are many forms of mutation, such as single nucleotide substitutions, insertions, deletions, transpositions, and duplications. For now, our only concern is that these mutational processes create diversity in the population of genes present in a gene pool. Because of mutation, alternative copies of the same homologous region of DNA in a gene pool will show different states. Mutation occurs at the molecular level. Although many environmental agents can influence the rate and type of mutation, one of the central tenets of Darwinian evolution is that mutations are random with respect to the needs of the organism in coping with its environment. There have been many experiments addressing this tenet, but one of the more elegant and convincing is replica plating, first used by Joshua and Esther Lederberg (1952) (Figure 1.1). Replica plating and other experiments provide empirical proof that mutation, occurring on DNA at the molecular level, is not being directed to produce a particular phenotypic consequence at the level of an individual interacting with its environment. Therefore, we will regard mutations as being random with respect to the organism’s needs in coping with its environment (although, as we will see soon, mutation is highly nonrandom in other respects at the molecular level). Mutation creates allelic diversity. Alleles are alternative forms of a gene. In some cases, genetic surveys focus on a region of DNA that may not be a gene in a classical sense; it may be a DNA region much larger or smaller than a gene, or a noncoding region. We will use the term haplotype to refer to an alternative form (specific nucleotide sequence) among the homologous copies of a defined DNA region, whether a gene or not. The allelic or haplotypic diversity created by mutation can be greatly amplified by the genetic mechanisms of recombination and diploidy. In much of genetics, recombination refers to meiotic crossing over, but we use the term recombination in a broader sense as any genetic mechanism that can create new combinations of alleles or haplotypes. This definition of recombination encompasses the meiotic events of both independent assortment and of crossing over and also includes gene conversion and any non-meiotic events that create new gene combinations that can be passed on through a gamete to the next generation. Sexual reproduction and diploidy can also be thought of as mechanisms that create new combinations of genes. As an illustration of the genetic diversity that can be generated by the joint effects of mutation and recombination, consider the MHC complex (major histocompatibility complex, also known in

3

4

Population Genetics and Microevolutionary Theory

Bacterial Plate Pressed on Velvet

Bacterial Colonies Imprinted on Velvet

Plate 1. Bacteria Grown in Absence of Streptomycin

Sterile Plate Pressed on Imprinted Velvet

Plate 2. Sterile Plate with Streptomycin

Each Bacterial Colony on Plate 1 is Isolated and Tested for Growth on a Plate with Streptomycin: Only one Colony Grows

Only one Colony Crows on Streptomycin

Figure 1.1 Replica plating. A suspension of bacterial cells is spread upon a Petri dish (plate 1) such that each individual bacterium should be well separated from all others. Each bacterium then grows into a colony of genetically identical individuals. Next, a circular block covered with velvet is pressed onto the surface of plate 1. Some bacteria from each colony stick to the velvet, so a duplicate of the original plate is made when the velvet is pressed onto the surface of a second Petri dish (plate 2), called the replica plate. The medium on the replica plate contains streptomycin, an antibiotic that kills most bacteria from the original strain. In the example illustrated, only one bacterial colony on the replica plate can grow on streptomycin, and its position on plate 2 identifies it as the descendant of a particular colony on plate 1. Each bacterial colony on plate 1 is then tested for growth on a plate with the antibiotic streptomycin. If mutations were random and streptomycin simply selected preexisting mutations rather than inducing them, then the colonies on plate 1 that occupied the positions associated with resistant colonies on plate 2 should also show resistance, even though these colonies had not yet been exposed to streptomycin. As shown, this was indeed the case.

humans as HLA, human leukocyte antigen) of about 200 genes on the same chromosome. Table 1.1 shows the number of alleles found at 26 of these loci as of July 2018 in human populations in an HLA database (Robinson et al. 2015). As can be seen, mutational changes at these loci have generated from 1 to 5212 alleles/locus with a total of 18 690 alleles over all 26 loci. However, these loci can and do recombine. Hence, recombination has the potential of combining these 18 690 alleles into 7.48 × 1042 distinct gamete types (obtained by multiplying the allele numbers at each locus). Sexual reproduction has the potential of bringing together all pairs of these gamete types in a diploid individual, resulting in 2.07 × 1079 genotypes. And this is only from 26 loci in one small region of one chromosome of the human genome! Given that there are only about 7.7 × 109 humans in the world at the beginning of 2019, everyone in the world (with the exception of identical twins) will have a unique HLA genotype when these 26 loci are considered simultaneously as there are many more possible genotypes than human individuals. But, of course, humans differ at many

The Scope and Basic Premises of Population Genetics

Table 1.1 Numbers of alleles known in 2018 at 26 loci within the human MHC (HLA) region. Locus

Number of Alleles

HLA-1 A

4340

HLA-1 B

5212

HLA-1 C

3930

HLA-1 E

27

HLA-1 F

31

HLA-1 G

61

DRA

7

DRB1

2268

DRB2

1

DRB3

171

DRB4

80

DRB5

61

DRB6

3

DRB7

2

DRB8

1

DRB9

6

DQA1

95

DQB1

1257

DOA

12

DOB

13

DMA

7

DMB

13

DPA1

67

DPA2

5

DPB1

1014

DPB2

6

Total

18 690

Source: Database described in Robinson et al. (2015) as updated on July 2018.

more loci than just these 26. As of 2015, 88 million genetic variants were known in the human genome (The 1000 Genomes Project Consortium 2015). Assuming that most of these are biallelic, each polymorphic nucleotide defines three genotypes, so collectively the number of possible genotypes defined by these known polymorphic sites is 388,000,000 = 1041,986 670 genotypes. To put this number into perspective, the number of electrons in the known universe is 1081 (https://www.quora.com/Approximately-how-many-electrons-are-there-in-the-known-Universe), a number far smaller than the number of potential genotypes that are possible in humanity just with known genetic variation. Hence, mutation and recombination can generate truly astronomical levels of genetic variation.

5

6

Population Genetics and Microevolutionary Theory

The distinction between mutation and recombination is often blurred because recombination can occur within a gene and thereby create new alleles or haplotypes. For example, 71 individuals from three human populations were sequenced for a 9.7 kb region within the lipoprotein lipase (LPL) locus (Nickerson et al. 1998). This represents just about a third of this one locus. In all, 88 variable sites were discovered, and 69 of these sites were used to define 88 distinct haplotypes or alleles. These 88 haplotypes arose from at least 69 mutational events (a minimum of one mutation for each of the 69 variable nucleotide sites) coupled with about 30 recombination and gene conversion events (Templeton et al. 2000a). Thus, intragenic recombination and mutation have together generated 88 haplotypes as inferred using only a subset of the known variable sites in just a third of a single gene in a sample of 142 chromosomes. These 88 haplotypes in turn define 3916 possible genotypes – a number considerably larger than the sample size of 71 people. Studies such as those mentioned above make it clear that mutation and recombination can generate large amounts of genetic diversity at particular loci or chromosomal regions, but they do not address the question of how much genetic variation is present within species in general. How much genetic variation is present in natural populations was one of the defining questions of population genetics up until the mid-1960s. Before then, most of the techniques used to define genes required genetic variation to exist. For example, many of the early important discoveries in Mendelian genetics were made in the laboratory of Thomas Hunt Morgan during the first few decades of the twentieth century. This laboratory used morphological variation in the fruit fly Drosophila melanogaster as its source of material to study. Among the genes identified in this laboratory was the locus that codes for an enzyme in eye pigment biosynthesis known as vermillion in Drosophila. Morgan and his students could only identify vermillion as a genetic locus because they found a mutant that coded for a defective enzyme, thereby producing a fly with bright red eyes. If a gene existed with no allelic diversity at all, it could not even be identified as a locus with the techniques used in Morgan’s laboratory. Hence, all observable loci had at least two alleles in these studies (the “wildtype” and “mutant” alleles in Morgan’s terminology). As a result, even the simple question of how many loci have more than one allele could not be answered directly. This situation changed dramatically in the mid-1960s with the first applications of molecular genetic surveys (first on proteins, later on the DNA directly; see Appendix A, which gives a brief survey of the molecular techniques used to measure genetic variation). These new molecular techniques allowed genes to be defined biochemically and irrespective of whether or not they had allelic variation. The initial studies (Harris 1966; Johnson et al. 1966; Lewontin and Hubby 1966), using techniques that could only detect mutations causing amino acid changes in protein coding loci (and only a subset of all amino acid changes at that), revealed that about a third of all protein coding loci were polymorphic (i.e. a locus with two or more alleles such that the most common allele has a frequency of less than 0.95 in the gene pool) in a variety of species. As our genetic survey techniques acquired greater resolution (Appendix A), this figure has only gone up. These genetic surveys have made it clear that many species, including our own, have literally astronomically large amounts of genetic variation. The chapters in Section 1 of this book will examine how premises 1 and 2 combine to explain great complexity at the population level in terms of the amount of genetic variation and its distribution in individuals, within demes, among demes, and over space and time. Because it is now clear that many species have vast amounts of genetic variation, the field of population genetics has become less concerned with the amount of genetic variation and more concerned with its phenotypic and evolutionary significance. This shift in emphasis leads directly into our third and final premise.

The Scope and Basic Premises of Population Genetics

Phenotypes Emerge from the Interaction of DNA and Environment A phenotype is a measurable trait of an individual (or as we will see later, it can be generalized to other units of biological organization). In Morgan’s day, genes could only be identified through their phenotypic effects. The gene was often named for its phenotypic effect in a highly inbred laboratory strain maintained under controlled environmental conditions. This method of identifying genes led to a simple-minded equating of genes with phenotypes that still plagues us today. Almost daily, one reads about “the gene for coronary artery disease,” “the gene for thrill seeking,” etc. Equating genes with phenotypes is reinforced by metaphors appearing in many textbooks and science museums to the effect that DNA is the “blueprint” of life. However, DNA is not a blueprint for anything; that is not how genetic information is encoded or processed. For example, the human brain contains about 1011 neurons and 1015 neuronal connections (Coveney and Highfield 1995). Does the DNA provide a blueprint for these 1015 connections? The answer is an obvious “NO.” There are only about 3 billion base pairs in the human genome. Even if every base pair coded for a bit of information, there is insufficient information storage capacity in the human genome by several orders of magnitude to provide a blueprint for the neuronal connections of the human brain. DNA does not provide phenotypic blueprints; instead, the information encoded in DNA controls dynamic processes (such as axonal growth patterns and signal responses) that always occur in an environmental context. There is no doubt that environmental influences have an impact on the number and pattern of neuronal connections that develop in mammalian brains in general. It is this interaction of genetic information with environmental variables through developmental processes that yields phenotypes (such as the precise pattern of neuronal connections of a person’s brain). Genes should never be equated to phenotypes. Phenotypes emerge from genetically influenced dynamic processes whose outcome depends upon environmental context. In this book, phenotypes are always regarded as arising from an interaction of genotype with environment. The marine worm Bonellia (Figure 1.2) provides an example of this interaction (Gilbert and Sarkar 2000). The free-swimming larval forms of these worms are sexually undifferentiated. If a larva settles alone on the normal mud substrate, it becomes a female with a long (about 15 cm) tube connecting a proboscis to a more rounded part of the body that contains the uterus. On the other hand, the larvae are attracted to females, and if it can find a female, it differentiates into a male that exists as a ciliated microparasite inside the female. The body forms are so different they were initially thought to be totally different creatures. Hence, the same genotype, depending upon environmental context, can yield two drastically different body types. The interaction between genotype and environment in producing phenotype is critical for understanding the evolutionary significance of genetic variability, so the chapters in Section 2 will be devoted to an exploration of the premise that phenotypes emerge from a genotype-byenvironment interaction. As a prelude to why the interaction of genotype and environment is so critical to evolution, consider the following phenotypes that an organism can display:

• • •

Being alive versus being dead: the phenotype of viability (the ability of the individual to survive in the environment) Given being alive, having mated versus not having mated: the phenotype of mating success (the ability of a living individual to find a mate in the environment) Given being alive and mated, the number of offspring produced: the phenotype of fertility or fecundity (the number of offspring the mated, living individual can produce in the environment).

7

8

Population Genetics and Microevolutionary Theory

Proboscis

Mouth

Uterus

Figure 1.2 Sexes in Bonellia. The female has a walnut-sized body that is usually buried in the mud with a protruding proboscis. The male is a ciliated microorganism that lives inside the female. Source: Adapted from Figure 3.18 from Genetics, 3rd Edition, by Peter J. Russell. Copyright © 1992 by Peter J. Russell. Reprinted by permission of Pearson Education, Inc.

The three phenotypes given above play an important role in microevolutionary theory because, collectively, these phenotypes determine the chances of an individual passing on its DNA in the context of the environment. Reproductive fitness is the collective phenotype produced by combining these three components required for passing on DNA into a single measure. Fitness will be discussed in detail in Section 3 of this book. Reproductive fitness turns premise 1 (DNA can replicate) into reality. DNA is not truly self-replicating. DNA can only replicate in the context of an individual surviving in an environment, mating in that environment, and producing offspring in that environment. Hence, the phenotype of reproductive fitness unites premise 3 (phenotypes are gene-by-environment interactions) with premise 1 (DNA can replicate). This unification of premises implies that the probability of DNA replication is determined by how the genotype interacts with the environment. In a population of genetically diverse individuals (arising from premise 2 that DNA can mutate and recombine), it is possible that some genotypes will interact with the environment to produce more or fewer acts of DNA replication than other genotypes. Hence, the environment influences the relative chances for various genotypes of replicating their DNA. As we will see in Section 3 of this book, this influence of the environment (premise 3) upon DNA replication (premise 1) in genetically variable populations (premise 2) is the basis for natural selection and one of the major emergent features of microevolution: adaptation to the environment. Adaptation refers to attributes and traits displayed by organisms that aid them in living and reproducing in specific environments. Adaptation is one of the more dramatic features of evolution, and, indeed, it was the main focus of the theories of Darwin and Wallace. Adaptation can only be understood in terms of a three-way interaction among all of the central premises of population genetics (Figure 1.3). This book uses these three premises in a progressive fashion: Section 1 utilizes premises 1 (DNA can replicate) and 2 (DNA can mutate and recombine), which are molecular in focus,

The Scope and Basic Premises of Population Genetics

Premise I : DNA Can Replicate

Heritable Variation in the Phenotype of Reproductive Fitness

Premise II: DNA Can Mutate & Recombine

Genetically Variable Population of Individuals

Premise III: Phenotypes Emerge from the Interaction of DNA and Environment

Environment

Figure 1.3 The integration of the three fundamental premises of population genetic theory through the phenotype of reproductive fitness.

to explain the amount and pattern of genetic variation under the assumption that the variation has no phenotypic significance on reproductive fitness. Section 2 focuses upon premise 3 (phenotypes emerge from the interaction of DNA and environment) and considers what happens when genetic variation does influence phenotype. Finally, Section 3 considers the emergent evolutionary properties that arise from the interactions of all three premises (Figure 1.3) and specifically focuses upon adaptation through natural selection. In this manner, we hope to achieve a thorough and integrated theory of microevolutionary processes.

Population Genomics Microevolutionary processes depend upon the existence of genetic variation in the gene pool, so it is not surprising that the questions asked and approaches used in population genetic research are heavily influenced by the techniques available for surveying genetic variation in a population. The importance of the technique used to survey variation was already mentioned above with respect to one of the earliest and most fundamental questions of population genetics – how much variation exists in a gene pool? As the field of molecular genetics matured into genomics, there has been explosive growth in the ability to address ever more fundamental questions in unprecedented detail in population genetics. The application of genomic techniques to address population genetic questions is referred to as population genomics. The focus of this book is upon eukaryotic, sexually reproducing diploid species. Such species have multiple genomes. A genome is a complete set of the chromosomes that are normally passed on through a gamete. Because eukaryotes arose from a symbiotic fusion of two or three different Precambrian lineages of prokaryotes, eukaryotic cells typically carry two or three different genomes with separate prokaryotic origins (Koonin 2010). The first is the nuclear genome, which evolved

9

10

Population Genetics and Microevolutionary Theory

Figure 1.4 A photomicrograph of the nuclear genome of Drosophila busckii as visualized through polytene chromosomes. Polytene chromosomes are formed in some tissues from endoreduplication – repeated rounds of DNA (deoxyribonucleic acid) replication without nuclear division. Homologous strands sit side by side, allowing a high-resolution visualization of the nuclear chromosomes. Source: Andrew Syred/Science Source.

from an Archaebacteria ancestor. This genome in modern eukaryotes generally consists of several linear and distinct DNA molecules complexed with a number of proteins and other chemicals to form distinct chromosomes. The nuclear genome refers to a haploid set of these chromosomes. Figure 1.4 shows a photomicrograph of a nuclear genome of the fruit fly Drosophila busckii. Fruit flies, like humans, have an XY sex chromosome system, so some genomes bear an X chromosome, and an alternative nuclear genome bears a Y chromosome and no X chromosome. Y chromosomes are often much smaller in many species, including Drosophila and humans. In some species, sex is determined by factors other than chromosomes, such as environmental factors as illustrated in Figure 1.2, in which case there is only one type of nuclear genome. In diploid species, cells carry two nuclear genomes: one derived from the female gamete and one derived from the male gamete. The mode of inheritance of the autosomal portion of the genome is bisexual and sexually balanced, but the sex chromosomes display a different pattern of inheritance. For example, in an XY system of sex chromosomes, such as those found in humans and Drosophila, the X chromosome is inherited bisexually, but females are diploid for the X chromosome whereas males are haploid, resulting in haplo–diploid inheritance. The Y chromosome is unisexual in inheritance, being passed on from father to son, and is haploid in males. During the Precambrian, certain eubacterial lineages evolved the capacity to utilize oxygen for aerobic metabolism after the earlier evolution of prokaryotic photosynthesis. Photosynthesis resulted in the release of large amounts of oxygen into Earth’s ancient reducing atmosphere.

The Scope and Basic Premises of Population Genetics

Oxygen was initially a deadly pollutant for most life forms, but some prokaryotic lineages adapted to this new environmental factor through the evolution of aerobic metabolism that allowed oxygen to be used as an efficient means of extracting energy from sugars. Some of these aerobic bacteria were ingested by Archaebacteria but not digested, resulting ultimately in a symbiotic relationship in which the aerobic symbiots evolved into the organelle now called mitochondria. This horizontal transfer between these two bacterial lineages resulted in the eukaryotic cell with both nuclear and mitochondrial genomes. The mitochondria retained a more bacteria-like genome consisting of a single, circular piece of DNA (Figure 1.5). Over the course of evolution, many of the genes needed for aerobic metabolism were transferred from the mitochondrial genome into the nuclear genome, but other genes essential for aerobic metabolism are retained in the mitochondrial genome. Mitochondrial DNA is often inherited only through females in bisexual animals and displays a clonal, non-recombining haploid pattern of inheritance rather than a bisexual, recombining mode.

tRNA Phe

tRNA Pro tRNA Thr

Control Region

tRNA Val 12S rRNA

Cyt b

16S rRNA

ND6

tRNA Leu

ND5

ND1 tRNA Leu tRNA Ser

tRNA Ile tRNA Gln tRNA Met ND2

tRNA His

tRNA Trp tRNA Ala tRNA Asn tRNA Cys tRNA Tyr

ND4

ND4L ND3

tRNA Arg

COI

COII

ATP8

COIII

tRNA Gly

ATP6

tRNA Ser tRNA Asp tRNA Lys

Figure 1.5 The human mitochondrial genome. The names and positions of the various genes and control elements are indicated on the circular genome.

11

12

Population Genetics and Microevolutionary Theory

Eukaryotic plants acquired another genome, the chloroplast genome, through endosymbiosis with a photosynthetic cyanobacteria lineage. The chloroplast genome confers upon plants the ability of photosynthesis. The chloroplast genome, like the mitochondrial genome, consists of circular DNA, although it is typically larger and contains more genes than the mitochondrial genome. Like the mitochondrial genome, the chloroplast genome typically displays unisexual inheritance that is effectively haploid. Maternal inheritance is the most common pattern, but some groups of plants display biparental and paternal inheritance (Birky 2008). The fact that autosomes, sex chromosomes, and organelle DNAs often display distinct patterns of inheritance has made genetic surveys of two or more of these types of DNA a powerful tool in population genetics and genomics. As we will see in later chapters, such joint surveys allow the separation of the role of the sexes, coverage of longer periods of time in reconstructing the past, and the exploration of the balance of many different evolutionary forces acting in a reproducing population. The diversity in modes of inheritance has thereby made premise 1, DNA can replicate, more powerful than ever in the genomic era. Genomes are the physical–chemical structures in which the processes of mutation and recombination are carried out. Although mutation is random with respect to the needs of the organism, as pointed out above and in Figure 1.1, mutation is highly nonrandom at the genomic level, a fact often ignored in the population genetic literature. One common class of mutations is when a single nucleotide mutates to another nucleotide state. It is critical to note that such single nucleotide substitutions are chemical processes, and, as such, the probability and type of substitution a given single nucleotide is likely to experience are influenced by the physical–chemical environment created by the nucleotides surrounding the target nucleotide (Sung et al. 2015; Zhu et al. 2017; Suárez-Villagrán et al. 2018). In other words, single site mutations can be regarded as phenotypes that depend in part upon the environmental context as determined by their nucleotide neighbors. Hence, single nucleotide mutagenesis is actually a multi-nucleotide process that creates mutational hotspots in the genome, sometimes for just a single nucleotide within a multi-nucleotide mutagenic motif, but sometimes for several nucleotides in the motif leading to not only a hotspot but closely spaced mutational clusters (Schrider et al. 2011; Besenbacher et al. 2016). Templeton et al. (2000a) explored the impact of mutagenic motifs upon the nucleotide variation in a 9.7 kb segment of the LPL gene in 71 individuals. They discovered that about half of the nucleotide sites showing variation were associated with just three multi-nucleotide motifs, with CG dinucleotides being the most likely to show variation (Table 1.2). CG dinucleotides are hypermutagenic when the cytosine is methylated, making a C to T transition highly likely, and mutability can be further enhanced by the 5 nucleotide adjacent to a CG pair (Baele et al. 2008). Mutational hotspots associated with Table 1.2 The distribution of polymorphic nucleotide sites in a 9.7 kb region of the human Lipoprotein Lipase gene over nucleotides associated with three known mutagenic motifs and all remaining nucleotide positions. Number of Nucleotides in Motif

Number of Polymorphic Nucleotides

CG

198

19

9.6%

Polymerase α Arrest Sites With Motif TG(A/G)(A/G)GA Mononucleotide Runs ≥5 Nucleotides

264

8

3.0%

Motif

All Other Sites Source: Modified from Templeton et al. (2000a).

Percent Polymorphic

456

15

3.3%

8777

46

0.5%

The Scope and Basic Premises of Population Genetics

DNA sequence motifs are not limited just to single nucleotide substitutions but are also found for indel (insertion/deletion) and copy-number mutations (Kvikstad and Duret 2014; Press et al. 2019). Methylated CG dinucleotide mutations illustrate that nonrandom mutagenesis not only creates mutational hotspots in the genome but also increases the probability of mutational homoplasy in which two independent mutational events at the same nucleotide site create the same nucleotide substitution. Such homoplasy strikes directly at a fundamental concept in population genetics – the distinction between identity-by-descent versus identity-by-state. Identity-by-descent occurs when two homologous copies of a DNA region are identical in nucleotide state because no mutations occurred in either DNA lineage, tracing back to their common ancestral DNA molecule (by definition, homologous copies are descended from a common ancestral molecule). In contrast, identity-by-state occurs when two homologous copies of a DNA region are identical in nucleotide state regardless of their mutational history. As will be seen in later chapters, much of population genetic theory has been formulated in terms of identity-by-descent, and this theory can be most easily applied to data when identity-by-state is equated to identity-by-descent. Homoplasy undermines this assumption by allowing two identical DNA copies to have had mutations in one or both lineages from the ancestral molecule. Despite the overwhelming evidence that single nucleotide mutagenesis is a multi-nucleotide process, many population genetic models and computer simulations treat each nucleotide or indel as independent mutational units that do not depend upon the sequence context. For example, one of the most commonly used models of mutation in population genetics is the infinite sites model that assumes that every mutation occurs at a different mutational site because the probability of mutation at any given site is small and there are a large number of potential sites. The infinite sites model eliminates the possibility of multiple hits and homoplasy and always results in identity-bystate being the same as identity-by-descent. More complicated models of mutation are possible that allow multiple hits and homoplasy, as will be given later in this book. Computer programs such as ModelTest and jModelTest (Posada 2008) assess the best fitting mutational model from DNA sequence data over a large number of mutational models. Mutational hotspots can be incorporated into some of these models by allowing rate variation across sites. However, none of the models in ModelTest incorporate the impact of multi-nucleotide sequence context into the mutational model. However, just because a model makes unrealistic biological assumptions does not mean that it is useless. In the next chapter, we will present the Hardy–Weinberg model, a basic model of evolution that makes many unrealistic assumptions yet has turned out to be extremely useful and yields results that accurately describe many populations. The question is, therefore, not if the assumptions of a model are unrealistic but rather how robust are the model’s predictions to violations of its assumptions. Unfortunately, because of the analytical difficulty and computational complexity of multisite mutational models, there have only been a few examinations of robustness to deviations from the simpler mutational models. The few studies that have been performed indicate that directly modeling multisite context results in much better fits to sequence data than independent-site models, and the failure to incorporate multisite context produces biases and false positives in phylogenetic inference and detecting natural selection (Siepel and Haussler 2004; Baele et al. 2008; Lawrie et al. 2011; Bérard and Guéguen 2012; Chachick and Tanay 2012; Bloom 2014; Kvikstad and Duret 2014). Many of the studies mentioned above involve long evolutionary time scales, and a frequent justification for the infinite sites model is that multiple hits and homoplasy are not likely on the shorter time scales relevant for most population genetic studies. However, this argument does not seem to be borne out by genomic surveys. For example, multiple hits and parallel changes minimally occurred at 51% of the 88 single nucleotide polymorphisms (SNPs) in the LPL gene in humans

13

14

Population Genetics and Microevolutionary Theory

(Templeton et al. 2000a), and extensive homoplasy has been observed in other regions of the human genome (e.g. Fullerton et al. 2000). Hence, the very genetic variation that is the main focus of population genetics is greatly influenced by mutational processes that violate the infinite sites model. These results are worrisome because many of the analyses and computer simulations used in population genetics depend upon the infinite sites model or at least independent sites models of mutation; yet, the robustness of the predictions has rarely been examined. A few studies have examined the question of robustness to the mutational model, and many predictions are not robust. For example, inferences concerning recombination (Templeton et al. 2000a), gene flow (Strasburg and Rieseberg 2010), population structure/demographic history (Cutter et al. 2012), and natural selection (Baele et al. 2008; Cutter et al. 2012) can all be greatly affected by violations of commonly used simple mutational models. Given these results, a more complete examination of robustness to simple mutational models is greatly needed in population genetics. Recombination is another major source of genetic variation, and it too is often concentrated into recombinational hotspots that are associated with specific DNA sequence motifs (Singh et al. 2013; de Massy 2014; Singhal et al. 2015). Moreover, recombination hotspots are frequently hotspots for mutation and gene conversion (Pratto et al. 2014; Arbel-Eden and Simchen 2019; Halldorsson et al. 2019), a process related to recombination that places a small segment from one chromosome onto a homologous chromosome and thereby can create a new haplotype. For example, 30 statistically significant recombination and gene conversion events (after correction for false positives) were detected in the 9.7 kb region in the human LPL gene. All these events mapped to a small region associated with a microsatellite sequence in the 6th intron (Templeton et al. 2000a), as shown in Figure 1.6, with not a single significant recombination or gene conversion event detected anywhere else in this 9.7 kb region. Just as mutational hotspots can create false or misleading signals for evolutionary forces such as natural selection, so can recombination and gene conversion hotspots (Bolívar et al. 2019). The fact that DNA sequence influences both the local rates of mutation and recombination at the genomic level creates an interesting twist to premise 2 in population genetics, namely, local mutation and recombination can be regarded as a genomic level, inherited phenotype that is potentially responsive to natural selection. Martincorena and Luscombe (2013) investigated the possibility that populations might evolve lower mutation rates at genomic positions where most mutations are Number of Events 15

10

5

0 E4

E5

E6

E7

E8

E9

Figure 1.6 Recombination and gene conversion events in a 9.7 kb interval of the LPL gene. The horizontal line is map of the region, showing the positions of exons (E4 through E9) in thick lines and introns in thin lines. The number of statistically significant recombination or gene conversion events that could have occurred in an interval defined by adjacent single nucleotide polymorphism pairs is plotted. Source: Based on data in Templeton et al. (2000a).

The Scope and Basic Premises of Population Genetics

deleterious and increased mutation rates where many mutations are advantageous. They found that natural selection could result in targeted hyper- and hypomutation, with the conditions favoring hypermutation being more restrictive. Livnat (2013, 2017) has also proposed models of natural selection interacting with nonrandom mutation and recombination at the genome level to produce adaptive evolutionary change. At first glance, the ability of natural selection to adjust local mutation and recombination rates may seem to contradict our earlier conclusion that mutations are random with respect to the environmental needs of the organism, as illustrated by the replica plating experiment shown in Figure 1.1. However, there is nothing Lamarckian about the selective models of Martincorena and Luscombe (2013) and Livnat (2013, 2017) that deal with the adjustment of mutation rates through ordinary natural selection on DNA sequences and not with the production of a specific adaptive mutation induced by a specific environmental need (the Lamarckian model). These models do show that nonrandom mutation and recombination at the genomic level can play an important role in adaptive response (Figure 1.3) and that the details underlying premise 2 are themselves subject to adaptive evolutionary change. Genomic science has also had a profound impact on our understanding of premise 3 that phenotypes arise out of the interaction of genotypes responding to the environment. The genes in any of the genomes constitute biological information but not function unless they are expressed. Whether a gene codes for a protein or for a functional RNA molecule (Bonasio and Shiekhattar 2014; Lyu et al. 2014), the first step in going from stored information to biological functionality is the transcription of the DNA into RNA by RNA polymerases. The transcriptome refers to the fraction of the genome that is actually being transcribed, and the transcriptome varies over developmental stages, across cell types, across individuals, and across many environmental factors. There is not a single transcriptome for a species or even an individual, but many, and all, are highly context dependent. The portions of a genome that are being transcribed at a given time and cell type are influenced by many factors, including the three-dimensional structure of the genome in terms of the relative positioning of sequences on the same chromosome and between chromosomes (Bouwman and de Laat 2015; Bonev and Cavalli 2016), chromatin remodeling (Chambers et al. 2013), and DNA sequences such as promoters, enhancers, silencers, response elements, and insulators. These DNA sequences often have binding domains that when bound to a protein or some other molecule (called transcription factors) can activate or deactivate the transcriptional function of the promoters, enhancers, silencers, etc. These binding molecules themselves are often induced by various environmental factors. Environmental factors can also influence how the transcribed RNA is processed, edited, and translated into proteins and how these proteins are sometimes modified post-translationally. Hence, the environment can play a direct role in how the information in the DNA influences phenotypes. Many binding domains are found in transposable elements (e.g. Cui et al. 2011), segments of DNA that have the ability to move or make copies of themselves at different locations in the genome. Transposable elements, also known as transposons, often constitute a sizable portion of the nuclear genome in eukaryotes, going up to 90% of the maize genome (SanMiguel et al. 1996). When transposons bearing binding domains transpose to new locations within a genome, there is the potential for bringing a new gene into an established regulatory network, thereby inducing genetic variation in the coordination of gene-network transcription regulation. Moreover, many binding domains can bind several different transcription factors, and even a single mutation in a binding domain can greatly alter the transcription factors that can influence transcription in the region influenced by the domain (Payne and Wagner 2014). This ability of mutation to change the array of transcription factors that will bind a specific domain and the movement of such domains by transposons can generate much variation in transcriptional regulatory patterns. This

15

16

Population Genetics and Microevolutionary Theory

variation in turn confers a high degree of evolvability (the ability to bring forth novel adaptations) upon the population. Evolvability is also enhanced by the nonrandom nature of mutation and recombination in the models of Livnat (2013, 2017). Another important aspect of the transcriptome is epigenetics, the development and maintenance of heritable states of gene expression patterns that do not directly depend on the DNA sequence and that are typically reversible, often in response to environmental cues (Bonasio et al. 2010). Epigenetic states are frequently controlled by chemical modifications of histones in the chromatin and/or the methylation of cytosines at CG dinucleotides (Rivera and Ren 2013; Won et al. 2013). These chemically modified epigenetic states are stable yet reversible and can be copied during the process of mitosis and thereby persist over multiple cell generations and can sometimes be passed on across generations (Boskovic and Rando 2018; Zhang et al. 2018). These chemical modifications are controlled by enzymes and noncoding RNAs (Fatica and Bozzoni 2014), which in turn are coded for in genes. Genetic variation in these genes influences epigenetic phenomena (Klironomos et al. 2013; Kilpinen et al. 2013), so the epigenome, including its sensitivity to environmental cues, is itself a phenotype that is influenced by underlying genetic variation, and hence subject to change by natural selection as shown in Figure 1.3. As illustrated by the discussion above, the DNA in the genome is not a static blueprint of life but rather is a dynamic entity that is constantly changing in response to the environment to produce phenotypes and whose dynamic attributes are subject to evolutionary change, including adaptation. Hence, genomic studies strongly reinforce the validity of premise 3 in population genetics and its integration with the other premises, as shown in Figure 1.3.

17

Part 1 The Scope and Basic Premises of Population Genetics

19

2 Modeling Evolution and the Hardy–Weinberg Law Throughout this book, we will construct models of reproducing populations to investigate how various factors can cause evolutionary changes. In this chapter, we will construct some simple models of an isolated local population. These models eliminate many possible features in order to focus our inference upon one or a few potential microevolutionary factors. The models will also provide insights that have been historically important to the acceptance of the neoDarwinian theory of evolution at the beginning of the twentieth century and are of increasing importance to the application of genetics to human health and other contemporary problems in the twenty-first century.

How to Model Microevolution Given our definition that evolution is a change over time in the frequency of alleles or allele combinations in the gene pool, any model of evolution must include at the minimum the passing of genetic material from one generation to the next. Hence, our fundamental time unit will be the transition between two consecutive generations at comparable stages. We can then examine the frequencies of alleles or allelic combinations in the parental versus offspring generation to infer whether or not evolution has occurred. All such trans-generational models of microevolution have to make assumptions about three major mechanisms:

•• •

Mechanisms of producing gametes Mechanisms of uniting gametes Mechanisms of developing phenotypes

In order to specify how gametes are produced, we have to specify the genetic architecture. Genetic architecture is the number of loci and their genomic positions, the number of alleles per locus, the mutation rates, and the mode and rules of inheritance of the genetic elements. For example, the first model we will develop assumes a genetic architecture of a single autosomal locus with two alleles with no mutation. The genetic architecture provides the information needed to specify how gametes are produced. For a single-locus, two-allele autosomal model with no mutation, we need only to use Mendel’s first law of inheritance (the law of equal segregation of the two alleles in an individual heterozygous at an autosomal locus) to specify how genotypes produce gametes. Other single-locus genetic architectures can display different modes of inheritance, including X-linked loci (with a haplo–diploid, sex-linked mode of inheritance in organisms with an XY sex-determining system), Y chromosomal loci (with a haploid, unisexual paternal mode of inheritance in organisms with an XY sex-determining system), or mitochondrial DNA and

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

20

Population Genetics and Microevolutionary Theory

chloroplast DNA (with a haploid, unisexual mode of inheritance in most eukaryotes). We can also examine genetic architectures that depend upon more than one locus, in which case mixed modes of inheritance are possible and in which Mendel’s second law (independent assortment) and/or recombination frequencies of linked loci may enter into the rules by which gametes are produced. We can even have deviations from the standard rules of inheritance. For example, we may specify that a locus is subject to deviations from Mendel’s first law of 50 : 50 segregation in the production of gametes from heterozygotes. In a multi-locus model, we may specify that unequal crossing over can occur, thereby producing variation in the number of genes transmitted to the gametes. The assumptions about genetic architecture that we make obviously limit the types of evolutionary processes that we can model. Hence, the specification of genetic architecture is a critical first step in any model of microevolution. Because our focus is upon sexually reproducing diploid organisms, the transition from one generation to the next involves not only the production of gametes but also the pairing of gametes to form new diploid zygotes. Hence, we need to specify the mechanisms or rules by which gametes are paired together in the reproducing population. Population structure refers to the mechanisms of uniting gametes and includes:

•• ••

the the the the

system of mating of the population; size of the population; presence, amount, and pattern of genetic exchange with other populations; and age structure of the individuals within the population.

All of these factors can have an impact on which gametes are likely to be paired and transmitted to the next generation through newly formed zygotes. As with genetic architecture, we can make assumptions about population structure that vary from the simple to the complex, depending upon the types of phenomena we wish to examine for evolutionary impact. The system of mating can be simply a random pairing of individuals or can be influenced by degrees of biological relatedness or other factors. We can choose to ignore the impact of population size by assuming size to be infinite, or we can examine small populations in which the population size has a major impact on the probability of two gametes being united in a zygote. We can model a single deme in which all uniting gametes come from that deme, or we can allow gametes from outside the deme to enter at some specified rate or probability, which in turn could be a function of geographical distance, ecological barriers, etc. We can assume discrete generations in which all individuals are born at the same time and then reproduce at the same time followed by complete reproductive senescence or death. Alternatively, we can assume that individuals can reproduce at many times throughout their life, can mate with individuals of different ages, and offspring can coexist with their parents. Until we specify these parameters of population structure, we cannot model microevolution because the uniting of gametes is a necessary step in the transmission of genes from one generation to the next in sexually reproducing organisms. In most species, the zygote that results from uniting gametes is not capable of immediate reproduction, but rather must grow, develop, survive, and mature reproductively. All of this takes place in an environment or suite of environments. From premise 3 in Chapter 1 (phenotypes arise from gene-by-environment interactions), we know that actual DNA replication depends upon the fitness phenotypes of the individuals bearing the DNA (Figure 1.3). Hence, we also need to specify phenotypic development: the mechanisms that describe how zygotes acquire phenotypes in the context of the environment. Assumptions can range from the simple (the genetic architecture has no impact on phenotype under any of the environments encountered by individuals in the

Modeling Evolution and the Hardy–Weinberg Law

population) to the complex (phenotypes are dynamic entities constantly changing as the external environment changes and/or as the individual ages). All models of microevolution must make assumptions about the mechanisms of producing gametes, uniting gametes, and developing phenotypes. Without such assumptions, it is impossible to specify the genetic transition from one generation to the next. Quite often, models are presented that do not explicitly state the assumptions being made about all three mechanisms. This does not mean that assumptions are not being made; rather, they are being made in an implicit fashion. Throughout this book, an effort is made to state explicitly the assumptions being made about all three of these critical components of transferring DNA from one generation to the next in a reproducing population. We will do this now for our first and simplest model of evolution, commonly called the Hardy–Weinberg model.

The Hardy–Weinberg Model One of the simplest models of population genetics is the Hardy–Weinberg model, named after two individuals who independently developed this model in 1908 (Hardy 1908; Weinberg 1908). Although this model makes several simplifying assumptions that are unrealistic, it has still proven to be useful in describing many population genetic attributes and will serve as a useful base model in the development of more realistic models of microevolution. Hardy was an English mathematician, and his development of the model is mathematically simpler but yields less biological insight than the more detailed model of Weinberg, a German physician. Both derivations will be presented here because each has advantages over the other for particular problems that will be addressed later in this book. Both derivations start with a common set of assumptions, as shown in Table 2.1. We now discuss each of the assumptions given in that table. Concerning the mechanisms of producing gametes, both men assumed a single autosomal locus with two alleles and with no mutation. Meiosis was assumed to be completely normal and regular, so that Mendel’s first law of equal segregation could predict the gametes produced by any genotype. There are also no maternal or paternal effects of any sort, so it makes no difference which parent contributes a gamete bearing a specific allele. Concerning the mechanisms of uniting gametes, both men assumed a single population that has no genetic contact with any other populations, that is, an isolated population. Within this closed Table 2.1

The assumptions of the Hardy–Weinberg model.

Mechanisms of producing gametes (genetic architecture)

One autosomal locus Two alleles No mutation Mendel’s First Law

Mechanisms of uniting gametes (population structure) System of mating

Random

Size of population

Infinite

Genetic exchange

None (one isolated population)

Age structure

None (discrete generations)

Mechanisms of developing phenotypes

All genotypes have identical phenotypes with respect to their ability for replicating their DNA

21

22

Population Genetics and Microevolutionary Theory

population, Hardy assumed that the individuals are monoecious (each individual is both a male and a female) and self-compatible; Weinberg allowed the sexes to be separate but assumed that the sex of the individual has no impact on any aspect of inheritance or genetic architecture. The system of mating in both derivations is known as random mating; that is, the probability of two genotypes being mates is simply the product of the frequencies of the two genotypes in the population. Note that random mating is defined solely in terms of the genotypes at the locus of interest; there is no implication in this assumption that mating is random for any other locus or set of loci or for any phenotypes not associated with the locus of interest. For example, humans do not mate at random for a number of phenotypes (gender, skin color, height, birth place, etc.), but as long as the genetic variation at the locus of interest has no impact on any of these phenotypes, the assumption of random mating can still hold. Hence, random mating is an assumption that is specific to the genetic architecture of interest and that does not necessarily generalize to other genetic systems found in the same organisms. Concerning the other aspects of population structure, both derivations make the assumption that the population is of infinite size, thereby eliminating any possible effects of finite population size upon the probability of uniting gametes. Both men ignored the effects of age structure by assuming discrete, non-overlapping generations. Finally, concerning the mechanisms of developing phenotypes, nothing was explicitly assumed but implicitly both derivations require that under the range of environments in which the individuals of the population are living and reproducing, there is no phenotypic variation for viability, mating success, and fertility. In terms of their ability to replicate DNA, all genotypes have identical phenotypes. This means that all genotypes have the same reproductive fitness, so there is no natural selection in this model. To examine the population genetic implications of these assumptions upon a reproducing population, we need to go through a complete generation transition. In both derivations, we will start with a population of reproductively mature adults. The essence of this model (and many others in population genetics) is to follow the fate of genes from this population of adults through producing gametes, mating to unite gametes (zygote production) and then zygotic development to the adults of the next generation. We will then examine the gene pools associated with these two generations of adults to see if any evolution has occurred. Because we are dealing with a single autosomal locus with two alleles (say A and a) and no additional mutation, adult individuals are of three possible genotypes: AA, Aa, and aa. We will characterize the adult population by their genotypes and the frequencies of these genotypes in the total population (see Figure 2.1). Let these three genotype frequencies be GAA, GAa, and Gaa, where the subscript indicates the genotype associated with each frequency. Because these three genotypes represent a mutually exclusive and exhaustive set of possible genotypes, these three genotype frequencies define a probability distribution over the genotypes found in the adult population (see Appendix B for a discussion of probability distributions). This means that GAA + GAa + Gaa = 1. This probability distribution of genotype frequencies represents our fundamental description of the adult population. At this point, the derivations of Hardy and of Weinberg diverge. We will first follow Hardy’s and then return to Weinberg’s. The population of adult individuals can produce gametes. As discussed in Chapter 1, the population of potential gametes produced by these individuals defines the gene pool (Figure 2.1). Because of our assumptions about genetic architecture and no mutation, all we need is Mendel’s first law to predict the frequencies of the various haploid genotypes (gametes) found in the gene pool from the frequencies of the diploid adult genotypes. Two and only two haploid gametic types are possible: A and a. The frequencies of these gametes (which for a one locus model are called allele frequencies) also define a probability distribution over the gamete types

Modeling Evolution and the Hardy–Weinberg Law

Adult Population

Mechanisms of Producing Gametes (Mendel’s First Law) Gene Pool (Population of Gametes) Mechanisms of Uniting Gametes (Random Mating)

AA

Aa

aa

GAA

GAa

Gaa

1 2

1

a

p = GAA + 1 GAa 2 p

q = Gaa + 1 GAa 2

p

G′AA

p

Mechanisms of Producing Gametes (Mendel’s First Law)

Next Generation Gene Pool

q

= p2

G′aa = q2

1

1

Aa

aa

G′Aa = 2pq 1 2

1

A p′ = p2 + 1 (2pq) = p 2

q

aa

G′Aa = 2pq

AA G′AA

q

Aa (Aa & aA)

= p2

Mechanisms of Developing Phenotypes 1 (No Effect on Viability, Mating Success or Fertility) Adult Population Of Next Generation

1

A

AA Zygotic Population

1 2

G′aa = q2

1 2

1

a q′ = 1 (2pq) + q2 = q 2

Figure 2.1 Derivation of the Hardy–Weinberg Law for a single autosomal locus with two alleles, A and a. In going from adults to gametes, solid arrows represent Mendelian transition probabilities for homozygotes, and dashed arrows represent Mendelian transition probabilities for heterozygotes. In going from gametes to zygotes, solid arrows represent gametes bearing the A allele, and dashed arrows represent gametes bearing the a allele.

found in the gene pool. This probability distribution of gamete frequencies represents our fundamental description of the gene pool. We will let p = the frequency of gametes bearing the A allele in the gene pool and q = the frequency of gametes bearing the a allele in the gene pool. Because p and q define a probability distribution over the gene pool, p + q = 1, or q = 1 − p. Hence, we need only one number, say p, to completely characterize the gene pool in this model. A critical question is: can we predict the allele (gamete) frequencies from the genotype frequencies? Under our assumptions of the mechanisms for producing gametes, the answer is “yes,” and all we need to use is Mendel’s first law of equal segregation. Under Mendel’s law with no mutation, the probability of an AA genotype producing an A gamete is 1, and the probability of an AA genotype producing an a gamete is 0. Similarly, the probability of an aa genotype producing an A gamete is 0, and the probability of an aa genotype producing an a gamete is 1 under standard Mendelian inheritance. Finally, Mendel’s first law predicts that the probability of an Aa genotype producing an A gamete is 1/2, and the probability

23

24

Population Genetics and Microevolutionary Theory

of an Aa genotype producing an a gamete is 1/2. These Mendelian probabilities are transition probabilities that describe how one goes from adult genotypes to gamete types. Hence, the transition from the adult population to the gene pool is determined completely by these transmission probabilities (our mathematical descriptor of the mechanisms of producing gametes). As can be seen from Figure 2.1, these transition probabilities from diploidy to haploidy allow us to predict the gene pool state completely from the adult population genotype state. In particular, all we have to do is multiply each transmission probability by the frequency of the genotype with which it is associated and then sum over all genotypes for each gamete type. Thus, 1 × GAA is the frequency of A gametes coming from AA individuals, ½ × GAa is the frequency of A gametes coming from Aa individuals, and 0 × Gaa = 0 is the frequency of A gametes coming from aa individuals. Hence, the total frequency of the A allele in the gene pool is 1 × GAA + ½ × GAa + 0 × Gaa = GAA + ½GAa = p. Similarly, the frequency of the a allele in the gene pool is 0 × GAA + ½ × GAa + 1 × Gaa = Gaa + ½GAa = q = 1 − [GAA + ½GAa] = 1 − p (see Figure 2.1). Note that the Mendelian transmission probabilities (the 0’s, 1’s, and ½’s used above) and the genotype frequencies (the G’s) completely determine the allele frequencies in the gene pool. In general, gamete frequencies can always be calculated from genotype frequencies given a knowledge of the mechanisms of producing gametes. Letting gj be the frequency of gamete type j in the gene pool (either an allele for a single-locus genetic architecture or a multi-allelic gamete for a multi-locus genetic architecture), the general formula for calculating a gamete frequency is: gj =

Probability genotype k producing gamete j × frequency of genotype k Genotypes

21 where “genotype k” is simply a specific genotype possible under the assumed genetic architecture. The equations previously used to calculate p and q are special cases of Eq. (2.1) for a single autosomal locus with two alleles. This equation makes it clear that two types of information are needed to calculate gamete frequencies:

• •

information about the mechanisms of producing gametes that determine the probability of a specific genotype producing a specific gamete type, and the genotype frequencies of the population of interest.

It is always possible to calculate the gamete frequencies from the genotype frequencies given a knowledge of the mechanisms of producing gametes. Is it also possible to calculate the genotype frequencies from the gamete frequencies given a knowledge of the mechanisms of producing gametes? The answer is “no.” To see this, consider a population of adults consisting only of Aa individuals (Figure 2.2a). In this population, GAA = 0, GAa = 1, and Gaa = 0. Hence, p = GAA + ½ × GAa = 0 + ½ × 1 = 0.5. Now, consider a population with GAA = 0.25, GAa = 0.5, and Gaa = 0.25 (Figure 2.2b). For this population, p = GAA + ½ × GAa = 0.25 + ½ × (0.5) = 0.5. Now consider the population shown in Figure 2.2c in which GAA = 0.5, GAa = 0, and Gaa = 0.5. In this population, p = GAA + ½ × GAa = 0.5 + ½ × (0) = 0.5. Hence, three very different populations of adults all give rise to identical gene pools! This shows that there is no one-to-one mapping between genotype frequencies and gamete frequencies. Although gamete frequencies can always be calculated from genotype frequencies given a knowledge of the rules of inheritance, genotype frequencies are not uniquely determined by gamete frequencies and the rules of inheritance. Obviously, we need additional information to predict genotype frequencies from gamete frequencies. This is where population structure comes in.

Modeling Evolution and the Hardy–Weinberg Law

(a) Aa GAa = 1

Adult Population

Mechanisms of Producing Gametes (Mendel’s First Law)

Gene Pool

1 2

1 2

A p = GAA + 12 GAa = 0.5

a q = Gaa +

1 2

GAa = 0.5

(b) Adult Population

Mechanisms of Producing Gametes (Mendel’s First Law)

Gene Pool

AA

Aa

aa

GAA = 0.25

GAa = 0.5

Gaa = 0.25

1 2

1

1 2

1

A p = GAA +

1 2

a GAa = 0.5

q = Gaa +

1 2

GAa = 0.5

(c) AA

aa

GAA = 0.5

Gaa = 0.5

Adult Population

Mechanisms of Producing Gametes (Mendel’s First Law)

Gene Pool

1

1

A p = GAA +

1 2

a GAa = 0.5

q = Gaa +

1 2

GAa = 0.5

Figure 2.2 Different adult populations (a, b, and c) sharing a common gene pool.

Hardy and Weinberg made assumptions about population structure that remove as potential evolutionary factors genetic contact with other populations, population size, and age structure. All that is left in their simplified model is system of mating. Under Hardy’s formulation, random mating means that two gametes are randomly and independently drawn from the gene pool and united to form a zygote. By a random draw, Hardy meant that the probability of a gamete being drawn is the same as its frequency in the gene pool. Hence, if the proportion of the gametes bearing the A allele is p, then the probability of choosing a gamete with an A allele is p. Similarly, the probability of drawing an a gamete is q. Individuals are monoecious in Hardy’s model, and every individual contributes equally to both male and female gametes. Hence, although the second

25

26

Population Genetics and Microevolutionary Theory

Table 2.2 The multiplication of allele frequencies to yield zygotic genotypic frequencies under the Hardy–Weinberg model of random mating. Male Gametes

Allele

Frequency

Female

A

p

Gametes

a

q

Allele:

A

a

Frequency:

p

q

AA p × p = p2 aA q × p = qp

Aa p × q = pq aa q × q = q2

Summed frequencies in zygotes: AA: G AA = p2 Aa: G Aa = pq + qp = 2pq aa: G aa = q2 Note: The zygotic genotype frequencies are indicated by G k and the allele frequencies by p and q.

gamete drawn from the gene pool must be from the opposite sex of the first, all individuals are still equally likely to be the source of the second gamete. Moreover, Hardy regarded the number of gametes that could be produced by an individual as effectively infinite, so that drawing the first gamete from the gene pool has no effect upon drawing the second. The assumption of random mating also stipulates that this second gamete is drawn independently from the gene pool, which means that the probabilities are identical on the second draw and that the joint probability of both gametes is simply the product of their respective allele frequencies. Table 2.2 shows how these gamete frequencies are multiplied to yield zygotic genotype frequencies. Note, in calculating the frequency of the Aa genotype, there are two ways of creating a heterozygous zygote; the A allele could come from the paternal parent and a from the maternal, or vice versa. The Hardy– Weinberg assumptions imply that parental origin of an allele has no effect. Hence, the two types of heterozygotes, each with frequency pq, are pooled together into a single Aa class with frequency 2pq. As the zygotes develop and mature into adults capable of contributing genes to the next generation, there is no change in their relative frequencies because of the implicit assumption of no phenotypic variation in viability, mating success, or fertility. This means that all genotypes are assigned a reproductive fitness of 1 so that there are no fitness differences (Figure 2.1). Hence, Hardy showed that the genotype frequencies of the next generation could be predicted from allele frequencies given knowledge of the system of mating. From Figure 2.1 or Table 2.2, these predicted genotype frequencies are:

•• •

G AA = p2, G Aa = 2pq, and G aa = q2.

This array of genotype frequencies is known as the Hardy–Weinberg law. We did not make any assumptions in this derivation about the initial genotypic frequencies, GAA, etc. The initial adult population does not have to have Hardy–Weinberg genotype frequencies for the zygotes to have Hardy–Weinberg frequencies; all that is required is random mating of the adults

Modeling Evolution and the Hardy–Weinberg Law

Table 2.3

Weinberg’s derivation of the Hardy–Weinberg genotype frequencies. Mendelian probabilities of offspring (zygotes)

Mating pair

Frequency of mating pair

AA

Aa

aa

AA × AA

GAA × GAA = GAA2

1

0

0

AA × Aa

GAA × GAa = GAAGAa

½

½

0

Aa × AA

GAa × GAA = GAAGAa

½

½

0

AA × aa

GAA × Gaa = GAAGaa

0

1

0

aa × AA

Gaa × GAA = GAAGaa

0

1

0

Aa × Aa

GAa × GAa = GAa

¼

½

¼

Aa × aa

GAa × Gaa = GAaGaa

0

½

½

aa × Aa

Gaa × GAa = GAaGaa

0

½

½

aa × aa

Gaa × Gaa = Gaa2

0

0

1

G AA

G Aa

G aa

2

Total Offspring Summing zygotes over all mating types:

G AA = GAA2 + ½ [2GAAGAa] + ¼GAa2 = [GAA + ½GAa]2 = p2 G Aa = ½[2GAAGAa] + 2GAAGaa + ½GAa2 + ½[2GAaGaa] = 2[GAA + ½GAa][Gaa + ½GAa] = 2pq G aa = ¼GAa2 + ½[2GAaGaa] + Gaa2 = [Gaa + ½GAa]2 = q2 Note: Female genotypes are indicated first in the mating pair, male genotypes second.

regardless of their genotype frequencies. Hence, it takes only one generation of random mating to achieve Hardy–Weinberg genotype frequencies regardless of the starting genotype frequencies. Weinberg’s derivation differed from Hardy’s at the point of modeling uniting gametes. To Weinberg, random mating meant that the probability of two genotypes being involved in a mating event was simply the product of their respective genotype frequencies. Given a mating, offspring genotypes would be produced according to standard Mendelian probabilities. Hence, in Weinberg’s derivation, the mechanisms of producing gametes and the mechanisms of gametic union are utilized in an integrated fashion, as shown in Table 2.3. Note that this table makes an additional assumption not needed under the monoecious version of Hardy, namely, that the genotype frequencies are identical in both sexes. With this additional assumption, the end result of Weinberg’s derivation is the same as Hardy’s: the zygotic genotype frequencies (and hence the adult genotype frequencies of the next generation under the assumptions made here) are again given by G AA = p2, G Aa = 2pq, and G aa = q2. We now address the important question of whether or not microevolution has occurred in this model, that is, are the allele frequencies in the offspring generation different or the same as the allele frequencies of the parent generation. Given that the adults of the offspring generation have the genotype frequencies G AA = p2, G Aa = 2pq, and G aa= q2, the allele frequencies in the pool of gametes they produce (say p’ for A and q’ for a) are calculated from Eq. (2.1) as p’ = p2 + ½ 2pq = p2 + pq = p p + q = p

22

and q’ = q (also shown in Figure 2.1). The allele frequencies p and p’ are measured at comparable stages in two successive generations (here at the stage of producing gametes), and this contrast allows us to see if evolution has occurred. Because p = p’, by definition, there has been no evolution.

27

28

Population Genetics and Microevolutionary Theory

Hence, the Hardy–Weinberg model predicts that allele frequencies are stable over time and that no evolution is occurring under this set of assumptions. Because of this stability over time, Hardy– Weinberg genotype frequencies are often called the Hardy–Weinberg equilibrium. As noted earlier, it takes only one generation of random mating to achieve Hardy–Weinberg frequencies, and, once achieved, the population will remain in this state until one or more assumptions of the Hardy– Weinberg model are violated.

An Example of the Hardy–Weinberg Law As an illustration of the application of this model, consider a human population of Pueblo Indians scored for genetic variation at the autosomal blood group locus MN (Figure 2.3). This locus has two common alleles in most human populations, the M allele and the N allele. Genetic variation at this locus determines your MN blood group type, with a very simple genotype to phenotype

83 MM

Pueblo Indian Population

GMM = 83/140 = 0.593

Mechanisms of Producing Gametes (Mendel’s First Law)

Gene Pool

46 MN

1 2

p = 0.593 +

1 (0.329) = 0.757 2

p

p

MM GMM = p2 = 0.573

1

N q = 21 (0.329) + 0.079 = 0.243

M

Mechanisms of Uniting Gametes (Random Mating)

Zygotic Population

GMN = 46/140 = 0.329 GNN =11/140 = 0.079

1 2

1

11 NN

p

q

MN GMN = 2pq = 0.368

q

q

NN GNN = q2 = 0.059

Mechanisms of Developing Phenotypes (No Effect on Viability, Mating Success, or Fertility)

Adult Population of Next Generation

MM GMM = p2 = 0.573

MN GMN = 2pq = 0.368

NN GNN = q2 = 0.059

Figure 2.3 Application of the Hardy–Weinberg model to a sample of Pueblo Indians scored for their genotypes at the autosomal MN blood group locus.

Modeling Evolution and the Hardy–Weinberg Law

mapping: MM genotypes have blood group M, MN genotypes have blood group MN, and NN genotypes have blood group N. Hence, it is easy to characterize the genotypes of all individuals in a population by determining their MN blood group type. Figure 2.3 shows the number of individuals with each of the possible genotypes at this locus in a sample of 140 Pueblo Indians (Boyd 1950). The first step in analyzing a population is to convert the genotype numbers into genotype frequencies by dividing the number of individuals of a given genotype by the total sample size. For example, 83 Pueblo Indians had the MM genotype out of the total sample of 140, so the frequency of the MM genotype in that sample is 83/140 = 0.593. Figure 2.3 then shows how the allele frequencies are calculated in the pool of potential gametes, yielding p (the frequency of M in this case) = 0.757 and q = 0.243. We can also apply the other definition of gene pool to this sample; namely, the gene pool is the population of genes collectively shared by all the individuals. Since this is a diploid locus, the 140 Pueblo Indians collectively share 280 copies of genes at the MN locus. The 166 copies found in the 83 MM homozygotes are all M, and half of the 92 copies found in the 46 MN heterozygotes are M. Hence, the total number of M alleles in this sample of 280 genes is 166 + ½92 = 212. The frequency of the M allele is therefore 212/280 = 0.757. As this shows, either way of conceptualizing the gene pool leads to the same answer. Continuing with Figure 2.3, we can see that the zygotic frequencies should be 0.573 for MM, 0.368 for MN, and 0.059 for NN if this population were randomly mating. Recall that random mating in this case simply means that the individuals are choosing mates at random with respect to their MN blood group types; it does not mean that mating is random for every trait! For example, this population is evenly split between males and females, so the frequency of the female genotype XX (where X designates the human X chromosome) is 0.5 and the frequency of the male genotype XY is 0.5 (where Y designates the human Y chromosome). Because sex is determined by the X and Y chromosomes as wholes and these chromosomes do not normally recombine, we effectively can treat gender as determined by a single locus with two alleles, X and Y. The frequency of X gametes in the Pueblo Indian gene pool is 0.5 + ½(0.5) = 0.75 and the frequency of Y gametes is ½(0.5) = 0.25. Therefore, we would expect the Hardy–Weinberg genotype frequencies of:

•• •

GXX = (0.75)2 = 0.5625, GXY = 2(0.75)(0.25) = 0.375, and GYY = (0.25)2 = 0.0625.

Obviously, this population is not at Hardy–Weinberg equilibrium for the X and Y chromosomes, and the reason is straightforward: mating is not random for these genetic elements. Instead, the only cross that can yield offspring is XX × XY, a gross deviation from the Hardy–Weinberg model portrayed in Table 2.3. Because of this highly non-random system of mating, the X and Y chromosomes can never achieve Hardy–Weinberg frequencies. Hence, systems of mating can be locus specific, and Hardy–Weinberg frequencies are only for loci that have a random system of mating. Other genetic systems found in the same individuals in the same population may deviate from Hardy– Weinberg because mating is not random for that genetic system. Recall that when the assumptions of Hardy–Weinberg are met, the population goes to Hardy– Weinberg genotype frequencies in a single generation and remains at those frequencies. Hence, if the Pueblo Indian population had been randomly mating for the MN blood groups in the past and if the other assumptions of Hardy–Weinberg are at least approximately true, we would expect the adult genotype frequencies of the next generation shown in Figure 2.3 to hold for the current adult population as well. This observation provides a basis for testing the hypothesis that this, or any population, has Hardy–Weinberg frequencies. The statistical details and a worked example of such a test are provided in Box 2.1.

29

30

Population Genetics and Microevolutionary Theory

Box 2.1 Testing to See if a Population Is in Single-Locus Hardy–Weinberg We first estimate the allele frequencies using either Eq. (2.1) or the gene counting method and then calculate the expected Hardy–Weinberg genotype frequencies. These steps have already been done for the Pueblo Indians, as shown in Figure 2.3. Next, we convert the expected Hardy–Weinberg genotype frequencies into expected genotype numbers by multiplying each frequency by the total sample size, which is 140 in this case. For example, the expected number of MM homozygotes under Hardy–Weinberg for the Pueblo Indian sample is (0.573) × 140 = 80.22. Similarly, the expected numbers of MN and NN genotypes are 51.52 and 8.26, respectively. Now, we can calculate a standard goodness-of-fit chi-square statistic (see Appendix B):

Genotypes

Obs i − Exp i Exp i

2

=

83 − 80 22 80 22

2

+

46 − 51 52 51 52

2

+

11 − 8 26 8 26

2

= 1 59

23

where Obs(i) is the observed number of individuals with genotype i and Exp(i) is the expected number of individuals with genotype i under Hardy–Weinberg (in this case, i can be MM, MN, or NN). If the null hypothesis of Hardy–Weinberg is true, we expect the statistic calculated in Eq. (2.3) to have a value such that there is a high probability of the statistic having a value that large or larger when in fact the population is at Hardy–Weinberg. To calculate this probability, we need the degrees of freedom associated with the chi-square statistic. In general, the degrees of freedom are the number of categories being tested (three genotype categories in this case) minus 1 minus the number of independent parameters that had to be estimated from the data being tested to generate the expected numbers. In order to generate the Hardy–Weinberg expected values, we first had to estimate the allele frequencies of M and N from the data being tested. However, recall that q = 1−p, so that once we know p we automatically know q. This means that the data are used to estimate only one independent parameter (the parameter p). Therefore, the degrees of freedom are 3−1−1 = 1. We can now look up the value of 1.59 with 1 of freedom in a chi-square table or statistical calculator and find that the probability of getting a value of 1.59 or larger if the null hypothesis of Hardy–Weinberg were true is 0.21. Generally, such probabilities have to be less than 0.05 before the null hypothesis is rejected. Hence, we fail to reject the null hypothesis of Hardy–Weinberg for this sample of Pueblo Indians scored for the MN locus. It would have been simpler to say that the Pueblo Indian population is in Hardy–Weinberg, but we have not actually demonstrated this. Our sample is relatively small, and perhaps with more extensive sampling, we would reject Hardy–Weinberg. Hence, all that we have really demonstrated is that we fail to reject Hardy–Weinberg for our current sample. Statistical tests never prove that a null hypothesis is true; the test either rejects or fails to reject the null hypothesis.

Importance of the Hardy–Weinberg Law At first, the Hardy–Weinberg law may seem a relatively minor, even trivial, accomplishment. Nevertheless, this simple model played an important role in the development of both genetics and evolution in the early part of the twentieth century. Mendelian genetics had been rediscovered at the start of the twentieth century, but many did not accept it. One of the early proponents of Mendelian genetics was R.C. Punnett (of “Punnett square” fame). Punnett made a presentation

Modeling Evolution and the Hardy–Weinberg Law

at a scientific meeting in which he argued that the trait of brachydactyly (short fingers) was inherited as a Mendelian dominant trait in humans. Udny Yule, a member of the audience, raised the objection that one would expect a 3 : 1 ratio of people with brachydactyly to those without if the Mendelian model were true, and this clearly was not the case. Punnett suspected that there was an error in this argument, but he could not come up with a response at the meeting. Later, Punnett explained the problem to his mathematician friend, G.H. Hardy, who immediately proceeded to derive his version of the Hardy–Weinberg law. Hardy’s derivation made it clear that Yule had confused the family Mendelian ratio of 3 : 1 (which was for the offspring of a specific mating between two heterozygotes for the dominant trait) with the frequency in a population. Suppose in our earlier derivation, the A allele is dominant over a for some trait, then the Hardy–Weinberg law predicts that the ratio of frequencies of those with the dominant trait to those with the recessive trait in a random mating population should be p2 + 2pq:q2. There is no constraint upon this ratio to be 3 : 1 or any of the other family ratios expected under Mendelian inheritance. Rather, this population ratio can vary continuously as p varies from 0 to 1. The predicted ratio of individuals with dominant to recessive traits also provided a method for predicting the frequency of carriers for genetic disease. Many genetic diseases in humans are recessive, so now let a be a recessive disease allele. Only two phenotypic categories could be observed in these early Mendelian studies: the dominant phenotype, associated with the genotypes AA and Aa, and the recessive, associated with the genotype aa. Thus, there was seemingly no way to predict how many people were carriers (Aa) as they could not be distinguished phenotypically from the AA homozygotes. However, if we assume that Hardy–Weinberg is true, then the frequency of individuals affected with the genetic disease (which is observable) is q2. Hence, we can estimate q in this case as: q=

Gaa

24

Given q, the frequency of carriers of the genetic disease can be estimated as 2(1−q)q. Note, in this case, we cannot actually test the population for Hardy–Weinberg because we only have two observable categories and we have estimated one parameter from the data to be tested (Eq. 2.3). Therefore, the degrees of freedom are 2–1−1 = 0. Zero degrees of freedom means we have insufficient information in the data to test the model (Appendix B). Equation (2.4) should never be used when all genotypic classes are observable because it is valid only in the special case of Hardy–Weinberg genotype frequencies. In contrast, Eq. (2.1) makes no assumptions about Hardy–Weinberg and is true for any set of genotype frequencies. Therefore, when all genotypic classes are observable, Eq. (2.1) should be used instead of Eq. (2.4) because Eq. (2.1) will always give you the right answer, whereas Eq. (2.4) will only give the right answer in a specific special case. Nevertheless, Eq. (2.4) played an important role throughout much of the twentieth century in genetic counseling in predicting heterozygous carrier frequencies for autosomal recessive genetic diseases when all genotypic classes were not observable. The Hardy–Weinberg law also predicts no evolution, that is, the allele frequencies remain constant over time. At first, this may also seem to be a rather uninteresting result, but this observation was critical for the acceptance of the Darwin–Wallace concept of natural selection. The publication of Darwin’s book The Origin of Species in 1859 strongly established the concept of descent with modification within biology. However, Darwin’s (and Wallace’s) explanation for the origin of adaptations via natural selection was less universally accepted. Darwin felt that the Scottish engineer Fleeming Jenkin raised one of the most serious objections to the theory of natural selection in 1867. At this time, the dominant idea of inheritance was that of “blending inheritance” in which

31

32

Population Genetics and Microevolutionary Theory

the traits of the father and mother are blended together much as mixing two different colors of paint together results in a new color that represents equal amounts of the original colors. Jenkin pointed out that half of the heritable variation would be lost every generation under blending inheritance; hence, a population should quickly become homogeneous. Recall from Chapter 1 that heritable variation is a necessary prerequisite for all evolution, so evolution itself would grind to a halt unless mutation replenished this loss at the same rate. Darwin and Wallace had based their theory of natural selection upon the tenet that mutation creates new variation at random with respect to the needs of the organism in coping with its environment. It seemed implausible that half of the genetic material could mutate at random every generation and the organisms still survive. Hence, Jenkin’s argument seemed to imply that either genetic variation would quickly vanish and all evolution halt or that natural selection required levels of mutation that would result in extinction. This problem even lead Darwin in his 1868 book on The Variation of Animals and Plants Under Domestication to speculate that mutation might be directed by the environment. By the beginning of the twentieth century, many neo-Lamarkian ideas based upon directed mutations were popular alternatives to natural selection of random mutations. Jenkin’s argument was finally put to rest by the Hardy–Weinberg law. The Hardy–Weinberg model, by ignoring many potential evolutionary forces (Table 2.1), focuses our attention upon the potential evolutionary impact of Mendelian inheritance alone. By demonstrating that Mendelian inheritance results in a population with a constant allele frequency, it was evident that Mendelian genetic variation is not rapidly lost from a population. Indeed, under the strict assumptions of Hardy–Weinberg, genetic variation persists indefinitely. Thus, even though the Hardy–Weinberg model is one of no evolution, this model was critical for the acceptance of natural selection as a plausible mechanism of evolutionary change under Mendelian inheritance. In general, this book is concerned about evolutionary change. In modeling evolution, Hardy– Weinberg is a useful null model of evolutionary stasis. Indeed, much of the rest of this book is devoted to relaxing one or more of the assumptions of the original Hardy–Weinberg model and seeing whether or not evolution can result. In this sense, Hardy–Weinberg serves as a valuable springboard for the investigation of many forces of evolutionary change. In the remainder of this chapter, we consider just one slight deviation from the original Hardy–Weinberg model, and we will investigate the evolutionary implications of this slight change.

Hardy–Weinberg for Two Loci The original Hardy–Weinberg model assumed a genetic architecture of one autosomal locus with two alleles. We will now consider a slightly more complicated genetic architecture of two autosomal loci, each with two alleles (say A and a at locus 1 and B and b at locus 2). Otherwise, we will retain all other assumptions of the original Hardy–Weinberg model. However, there is one new assumption. Recall from Chapter 1 that our second premise is that DNA can mutate and recombine. We will retain the Hardy–Weinberg assumption of no mutation, but we will allow recombination (either independent assortment if the two loci are on different autosomes or crossing over if they are on the same autosome). Because our main interest is on whether or not evolutionary change occurs, we will start with the gene pool and go to the next generation’s gene pool (Figure 2.4), rather than going from adult population to adult population as in Figures 2.1 and 2.3. Given two loci with two alleles each and the possibility of recombination between them, a total of four gamete types are possible (AB, Ab, aB,

Modeling Evolution and the Hardy–Weinberg Law

Gene Pool

AB

Ab

aB

ab

gAB

gAb

gaB

gab

Mechanisms of Uniting Gametes (Random Mating)

Zygotic/adult Population

AB/AB AB/Ab AB/aB AB/ab Ab/Ab AblaB Ab/ab gAB2 2gABgAb 2gABgaB 2gABgab gAb2 2gAbgaB 2gAbgab

aB/aB gaB2

aB/ab 2gaBgab

ab/ab gab2

Mechanisms of Producing Gametes (Mendel’s First Law and Recombination)

Gene Pool of Next Generation

AB

Ab

aB

ab

g′AB

g′Ab

g′aB

g′ab

Figure 2.4 Derivation of the Hardy–Weinberg Law for two autosomal loci with two alleles each: A and a at locus 1 and B and b at locus 2. In going from gametes to zygotes, solid arrows represent gametes bearing the AB alleles and are assigned the weight gAB, dashed arrows represent gametes bearing the Ab allele and are assigned the weight gAb, gray arrows represent gametes bearing the aB gametes and are assigned the weight gaB, and dotted arrows represent gametes bearing the ab alleles and are assigned the weight gab. In going from adults to gametes, solid arrows represent Mendelian transition probabilities of 1 for homozygotes, dashed arrows represent Mendelian transition probabilities of ½ for single heterozygotes, dotted arrows represent non-recombinant Mendelian transition probabilities of ½(1 − r) (where r is the recombination frequency between loci 1 and 2) for double heterozygotes, and gray arrows represent recombinant Mendelian transition probabilities of ½r for double heterozygotes.

and ab). The gene pool is characterized by four gamete frequencies (Figure 2.4), symbolized by gxy where x indicates the allele at locus 1 and y indicates the allele at locus 2. Just as p and q sum to one, these four gamete frequencies also sum to one because they define a probability distribution over the gene pool. The transition from this gene pool to the zygotes is governed by the same population structure (rules of uniting gametes) as given in the single-locus Hardy–Weinberg model. In particular, the assumption of random mating means that gametes are drawn independently from the gene pool, with the probability of any given gamete type being equal to its frequency. The probability of any particular genotype is simply the product of its gamete frequencies, just as in the singlelocus Hardy–Weinberg model. In Figure 2.4, we are not keeping track of the paternal or maternal origins of any gamete, so both types of heterozygotes are always pooled, and, therefore, the product of the gamete frequencies for heterozygous genotypes is multiplied by two. For example, the frequency of the genotype AB/Ab is 2gABgAb. Note that there are two types of double heterozygotes, AB/ab (the cis double heterozygote with a random mating frequency of 2gABgab) and Ab/aB (the trans double heterozygote with a random mating frequency of 2gAbgaB). Although the cis and trans double heterozygotes share the double heterozygous genotype, completely different gamete types produce the cis and trans double heterozygosity. As we will soon see, the cis and trans double heterozygous genotypes contribute to the gene pool in different ways. Hence, we will keep the cis and trans double heterozygote classes separate.

33

34

Population Genetics and Microevolutionary Theory

The rules for uniting gametes in the two-locus model are the same as for the single-locus model, the only difference being that there are now 10 genotypic combinations. As with the single-locus model, if we know the gamete frequencies and know that the mating is at random (along with the other population structure assumptions of Hardy–Weinberg), we can predict the zygotic genotype frequencies. If we further assume that there are no phenotypic differences that affect viability, mating success, or fertility, we can also predict the next generation’s adult genotype frequencies from the gamete frequencies. The similarities to the single-locus model end when we advance to the transition from the next generation’s adult population to the gene pool of the next generation (Figure 2.4). At this point, some new rules are encountered in producing gametes that did not exist at all in the single-locus model (Figure 2.1). As before, homozygous genotypes can only produce gametes bearing the alleles for which they are homozygous (this comes from the assumptions of normal meiosis and no mutations). As before, genotypes heterozygous for just one locus produce two gamete types, with equal frequency as stipulated by Mendel’s first law. However, genotypes that are heterozygous for both loci can produce all four gamete types, and the probabilities are determined by a combination of Mendel’s first law and recombination (Mendel’s second law of independent assortment if the loci are on different chromosomes, or the recombination frequency if on the same chromosome). Hence, the transition from genotypes to gametes requires a new parameter, the recombination frequency r, which is ½ if the loci are on different chromosomes, and 0 ≤ r ≤ ½ if the loci are on the same chromosome. The addition of recombination produces some qualitative differences with the single-locus model. First, in the single-locus model, an individual could only pass on gametes of the same types that the individual inherited from its parents. But note from Figure 2.4 that the cis double heterozygote AB/ab, which inherited the cis AB and ab gamete types from its parents, can produce not only the cis gamete types, each with probability ½(1−r), but can also produce the trans gamete types Ab and aB, each with probability ½r. Similarly, the trans double heterozygote can produce both cis and trans gamete types (Figure 2.4). Thus, recombination allows the double heterozygotes to produce gamete types that they themselves did not inherit from their parents. This effect of recombination is found only in the double heterozygote class, but this does not mean that recombination only occurs in double heterozygotes. Consider, for example, the single heterozygote AB/Ab. If no recombination occurs in meiosis, this genotype will produce the gamete types AB and Ab with equal frequency. Hence, the total probability of gamete type AB with no recombination is ½(1−r), and similarly it is ½(1−r) for Ab. Now consider a meiotic event in which recombination did occur. Such a recombinant meiosis also produces the gamete types AB and Ab with equal frequency, that is, with probability ½r for each. However, in the recombinant AB gamete, the A allele that is combined with the B allele originally came from the Ab gamete that the AB/Ab individual inherited from one of its parents. Hence, recombination has occurred, but because we do not distinguish among copies of the A alleles, we see no observable genetic impact. Hence, the total probability of an AB gamete, regardless of the source of the A allele, is ½(1−r) + ½r = ½, and the total probability of an Ab gamete, regardless of the source of the A allele, is ½(1−r) + ½r = ½. Thus, recombination could be occurring in all genotypes, but it is observable only in double heterozygotes. The qualitative difference from the single-locus model that causes some genotypes to produce gamete types that they themselves did not inherit leads to yet another qualitative difference: the two definitions of gene pool given in Chapter 1 are no longer equivalent. If we define the gene pool as the shared genes of all the adult individuals, we obtain the gamete frequencies from the pool of gametes produced by their parents (the gxy s in Figure 2.4). On the other hand, if we define the gene pool as the population of potential gametes produced by all the adult individuals, the effects of

Modeling Evolution and the Hardy–Weinberg Law

recombination enter and we obtain the g xy s in Figure 2.4. To avoid any further confusion on this point, the phrase “gene pool” in this book will always refer to the population of potential gametes unless otherwise stated. The general population genetic literature often does not make this distinction because in the standard single-locus Hardy–Weinberg model it is not important. Quite frequently, there is a time difference of one generation among the models of various authors depending upon which definition of gene pool they use (usually implicitly). Therefore, readers have to be careful in interpreting what various authors mean by “gene pool” when dealing with multilocus models or other models in which these two definitions may diverge. The most important qualitative difference from the single-locus model involves the potential for evolution. As seen before, the single-locus Hardy–Weinberg model goes to equilibrium in a single generation of random mating and then stays at the equilibrium, resulting in no evolution. To see if this is the case for the two-locus model, we now use Eq. (2.1) to calculate the gamete frequency of the AB gamete using the weights implied by the arrows in Figure 2.4 going from adults to gametes: 1 1 1 1 2gAB gAb + 2gAB gaB + 1 − r 2gAB gab + r 2gAb gaB 2 2 2 2 + gAb + gaB + 1 − r gab + rgAb gaB + gAb + gaB + gab + rgAb gaB − rgAB gab

g'AB = 1 × g2AB + = gAB gAB = gAB gAB

25

= gAB + r gAb gaB − gAB gab = gAB − rD where D = (gAB gab−gAbgaB). The parameter D is commonly known as linkage disequilibrium. However, because it can exist for pairs of loci on different chromosomes that are not linked at all, a more accurate but more cumbersome term is gametic phase imbalance. Because the term linkage disequilibrium dominates the literature, we will use it throughout the book, but with the caveat that it can be applied to unlinked loci. Similarly, the other three gamete types can be obtained from Eq. (2.1) as: 1 1 1 1 + 2gAb gab + 1 − r 2gAb gaB + r 2gAB gab = gAb + rD 2g g 2 AB Ab 2 2 2 1 1 1 1 2 = 1 × gaB + 2gAB gaB + 2gaB gab + 1 − r 2gAb gaB + r 2gAB gab = gaB + rD 2 2 2 2 1 1 1 1 2 = 1 × gab + 2gAb gab + 2gaB gab + 1 − r 2gAB gab + r 2gAb gaB = gab − rD 2 2 2 2 26

g'Ab = 1 × g2Ab + g'aB g'ab

At this point, we can now address our primary question: is evolution occurring? Recall our definition from Chapter 1 of evolution as a change in the frequencies of various types of genes or gene combinations in the gene pool. As is evident from Eqs. (2.5) and (2.6), as long as r > 0 (that is, some recombination is occurring) and D 0 (there is some linkage disequilibrium), then gxy g xy: evolution is occurring! Thus, a seemingly minor change from one to two loci results in a major qualitative change of population-level attributes. No evolution occurs in this model if r = 0. In that case, the two-locus model is equivalent to a single-locus model with four possible alleles. Thus, some multi-locus systems can be treated as if they were a single locus as long as there is no recombination. On the other hand, recombination can sometimes occur within a single gene. As mentioned in Chapter 1, the genetic variation within a 9.7 kb segment of the Lipoprotein Lipase (LPL) gene in humans was shaped in part by 30 recombination and gene conversion events (Templeton et al. 2000a). Thus, in some cases, the evolutionary potential created by recombination must be considered even at the single-locus level. In the case of LPL, we are looking at two or more different polymorphic nucleotide sites within the same gene and

35

36

Population Genetics and Microevolutionary Theory

not, technically speaking, at different loci. However, the qualitative evolutionary potential is still the same as long as the polymorphic sites under examination can recombine, regardless of whether those sites are single nucleotides within a gene or between traditional loci. No evolution also occurs in this model if D = 0. D will equal zero when the two-locus gamete frequencies are the product of their respective single-locus allele frequencies. To see this, let pA be the frequency of the A allele at locus 1 and pB the frequency of the B allele at locus 2. These single-locus allele frequencies are related to the two-locus gamete frequencies by: pA = gAB + gAb pB = gAB + gaB

27

Now, consider the product of the A and B allele frequencies: pA pB = gAB + gAb gAB + gaB = g2AB + gAB gaB + gAB gAb + gAb gaB = gAB gAB + gaB + gAb + gAb gaB = gAB 1 − gab + gAb gaB

28

= gAB − gAB gab + gAb gaB = gAB − D Solving Eq. (2.8) for D yields D = gAB − pA pB

29

and similar equations can be derived in terms of the other three gamete frequencies. Equation (2.9) suggests another biological interpretation of D; it is the deviation of the two-locus gamete frequencies from the product of the respective single-locus allele frequencies. Equation (2.9) also makes it clear that D will be 0 when the two-locus gamete frequency is given by the product of the respective single-locus allele frequencies. This can also be seen by evaluating the original formula for linkage disequilibrium under the assumption that the two-locus gamete frequencies are given the product of their respective allele frequencies: D = gABgab − gAbgaB = (pApB)(papb) − (pApb)(papB) = pApBpapb − pApbpapB = 0. The two-locus gamete frequencies will be products of the single-locus allele frequencies when knowing what allele is present at one locus in a gamete does not alter the probabilities of the alleles at the second locus, that is, the probabilities of the alleles at the second locus are simply their respective allele frequencies regardless of what allele occurs at the first locus. When D 0, knowing which allele a gamete bears at one locus does influence the probabilities of the alleles at the second locus. In statistical terms, D = 0 means that there is no association in the population between variation at locus 1 with variation at locus 2. When D = 0, Eqs. (2.5) and (2.6) show that the gamete frequencies (and hence the genotype frequencies) are constant, just as they were in the single-locus Hardy–Weinberg model. Thus, when D = 0, the population is at a non-evolving equilibrium, given the other standard Hardy–Weinberg assumptions. We can now understand why D is called disequilibrium. When D is not zero and there is recombination, the population is evolving and is not at a two-locus Hardy–Weinberg equilibrium. The larger D is in magnitude, the greater this deviation from two-locus equilibrium. Evolution occurs when r > 0 and D 0, and we now examine the evolutionary process induced by linkage disequilibrium in more detail. From Figure 2.4 or Eqs. (2.5) and (2.6), we see that linkage disequilibrium in the original gene pool (gABgab-gAbgaB) influences the next generation’s gene pool. Similarly, the linkage disequilibrium in the next generation’s gene pool will influence the

Modeling Evolution and the Hardy–Weinberg Law

subsequent generation’s gene pool. The linkage disequilibrium in the next generation’s gene pool in Figure 2.4 is: D1 = g’AB g’ab − g’aB g’Ab = gAB − r D gab − r D − gaB + r D gAb + r D = D 1−r

2 10

Using Eq. (2.10) recursively, we can see that D2 (the linkage disequilibrium in the gene pool two generations removed from the original gene pool) is D(1−r)2. In general, if we start with some initial linkage disequilibrium, say D0, then Dt, the linkage disequilibrium after t generations of random mating, is: Dt = D 0 1 − r

t

2 11

Equation (2.11) reveals that the evolution induced by linkage disequilibrium is both gradual and directional, as illustrated in Figure 2.5. Because r ≤ ½, the quantity (1−r)t goes to zero as the number of generations (t) gets large. Hence, the direction of evolution is to dissipate linkage disequilibrium and to move closer and closer to a two-locus Hardy–Weinberg equilibrium in which the two-locus gamete frequencies are the products of the constituent single-locus allele frequencies. The approach to this equilibrium is gradual, proceeding at an exponential rate determined by (1−r). The larger the value of r (i.e. the more recombination occurs), the faster is the approach to equilibrium with D = 0 (Figure 2.5). Note, however, that even for loci on different chromosomes that sort independently (r = ½), equilibrium is not attained instantly (Figure 2.5), in great contrast to the single-locus Hardy–Weinberg model. However, the approach to linkage equilibrium is quite rapid with unlinked loci. For example, after just five generations of random mating, only a little more than 3% of the original disequilibrium for unlinked loci remains (from Eq. 2.8). However, for linked loci with r small, the linkage disequilibrium can persist and affects gene pool evolution for many, many generations. For example, for two loci with r = 0.01 (1% recombination), it takes 345 generations to

r = 0.001

0.25

0.2

r = 0.01

Dt

0.15

0.1 r = 0.05 r = 0.1

0.05 r = 0.5 10

20 30 t (Number of Generations)

40

50

Figure 2.5 The decay of linkage disequilibrium with time in generations as a function of different recombination rates, r, starting with an initial value of D0 = 0.25.

37

38

Population Genetics and Microevolutionary Theory

reduce the initial linkage disequilibrium to the level achieved in just five generations for unlinked loci. During this approach to linkage equilibrium, the two single-locus systems that contribute to the two-locus genetic architecture will be at single-locus Hardy–Weinberg in just one generation of random mating, but the multi-locus system will be in disequilibrium and evolving (given initial linkage disequilibrium). Therefore, recombination and linkage disequilibrium are sufficient conditions for evolution in a multi-locus system. One disadvantage of D as a measure of linkage disequilibrium is that it does not have a fixed range, and the range is strongly influenced by the single-locus allele frequencies. This makes comparisons across pairs of loci difficult to interpret. For example, suppose one pair of loci has a D of 0.2 and another pair a D of 0.1. If both pairs have identical single-locus allele frequencies, we can safely say that the first pair shows a greater degree of association of alleles across loci than the second pair. However, if the single-locus allele frequencies are quite different between these two pairs of loci, even this simple inference is not possible. To avoid this problem, consider an alternative measure of linkage disequilibrium known as the normalized linkage disequilibrium, D , which is the linkage disequilibrium divided by its theoretical maximum absolute value. Because two-locus gamete frequencies cannot be negative nor be greater than the corresponding single-locus allele frequencies (Eq. 2.7), we have that 0 ≤ gAB ≤ min pA , pB

2 12

Solving Eq. (2.9) for gAB and substituting the result into inequality 2.12 yields − pA pB ≤ D ≤ min pA − pA pB , pB − pA pB = min pA 1 − pB , pB 1 − pA or − pA pB ≤ D ≤ min pA pb , pa pB

2 13

As noted above, equations similar to 2.9 can be derived with respect to the other gamete frequencies, such as gab, so D also satisfies the inequality − pa pb ≤ D ≤ min pA pb , pa pB

2 14

Thus, we have

D' =

D ,D < 0 min pA pB , pa pb D ,D > 0 min pA pb , pa pB

2 15

D has a range of values from −1 to +1 for all pairs of loci irrespective of the allele frequencies at the component loci (although D itself is still dependent on allele frequencies). Both D and D can be either positive or negative, but the sign depends upon the allele labels, an arbitrary decision with no biological significance. An easy way of avoiding this problem is to use the absolute values of these linkage disequilibrium measures: |D| or |D |. Another commonly used alternative is: r2 =

D2 pA pa pB pb

2 16

The r2 measure always falls in the range of 0 to 1. However, unlike |D |, the actual range of r2 is constrained by the single-locus allele frequencies and in general does not reach 1. Consequently, comparisons of r2 across different pairs of loci are still affected by unequal allele frequencies.

Modeling Evolution and the Hardy–Weinberg Law

All of the measures of linkage disequilibrium given above have one common feature: they take a vector of gamete frequencies, {gAB, gab, gAb, gaB}, and convert it into a scalar, that is, a single number. Inevitably, there is a loss of information in going from the vector of gamete frequencies to a scalar number. A class of vector measures of linkage disequilibrium that avoids this loss of information is the custom correlation coefficient (CCC) that itself is a vector whose elements measure the degree of association between each possible allelic combination. The general form of a CCC element for the allelic pair (i,j) is (Climer et al. 2014b): CCC ij = gij ff i ff

j

2 17

where gij is the frequency of the gamete bearing allele i at the first locus and allele j at the second locus in the gene pool, ffi is a frequency factor correction for the frequency of allele i, and ffj is a frequency factor correction for the frequency of allele j. Various frequency factor corrections can be used depending upon the desired use of this vector measure of linkage disequilibrium, so CCC is a class of association measures. Moreover, note from Eq. (2.17) that CCC measures the association in the gene pool between two alleles at different loci or sites, whereas all the scalars measure the association in the gene pool between two loci or sites.

Sources of Linkage Disequilibrium Given that some initial linkage disequilibrium is necessary before recombination can act as an evolutionary force in a random mating population, it is important to understand what factors can create an initial disequilibrium. Many factors can create linkage disequilibrium, including:

•• •• •

mutation, non-random mating, finite population size, gene flow, and natural selection.

Note that this list of factors that can generate linkage disequilibrium corresponds to the very same factors that are assumed not to occur in the simple Hardy–Weinberg model (Table 2.1). All of these factors will be considered in this book, but, for now, we focus only upon the first: mutation. The impact of mutation is most easily seen by D or CCC when the frequency factor is set as the reciprocal of the allele frequency. With these frequency factors, the elements in the CCC vector are simply a gamete frequency divided by the product of frequencies of the two alleles carried by that gamete. As pointed out earlier, this product is the expected gamete frequency when there is no linkage disequilibrium. Hence, the elements in this special case of CCC are a direct measure of the extent of deviation from the no linkage disequilibrium case. The utility of D and CCC for measuring linkage disequilibrium induced by mutation is shown in Table 2.4. Starting with a population that has genetic variation only at the A locus, with two alleles A and a, each with frequency 0.5, and no variation at the nearby B locus, with all copies of the B gene being of allelic state B (pB = 1), a mutation at the B locus is assumed to occur resulting in a new allele, b. This initial mutational event must occur either in an A bearing gamete or in an a bearing gamete, and, in Table 2.4, we assume that this initial mutation occurred in an a bearing gamete under the infinite sites model, thereby producing a third gamete type, ab. As will be shown in Chapters 4 and 5, the frequency of this gamete type in the early generations after mutation is strongly influenced by random processes even in a large population. Table 2.4 supposes that the frequency of the ab gamete, gab, is 0.01 shortly after mutation and before any recombination has

39

40

Population Genetics and Microevolutionary Theory

Table 2.4 Gamete frequencies (g’s) and the values of several measures of linkage disequilibrium (D, D , r2, and CCC). CCC Elements t

gAB

gaB

gAb

gab

D

D’

r2

0

0.5

0.49

0

0.01

0.005

1

0.010

1.0101

0.9898

0

2

2

0.5

0.49

0.0002

0.0095

0.005

0.90

0.008

1.0086

0.9903

0.0975

1.9025

5

0.5

0.49

0.0011

0.0089

0.004

0.77

0.006

1.0067

0.9911

0.2262

1.7738

10

0.5

0.49

0.0020

0.0080

0.003

0.60

0.004

1.0040

0.9919

0.4012

1.5987

AB

aB

Ab

ab

Note: Time (t) is measured in generations after a new mutation creates a new allele, b, at the B locus on a background of the a allele at the A locus, creating the initial array of gamete frequencies given at t = 0. A recombination rate of 0.05 was assumed to calculate the subsequent linkage disequilibrium measures.

occurred (generation 0). We also assume that the frequency of aB gametes has been reduced to 0.49 by this generation, so that all four gamete frequencies still sum to 1. Note from Table 2.4 that the fourth potential type of gamete, Ab, does not exist in this gene pool because, by assumption, the B to b mutation occurred on an a bearing gamete, the infinite sites model assumes no more mutations to b can occur, and there has been no opportunity for recombination yet. Note that both D and r2 are very small, but D is 1, indicating that linkage disequilibrium is at its maximal magnitude after mutation. This is true in generation 0 regardless of whether the b mutation occurs on an a or A background. Hence, the very act of mutation creates maximal linkage disequilibrium, so multilocus genetic systems always begin with linkage disequilibrium at the time of mutation. The CCC elements give even more insight into the impact of mutation on this two-locus system. With the chosen frequency factors, a CCC element value of 1 indicates that the gamete is at its equilibrium value of no linkage disequilibrium, a value greater than one indicates that the two alleles are found more frequently together in the gene pool than expected by chance alone, and values less than one indicate that the two alleles are found together less frequently than by chance alone. Note that the elements associated with the two gamete types that existed prior to this mutation, AB and aB, both have CCC elements close to one, indicating that there is almost no association between the alleles in these two pairs of gamete types. In contrast, the gamete type bearing the mutant, ab, has a CCC element of 2. Hence, the association between the a and b alleles that is created by the act of mutation is the sole association driving D = 1. Mutation does not create associations between all alleles, but only between the new mutant allele and the pre-existing alleles at other loci that happened to be in the same gamete at the time of mutation. Hence, CCC makes it clear that the linkage disequilibrium is marking the historical genetic background of the origin of the new mutation. Using Eqs. (2.5) and (2.6) recursively with a recombination rate of 0.05, Table 2.4 shows that this initial linkage disequilibrium decays gradually with time (that is, it goes closer to 0 for D, D , and r2 and goes closer to 1 for the CCC elements). Note also that CCC not only marks this historical origin of the mutation by a value much greater than 1 but also marks the new recombinant gamete by a CCC value much less than 1. Once again, there is much more information in CCC than in D . If recombination were even rarer, this decay would occur more slowly. As a consequence, the linkage disequilibrium created by the act of mutation and subsequent recombination can sometimes persist for long periods of time after the original mutational event. This observation leads to an important conclusion about evolution: the current state of a population’s gene pool and its ongoing evolution

Modeling Evolution and the Hardy–Weinberg Law

is influenced by its past history. The past cannot be ignored in understanding the present and predicting the future for biological systems subject to evolutionary change.

Some Implications of the Impact of Evolutionary History upon Disequilibrium The impact of the past as measured by D0 in Eq. (2.11) upon the present in multi-locus systems can be either a boon or a bane, depending upon the question being addressed. One boon is that patterns of linkage disequilibrium contain much information about the recombinational history of different regions of the genome. Because linkage disequilibrium is sensitive to the accumulation of recombination events over generations (Eq. 2.11, Figure 2.5), we expect that genomic regions with high rates of recombination will display little disequilibrium within such regions, while genomic regions with no to little recombination will display high levels of linkage disequilibrium because the initial conditions after a mutation event are always maximal disequilibrium, as shown by Table 2.4. Local regions in the genome that show extensive linkage disequilibrium are called LD blocks or haplotype blocks. In genomes that have recombinational hotspots, we expect LD blocks to be found primarily in the regions outside or between the hotspots. An example of this is shown in Figure 2.6 that shows the pairwise linkage disequilibrium observed in the SNP markers found in the 9.7 kb region of the LPL gene discussed in Chapter 1 (Templeton et al. 2000a). Recall from Chapter 1 and Figure 1.6 that a recombination hotspot exists in the 6th intron in this region. As expected, we see much significant linkage disequilibrium in the two regions flanking this hotspot but little within the hotspot. One test to quantify the amount of recombination in a genomic region is the four-gamete test. As shown in Table 2.4, when a mutation creates a new SNP, there are originally only three gamete types in disequilibrium between the new mutant and previously existing biallelic SNPs. Recombination is needed to produce the fourth gamete type, as shown in Table 2.4, but only under the infinite sites model of mutation. If mutation can create homoplasy, the b allele shown in Table 2.4 could have arisen on the A background by a second mutational event, thereby creating the fourth gamete type without recombination. For example, Templeton et al. (2000a) estimated the number of recombination events in human mtDNA using the four-gamete test, which inferred 413 recombination events uniformly distributed across the mtDNA genome. However, as mentioned in Chapter 1, there is no recombination in human mtDNA, so all of these inferred recombination events were artifacts of assuming the infinite sites model for the four-gamete test. The four-gamete test should never be used to estimate recombination rates given its great sensitivity to deviations from the infinite sites model. A potential bane of the strong impact of history upon linkage disequilibrium occurs within those regions with no to little recombination. Within such genomic regions, the pattern and magnitude of linkage disequilibrium are determined mostly by evolutionary history, both the history of mutational origins creating linkage disequilibrium on a specific genetic background and the history of subsequent mutational homoplasies reducing linkage disequilibrium without recombination. As a consequence, the pattern of linkage disequilibrium among markers within the LD block has no to little correlation with the physical positions of the markers (Templeton 1999a). In contrast, on a larger scale in the genome, we expect linkage disequilibrium to decrease with increasing physical distance (Eq. 2.11, Figure 2.5). This inverse relationship between linkage disequilibrium and physical distance within the genome is often used to map the position of genomic regions associated with phenotypic variation (this will be discussed in Chapter 10). However, if there are LD blocks within the mapped region, the physical position of the causative mutation (which is generally not used as a marker in the mapping study) may show little correlation with the physical

41

42

Population Genetics and Microevolutionary Theory

2

25

Site i

69

50

1

E4 E5

25

E6

Site j

E7

50

Significant Linkage Disequilibrium No Significant Linkage Disequilibrium

E8

Insufficient Sample Size to Test Significance E9

68

Figure 2.6 Linkage disequilibrium between all pairs of SNP markers in a 9.7 kb region of the LPL gene. The diagonal line below the matrix indicates the position of the SNPs in the exons and introns of the sequenced region. The thick gray line indicates the position of the recombination hotspot illustrated in Figure 1.6, and the gray polygon encloses the part of the linkage disequilibrium matrix bounded by this hotspot. Source: Templeton et al. (2000a). © 2000, Elsevier.

position of the marker SNP with the strongest phenotypic association due to evolutionary history dominating over physical positioning – a feature often not appreciated in genomic mapping studies. For example, Kopp et al. (2008) mapped an association with end-stage kidney disease in humans to a small region of Chromosome 22 that contained two genes, MYH9 and ApoL1. The strongest disease associations were found with SNPs in the MYH9 gene, which codes for a structural protein found in the kidney. Kopp et al. (2008) therefore concluded that MYH9 was the “major-effect risk gene” for end-stage kidney disease. However, Tzur et al. (2010), part of a group that had independently mapped the association into the same genomic region, examined the pattern of linkage disequilibrium in this region in more detail and used haplotypes and additional population studies. They concluded that two missense mutations in the ApoL1 gene were in fact the causative mutations, and subsequent studies in kidney cell tissue culture and animal models confirmed their conclusion (Anderson et al. 2015; Olabisi et al. 2016). The occurrence of two independent missense mutations in the ApoL1 gene that increased risk for end-stage kidney disease complicated the

Modeling Evolution and the Hardy–Weinberg Law

pattern of association between the disease and single SNP markers, and it required the more detailed population genetic studies as executed by Tzur et al. (2010) to unravel the true risk associations in this genomic region. This illustrates that linkage disequilibrium can be a boon in mapping studies but a bane in identifying the causative mutations or genes. Tzur et al. (2010) used both haplotypes and haplotype trees (to be discussed in Chapter 5) to help unravel the complex situation in the MYH9/ApoL1 region. This illustrates another potential boon of the strong impact of history upon linkage disequilibrium that occurs in those regions with no to little recombination. In such regions, extended haplotypes can persist for long periods of evolutionary time, and evolutionary trees of haplotypes can often be estimated. This can not only help in deciphering phenotypic associations (Chapter 10) but also allow population geneticists to get a more reliable view of the mutational history that led to the present-day genetic variation, and hence give us a window into the past. For example, one important mutation in human genetics is the sickle-cell mutation in the sixth codon of the autosomal locus that codes for the β chain of adult hemoglobin. We will look at the phenotypic and adaptive significance of this mutation later in this book. For now, we focus on the linkage disequilibrium patterns of this relatively new mutation in the human gene pool with some genetic variation in surrounding loci. Figure 2.7 shows the genetic state of some of these surrounding loci on chromosomes that contain the β S allele. As will be detailed later in this book, the β S allele only recently became common in specific, geographically restricted human populations and globally is a rare allele. In contrast, the restriction site polymorphisms (see Appendix A) at nearby loci and in non-coding regions between these loci are more widespread and common in human populations. As will be discussed in Chapter 7, this pattern implies that the β S allele is more recent in origin than the other genetic variants shown in Figure 2.7. As expected for a relatively new mutation in a DNA region showing only low levels of recombination, the β S allele shows extensive linkage disequilibrium with these restriction site polymorphisms. However, the β S allele is found on not just one, but at least five distinct haplotype backgrounds as defined by multiple restriction site polymorphisms (Figure 2.7). This implies that

60

50

ε

5′

HcII

40



30

20

ψβ



10

δ

0

β

kb

3′

XI HdIII TI HdIII PII HcII HcII HfI RI AII HfI HpI HdIII BI

Cameroon:



− + +

+

+



+

+ −

+

− +

+



Senegal:



+ + +



+

+

+

+ −

+

+ +

+

+

Benin:



− − −



+



+

− −

+

+ −



+

Bantu:



− + +



+





− +

+

+ +

+

+

Indian:

+

+ + +



+

+

+

− +

+

− +

+



Figure 2.7 Multi-locus genetic backgrounds containing β S alleles at the hemoglobin β chain locus. Restriction site polymorphisms in or near several hemoglobin chain loci (β, ε, δ, Gγ, and Aγ) and the pseudogene γβ are indicated, with “+” meaning that the indicated restriction enzyme cuts the site on that chromosomal type, and “-” meaning that it does not cut. Source: Data from Lapoumeroulie et al. (1992) and Oner et al. (1992).

43

44

Population Genetics and Microevolutionary Theory

the mutation at the sixth codon that defines the βS allele occurred multiple times in recent human evolution, at least four times in Africa and once in Asia (Lapoumeroulie et al. 1992; Oner et al. 1992). Hence, by looking at patterns of linkage disequilibrium, we can make inferences about the mutational history of this particular allele, including the existence of multiple hits and homoplasy at a single site (in this case, producing multiple origins of the β S allele). Haplotype analysis often allows us to distinguish identity-by-descent from identify-by-state – a critical distinction in many population genetic analysis. Haplotype trees augment this ability (Chapter 5). As will be shown in Chapter 7, techniques exist for extracting much historical information when linkage disequilibrium or haplotype trees are present, information that can be used to test a number of hypotheses about the history of the locus under study and of populations and the existence of violations of the infinite sites model of mutation. As shown in this chapter, the Hardy–Weinberg law, a seemingly simple model, nevertheless leads to many important insights about the evolutionary process. This model played an important role in the establishment of Mendelian genetics and natural selection during the first half of the twentieth century. The two-locus version of this law and its implications are currently playing a critical role in medical and evolutionary genetics in the twenty-first century. The difference in the potential for evolutionary change between the one-locus versus two-locus versions shows that we must be cautious in generalizing inferences from our reductionistic models. It is therefore critical to examine what happens when some of the other assumptions of the original Hardy–Weinberg model are altered or relaxed. In the next chapter, we will focus upon one of these critical assumptions: system of mating.

45

3 Systems of Mating As defined in Chapter 1, a deme is a collection of interbreeding individuals of the same species that live in sufficient proximity that they share a common system of mating – the rules by which pairs of gametes are chosen from the local gene pool to be united into a zygote. Sufficient proximity depends upon the geographical range of the group of individuals and their ability to disperse and interbreed across this range. These geographical factors will be dealt with in Chapters 6 and 7. Here, we simply note that, depending upon the geographical scale involved and the individuals’ dispersal and mating abilities, a deme may correspond to the population of the entire species or to a subpopulation restricted to a small local region within the species’ range. A deme is not defined by geography but rather by a shared system of mating. The Hardy–Weinberg model assumes one particular system of mating – random mating – but many other systems of mating exist. Moreover, as shown in Chapter 2, it is possible for different loci or complexes of loci within the same deme to have different systems of mating. It is therefore more accurate to say that a deme shares a common set of systems of mating. The purpose of this chapter is to investigate some alternatives to random mating and their evolutionary consequences.

Inbreeding In its most basic sense, inbreeding is mating between biological relatives. Two individuals are related if among the ancestors of the first individual are one or more ancestors of the second individual. Because of shared common ancestors, the two individuals could share genes at a locus that are identical copies of a single ancestral gene (via premise 1; DNA can replicate). Such identical copies due to shared ancestry are said to be identical-by-descent, as pointed out in Chapter 1. In contrast, the same allele can arise more than once due to recurrent mutation (Chapter 1). Identical copies of a gene due to recurrent mutation from different ancestral genes are said to be identical-by-state, and genes that are identical-by-descent are also identical-by-state. Virtually all individuals within most species are related to all other individuals if you go far enough back in time. For example, computer simulations using reasonable assumptions about humanity’s demographic history indicate that all humans living today share at least one common ancestor who lived sometime between 55 CE (Common Era) and 1415 BCE (Before the Common Era) (Rohde et al. 2004). Thus, all humans are biological relatives if we could trace our ancestry back a few thousand years. In practice, we often know pedigree relationships only for a few generations into the past. Given our ignorance about long-term pedigrees, how do we decide who is a relative

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

46

Population Genetics and Microevolutionary Theory

and who is not? The solution to this practical problem is to regard some particular generation or set of individuals as the reference population whose members are regarded as unrelated. We assume that we can ascertain the biological relatedness of any two individuals in the current population by going back to, but not beyond, the individuals in that reference population. By assumption, all the genes in this reference population are regarded as not being identical-by-descent. If two identical genes today are traced back to different genes in the reference population, this pair of genes is regarded as being identical-by-state and not by descent. The assumption that we can ignore common ancestors from the remote past is not only a practical necessity in pedigree analyses, but it is also defensible because of the phenomenon of ghost ancestors, individuals who are common genealogical ancestors of a focal individual or group of individuals, but who have contributed no genetic material to that individual or group (Gravel and Steel 2015). A ghost ancestor who is a genealogical ancestor of an entire population or species but has no existing genetic contribution to that population or species is called a superghost. Such ghosts appear because of three random processes. First, consider the probability that a randomly chosen autosomal nucleotide of a focal individual comes from a particular ancestor, as predicted from Mendel’s first law of segregation. This probability is ½ for a parent of the individual, ¼ for a grandparent, and (½)t for an ancestor t generations ago. This probability rapidly diminishes and becomes vanishingly small for large t. Second, nucleotides are not segregating independently, rather they are arranged into chromosomes, and within chromosomes, recombination breaks up an ancestral chromosome into smaller and smaller segments in a probabilistic manner as t increases (Hanson 1959). The infinite sites model of mutation regards the genome as infinitely divisible, but this is not true, particularly when recombination is not uniform across the genome. As a result, we are always dealing with a finite number of ancestral chromosome segments, both in a focal individual and even in a group of individuals that share that common ancestor. Third, all populations are finite, and as we will see in Chapters 4 and 5; this results in a random sampling of the products of meiosis at the population level that inevitably causes the random loss of some ancestral genetic material, with that genetic material having low frequencies in the population facing the highest chance of loss. Putting all these random forces together ensures that the genetic contribution of a remote ancestor of an individual or even a population is likely to be lost. As a result, ghost and super-ghost ancestors rapidly accumulate in populations as t increases. Indeed, it is likely that the common genealogical ancestor of all of humanity in the simulations of Rohde et al. (2004) is a ghost ancestor for many people living today, and could even be a super-ghost. Ignoring remote generations becomes less and less important for increasing t in identifying genetic ancestors as opposed to genealogical ancestors. There are several alternative ways of measuring inbreeding within the basic concept of mating between relatives. Many of these alternatives are incompatible with one another because they focus on measuring different biological phenomena that are associated with matings between relatives. Unfortunately, all of these alternative ways of measuring “inbreeding” are typically called “inbreeding coefficients” in the population genetic literature. This lack of verbal distinctions between different biological concepts has resulted in confusion and misunderstanding. Jacquard (1975) tried to clarify this confusion in an excellent article entitled “Inbreeding: one word, several meanings,” but the many meanings of the word “inbreeding coefficient” are still rarely specified in much of the population genetic literature. The responsibility for making the distinctions among the several distinct and mutually incompatible “inbreeding coefficients” therefore often falls upon the reader. Consequently, it is important to be knowledgeable of the more common concepts of inbreeding, which we now examine.

Systems of Mating

Definitions of Inbreeding Pedigree Inbreeding

When two biological relatives mate, the resulting offspring could be homozygous for an allele through identity-by-descent. In other words, the gene at a particular autosomal locus being passed on by the father could be identical to the homologous gene being passed on by the mother because both genes are identical copies of a single piece of DNA found in a common ancestor. The amount of inbreeding in this case is measured by F (the first of many “inbreeding coefficients”). F is the probability that the offspring is homozygous due to identity-by-descent at a randomly chosen autosomal locus or nucleotide site. Offspring for whom F > 0 (that is, offspring with a finite chance of being homozygous at a locus through identity-by-descent) is said to be inbred. Because F is a probability, it can range in value from 0 (no chance for any identity-by-descent) to 1 (all autosomal loci are identical-by-descent with certainty). F can be calculated for an individual by applying Mendel’s first law of 50 : 50 segregation to the pedigree of that individual. As an example, consider the pedigree in Figure 3.1 which shows an offspring produced by a mating between two half-siblings. For pedigree data, the reference population is simply the set of individuals for which no further pedigree information exists. In Figure 3.1, the reference population consists of individual A and the two males with whom she mated. These three individuals are assumed to be unrelated (all are in the reference generation), and any alleles they carry, even if identical, are not considered to be identical-by-descent but rather to be identical-by-state. In Figure 3.1, there is only one shared ancestor (A) common to both the mother (C) and the father (B). Assuming that the common ancestor herself has no inbreeding in the pedigree sense, then her two alleles at an autosomal locus cannot be identical-by-descent by assumption and are indicated by A and a (they may be identical-by-state). The probability that the common ancestor (A) passes on the A allele to her son (B) is 1/2 from Mendel’s first law, and, likewise, the probability that she passes on the A allele to her daughter (C) is 1/2. Both the son (B) and daughter (C) also received an allele at this locus from their fathers, who are not common ancestors and cannot contribute to identity-by-descent. Therefore, the only way for the offspring (D) to be identical-by-descent for this locus is for both the father (B) and the mother (C) to pass on the allele they inherited from their common ancestor (A), and each of these gamete transmissions also has a probability of 1/2 under Mendel’s first law (Figure 3.1). Because the four segregation probabilities shown in Figure 3.1 are all independent, the probability that all four occurred as shown is (1/2)4 = 1/16 = probability that individual (D) is homozygous by descent for allele A. The common ancestor (A) also had a second allele a, and the probability that individual (D) is homozygous by descent for allele a is likewise 1/16. Hence, the total probability of individual (D) being identical-by-descent at this locus is 1/16 + 1/16 = 1/8 since the event of D being AA is mutually exclusive from the event of D being aa. By definition, the pedigree inbreeding coefficient for individual (D) is therefore F = 1/8. The calculation of F can become much more difficult when there are many common ancestors and ways of being identical-by-descent and when the common ancestors themselves are inbred in the pedigree sense. However, the basic principles are the same: nothing more than Mendel’s first law is applied to the pedigree to calculate the pedigree inbreeding coefficient F. For example, consider the case of two full-siblings mating to produce an inbred offspring (Figure 3.2). In this case, the inbred offspring can be homozygous by descent for an allele from its grandmother or grandfather (Figure 3.2). Since an individual can be homozygous by descent for an allele from one and only one of the common maternal/paternal ancestors, identity-by-descent for an allele from the grandmother is mutually exclusive from identity-by-descent for an allele from the grandfather. Hence, the total probability of identity-by-descent, regardless of which common ancestor provided the

47

48

Population Genetics and Microevolutionary Theory

Aa A Simplify Pedigree by Excluding C

B

D

Individuals Who Cannot Contribute to Identity by Descent

A

1 2

A

B

1 2 C

1 2

A 1 2

D AA (or aa) Probability(D = AA) = ( 1 )4 = 1 2 16

Probability(D = AA or D = aa) = 1 + 1 = 1 16 16 8

Figure 3.1 A mating between two half-siblings (individuals B and C) who share a common mother (individual A, who is heterozygous Aa) to produce an inbred offspring (individual D). The left side of the figure portrays the pedigree in the standard format of human genetics, where squares denote males, circles females, horizontal lines connecting a male and female denote a mating, and vertical lines coming off from the horizontal mating lines indicate the offspring. The right side of the figure shows how this pedigree is simplified for the purposes of calculating the inbreeding coefficient F by deleting all individuals from the pedigree who are not common ancestors of the offspring of interest (individual D in this case). Shading in the pedigree on the left indicates the deleted individuals. The Mendelian probabilities associated with transmitting the A allele are indicated in the simplified pedigree.

allele, is the sum of the identity probabilities associated with the grandmother and grandfather, each of which is 1/8 (Figure 3.1). Hence, F = 1/8 + 1/8 = 1/4 for the offspring of two full-siblings. Of course, some pedigrees have many more common ancestors and pathways of potential identity-by-descent, making the calculation of F more difficult than the simple examples shown in Figures 3.1 and 3.2. The algorithms used to make these calculations for more complicated pedigrees were worked out many centuries ago by the Roman Catholic Church. Dispensations for incestuous marriages were needed to be granted before the Church could recognize such marriages. Therefore, priests needed to work out the degree of inbreeding that would occur in the offspring from such a marriage in order to distinguish degrees of consanguinity that are dispensable from those that are not (Cavalli-Sforza and Bodmer 1971). Today, many computer programs use these same algorithms to calculate F. It is critical to note that the pedigree inbreeding coefficient F is applied to a particular individual coming from a specified union with a specified pedigree. F is therefore an individual concept and not a population concept at all. Indeed, a single population often consists of individuals showing great variation in their F’s. For example, a captive herd of Speke’s gazelle (Gazella spekei) was established at the St. Louis Zoo between 1969 and 1972 from one male and three females imported from Africa (Templeton and Read 1983, 1984, 1994). Assuming that these four imported animals are unrelated (that is, the four founding animals constitute the reference population), their initial offspring would all have F = 0. However, because there was only one male in the original herd, the most distant relationship among captive-bred animals is that of a half-sibling (all the initial captive-bred offspring must share the same father). As a consequence, once the initial founders had died or were too old to breed, the least inbred mating possible among the captive born animals

Systems of Mating

A′a′ A

A′

1 2

Simplify Pedigree by Splitting into B

D

Aa

A′

C Mutually Exclusive Loops that can Contribute to Identity by Descent

A

1 2

A′ B

1 2

C A′

1 2

OR

A

1 2

1 2 C

B

1 2

D A′A′ (or a′a′)

D

A 1 2

AA (or aa)

Figure 3.2 Inbreeding associated with a mating of two full-siblings. 0.35

Inbreeding Coefficient, F

0.30 0.25 0.20 0.15 0.10 0.05

Males

FB5

FB2

F41

FB1

F39 F40

F37

F36

F22

F14

F7

F12

F4

M32

MB2

M31

M30

M20

M13

0.00

Females Animal ID

Figure 3.3 The pedigree inbreeding coefficients for all individuals from a captive herd of Speke’s gazelle. Source: Data from Templeton and Read (1984).

would be between half-siblings, with F = 1/8 = 0.125 (Figure 3.1). Moreover, in the initial decade of captive breeding, some father–daughter matings and other highly consanguineous matings occurred as well, resulting in a herd by 1979 (then split between zoos in St. Louis and Texas) that consisted of 19 individuals with a broad spread of individual F’s ranging from 0 to 0.3125 (Figure 3.3). Recall from Chapter 2 that the system of mating used in the Hardy–Weinberg Law is a population concept applied to the level of a deme and to a particular locus. Random mating as a concept is meaningless for specific individuals within a deme. Figure 3.3 illustrates that F refers to individuals, not the deme. Hence, pedigree inbreeding (the one most people think of when they encounter the word “inbreeding”) does not – indeed, cannot – measure the system of mating of a deme. This means that F cannot be used to look for deviations from the Hardy–Weinberg assumption. However, this does not mean that pedigree inbreeding has no population genetic or evolutionary implications.

49

50

Population Genetics and Microevolutionary Theory

One of the most important evolutionary implications of pedigree inbreeding (F) is that it displays strong interactions with rare, recessive alleles and epistatic gene complexes. Consider, first, a model in which a recessive allele is lethal when homozygous. Let B = the sum over all loci of the probability that a gamete drawn from the gene pool bears a recessive lethal allele at a particular locus. Because B is a sum of probabilities of non-mutually exclusive events, B can be greater than one. Indeed, the simplest biological interpretation of B is that it is the average number of lethal alleles over all loci borne by a gamete in the gene pool. When pedigree inbreeding occurs, then BF = the rate of occurrence of both gametes bearing lethal alleles that are identical-by-descent, thereby resulting in the death of the inbred individual. Of course, an individual can die from many causes, and not just due to identity-by-descent for a lethal allele. The only way for an individual to live is (i) not to be identical-by-descent for a lethal allele AND (ii) not to die from something else, either genetic or environmental. Under the assumption that B is a small number, the number of times an inbred individual will be identical-by-descent for a lethal allele will follow a distribution known as the Poisson distribution (Appendix B). The only way for the individual not to die of identity-by-descent for a lethal gene is to have exactly 0 lethal genes that are identical-by-descent and therefore homozygous. This probability equals e−BF under the Poisson distribution. Let −A be the natural logarithm of the probability of not dying from any cause other than being homozygous for a lethal recessive allele that is identicalby-descent. Then, the probability of not dying from something else is e−A. To be alive, both events must be true, so the probability of being alive is e−BFe−A = e−(A + BF) (see Appendix A for the attributes of probability measures). Therefore, we have the expected mathematical relationship that: ln Probability of an inbred individual with F being alive = − A − BF

31

Note that Eq. (3.1) predicts that viability (the probability of being alive at a given age) should decrease with increasing inbreeding (as measured by F). This is an example of inbreeding depression, the reduction of a beneficial trait (such as viability or birth weight) with increasing levels of pedigree inbreeding. Inbreeding depression does not always occur with pedigree inbreeding, nor is it necessarily associated with any of the other definitions of inbreeding. However, inbreeding depression is a common phenomenon in animals and plants (Ralls et al. 1988; Holsinger 1991; Nietlisbach et al. 2019), so we need to examine the application of Eq. (3.1). One complication of applying Eq. (3.1) is that any one individual is either dead or alive, so the realized probability for any one individual is either 0 or 1 regardless of F. However, Eq. (3.1) predicts a linear relationship between the natural logarithm of the probability of being alive and F, so, in the model, this probability can take on intermediate values between 0 and 1. Although F is defined for an individual, Eq. (3.1) cannot be meaningfully applied to an individual. We must therefore extend the concept of pedigree inbreeding up to the level of a deme before we can make use of Eq. (3.1). To illustrate how to do this, consider the 1979 population of Speke’s gazelle whose individual F’s are portrayed in Figure 3.3. As can be seen, several animals have identical levels of pedigree inbreeding: seven animals share an F = 0, five animals share an F = 0.125, and four share an F = 0.25. Although any one animal is either dead or alive at a given age, the proportion of animals alive at a given age in a cohort that shares a common level of pedigree inbreeding varies between 0 (everyone in the cohort is dead) and 1 (everyone in the cohort is alive). Hence, the probability of an inbred individual with a specific F being alive at a given age is estimated by the proportion of the cohort sharing a specific F that are alive at the given age. Complications can arise due to small sample sizes within certain cohorts, but small sample size corrections can be used to deal with these difficulties (Templeton and Read 1998). Equation (3.1) is now implemented by doing a regression (Appendix B) of the natural logarithm of the cohort viability at a given age against the various F’s

Systems of Mating

100 90

30 Day Survival A = −ln(0.79) = 0.23

80

1 Year Survival Percent Survival

70 A = −ln(0.66) = 0.42

60

B=

−slo

pe =

50 B=

40

−s

2.6

2

lop

e=

3.7

5

30 25 0

1 16

1 3 8 16 Inbreeding Coefficient (F)

1 4

5 16

Figure 3.4 Inbreeding depression in a captive herd of Speke’s gazelle. Source: Modified from Templeton and Read (1984).

associated with different cohorts to estimate A and B. For example, for the Speke’s gazelle herd up to 1982, a regression of the natural logarithm of survivorship up to 30 days after birth upon F yields A = 0.23 and B = 2.62, and survivorship up to one year (the approximate age of sexual maturity in this species) yields A = 0.42 and B = 3.75 (Figure 3.4). This means that the average gamete from this population behaved as if it bore 3.75 alleles that would kill before one year of age any animal homozygous for such an allele. In general, death can arise from several other genetic causes under inbreeding besides homozygosity for a recessive, lethal allele. For example, homozygosity for an allele may lower viability but may not necessarily be absolutely lethal. Nevertheless, homozygosity for such deleterious alleles could reduce the average survivorship for a cohort of animals sharing a common F. Alternatively, some homozygous combinations of alleles at different loci may interact to reduce viability through epistasis. For example, knock-out (complete loss of function) mutations were induced for virtually all of the 6200 genes in the yeast (Saccharomyces cerevisiae) genome. Yeast can exist in a haploid phase that genetically mimics the state of F = 1 for every locus, so the effects of these knock-out mutants could be studied in the equivalent of a homozygous state. Given the compact nature of the yeast genome, it was anticipated that most of these knock-outs would have lethal consequences, that is, they would behave as recessive, lethal alleles. Surprisingly, more than 80% of these knock-out mutations were not lethal and seemed “nonessential” (Tong et al. 2001). However, when yeast strains were constructed that bore pairs of mutants from this “nonessential” class, extensive lethality emerged from their interactions (Tong et al. 2001, 2004). Similarly, a detailed analysis of the genetic causes of the inbreeding depression found in the captive population of Speke’s gazelle revealed that epistasis between loci was a significant contributor to the observed B (Templeton and Read, 1984; Templeton, 2002a). The yeast experiments and the results obtained with the Speke’s gazelle make it clear that B should be regarded as the number of “lethal equivalents” rather than the number of actual lethal alleles. The phrase “lethal equivalents” emphasizes that we really do not know the genetic architecture underlying inbreeding depression from these regression

51

52

Population Genetics and Microevolutionary Theory

analyses, but lethal equivalents do allow us to measure the severity of inbreeding depression in a variety of populations using the standard reference model of Eq. (3.1). Because each diploid animal results from the union of two gametes and, by definition, the only animals that survive are those not homozygous for any lethal equivalents, a living animal is expected to bear about 2B lethal equivalents in heterozygous condition. In the original non-inbred population of Speke’s gazelles, the average number of lethal equivalents for one-year survivorship borne by the founding animals of this herd is therefore 7.5 lethal equivalents per animal, and many bird and mammalian species are bearers of multiple potentially lethal genetic diseases or gene combinations (Ralls et al. 1988; Nietlisbach et al. 2019). Indeed, there is a large potential for inbreeding depression and other deleterious genetic effects in most human populations when pedigree inbreeding does occur. For example, cousin matings represent only 0.05% of matings in the United States (Neel et al. 1949), but 18–24% of albinos and 27–53% of Tay-Sachs cases (a lethal genetic disease) in the United States come from cousin matings (both of these are autosomal recessive traits, with the recessive allele being rare). This same pattern is true for many other recessive genetic diseases. Hence, even small amounts of pedigree inbreeding in a population that is either randomly mating or even avoiding system-of-mating inbreeding (to be discussed next) can increase the incidence of some types of genetic disease by orders of magnitude in the pedigree-inbred subset of the population. At first glance, it may seem that the concept of pedigree inbreeding and F is useful only for those few species or populations such as humans and captive-bred populations for which pedigree information can be obtained. The vast majority of species do not have pedigree information available, but, in the genomics era, F can still be estimated even in the absence of pedigree information. The simplest of these genomic estimators of F are single-point estimates that are based on some measure of homozygosity at a large number of genetic markers (typically SNPs) scattered throughout the genome (Kardos et al. 2016). These single-point estimates tend to do poorly when compared to F’s from pedigreed populations or simulations (Gazal et al. 2014). This is not surprising because the single-point estimators ignore homoplasy even though our knowledge of mutagenesis at the molecular level indicates that homoplasy can be quite common at single sites (Chapter 1). Hence, identity-by-state is frequently not identity-by-descent for SNPs, although identity-by-descent is frequently assumed in programs to estimate F from SNP data. The solution to the problem of homoplasy was indicated in Chapter 2: use haplotypes or genomic regions that include many polymorphic sites to infer identity-by-descent for the entire haplotype or region even though individual sites within the region may display homoplasy (recall Figure 2.7 and the homoplasious sickle-cell allele). Multi-locus measures have been proposed for estimating F, with one of the most common measures being runs of homozygosity (ROH, segments of the genome that are homozygous for multiple polymorphic markers) that estimate F as the ratio of the physical length of the genome that is in ROHs to the total physical length of the genome. To minimize the danger of homoplasy, some minimal number of polymorphic markers or length is often set, although the lengths of ROHs themselves contain information about identity-by-descent because the distribution of the lengths should be strongly affected by the number of generations at which common ancestors lived (Hanson 1959). Consequently, adjusting the minimal length of an ROH can be used to examine recent versus more remote common ancestors (Kardos et al. 2016). ROH estimators gave the best results in estimating F in the study of Gazal et al. (2014) that compared many methods, and markerbased methods often give better estimates of F than pedigree data (Kardos et al. 2016), most likely because of the incompleteness of most pedigrees. Although pedigree data typically do not exist for many of the populations studied by population geneticists, marker-based approaches can not only estimate F for an individual but can also

Systems of Mating

estimate kinship between individuals. There are many measures of kinship, but one common one is the coefficient of kinship between two individuals, that is, the probability that a randomly chosen allele at an autosomal locus or nucleotide site in one individual is identical-by-descent to a randomly chosen allele at the same locus or site in the second individual. The coefficient of kinship is easily calculated for a pair of individuals from pedigree data by creating a hypothetical offspring between the pair and then calculating the F of this hypothetical offspring using the same algorithms for calculating F in an inbred individual. Since this hypothetical offspring is not real, one can calculate the coefficient of kinship between two individuals of the same sex, a deceased individual and a living one, etc. For example, the coefficient of kinship between two half-siblings (either of the same sex or different) is 1/8 from Figure 3.1, and the coefficient of kinship between two fullsiblings is 1/4 from Figure 3.2. Both single-site and multi-site marker estimators of kinship exist. The multi-site estimators are no longer ROHs, but rather are IBD segments (identity-by-descent segments), which are runs of identical markers or haplotypes found in the genomes of the two individuals being compared. Ramstetter et al. (2017) found that all methods did well in detecting first- and second-degree relatives in a population of 2485 individuals with pedigrees going back six generations, but only IBD segment methods could identify seventh-degree relatives within an accuracy of one degree of relatedness. Hence, it is now possible to get partial pedigree and parentage information even in wild populations. As we will see in later chapters, just having known parent/ offspring and sibling relationships is useful in many population genetic studies. Genomic marker studies have also allowed inbreeding depression to be documented in wild populations (Kardos et al. 2016; Nietlisbach et al. 2019). In particular, the genomic simulations given in Nietlisbach et al. (2019) reveal that estimates of F based on ROHs yield the best, unbiased estimates of inbreeding depression. Huisman et al. (2016) reported stronger inbreeding depression in a wild population of red deer (Cervus elaphus) in Scotland using a genomic estimate of F versus a pedigree estimate of F. Part of this was due to a larger sample size for the genomic estimator, but even when the analysis was limited to the same individuals with both genomic and pedigree estimates, the inbreeding depression was stronger with the genomic-based F’s. This was probably due to incomplete pedigree information, particularly for common ancestors from several generations ago. Such results indicate that genomic-based estimates of F and kinship are not just substitutes for missing pedigree data, but actually produce more biologically accurate estimates of F and kinship than is usually possible with most pedigree data sets that usually extend only a few generations back in time. Because of inbreeding depression, “inbreeding” – regardless of the exact definition being used – is often viewed as something deleterious for a population. The idea that “inbreeding” is deleterious has raised many concerns for endangered species, as such species often are reduced to small sizes, which as we have seen leads to pedigree inbreeding. Studies on pedigree inbreeding depression, such as those performed for Speke’s gazelle, demonstrate that these concerns are real, and much of applied conservation genetics focuses on dealing with inbreeding in its various senses and consequences. However, is “inbreeding” always deleterious? The answer appears to be no. For example, many higher plants have extensive self-mating, the most extreme form of inbreeding, and this inbreeding can be adaptive under many conditions (Holsinger 1991). Moreover, inbreeding depression itself can evolve and can be significantly diminished in a just a few generations (Templeton and Read 1984, 1998). To understand the ultimate cause for why inbreeding is not always deleterious, we must turn our attention from inbreeding at the level of an individual to inbreeding at the level of a deme’s system of mating. Inbreeding as a Deviation from Random Mating Expectations

To obtain a system of mating measure of inbreeding at the deme level, we must examine deviations from Hardy–Weinberg genotype frequencies that are due to non-random mating. First, recall the

53

54

Population Genetics and Microevolutionary Theory

Table 3.1 The multiplication of allele frequencies coupled with a deviation from the resulting products as measured by λ to yield zygotic genotypic frequencies under a system of mating that allows deviation from random mating. Male Gametes Allele:

A

a

Frequency:

p

q

Allele Frequency Female Gametes

Marginal Allele Frequencies in the Deme

A

p

AA p2 + λ

Aa pq−λ

(p2 + λ) + (pq−λ) = p2 + pq = p(p + q) = p

a

q

aA qp−λ

aa q +λ

(qp−λ) + (q2 + λ) = qp + q2 = q(p + q) = q

(p2 + λ) + (qp−λ) = p2 + qp = p(p + q) =p

(pq−λ) + (q2 + λ) = pq + q2 = q(p + q) =q

Marginal Allele Frequencies in the Deme

2

Summed Frequencies in Zygotes: AA: G’AA = p2 + λ Aa: G’Aa = pq−λ + qp−λ = 2pq−2λ aa: G’aa = q2 + λ Note: The zygotic genotype frequencies are indicated by G’k.

random mating model for the simple one-locus, two-allele (A and a) model shown in Table 2.2 in the previous chapter. Note that in Table 2.2, the genotype frequencies are obtained by multiplying the allele frequencies associated with the male and female gametes. Now, suppose that gametes are put together in such a way that there is a deviation from the product rule of Hardy–Weinberg in producing genotype frequencies, but that the marginal allele frequencies remain the same. Let λ be this deviation parameter from the simple product of the gamete frequencies, as shown in Table 3.1. Note from Table 3.1 that λ only affects the genotype frequencies and not the gamete frequencies. This is because λ is designed to measure how gametes come together to form genotypes for a given set of gamete frequencies. Also, note that λ is applied to a deme and measures deviations from Hardy–Weinberg genotype frequencies in that deme. In contrast, F is defined for individuals, not demes, and measures the probability of identity-by-descent for that individual, and not the system of mating of the deme as a whole. Biologically, λ is quite different from F. λ is also quite different from F mathematically. Recall that F is a probability, and like all probabilities, it is defined only between 0 and 1 inclusively. In contrast, as shown in Box 3.1, λ is the covariance (see Appendix B) between uniting gametes. A covariance is proportional to the correlation coefficient (Appendix B), can take on both positive and negative values, and is mathematically non-comparable to a probability such as F. If λ > 0, there is a positive correlation between uniting gametes in excess of random mating expectations. This means that the alleles borne by the uniting gametes are more likely to share the same allelic state (with no distinction between identity-by-descent versus identity-by-state, unlike with F) than expected under random mating. If λ < 0, there is a negative correlation between uniting gametes, and the alleles borne by the uniting gametes are less likely to share the same allelic state than expected under random mating. Random

Systems of Mating

Box 3.1 The Statistical Meaning of λ In order to show that λ is the covariance among uniting gametes, we must first define a random variable assigned to the gametes. In our simple genetic model, the gametes bear only one of two possible alleles, A and a. Let x be a random variable that indicates the allele borne by a male gamete such that x = 1 if the male gamete bears an A allele, and x = 0 if the male gamete bears an a allele. Similarly, let y be a random variable that indicates the allele borne by a female gamete such that y = 1 if the female gamete bears an A allele, and y = 0 if the female gamete bears an a allele. Let p = the frequency of A bearing gametes in the gene pool. Because we are dealing with an autosomal locus, p is the frequency of A for both male and female gametes. Using these definitions and the standard formula for means, variances, and covariances (Appendix B), we have: Mean x = μx = 1 × p + 0 × q = p Mean y = μy = 1 × p + 0 × q = p Variance x = σ 2x = 1 − μx

2

× p + 0 − μx

2

× q = 1 − p 2 p + − p 2 q = pq

Variance y = σ 2y = pq Covariance x, y = 1 − μx 1 − μy p2 + λ + 1 − μx 0 − μy 2pq − 2λ + 0 − μx 0 − μy q2 + λ = q2 p2 + λ − pq 2pq − 2λ + p2 q2 + λ = λ q2 + 2pq + p2 =λ

Hence, λ is the covariance between uniting gametes under a system of mating that produces the genotype frequencies given in Table 3.1. Because covariances do not have a standardized range whereas correlations do, it is usually more convenient to measure the non-random associations between uniting gametes through their correlation coefficient rather than their covariance. The correlation coefficient is (Appendix B): ρx,y =

Covariance x, y σ 2x σ 2y

=

λ pq

mating occurs when there is no correlation between uniting gametes (λ = 0). The correlation coefficient between uniting gametes is λ/(pq) (see Box 3.1). The correlation coefficient (Appendix B) has a standardized range of −1 to +1 inclusively, in contrast to the covariance that has no standardized range. Hence, it is more convenient to measure deviations from Hardy–Weinberg at the deme level in terms of the correlation of uniting gametes as opposed to the covariance of uniting gametes. Accordingly, we define the “inbreeding coefficient” to be f ≡ λ/(pq) = the correlation of uniting gametes within the deme. From Table 3.1, we can now see that the genotype frequencies that emerge from this system of mating can be expressed as:

•• •

G’AA = p2 + λ = p2 + pq λ pq = p2 + pqf G’Aa = 2pq − 2λ = 2pq − 2pq λ pq = 2pq − 2pqf = 2pq 1 − f G’aa = q2 + λ = q2 + pq λ pq = q2 + pqf

32

Because f is a correlation coefficient, it can take on both positive and negative values (as well as zero, the random mating case). Generally, when f is positive, the system of mating of the deme is

55

56

Population Genetics and Microevolutionary Theory

described as one of inbreeding, and when f is negative, the system of mating of the deme is described as one of avoidance of inbreeding. However, regardless of whether or not f is positive or negative, f is called the “inbreeding coefficient.” Although “inbreeding” as measured by f alters the genotype frequencies from Hardy–Weinberg (Eqs. 3.2), it does not cause any change in allele frequency. The frequency of the A allele in the final generation in Table 3.1 is: p’ = 1 × p2 + pqf + 1 2 2pq 1 − f = p2 + pq = p p + q = p

= p2 + pqf + pq 1 − f = p2 + pqf + pq − pqf

Because the allele frequencies are not changing over time in Table 3.1, inbreeding as measured by f is not an evolutionary force by itself at the single locus level (that is, system-of-mating inbreeding alone does not change the frequencies of alleles in the gene pool). Another interpretation of “f ” is suggested by Eqs. (3.2): in addition to f being a correlation coefficient, f is also a direct measure of the deviation of heterozygote genotype frequencies from Hardy–Weinberg expectations. Note that the frequency of heterozygotes in Eq. (3.2) is 2pq(1 − f), and recall from Chapter 2 that the expected frequency of heterozygotes under Hardy–Weinberg is 2pq. Hence, an alternative mathematical definition of f is: f = 1−

Observed Frequency of Heterozygotes in the Deme Expected Frequency of Heterozygotes Under Hardy-Weinberg

33

From Eq. (3.3), we can see that a positive correlation between uniting gametes leads to a heterozygote deficiency in the deme (typically called an inbreeding system of mating), no correlation yields Hardy–Weinberg frequencies (random mating), and a negative correlation (typically called avoidance of inbreeding) yields an excess of heterozygotes in the deme. In most of the population genetic literature, both f and F are called inbreeding coefficients and are often assigned the same mathematical symbol (typically f). That will not be the case in this book. F, which will be called pedigree inbreeding, refers to a specific individual, measures that individual’s probability of identity-by-descent for a randomly chosen autosomal locus, and ranges from 0 to 1. In contrast, f will be called system-of-mating inbreeding, refers to a deme, measures deviations from Hardy–Weinberg genotype frequencies, and ranges from −1 to +1 (Table 3.2). Because f and F are both called inbreeding coefficients and are frequently assigned the same symbol in much of the Table 3.2 A contrast between F, the pedigree inbreeding coefficient, and f, the system-of-mating inbreeding coefficient. F

Property

f

Data Used to Calculate

Pedigree Data for Specific Individuals

Genotype Frequency Data for a Specific Locus and Deme

Type of Mathematical Measure

Probability

Correlation Coefficient

Range of Values

0≤F≤1

−1 ≤ f ≤ 1

Biological Level of Applicability

Individual

Deme

Biological Meaning

The Expected Chance of Identityby-Descent at a Randomly Chosen Autosomal Locus for a Specific Individual Caused by the Biological Relatedness of the Individual’s Parents

The System of Mating of a Deme Measured as Deviations From Random-Mating Genotype Frequency Expectations

Systems of Mating

literature, it is not surprising that these two extremely different definitions of “inbreeding” have often been confused. We will illustrate the difference between these two “inbreeding coefficients” by returning to the example of the captive herd of Speke’s gazelle. Recall that the captive herd of the Speke’s gazelle was founded at the St. Louis Zoo with one male and three females between 1969 and 1972. Because there was only one male, all animals born in this herd were biological relatives. Under the assumption that the four founding animals were unrelated (our reference population), all of these original founders and the offspring between them have F = 0, that is, these individuals were not “inbred.” By 1982, these older animals had all died off and all animals in the herd had F > 0. Given that all animals bred in captivity had to be at least half-siblings of one another (there was only one founding male), this “inbred” state of the descendants of the original founders and their offspring was inevitable regardless of system of mating. The average F in 1982 was 0.149 relative to the founder reference population, making this captive herd one of the most highly “inbred” populations of large mammals known. An isozyme survey (Appendix A) was also performed on these same animals in 1982. For example, at the polymorphic general protein (GP) locus, the observed heterozygosity was 0.500, but the expected heterozygosity under random mating was 0.375. Hence, for this locus, f = −0.333. Several other polymorphic isozyme loci were scored, all yielding f < 0, with the average f over all loci being −0.291. This highly negative f indicates a strong avoidance of system-of-mating inbreeding. We now have what appears to be a contradiction, at least for those who confuse f and F. This herd of gazelles is simultaneously one of the most highly inbred (pedigree-sense F) populations of large mammals known, and it also is strongly avoiding inbreeding (system of mating-sense f). There is no paradox here except verbally; the two types of “inbreeding” and “inbreeding coefficients” are measuring completely different biological attributes. The negative f indicates that the breeders of this managed herd were avoiding inbreeding in a system-of-mating sense within the severe constraints of this herd of close biological relatives. If inbreeding were being avoided at the level of system of mating, then why did every individual in the herd have an F > 0? Keep in mind that “random mating” means that females and males are paired together at random regardless of their biological relationship. In any finite population, there is always a finite probability of two related individuals being paired as mates under random mating. The smaller the population, the more likely it is to have biological relatives mate “at random.” Hence, random mating (f = 0) implies some matings among biological relatives that will yield F > 0 in any finite population. Indeed, even avoidance of inbreeding (f < 0) can still result in matings among biological relatives in a finite population. For example, many human cultures (but not all) have incest taboos that often extend up to first cousins. Assuming a stable sized population of N adults with an average and variance of two offspring per family (the number of offspring being Poisson), then f = −1/(N − 10) when relatives up to and including first cousins are excluded as mates but mating is otherwise random (Jacquard 1974). Note that as N increases, f approaches 0. This means that although incest taboos are common in human societies, the Hardy– Weinberg law fits very well for most loci within most large human demes. However, some human demes are small. Suppose N = 50 (a small local human population, but still found in some hunter/ gathering societies), then f = −0.025 under this non-random system of mating of excluding close relatives as mates. Nevertheless, such small human populations typically contain many “inbred” (F > 0) individuals despite their incest taboos in choosing mates (f < 0). Consider, for example, a set of religious colonies in the upper Great Plains of North America that are descendants of a small group of Anabaptist Protestants who originally emigrated from the Tyrolean Alps (Steinberg et al. 1966). There has been very little immigration into these religious colonies from other human populations after they were established, so these colonies represent a genetic

57

Population Genetics and Microevolutionary Theory

isolate. Internally, their system of mating is one of strong avoidance of mating between close relatives as incest is considered to be a sin in their religion. Despite this strong avoidance of pedigree inbreeding, the average F for one isolated sub-sect was 0.0255. This makes this population one of the more highly inbred (F > 0) human populations known despite a system of mating that strongly avoids inbreeding (f < 0). The reason for this seeming contradiction is that these colonies were founded by relatively few individuals, so virtually everyone in the colony by the mid-twentieth century was related to everyone else. Hence, the pedigree inbreeding is due to the small population size at the time the colonies were founded, and not due to the system of mating (such “founder effects” will be discussed in more detail in the next chapter). Indeed, if these individuals truly mated at random, then the average F under random mating would be 0.0311, a value considerably larger than the observed average F of 0.0255. As this population reveals, avoidance of inbreeding in the system of mating sense does not necessarily result in no pedigree inbreeding (F = 0) but rather lowers the level of pedigree inbreeding from what would have occurred under random mating. The strong avoidance of inbreeding in this human population also results in large deviations from Hardy–Weinberg expectations. For example, a sample from this population scored for the MN blood group had 1083 individuals with genotype MM, 1220 with MN, and 260 with NN. Using the test given in Chapter 2, the resulting chi-square is 9.68 with one degree of freedom, which is significant at the 0.002 level. Hence, unlike most other human populations (see Chapter 2 for examples), this religious colony does not have Hardy–Weinberg genotype frequencies for the MN locus. Instead, there is a significant excess of heterozygotes (only 1149 are expected under random mating, versus the 1220 that were observed). Using Eq. (3.2), this results in f = −0.0615. Thus, this religious colony started from a small number of founders is highly inbred in the pedigree sense (F = 0.0255), even though the population is strongly avoiding inbreeding in the system of mating sense (f = −0.0615). The two “inbreeding coefficients” F and f are most definitely not the same either mathematically or biologically. Another human example is given in Figure 3.5 (Roberts 1967) that illustrates how small founding population size can result in pedigree inbreeding despite strong avoidance of system-of-mating 0.05

0.04

Average F

58

0.03

0.02

0.01

0.00 1820 1830 1840 1850 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 Decade of Birth

Figure 3.5 The average pedigree inbreeding coefficient for the human population on Tristan da Cunha as a function of decade of birth.

Systems of Mating

Table 3.3 The first eight marriages between biological relatives on Tristan da Cunha, showing the date of the marriage, the number of available women of marriageable agea, and the number of the available women who were not related to the groom. Marriage Between Relatives

a

Date of Marriage

Number of Available Women

Number of Non-relatives

1

1854

7

3

2

1856

9

2

3

1871

1

0

4

1876

1

0

5

1884

7

1

6

1888

8

0

7

1893

3

0

8

1898

1

0

16 years and over, single, and not a sister of the groom.

inbreeding. Twenty people colonized the remote Atlantic island of Tristan da Cunha in the early 1800s, with a few more migrants coming later (more details will be given in the next chapter). Despite a strong incest taboo among these Christian colonists and a system of mating characterized by f < 0, individuals with pedigree inbreeding (F>0) began to be born by the 1850s (under the assumption that all colonists and migrants were unrelated), with more and more extreme pedigree inbreeding occurring as time passed by (Figure 3.5). How could this human population, so strongly avoiding system-of-mating inbreeding under their taboo against incest, become a population with one of the highest levels of average pedigree inbreeding known in humanity? Table 3.3 gives the answer. The first marriage between biological relatives that resulted in an inbred offspring occurred in 1854. At the time of this marriage, there were only seven women of marriageable age (16 years or older) who were available (single and not a sister of the groom), and only three of them were not related to the man involved in this union. The number of women of marriageable age that were not relatives of the grooms in subsequent years quickly went down to 0 (Table 3.3). Hence, for those wishing to remain on the island and marry, there was no choice but to marry a relative, although more distant relatives were chosen compared to random mating expectations (hence, f < 0). The potential discrepancy between f and F gets more extreme as the population size gets smaller. Because the founder population of the Speke’s gazelle herd was just four individuals, the effect of finite size on f versus F was much larger than on the human populations living in the North American religious colonies or on Tristan da Cunha. Indeed, despite extreme avoidance of system-of-mating inbreeding, pairing the least related gazelles still meant that most matings were between half-siblings (F = 0.125), the most distantly related animals that existed in the population once the original founders and their offspring had died. Thus, even strong avoidance of “inbreeding” in terms of system-of-mating can result in many “inbred” individuals in the pedigree sense. The breeding program for the Speke’s gazelle has sometimes mistakenly been called a program of “deliberate inbreeding.” The system of mating was under the control of the breeders, so the only “deliberate” choice, as shown by f, is a strong avoidance of inbreeding. The accumulation of high levels of F in the individuals that constitute this population is not “deliberate” but rather is the inevitable consequence of the small founding population size of this herd. Because F is a probability, F has to be greater than or equal to

59

60

Population Genetics and Microevolutionary Theory

zero. It is therefore mathematically impossible to measure avoidance of inbreeding with F, so the fact that F > 0 for every animal in the herd tells one nothing about the system of mating. Indeed, the average F of a population is often due more to its finite size than to its system of mating, as illustrated both by the Speke’s gazelle herd and the Tristan da Cunha human population. Consequently, the average value of F for a deme, F, is not used as a measure of system of mating, but rather of another evolutionary force called genetic drift. We will discuss this biological application of F in more detail in Chapter 4. (Note, F in most of the population genetic literature is also simply called the “inbreeding coefficient” and is usually given the same symbol as F and f – readers beware!) F and f are quite distinct biologically and mathematically (Table 3.2), but neither is an evolutionary force by itself. We already saw that f 0 does not cause any change in allele frequency (Table 3.1), and F cannot be an evolutionary force because it is defined for a specific individual and not a deme or population. Recall from Chapter 1 that only populations evolve – not individuals – so F cannot by definition be used to describe evolutionary change (F, being a population average, is a population-level parameter and therefore can be used to describe evolutionary change, as we will see in Chapter 4). Nevertheless, both F and f have important population consequences that can affect the course of evolution when coupled with other evolutionary agents. We already discussed one important evolutionary consequence associated with F in the previous section, inbreeding depression in a population. We now examine some of the population and evolutionary consequences of system-of-mating inbreeding, f. Even small deviations from Hardy–Weinberg as measured by f can have major impacts on the genotype frequencies found in local populations. For example, consider the impact of system-ofmating inbreeding (f > 0) on the incidence of a rare, recessive autosomal trait, that is, a trait expressed only in individuals who are homozygous for a particular allele. This genetic category is of considerable interest because many genetic diseases in humans and other species are associated with a recessive allele that is rare in the deme. We can immediately see the impact of f on the frequency of such a recessive trait by looking at Table 3.1, now regarding a as the recessive allele. The frequency of individuals with the recessive phenotype is, from Table 3.1 or from Eqs. (3.2), q2 + pqf. Suppose first that q = 0.001 (most genetic disease alleles are rare) and f = 0 (random mating). Then, the frequency of affected individuals is (0.001)2 = 0.000001, or 1 in 1,000,000. Now, let f = 0.01, a seemingly minor deviation from random mating. Then, q2 + pqf = 0.000001 + (0.999)(0.001)(0.01) = 0.000011. Thus, a 1% inbreeding level causes an 11-fold increase in the incidence of the recessive trait in the population. Even small deviations from random mating cause large changes in genotype frequencies when rare alleles are involved. As we shall see in Chapter 12 of this book, this change in genotype frequencies means that rare, recessive, deleterious allele are subject to stronger natural selection in an inbreeding population (f > 0) than in a random-mating (f = 0) population. When an inbreeding system of mating persists for many generations, this greater exposure to selection means that recessive, deleterious alleles can be reduced to lower frequencies by natural selection than they would have been in a randommating population. Similar considerations lead to the prediction of reduced numbers of lethal equivalents when system-of-mating inbreeding persists for many generations, as was demonstrated in the captive population of Speke’s gazelles (Templeton and Read 1984, 1998). As will be shown in detail in Chapter 12, there is a strong evolutionary interaction between f and F that is mediated by natural selection. For now, it is important to keep in mind that inbreeding in either of the pedigree sense or system-of-mating sense is deleterious in some contexts but not in others.

Systems of Mating

Assortative Mating System-of-mating inbreeding (f) represents a deviation from random mating in which the biological relatedness among individuals affects their probability of becoming mates. Individuals can also have their probability of mating influenced by the traits or phenotypes displayed by potential mates. One such deviation from random mating based on individual phenotype is called assortative mating. Under assortative mating, individuals with similar phenotypes are more likely to mate than expected under random pairing in the population. This results in a positive correlation between the trait values for mating pairs in the population. Such assortative mating is common in animals for a large variety of traits (Jiang et al. 2013). Although assortative mating can arise from an individual’s preference to mate with another individual with a similar phenotype, assortative mating can arise from many factors other than mate choice. Consider treehoppers (a type of insect) in the genus Enchenopa (Wood and Keese 1990). These treehoppers can feed on a number of different host plants, and the treehoppers that fed and developed upon a particular type of host plant preferentially mate with other treehoppers that fed and developed upon the same kind of host plant. Does this mean that treehoppers prefer as mates other individuals that fed on the same type of host plant? Not at all. Rather, it was found that the various types of host plants affect the rate at which individuals become sexually mature, such that males and females that came from the same type of host plant tended to become sexually mature at the same time. Because female treehoppers are receptive for only a short period of time after achieving maturity and because males die rapidly after maturity, most of the individuals available as potential mates at any given time are those that feed on the same type of host plant. To see if the assortative mating was due to the impact of host plant on developmental time rather than individual mating preferences, individuals from different host plants were experimentally manipulated to become mature at the same time. No assortative mating by host plant occurred under those conditions. Hence, the assortative mating in this case arose from the effects of host plants on developmental times and did not involve any mate choice or preference on the part of individual treehoppers. The treehopper example illustrates the fact that assortative mating is simply a positive phenotypic correlation among mating individuals at the level of the deme. Such a correlation may arise from individual-level mating preferences but can also arise from many other factors that have nothing to do with individual-level mating preferences. Regardless of its cause, we need to consider the evolutionary consequences of assortative mating for a phenotype as such mating is common. To do so, we need to develop a deme-level model of assortative mating.

A Simple Model of Assortative Mating To investigate the evolutionary consequences of assortative mating for a genetic system, we first consider a simple model of an autosomal locus with two alleles (A and a) such that there is a 1 : 1 genotype–phenotype relationship with each genotype having a distinct phenotype. If the mating is 100% assortative for these phenotypes (that is, individuals only mate with other individuals with identical phenotypes), we obtain the generation-to-generation transition, as shown in Figure 3.6. Note that the genotype frequencies in the initial population in Figure 3.6 are not necessarily in Hardy–Weinberg equilibrium. However, regardless of whether or not we start in Hardy–Weinberg, the genotype frequencies will change each generation as long as there are some heterozygotes. In particular, the frequency of heterozygotes is halved each generation. Thus, this deviation from random mating causes a major qualitative difference with the original Hardy–Weinberg model given in

61

62

Population Genetics and Microevolutionary Theory

Zygotic Population

Mechanisms of Producing Phenotypes

Phenotypes of Adult Population

AA

Aa

aa

GAA

GAa

Gaa

1

1

1

TAA

TAa

Taa

GAA

GAa

Gaa

1

1

1

TAA × TAA = AA × AA

TAa × TAa = Aa × Aa

Taa × Taa = aa × aa

GAA

GAa

Gaa

Mechanisms of Uniting Gametes (Assortative Mating)

Mated Population

Mechanisms of Zygotic Production (Mendel’s First Law)

Zygotic Population of Next Generation

1

1 4

1 4 1 2

AA G′AA = GAA + 1 GAa 4

Aa 1G G′Aa = 2 Aa

1

aa G′aa = Gaa + 1 GAa 4

Figure 3.6 A model of 100% assortative mating for a phenotype determined by a single autosomal locus with two alleles and with all three possible genotypes having distinct phenotypes. G’s refer to genotype frequencies and T’s to phenotype frequencies.

Chapter 2: genotype frequencies do not immediately go to equilibrium. Rather, because the heterozygote genotype frequency is halved every generation, the deme gradually approaches a genotypic equilibrium in which the population is entirely homozygous. At this equilibrium, the initial heterozygote frequency is evenly split between the two homozygous genotypes (from Mendel’s first law) yielding an equilibrium array of genotype frequencies of

•• •

GAA equilibrium = GAA + 1 2GAa GAa equilibrium = 0 Gaa equilibrium = Gaa + 1 2GAa

34

The “inbreeding coefficient” f measures deviations from Hardy–Weinberg genotype frequencies, so f can also measure the impact of assortative mating on the population. Suppose the initial zygotic

Systems of Mating

genotype frequencies shown in Figure 3.6 were in Hardy–Weinberg equilibrium. Then, GAa = 2pq and f = 0. After one generation of assortative mating, we have that G’Aa = 1/2(2pq) from Figure 3.6. Hence, 1-f = 1/2(2pq)/(2pq) = 1/2, so f = 1/2. As this population continues to mate assortatively, the heterozygote deficiency relative to Hardy–Weinberg becomes more extreme until, at equilibrium, there are no heterozygotes. At this equilibrium, f = 1. Note that if one were only examining this locus at a single generation, it would be indistinguishable from system-of-mating inbreeding because of an inbreeding-like deficiency of heterozygotes relative to Hardy–Weinberg expectations. This example shows that calling f an “inbreeding coefficient” can sometimes be misleading. A non-zero value of f can arise in situations where there is no inbreeding (system-of-mating sense) at all. Jacquard (1975) recommended that f be called the “coefficient of deviation from random mating” because f measures deviations from random mating genotype frequencies, regardless of the biological cause of that deviation. This accurate, but somewhat cumbersome, name for f has not become generally adopted in the population genetic literature, so, once again, readers and students of this literature must always keep in mind that the “inbreeding coefficient” f may have nothing to do with system-of-mating inbreeding in particular cases. As shown above, assortative mating causes deviations from Hardy–Weinberg genotype frequencies. Does assortative mating also cause evolution, that is, does assortative mating alter gamete frequencies over time? Recall, the general definition of the frequency of the A allele is p = 1 × GAA + 1/2 × GAa. After one generation of assortative mating, we see from Figure 3.6 that the frequency of A will now be 1 × [1 × GAA + 1/4 × GAa] + 1/2 × [1/2 × GAa] = 1 × GAA + 1/2 × GAa = p. From Eqs. (3.4), we see at equilibrium that the frequency of A will be 1 × [1 × GAA + 1/2 × GAa] + 1/2 × [0] = 1 × GAA + 1/2 × GAa = p. Hence, just like inbreeding, assortative mating by itself does not alter allele frequencies and does not cause evolutionary change at the single locus level. Once again, assortative mating mimics the effects of system-of-mating inbreeding. Assortative mating also mimics the effects of pedigree inbreeding with respect to increasing the phenotypic frequencies of recessive traits. As an example, consider the phenotype of profound, early-onset deafness in humans. Such deafness can be caused by disease, accidents, and genes. Genetic causes explain 68% of these cases, with some 115 genes and mtDNA variants implicated in early-onset deafness (Morton and Nance 2006; http://deafnessvariationdatabase.org). Most of the alleles associated with deafness behave as autosomal recessives, and most such alleles are extremely rare. The one exception is the GJB2 locus that encodes the gap-junction protein connexin-26 (Morton and Nance 2006), with a loss of function allele named 35delG with a frequency, q, of about 0.01 in U.S. and European populations (Green et al. 1999; Storm et al. 1999). Under random mating, we expect q2 = 0.0001, so about 1 in 10 000 births should yield a deaf child due to homozygosity for this allele. The actual incidence of deafness due to this recessive allele is 3–5 in 10 000 births – far in excess of Hardy–Weinberg expectations. The reason is that there is strong assortative mating for deafness, with the phenotypic assortment rate being over 80% in the United States, 92% in England, and 94% in Northern Ireland (Aoki and Feldman 1994). Early-onset deaf children are often sent to special schools or classes and hence socialize mostly with one another. Also, they communicate best with one another. As a result, there is a strong tendency to marry within, which in turn results in increased homozygosity for alleles yielding deafness beyond random expectations. The impact of assortative mating for deafness upon the frequency of homozygotes at the GJB2 locus is reduced by the fact that deafness can be caused by many other loci and non-genetic factors. In general, the impact of assortative mating in yielding an “f” is proportional to both the phenotypic correlation between mates (which is high for deaf people) and the correlation between genotype and phenotype. This later correlation is reduced when the phenotype has many distinct genetic

63

64

Population Genetics and Microevolutionary Theory

and non-genetic causes, as is the case for deafness. As a consequence, many of the deaf people who marry are deaf for different reasons and not necessarily because they are both homozygous for the same alleles associated with deafness. Hence, identical phenotypes of a married deaf couple do not imply identical genotypes or even a genetic cause at all, which reduces the impact of assortative mating upon the frequency of homozygotes at any specific locus that contributes to deafness. For the GJB2 locus in particular, only about 1/4 of the people with early-onset deafness are homozygous for the 35delG recessive allele at this locus. Therefore, of the couples for which both individuals are deaf, we would expect only (1/4)2 = 1/16 to both be homozygous for 35delG and therefore produce only deaf children. Since all the other deaf-associated alleles at other loci are extremely rare and many people are deaf for non-genetic reasons, we would normally expect that the only category of deaf couples likely to have deaf children are these marriages between two 35delG homozygotes. Consequently, we would expect only 1 out of 16 marriages between two deaf individuals would result in deaf children. The actual figure is closer to 1 in 6 (Koehn et al. 1990). The reason for this discrepancy is that there is another attribute of assortative mating that leads to increased homozygosity for the 35delG allele at the GJB2 locus even when one or both parents are deaf for a reason other than homozygosity for the 35delG allele at GJB2. How can this be? To answer this question, we must look at the impact of assortative mating upon a multi-locus genetic architecture.

The Creation of Linkage Disequilibrium by Assortative Mating We saw above with the single locus model that assortative mating does not alter allele frequencies and is therefore not an evolutionary force at the single locus level. This conclusion is drastically altered as soon as we go to models of assortative mating for phenotypes influenced by two or more loci. Table 3.4 presents a 100% assortative mating model for a two-locus genetic architecture. In this model, two autosomal loci exist, each with two alleles (A and a at one locus, and B and b at the second), with r being the recombination frequency between them. We further assume that each capital letter allele contributes +1 to the phenotype, and each small letter allele contributes 0. To obtain the total phenotype of an individual genotype, we add up the phenotypic contributions of each allele over this pair of loci. Hence, the phenotype of the genotype AB/AB is +4, etc., as shown in Table 3.4. Mating is 100% assortative in this model in the sense that individuals mate only with other individuals that have identical phenotypes. There are 10 two-locus genotypes (Table 3.4), which can be paired together in 55 distinct ways under random mating. However, only 18 of these 55 mating types are allowed under 100% assortative mating, as shown in Table 3.4. Table 3.4 presents what is known as a transition matrix in mathematics. The rows in such a transition matrix are probabilities that describe all the possible offspring outcomes from each mating type, that is, the transition from mating type to offspring genotype. We have already encountered such a transition matrix in Chapter 2 in the derivation of Weinberg’s version of the Hardy– Weinberg law (Table 2.2). Such transition matrices can be used to predict the changes in genotype frequencies over many generations. Sometimes, the mathematics needed to make such predictions can appear quite complicated, but all we need to do here is to note a few features of the transition matrix given in Table 3.4 to make some important evolutionary insights into assortative mating. Note that the probabilities in each row sum to one, so each row represents a probability distribution over a mutually exclusive and exhaustive set (see Appendix B). However, note that 5 of the 18 rows have only one entry, a probability of “1” for a particular offspring genotypic class. The first row in Table 3.4 is one such example. This row describes the predicted offspring frequencies from the mating AB/AB × AB/AB. Under our model of inheritance (which assumes no mutations), all offspring

Table 3.4 Hundred percent assortative mating for a two-loci, two-allele genetic architecture with additive phenotypic effects. Mating

Offspring Genotypes AB AB

Mate Phenotype

Type

AB Ab

AB AB

×

AB AB

“4”

AB Ab

×

AB Ab

“3”

1

/4

1

/2

AB Ab

×

AB aB

“3”

1

/4

1

/4

AB aB

×

AB aB

“3”

1

/4

AB ab

×

AB ab

“2”

1

/4(1−r)2

1

AB ab

×

Ab aB

“2”

1

/4(1−r)r

1

AB ab

×

Ab Ab

“2”

×

aB aB

“2”

×

Ab aB

“2”

Ab aB

×

Ab Ab

“2”

Ab aB

×

aB aB

“2”

Ab Ab

×

Ab Ab

“2”

Ab Ab

×

aB aB

“2” “2”

AB ab Ab aB

aB aB

×

aB aB

Ab ab

×

Ab ab

“1”

Ab ab

×

aB ab

“1”

aB ab

×

aB ab

“1”

ab ab

×

ab ab

“0”

AB aB

AB ab

Ab aB

Ab Ab

aB aB

Ab ab

aB ab

ab ab

1 1

/2(1−r)r

1

/4(1−r)2 + 1/ 4r 2

1

1

2

/ 4r

1

/4

1

/2

/4 1

1

/4(1−r)2 + 1/4r2

1

/2(1−r)2

/2(1−r)r

1 1

/2(1−r)r 1

/2r2

1 1

/2(1−r)

/2(1−r)r

/ 2r 1

/ 2r

1

2

/ 2r

1

/2(1−r)r

/2(1−r) 1

/4

1

/2(1−r)r

1 1

1

1

1

/4r2

1

/4(1−r)r 1

/ 2r

1

1

/2(1−r)r

1

/4(1−r)2 + 1/4r2

1

/4(1−r)r

1

1

/ 2r 1

/ 2r 2

/2(1−r)

1

1

/2(1−r)

1

1

/2(1−r)

/4

/4r2

2

/4(1−r)

1

1 2

1

/2(1−r)r 1

/2(1−r) 1

1 1

/4(1−r)2

/4(1−r)r

/2(1−r)

/2r

/4(1−r)

/2(1−r)r

/4(1−r)2 + 1/4r2

1

/2(1−r)

/2(1−r)r

1

/4r2

/ 2r 1

/2(1−r)

/ 2r

1 1 1 1 1

/4

/4 1

/4

1

/2

1

/4

1

/4

1

/4

1

/2

1

/4

1

/4

1

66

Population Genetics and Microevolutionary Theory

from this mating type are AB/AB, and that is reflected in Table 3.4 by the probability of 1 appearing in the AB/AB column. Note that the offspring genotype in this case is the same as the genotypes of both parents. Hence, once a zygote is conceived with the genotype AB/AB, the resulting adult and all of its descendants can only mate exclusively with other AB/AB individuals – the only individuals with the +4 phenotype – and will only produce AB/AB offspring. In mathematical jargon, having the genotype of AB/AB is called an “absorbing state” because once a zygote enters that genotypic state, all of its descendants will also be in that state. Similarly, the very last row of Table 3.4 also defines an absorbing state, consisting of the mating type ab/ab × ab/ab. An examination of Table 3.4 reveals three other rows with a single entry of “1.” At first glance, these may seem to be the same situation as described in the preceding paragraph. However, one of these rows with a “1” in the +2 phenotypic category is associated with the mating type Ab/Ab × aB/ aB (Table 3.4) producing with probability one the offspring genotype of Ab/aB. Note that the offspring from this mating type has a genotype that is different from both parents. As a consequence, the offspring of this mating type do not remain in the parental mating type class, but rather the offspring can engage in matings associated with other rows. By looking at the rows in Table 3.4 that involve the offspring genotype Ab/aB, four are found and none of these rows are associated with an entry of a single “1.” Therefore, the mating type Ab/Ab × aB/aB is not an absorbing state. The descendants of the offspring in this class are expected to go to many other mating type classes in future generations. The remaining two rows with a single entry of “1” are associated with matings of identical homozygous genotypes that produce offspring with the same genotype, just like the two absorbing states we identified above. One of these remaining two rows is the mating type Ab/Ab × Ab/Ab, which produces offspring of genotype Ab/Ab. Note from Table 3.4 that individuals with the genotype Ab/Ab have a + 2 phenotype and can therefore mate with any other individuals with a + 2 phenotype (those individuals with genotypes Ab/Ab, AB/ab, Ab/aB, or aB/aB). Hence, not all offspring of Ab/Ab × Ab/Ab parents will necessarily remain within this mating type class the next generation. If they mate with any other +2 genotype, they can produce a wide variety of offspring types. Hence, the mating type Ab/Ab × Ab/Ab is not necessarily an absorbing state (and similarly the mating type aB/aB × aB/aB). Nevertheless, these mating types have the potential for being absorbing if, somehow, they were the only genotype in the +2 phenotypic category. To see if this potential could ever be realized involves mathematics beyond the scope of this book, but interested readers should see Ghai (1973). A verbal description of Ghai’s mathematical results is given in the next paragraph. Note that two of the genotypes with the +2 phenotype are double heterozygotes, which become increasingly rare as assortative mating proceeds (recall that assortative mating decreases heterozygote genotype frequencies in general). As time goes by, the +2 phenotypic class increasingly consists of just the homozygous Ab/Ab and aB/aB genotypes. Whenever matings occur between the Ab/Ab and aB/aB genotypes, the offspring are removed from the genotypic classes of the parents, as noted above. Consequently, it is impossible to have a stable equilibrium with both the Ab/Ab and aB/aB genotypes persisting in the population. However, if one of the Ab/Ab and aB/aB genotypes is rarer than the other, a greater proportion of the individuals of the rarer genotype will mate with the other +2 genotype. The result of these complex dynamics is that at most only one of the Ab/Ab and aB/aB genotypes has the potential for becoming an absorbing state as the population moves toward an equilibrium, and the winner is the one bearing the alleles that were more frequent in the initial population. The other mating types in Table 3.4 define “transient states.” For all the rows that do not have a single entry of “1,” there is always a non-zero value for one or more of the two to three absorbing states (assuming r > 0; if r = 0, then this model becomes effectively a single locus model). Thus,

Systems of Mating

Table 3.5

The equilibrium populations under a two-locus model of 100% assortative mating. Initial Gene Pool

Genotypes

pA = pB

pA < pB

pA > pB

AB/AB

pA

pA

Ab/Ab

0

0

pA − pB

aB/aB

0

pB−pA

0

ab/ab

pb

pb

pa

pB

every generation, some of the transient mating types will produce progeny that enter an absorbing state, and once there, they are stuck in that state forever under the assumptions of the model. The mathematical consequence of this is straightforward: the genotype frequencies of AB/AB and ab/ab can only increase with time while the genotype frequencies of all other genotypes can only decrease, with the possible exception of either Ab/Ab or aB/aB, but not both. This process will continue until all genotypes are in absorbing states. This results in three possible equilibrium populations. The initial frequencies of the A and B alleles determine to which of these three possible states the population evolves, as shown in Table 3.5. Suppose we started with an initial population with a gene pool of gAB, gAb, gaB, and gab, and therefore initial allele frequencies of pA = gAB + gAb and pB = gAB + gaB. In general, we can see from Table 3.5 that the population will evolve to a gene pool with a different set of gamete frequencies than the initial gene pool. Hence, assortative mating by itself causes evolutionary change by creating linkage disequilibrium. Note that linkage disequilibrium D exists (D = pApb or papB) in the equilibrium assortative mating population because there are only two or three gamete types at equilibrium, depending upon the initial conditions. Even if we had started with all four possible gamete types with no linkage disequilibrium, assortative mating in this case would have generated linkage disequilibrium. Table 3.4 represents an extreme version of assortative mating under a simplistic genotype– phenotype model. However, it does capture some general consequences of assortative mating under less extreme conditions:

• • •

First, as we saw with the single locus models, assortative mating increases the frequency of homozygotes at the expense of heterozygotes. Second, multiple equilibria exist, and the evolution of the population is determined by the state of its initial gene pool. This also is a general feature of multi-locus assortative mating models and illustrates that historical factors are a determinant of the course of evolution. In evolution, the present is constrained by the past. As we will see throughout this book, models that are more complex than the original Hardy–Weinberg model often have multiple possible evolutionary outcomes in which historical factors play a dominant role. Third, assortative mating can create and maintain linkage disequilibrium. Note that in Table 3.5, two gamete types usually dominate the predicted equilibrium gene pool: AB and ab. Both the A and B contribute +1 to the phenotype, and both a and b contribute 0 to the phenotype. Hence, the disequilibrium created by assortative mating in this case places alleles together in gametes that have similar phenotypic effects. This is also a general feature of the evolutionary impact of assortative mating: assortative mating causes gametes to bear alleles at different loci that cause similar phenotypic effects.

67

68

Population Genetics and Microevolutionary Theory

As the above features reveal, our earlier conclusion of assortative mating not being an evolutionary force is abandoned as soon as we leave the realm of one-locus models. Indeed, at the multi-locus level, assortative mating is an extremely powerful microevolutionary force. Also, note that assortative mating in the extreme model given in Table 3.4 splits the original population into genetic subsets (AB/AB and ab/ab, and, sometimes, Ab/Ab or aB/aB but never both) that are reproductively isolated from one another. Because of its ability to split a population into genetically differentiated and isolated subsets, assortative mating is also considered to be a powerful force not only in microevolution but in the origin of species as well. The assortative mating for deafness is less extreme than that given in Table 3.4, and, moreover, the genotype–phenotype relationship is quite different. Nevertheless, assortative mating for deafness will also create linkage disequilibrium in which bearers of a deaf allele at one locus tend to also be bearers of deaf alleles at other loci (Aoki and Feldman 1994). This disequilibrium also augments the incidence of genetic deafness in human populations. For example, let us return to the GJB2 locus. As shown earlier, we expect only about 1/16 of the deaf couples to have deaf children homozygous for 35delG when the GJB2 locus is considered by itself. Given the extreme rarity of alleles for deafness at all the other loci, we would normally dismiss the possibility that the remaining 15/16 of the deaf couples would have any substantial risk of having deaf children. Thus, we would normally expect only about 1 in 16 children of deaf couples to be deaf. But as noted earlier, the actual figure is closer to 1 in 6 (Koehn et al. 1990). What is going on here? The answer is linkage disequilibrium. For example, Vona et al. (2014) sequenced deaf and control individuals for 80 of the genes known to be associated with deafness. Fifty two percent of the deaf individuals were deaf due to a genetic cause. The deaf individuals as a whole bore an average of 3.7 deleterious variants in the 80 genes, whereas the controls bore an average of 1.4. and even the deaf individuals who were deaf for non-genetic reasons had a significant enrichment of deleterious alleles in these 80 genes. Hence, the assortative mating for deafness placed together in the same deaf individuals multiple alleles that were extremely rare in the general population, that is, assortative mating has changed the human gene pool by creating linkage disequilibrium among loci contributing to deafness. For example, a child who is deaf because of homozygosity for a recessive allele at a locus other than GJB2 may still be a heterozygous carrier for the 35delG allele because of past assortative mating. If such a deaf individual marries a 35delG homozygote, half of their children will be deaf even though both parents are deaf because of homozygosity of recessive alleles at different loci. Overall, assortative mating greatly increases the chances that marriages among deaf people will yield deaf children at the multi-locus level. The linkage disequilibrium that is induced by assortative mating, along with the increased homozygosity also induced by assortative mating, explains why autosomal recessive deafness occurs far more frequently in humans than expected under random mating.

Assortative Mating Versus Inbreeding As seen above, both assortative mating and inbreeding increase homozygosity and both can be measured by f. Despite these similarities, there are important differences between assortative mating and inbreeding. One of the critical differences between these two systems of non-random mating is that assortative mating by itself is a powerful force for evolutionary change at the multi-locus level, creating linkage disequilibrium among alleles having similar phenotypic effects. As we noted earlier, inbreeding is not an evolutionary force by itself at the single locus level, but how about at the multi-locus level? Table 3.6 presents the most extreme form of inbreeding possible in a two-locus model – complete selfing. Under 100% selfing, an individual can only mate with itself (many plant

Table 3.6 Hundred percent selfing for a two-loci, two-allele genetic architecture. Mating

Offspring Genotypes AB AB

Type AB AB

×

AB AB

AB Ab

×

AB Ab

1

AB aB

1

AB aB AB ab

× ×

AB ab

Ab aB

×

Ab aB

Ab Ab

×

Ab Ab

aB aB

×

aB aB

Ab ab

×

Ab ab

×

aB ab

×

ab ab

aB ab ab ab

AB Ab

AB aB

AB ab

Ab aB

Ab Ab

aB aB

Ab ab

aB ab

ab ab

1

1

1

/4

1

/4 2

/4(1−r) 1

1

/2

/4r

2

1 1

/2(1−r)r

1

/2(1−r)r

1

/4 1

/2

/2(1−r)r /2(1−r)r

/4

1

2

1

/2(1−r) 1

2

/ 2r

1

2

1

/ 2r

2

/2(1−r)

1

/ 4r

2

1 2

/4(1−r)

1

/4r2

1 2

/4(1−r)

1

/2(1−r)r

1

/2(1−r)r

/2(1−r)r

1

/2(1−r)r

1

/4(1−r)2 1

/4r2

1 1 1

1

/4

/2

1

/4

1

/2

1

/4

1

/4 1

70

Population Genetics and Microevolutionary Theory

species and a few animals have this system of mating), so for a two-locus model with 10 possible genotypes, there are only 10 possible mating types (Table 3.6). In examining Table 3.6 for absorbing versus transient states, you should discover that this model of 100% selfing has four universal absorbing states in contrast to the two observed with the model of 100% assortative mating given in Table 3.4. The four absorbing states correspond to selfing of the four homozygous genotypes: AB/AB, Ab/Ab, aB/aB, and ab/ab. At equilibrium, these are the only genotypes present in the population, and, obviously, the frequencies of the gametes AB, Ab, aB, and ab correspond to the respective homozygous genotype frequencies. Hence, we already see a big difference between 100% selfing and 100% assortative mating; all four gamete types can exist under selfing, whereas only two or three can exist at equilibrium under the 100% assortative mating model given in Table 3.4. Also, recall that if the initial population had no disequilibrium (D = 0), 100% assortative mating would create linkage disequilibrium (Table 3.5). In contrast, detailed mathematical analysis of the transition matrix defined by Table 3.6 reveals that no disequilibrium is created by 100% selfing (Karlin 1969); the gamete frequencies remain unchanged when there is no initial disequilibrium, so even this extreme form of inbreeding is not an evolutionary force by itself at the two-locus level in this situation. This is a general difference between inbreeding and assortative mating: assortative mating actively generates linkage disequilibrium, inbreeding does not. However, this does not mean that inbreeding has no effect on multi-locus evolution. Consider what happens when the initial population starts with some linkage disequilibrium. We saw in Chapter 2 that under random mating, the disequilibrium dissipates according to the equation Dt = D0(1 − r)t where D0 is the initial disequilibrium, r is the recombination frequency, and Dt is the amount of disequilibrium remaining after t generations of random mating. In deriving this equation, we had noted that disequilibrium is dissipated only in the offspring of double heterozygotes because recombination is effective in creating new recombinant gamete types only in double heterozygotes. Because inbreeding reduces the frequency of heterozygotes in general, it also reduces the frequency of double heterozygotes in particular. Therefore, the opportunity for recombination to dissipate disequilibrium is reduced under inbreeding simply because there are fewer double heterozygotes. In the extreme case of 100% selfing (Table 3.6), the equilibrium population consists only of homozygotes, so no dissipation of linkage disequilibrium is possible as the population nears genotypic equilibrium. Therefore, when starting with an initial population with some linkage disequilibrium, there is a dynamic race between recombination dissipating linkage disequilibrium through the increasingly rare double heterozygotes and the approach to the equilibrium population in which no further dissipation is possible. Under 100% selfing, the approach to homozygosity is so rapid that not all of the initial linkage disequilibrium is dissipated, meaning that some linkage disequilibrium can persist indefinitely under 100% selfing (Karlin 1969). Under models of less extreme inbreeding, linkage disequilibrium is eventually reduced to 0, just as it is under random mating, but at a reduced rate relative to its decay under random mating. Consequently, by itself, inbreeding, in all but its most extreme form of 100% selfing, is not ultimately an evolutionary force at either the single locus or two-locus level. This is in great contrast to assortative mating, which is a powerful multi-locus evolutionary force. There is another major difference between assortative mating and inbreeding at the multilocus level. Assortative mating affects the genotype frequencies only of those loci that contribute to the phenotype that affects mating and other loci that are in linkage disequilibrium with them. In contrast, inbreeding (in the system of mating sense) is based on choosing mates by pedigree relationship; hence, all loci are affected by inbreeding. Recall that inbreeding increases the incidence of all genetic diseases associated with rare, autosomal recessive alleles. In contrast,

Systems of Mating

children from marriages between deaf people have no increased risk for genetic diseases other than deafness and diseases associated with deafness through pleiotropy. Hence, assortative mating is locus specific whereas system-of-mating inbreeding alters the genotype frequencies at all loci. The fact that different genetic elements in the same populations can display different systems of mating is illustrated by studies on the fly Sciara ocellaris (Perondini et al. 1983). This species shows much chromosomal variability involving the structural modification of single bands of the polytene chromosomes. These band variants define what behaves as a single locus, two-allele genetic architecture. By examining the genotypes obtained from mated pairs, five of these polymorphic band systems fit well to a random mating model, seven showed assortative mating, and three deviated from random mating in other ways. Note that the deviations from random mating cannot be attributed to inbreeding in this case because all loci do not show a positive f as expected under inbreeding. What, then, is the system of mating for these flies? This question is unanswerable because we must first define the genetic system of interest. These flies are randomly mating for some polymorphisms, and assortatively mating for others. There is no such thing as the system of mating for a population; rather, there is only the system of mating for a specific genetic architecture within the population.

Assortative Mating, Linkage Disequilibrium, and Admixture As pointed out above, assortative mating directly affects the genotype and gamete frequencies of the loci that contribute to the phenotype for which assortative mating is occurring and of any loci in linkage disequilibrium with them. Generally, this means that assortative mating does not have a genome-wide effect in contrast to inbreeding as a system of mating. However, if there is extensive disequilibrium among loci throughout the genome, assortative mating can potentially have a genome-wide impact. Indeed, this circumstance is not a particularly rare one. To see why, we need to consider another source of linkage disequilibrium. In Chapter 2, we saw that mutation will create linkage disequilibrium. Now, we consider another factor that can create linkage disequilibrium, but unlike mutation, this factor can create massive amounts of disequilibrium between loci scattered all over the genome – even loci on different chromosomes. This evolutionary factor is admixture. Admixture occurs when two or more genetically distinct subpopulations are mixed together and begin interbreeding. As will be detailed in Chapters 6 and 7, many species have local populations that have little to no genetic interchange with one another for many generations causing those subpopulations to become genetically differentiated from one another (that is, they acquire different allele frequencies at many loci). Such reproductive isolation is often temporary, and events can occur that bring such differentiated subpopulations back together in the same area, allowing them to interbreed. This mixing together of previously differentiated subpopulations induces linkage disequilibrium in the admixed population even if no disequilibrium existed in either of the original subpopulations. For example, suppose a species consists of two subpopulations that have different allele frequencies at two loci, say the A locus (with alleles A and a) and the B locus (with alleles B and b). Let p1 be the frequency of A in subpopulation 1, and p2 be its frequency in subpopulation 2. Because we defined these populations to be genetically differentiated at the A locus, we know that p1 p2. Similarly, let k1 and k2 be the frequencies of the B allele in the two subpopulations, once again with k1 k2. We also assume that there is no linkage disequilibrium between the A and the B loci within either subpopulation. Finally, suppose these two subpopulations are brought together to form a new, admixed population such that a fraction m of the genes is derived from subpopulation 1

71

72

Population Genetics and Microevolutionary Theory

and 1 − m is derived from subpopulation 2 in this new admixed population. Then, in the first generation in which these two subpopulations are mixed together, the gamete frequencies in the newly created hybrid population are:

•• ••

gAB = mp1 k 1 + 1 − m p2 k 2 gAb = mp1 1 − k 1 + 1 − m p2 1 − k 2 gaB = m 1 − p1 k 1 + 1 − m 1 − p2 k 2 gab = m 1 − p1 1 − k 1 + 1 − m 1 − p2 1 − k 2

35

From these gamete frequencies, we can calculate the linkage disequilibrium (see Eq. 2.4) in this newly admixed population to be (after some algebra): Dadmixture = m 1 − m p1 − p2 k 1 − k 2

36

Note that as long as the two original subpopulations are genetically differentiated at the A and B loci (p1 p2 and k1 k2), then Dadmixture 0. An example of this is shown in Figure 3.7. Thus, hybridization or admixture between two subpopulations creates linkage disequilibrium between all pairs of loci that had different allele frequencies in the original subpopulations. Now suppose that locus A influences a trait for which there is assortative mating. Assortative mating would alter the genotype frequencies at the A locus within both subpopulations, but assortative mating would have no impact on the B locus genotype frequencies within either subpopulation because there is no linkage disequilibrium within either subpopulation. However, once admixture occurs, there is now disequilibrium between the A and the B loci. Now, assortative mating on the A trait would also have an impact on the B locus genotype frequencies in the admixed population. One aspect of this impact is to increase homozygosity above random mating expectations in the admixed population. Recall from Chapter 2 that a single generation of random mating is sufficient to establish one-locus Hardy–Weinberg genotype frequencies. However, when two or more previously differentiated subpopulations are mixed together with assortative mating on a trait that differs in initial frequency among the subpopulations, Hardy–Weinberg genotype frequencies are established only slowly even at loci not affecting the trait causing assortment. This gradual approach to Hardy–Weinberg genotype frequencies occurs as the linkage disequilibrium caused by admixture between these loci and the loci influencing the assorting trait gradually breaks down. Subpopulation 1

AB

Ab

0.03 0.07

Subpopulation 2

aB

ab

AB

Ab

0.27

0.63

0.63

0.27

aB

ab

0.07 0.03

D = (0.63)(0.03) − (0.27)(0.07) = 0

D = (0.03)(0.63) − (0.07)(0.27) = 0

Combined Population (50:50 Mix)

AB

Ab

aB

ab

0.33

0.17

0.17

0.33

D = (0.33)(0.33) − (0.17)(0.17) = 0.08

Figure 3.7 The creation of linkage disequilibrium (D) by admixture between two populations with differentiated gene pools.

Systems of Mating

A second impact stems directly from the first; decreased frequencies of heterozygotes reduces the dissipation of linkage disequilibrium. Hence, multi-locus Hardy–Weinberg equilibrium is also delayed relative to random mating expectations, and this prolongs the time it takes to achieve single-locus Hardy–Weinberg genotype frequencies. A third impact is that the mixed subpopulations do not fuse immediately, but rather the total population is stratified into genetically differentiated subcomponents that reflect the original historical subpopulations. As an example, consider the colonization of North America following the voyages of Columbus. This colonization brought together European, Native Americans, and sub-Saharan African human subpopulations that had allele frequency differences at many loci. Once brought together in the New World, admixture began. However, there was assortative mating for skin color that was in turn associated with socio-economic status. The skin color differences between Europeans and sub-Saharan Africans are due to about 13 loci (Pośpiech et al. 2014). There is no evidence that humans tend to mate non-randomly for most blood group loci, but Europeans and Africans show many allele frequency differences for such loci. As a result, there is strong disequilibrium between the skin color loci and blood group loci in the U.S. population as a whole. Assortative mating for skin color has therefore led to the persistence of allele frequency differences at blood group loci in modern-day U.S. European Americans and African Americans despite centuries of admixture. Hence, the U.S. population in toto is not a Hardy–Weinberg population. When regarded as a single entity, the U.S. population has too many homozygotes for almost all loci and extensive linkage disequilibrium throughout the genome. A similar situation due to assortative mating exists in Latin America (Norris et al. 2019). What is even more remarkable about admixed populations is that even assortative mating for a non-genetic trait can maintain allele frequency differences for many generations. All that is needed is for a phenotype causing assortative mating to be associated with the different historical subpopulations. For example, for reasons to be discussed in the next chapter, people of the Amish religion in the United States have different allele frequencies at many loci from the surrounding non-Amish populations due to the manner in which their initial populations were founded a couple of centuries ago. There is assortative mating by religion in the United States, and this has led to the persistence of these initial genetic differences between the Amish and their non-Amish neighbors over these centuries. Even fruit flies can learn mating preferences that can be passed on through cultural inheritance (Danchin et al. 2018), resulting in the potential for cultural assortative mating. All of these examples reveal that assortative mating, particularly at the multi-locus level, is a powerful evolutionary force. We now turn our attention to the opposite of assortative mating – preferential mating of individuals with dissimilar phenotypes – to see if this system of mating is likewise a powerful evolutionary force.

Disassortative Mating Disassortative mating is the preferential mating of individuals with dissimilar phenotypes. This means that there is a negative correlation between the phenotypes of mating individuals. For example, the Major Histocompatibility Complex (MHC), mentioned in Chapter 1, is found not only in humans but in mice as well. In mice, genetic variation in MHC induces odor differences. There is disassortative mating at this gene complex in mice that is due to olfactory discrimination of potential mates (Potts and Wakeland 1993). Raccoons (Procyon lotor) also use olfaction for individual recognition, and Santos et al. (2017) have shown that females perform MHC disassortative mate choice. Interestingly, males build social coalitions that defend territories and monopolize females,

73

74

Population Genetics and Microevolutionary Theory

Zygotic Population

Mechanisms of Producing Phenotypes

Phenotypes of adult population

AA

Aa

aa

GAA

GAa

Gaa

1

1

1

TAA

TAa

Taa

GAA

GAa

Gaa

Mechanisms of Uniting Gametes (Disassortative Mating)

Mated Population

Mechanisms of Zygotic Production (Mendel’s First Law)

TAA × TAa = AA × Aa

TAA × Taa = AA × aa

TAa × Taa = Aa × aa

GAAGAa / SUM

GAAGaa / SUM

GAaGaa / SUM

1 2

1 2

1

1 2

AA Aa Zygotic Population 1 1 1 of Next Generation G′ = 2 GAAGAa G′ = GAAGaa+ 2 GAAGAa+ 2 GAaGaa Aa AA SUM SUM

1 2

aa 1G G Aa aa G′aa = 2 SUM

Figure 3.8 A model of 100% disassortative mating for a phenotype determined by a single autosomal locus with two alleles and with all three possible genotypes having distinct phenotypes. G’s refer to genotype frequencies and T’s to phenotype frequencies. SUM = GAA × GAa + GAA × Gaa + GAa × Gaa and is needed to standardize the mating probabilities to sum to one.

and this male–male social choice is assortative for MHC. Disassortative mating for MHC is not limited to mammals but has also been found in the seabird, Leach’s storm-petrel (Oceanodroma leucorhoa) (Hoover et al. 2018). To see the impact of disassortative mating, consider the simple one-locus, two-allele model with a one-to-one genotype–phenotype mapping and 100% disassortative mating given in Figure 3.8. As with assortative mating, the genotype frequencies in general will change over the generations in this model of disassortative mating. For example, if we start out at Hardy–Weinberg frequencies with p = 0.25, then the initial random mating heterozygote frequency of 0.375 is altered by a single generation of disassortative mating to 0.565. Note that disassortative mating induces a heterozygote excess with respect to Hardy–Weinberg expectations, leading to f < 0 – exactly the opposite of assortative mating. This is exactly what was observed in the disassortative mating in raccoons and the storm-petrel – such mating resulted in high MHC heterozygosity and genetic diversity in the offspring (Santos et al. 2017; Hoover et al. 2018). Moreover, unlike assortative mating, the allele frequencies in general are changing. In the example given above, the allele frequency changed from 0.25 to 0.326 in a single generation. Therefore,

Systems of Mating

disassortative mating can be a powerful evolutionary force even at the single locus level. Moreover, disassortative mating is a powerful factor in maintaining genetic polymorphisms. In the example given in Figure 3.8, we can see that if we start at p = 0.25, then p would increase to 0.326 in a single generation of disassortative mating. Similarly, using the equations shown in Figure 3.8, we can see that if we start with p = 0.75, then p would decrease to 0.674 in a single generation of disassortative mating. Thus, disassortative mating is pushing the allele frequencies to an intermediate value in this case. Indeed, we can keep iterating the disassortative mating model given in Figure 3.8 to find that it quickly stabilizes at GAA = 0.4175, GAa = 0.5361, and Gaa = 0.0464, yielding an equilibrium allele frequency of A of 0.6856 and an equilibrium f of 1 − GAa/(2pq) = −0.2435 (note that once again f is not measuring the avoidance of system-of-mating inbreeding in this case, but rather the impact of disassortative mating). This is a true equilibrium, but different initial conditions will go to different equilibria, although all are polymorphic. In general, disassortative mating results in stable equilibrium populations with intermediate allele frequencies and f < 0. The reason for this intermediate stability is that any individual with a rare phenotype has an inherent mating advantage under a disassortative mating system: the rarer you are, the more other individuals in the deme that have a dissimilar phenotype to you and hence “prefer” you as a mate; “Prefer” is in quotation marks, because just as with assortative mating, disassortative mating does not always result from an active choice of mates but can arise from other causes. For example, many plants show disassortative mating for gametophytic self-sterility loci (S loci) (Vekemans and Slatkin 1994). These loci have alleles such that a grain of pollen bearing a particular allele at such an S locus can only successfully fertilize an ovum from a plant not having that same allele. This results in strong disassortative mating at S loci. Because rare phenotypes have such a large mating advantage in disassortative mating systems, new mutations associated with novel phenotypes have a tendency to increase in frequency. As a result, loci with disassortative mating systems tend to be polymorphic not just for two alleles, but for many alleles. We already saw in Chapter 1 that the MHC complex in humans has enormous levels of genetic variation, and the same is true for the homologous MHC complexes in other species, such as mice. The disassortative mating at these complexes is thought to be one of the contributors (but not the sole contributor) to these high levels of polymorphism. Similarly, 45 alleles at a self-sterility locus were found in a rare plant species (Oenothera organensis) with a total population size of only 500–1000 individuals (Wright 1969). Moreover, molecular analyses of S alleles indicate that they have persisted as polymorphisms for millions of years, even remaining polymorphic through speciation events (Ioerger et al. 1991; Vekemans and Slatkin 1994). The same is true for the MHC complex. For example, some of the alleles at human MHC-DRB loci mentioned in Chapter 1 have persisted as distinct allelic lineages since the time of the split of the old world monkeys from apes (over 25 million years ago) (Zhu et al. 1991; Stevens et al. 2013). This means that you can be heterozygous for two alleles at an MHC locus such that the allele you inherited from your father is evolutionarily more closely related to an allele from a pigtail macaque (Macaca nemestrina – an old-world monkey) than it is to the allele you inherited from your mother. Hence, disassortative mating is a strong evolutionary force even at the single locus level that tends to maintain stable polymorphisms for long periods of time. As noted above, disassortative mating also causes deviations from Hardy–Weinberg genotype frequencies and, in particular, causes an excess of heterozygosity. In this sense, disassortative mating “mimics” avoidance of system-of-mating inbreeding (f < 0), but as with assortative mating, disassortative mating is locus specific and does not have a general impact over all loci as does avoidance of inbreeding. For example, the Speke’s gazelle population surveyed in 1982 had a significant negative f for every locus examined and the f’s were statistically indistinguishable across loci

75

76

Population Genetics and Microevolutionary Theory

(Templeton and Read 1994). This indicates that the deviation from Hardy–Weinberg in this population was due to avoidance of inbreeding. In contrast, disassortative mating, just as we saw previously with assortative mating, is expected to affect only the loci contributing to the phenotype leading to disassortative mating and any other loci that are in linkage disequilibrium with the phenotypically relevant loci. As we saw earlier for assortative mating, there is no such thing as the system of mating for a deme in general; some systems of mating are specific to particular genetic systems and can be overlaid upon systems of mating that have more generalized effects, such as inbreeding or avoidance of inbreeding. Hence, disassortative mating has very different evolutionary impacts than avoidance of inbreeding, so these two systems of mating should not be equated. We saw earlier how assortative mating can split a deme into genetically isolated subpopulations by creating linkage disequilibrium. Although disassortative mating also tends to place together on the same gamete alleles with opposite phenotypic effects, it is not nearly as effective as assortative mating in generating and maintaining linkage disequilibrium. We saw earlier that the rate at which disequilibrium breaks down depends upon the frequency of double heterozygotes. Because disassortative mating results in heterozygote excesses, it accentuates the effectiveness of recombination in breaking down linkage disequilibrium. Moreover, because dissimilar individuals preferentially mate, there is no possibility of subdividing a deme into genetic/phenotypic subdivisions that are even partially reproductively isolated. Hence, in contrast to assortative mating, disassortative mating prevents demes from genetically subdividing. We also saw earlier that assortative mating can cause the persistence of genetic differences between historically differentiated demes long after contact has been established between the demes. In contrast, disassortative mating rapidly destroys genetic differences between historical subpopulations that come together and mate disassortatively – even for non-genetic traits. For example, the Makiritare and Yanomama Indians lived contiguously in South America prior to 1875 but apparently did not interbred much (Chagnon et al. 1970). As a consequence, most villages of these two tribes had significant genetic differentiation at many loci. Indeed, at several loci, the Makiritare had alleles that were not even present in the Yanomama gene pool. This situation began to change in one area when the Makiritare made contact with Europeans and acquired steel tools. The Yanomama, being more in the interior, did not have contact with non-Native Americans until the 1950s, and even in the 1970s, most Yanomama still had no outside contact. Hence, the Yanomama depended upon the Makiritare for steel tools. The Makiritare demanded sexual access to Yanomama women in exchange for the tools, siring many children who were raised as Yanomama. This also caused much hatred. One group of Yanomama eventually moved away to an area called Borabuk, but before they left, they ambushed the Makiritare and abducted many Makiritare women, who once captive had an average of 7.3 children as compared to 3.8 children per Yanomama woman. Because of this history, there were effectively two generations in which most offspring in the Borabuk Yanomama were actually Yanomama/Makiritare hybrids. This extreme disassortative mating between Yanomama and Makiritare, although not based upon any genetic traits, has led to the current Borabuk Yanomama being virtually identical genetically to the Makiritare, although, culturally, they are still Yanomama (Chagnon et al. 1970). As we have seen in this chapter, systems of mating can be potent evolutionary forces, both by themselves and in interactions with other evolutionary factors. In subsequent chapters, we will examine additional interactions between system of mating and other evolutionary forces.

77

4 Genetic Drift In deriving the Hardy–Weinberg Law in Chapter 2, we assumed that the population size was infinite. In some derivations of the Hardy–Weinberg Law, this assumption is not stated explicitly, but it enters implicitly by the act of equating allele and genotype probabilities to allele and genotype frequencies. The allele frequency is simply the number of alleles of a given type divided by twice the total population size for an autosomal locus in a diploid species. Likewise, the genotype frequency is simply the number of individuals with a specified genotype divided by the total population size. But in deriving the Hardy–Weinberg Law in Chapter 2, we actually calculated allele and genotype probabilities. For example, we calculated the probability of drawing an A allele from the gene pool as p and then stated that this is also the frequency of the A allele in the next generation. This stems from the common definition that the probability of an event is the frequency of the event in an infinite number of trials (Appendix A). But what happens when the population is finite in size so there is not an infinite number of trials? For example, suppose that a population has a gene pool with two alleles, say H and T, such that the probability of drawing either allele is 0.5 (i.e. p = q = 1/2). Now, suppose that 2N (a finite number) gametes are drawn from this gene pool to form the next generation of N diploid adults. Will the frequency of H and T be 0.5 in this finite population? You can simulate this situation. For example, let N = 5 (corresponding to a sample of 10 gametes), and place 10 coins in a box, shake the box, and count the number of heads (i.e. allele “H”). Suppose that after doing this experiment one time, six heads were observed. Hence, the frequency of the H “allele” in this simulation was 6/10 = 0.6, which is not the same as 0.5, the probability of H. Figure 4.1a shows the results of doing this coin flip simulation 20 times, and you are strongly encouraged to do this experiment yourself. As you can see from Figure 4.1a or from your own simulations, when population size is finite, the frequency of an allele in the next generation is often not the same as the probability of drawing that allele from the gene pool. Given our definition of evolution as a change in allele frequency, we note that this random sampling error can induce evolution. Genetic drift is the random change in allele frequency due to sampling error in a finite population. Genetic drift is an evolutionary force that can alter the genetic make-up of a population’s gene pool through time and shows that the Hardy–Weinberg “equilibrium” and its predicted stability of allele frequencies do not hold exactly for any finite population. The purpose of this chapter is to investigate the evolutionary properties and significance of genetic drift.

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

Population Genetics and Microevolutionary Theory

(a)

Number of Simulations With Given Outcome

6

N=5

5 4 3 2 1 0 0 (0.0

1 0.1

2 0.2

3 0.3

4 0.4

5 0.5

6 0.6

7 0.7

8 0.8

9 0.9

10 1.0)

(b) 6 Number of Simulations With Given Outcome

78

N = 10

5 4 3 2 1 0 0 (0.0

2 0.1

4 0.2

6 8 10 12 14 16 0.3 0.4 0.5 0.6 0.7 0.8 Number of H Alleles in Sample (Frequency of H Alleles in Sample)

18 0.9

20 1.0)

Figure 4.1 (a) The number of heads (the H “allele”) observed after shaking a box with 10 coins. The experiment was repeated 20 times. (b) The number of heads (the H “allele”) observed after shaking a box with 20 coins. The experiment was repeated 20 times.

Basic Evolutionary Properties of Genetic Drift The coin flip simulation shown in Figure 4.1a illustrates that finite population size can induce random changes in allele frequencies due to sampling error. But what exactly is the relationship between genetic drift and finite population size? To answer this question, repeat the coin flip experiment but now use 20 coins for the simulation of a diploid population of N = 10 and 2N = 20 gametes (double our previous population size). Figure 4.1b shows the results of such a simulation repeated 20 times. As can be seen from Figure 4.1b, there are random deviations from 0.5, so genetic drift is still operating in this larger but still finite population. By comparing Figures 4.1a and b, an important property of genetic drift is revealed: the simulated frequencies are more tightly clustered around 0.5 when 2N = 20 (Figure 4.1b) than when 2N = 10 (Figure 4.1b). This means that, on the average, the observed allele frequencies deviate less from the expected allele probability when the population size is larger. Thus, the amount of evolutionary change associated with random

Genetic Drift

sampling error is inversely related to population size. The larger the population, the less the allele frequency will change on the average. Hence, genetic drift is most powerful as an evolutionary force when N is small. In the coin box experiments, the outcome was about equally likely to deviate above and below 0.5 (Figures 4.1a and b). Hence, for a large number of identical populations, the overall allele frequency remains 0.5, although in any individual population, it is quite likely that the allele frequency will change from 0.5. The fact that deviations are equally likely above and below 0.5 simply means that there is no direction to genetic drift. Although we can see that finite population size is likely to alter allele frequencies due to sampling error, we cannot predict the precise outcome or even the direction of the change in any specific population. The coin box simulations given in Figure 4.1 only simulate one generation of genetic drift starting with an initial allele frequency of 0.5. The coin box simulations do not simulate the impact of drift over multiple generations because the probability of a coin flip producing an H allele remains unchanged at 0.5. However, suppose drift caused the allele frequency to change from 0.5 to 0.6 in one particular population. How about the next generation? Is it equally likely to be above or below 0.5, as it was in the first generation and will always be in our coin flip simulations? The answer is no, drift at one generation is always centered around the allele frequency of the previous generation, and allele frequencies in more ancient generations are irrelevant. Thus, after the allele frequency drifts to 0.6 from 0.5, the probability of drawing an H allele is now 0.6 and sampling error in the second generation is centered around 0.6 and not 0.5. This in turn means that after two generations of drift and given that the first generation experienced a deviation above 0.5, it is no longer true that deviations will be equally likely above and below 0.5. Once the population drifted to a frequency of 0.6, the next generation’s allele frequency is more likely to stay above 0.5. Under genetic drift, there is no tendency to return to ancestral allele frequencies. With each passing generation, it becomes more and more likely to deviate from the initial conditions. The action of drift over several generations can be simulated using a computer in which each generation drifts around the allele frequency of the previous generation. Figure 4.2 shows the results of 20 replicates of simulated drift in diploid populations of size 10 (2N = 20) over multiple generations, and Figure 4.3 shows the results in populations of size 25 (2N = 50). In both cases, the initial allele frequency starts at 0.5, but, with increasing generation number, more and more of the populations deviate from 0.5, and by larger amounts. As can be seen by contrasting Figure 4.2 (N = 10) with Figure 4.3 (N = 25), the smaller population size tends to have more radical changes in allele frequency in a given amount of time, as was shown in Figure 4.1 for one generation. However, Figure 4.3 shows that even with the larger population size of 25, substantial changes have occurred by generation 10. In general, we expect to obtain larger and larger deviations from the initial conditions with increasing generation time. Figures 4.2 and 4.3 show that N determines the rate of change caused by drift and that even large populations can be affected by drift if given enough time. The evolutionary changes in allele frequencies caused by genetic drift accumulate with time. Also, note in these simulations (particularly for N = 10) that eventually populations tend to go to allele frequencies of 0 (loss of the allele) or 1 (fixation of the allele). Genetic drift, like any other evolutionary force, can only operate as an evolutionary force when there is genetic variability. Hence, as long as p is not equal to 0 or 1, drift will cause changes in allele frequency. However, once an allele is lost or fixed, genetic drift can no longer cause allele frequency changes (all evolution requires genetic variation). Once lost or fixed, the allele stays lost or fixed, barring new mutations or the reintroduction of allelic variation by genetic interchange with an outside population. Genetic drift is like a room with flypaper on all the walls. The walls represent loss and fixation, and

79

Population Genetics and Microevolutionary Theory

Generation 1 4 2 0 Number of Simulations With a Given Allele Frequency

80

Generation 2 2

0

Generation 5

2

0

Generation 10

4

2

0 0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Frequency of H Alleles in a Simulated Population

Figure 4.2 Results of simulating 20 replicates of a finite population of size 10 (2N = 20) for 10 generations starting from an initial gene pool of p = 1/2. The distribution of allele frequencies is shown after 1, 2, 5, and 10 generations of genetic drift.

sooner or later (depending upon population size, which in this analogy is directly related to the size of the room), the fly (allele frequency) will hit a wall and be “stuck.” Genetic drift causes a loss of genetic variation within a finite population. In Figures 4.1–4.3, we simulated several replicates of the initial population. Now suppose that several subpopulations are established from a common ancestral population such that they are all genetically isolated from one another (that is, no gametes are exchanged between the subpopulations). Population subdivision into isolated demes is called fragmentation. Figures 4.1–4.3 can

Genetic Drift

Generation 1 2

Number of Simulations With a Given Allele Frequency

0 Generation 2 2

0 Generation 5

2

0 Generation 10

2

0 0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Frequency of H Alleles in a Simulated Population

Figure 4.3 Results of simulating 20 replicates of a finite population of size 25 (2N = 50) for 10 generations starting from an initial gene pool of p = 1/2. The distribution of allele frequencies is shown after 1, 2, 5, and 10 generations of genetic drift.

therefore also be regarded as simulations of population fragmentation of a common ancestral population such that the fragmented subpopulations are all of equal size. Note that the ancestral gene pool is the same (p = 0.5) in all the populations simulated. Therefore, these same figures allow us to examine the role of genetic drift upon fragmented populations. Now, we shift our focus from the evolution within each fragmented deme to the evolution of changes between subpopulations. Because of the genetic isolation under fragmentation, drift will operate independently in each subpopulation. Because of the randomness of the evolutionary direction of drift, it is unlikely that all the independent subpopulations will evolve in the same direction. This is shown in Figures 4.1–4.3 by regarding the replicate elements of the histograms as isolated subpopulations. The spread of these histograms around the initial allele frequency shows that different subpopulations evolve

81

82

Population Genetics and Microevolutionary Theory

away from 0.5 in different directions and magnitudes. Thus, although each subpopulation began with the same allele frequency, they now have many different allele frequencies. Genetic drift causes an increase of allele frequency differences among finite subpopulations. All of these properties of genetic drift have been demonstrated empirically by Buri (1956), as shown in Figure 4.4. He initiated 107 populations of 8 males and 8 females of the fruit fly Drosophila melanogaster, all with two eye color alleles (bw and bw75) at equal frequency. He then followed the

Generation

Number of Populations Fixed for bw

Number of Populations Fixed for bw75

1

0

0

2

0

0

3

0

0

4

0

1

5

0

2

6

1

3

7

3

3

8

5

5

9

6

6

10

7

8

11

11

10

12

12

17

13

12

18

14

14

21

15

18

23

16

23

25

17 18

26 27

26 28

19

30

28 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 Number of bw75 Alleles

Figure 4.4 Allele frequency distributions in 107 replicate populations of Drosophila melanogaster, each of size 16 and with discrete generations. Source: Buri (1956). © 1956, The Society for the Study of Evolution.

Genetic Drift

evolutionary fate of these replicate populations for 19 generations. Note the following from his experimental results shown in Figure 4.4:

• • • •



When allele frequencies are averaged over all 107 populations, there is almost no change from the initial allele frequencies of 0.5. Drift has no direction. The chances of any particular population deviating from 0.5 and the magnitude of that deviation increase with each generation. Evolutionary change via drift accumulates with time. With increasing time, more and more populations become fixed for one allele or the other. By generation 19, over half of the populations had lost their genetic variation at this locus. Ultimately, all populations are expected to become fixed. Drift causes the loss of genetic variability within a population. As alleles are lost by drift, it is obvious that many copies of the remaining allele have to be identical-by-descent. For example, the original gene pool had 16 bw alleles in it, but those populations that are fixed for the bw allele have 32 copies of that allele. Moreover, some of these copies of the bw allele found in a fixed subpopulation are descended via DNA replication from just one of the original bw copies found in the initial population. When two or more copies of a gene are of the same allelic state and descended via DNA replication from a single common gene in some initial reference population, the genes are said to be identical-by-descent, as noted in Chapter 3. In the subpopulations fixed for the bw allele in Buri’s experiments (and similarly for those fixed for bw75), many individuals will be homozygous for alleles that are identicalby-descent. This means that the copy of the gene the individual received from its mother is identical-by-descent to the copy it received from its father. As fixation proceeds, homozygosity from identity-by-descent tends to increase with each succeeding generation subject to drift. Drift causes the average probability of identity-by-descent to increase within a population. All populations started out with identical gene pools, but with time, the populations deviate not only from the ancestral condition but from each other as well. For example, at generation 19, 30 populations are fixed for bw and 28 for bw75. These populations no longer share any alleles at this locus, even though they were derived from genetically identical ancestral populations. Drift causes an increase of genetic differences between populations.

Founder and Bottleneck Effects As shown in the previous section, genetic drift causes its most dramatic and rapid changes in small populations. However, even a population that is large most of the time but has an occasional generation of very small size can experience pronounced evolutionary changes due to drift in the generation(s) of small size. If the population size grows rapidly after a generation of small size, the increased population size tends to decrease the force of subsequent drift, thereby freezing in the drift effects that occurred when the population was small. These features are illustrated via computer simulation in Figure 4.5. Figure 4.5a shows four replicate simulations of genetic drift in populations of size 1000, over 100 generations, with an initial allele frequency of 0.5. Figure 4.5b shows parallel simulations, but with just one difference: at generation 20, the population size was reduced to 4 individuals and then immediately restored to 1000 at generation 21. In contrasting Figure 4.5a with 4.5b, the striking difference is the radical change in allele frequency that occurs in each population during the transition from generation 20 to 21, reflecting drift during the generation of small size. However, there is relatively little subsequent change from the allele frequencies that existed at

83

Population Genetics and Microevolutionary Theory

(a) 1

Allele Frequency

0.8

0.6

0.4

0.2

0 0

20

40

0

20

40

60

80

100

60

80

100

(b) 1

0.8 Allele Frequency

84

0.6

0.4

0.2

0 Generation

Figure 4.5 A computer simulation of genetic drift in four replicate populations starting with an initial allele frequency of 0.5 over a period of 100 generations. In panel (a), the population size is kept constant at 1000 individuals every generation. In panel (b), the same replicates are repeated until generation 20, at which point the population size is reduced to 4 individuals. The population size then rebounds to 1000 individuals at generation 21 and remains at 1000 for the remainder of the simulation to simulate a bottleneck effect.

generation 21. Thus, the pronounced evolutionary changes induced by the single generation of small population size are “frozen in” by subsequent population growth and have a profound and continuing impact on the gene pool long after the population has grown large. These computer simulations show that genetic drift can cause major evolutionary change in a population that normally has a large population size as long as either:



the population was derived from a small number of founding individuals drawn from a large ancestral population (founder effect), or

Genetic Drift



the population went through one or more generations of small size followed by subsequent population growth (bottleneck effect).

We will now consider some examples of founder and bottleneck effects. There are many biological contexts in which a founder event can arise. For example, there is much evidence that individuals of Hawaiian Drosophila (fruit flies) are on rare occasions blown to a new island on which the species was previously absent (Carson and Templeton 1984). Because this is such a rare event, it would usually involve only a single female. Most Drosophila females typically have had multiple matings and can store sperm for long periods of time. A single female being blown from one island to another would often therefore carry over the genetic material from two or three males. Hence, a founder size of four or less is realistic in such cases. (Single males could also be blown to a new island, but no population could be established in such circumstances.) If the inseminated female found herself on an island for which the ecological niche to which she was adapted was unoccupied, the population size could easily rebound by one or two orders of magnitude in a single generation, resulting in a situation not unlike that shown in Figure 4.5b. Founder events are also common in humans. One example of both a founder effect and a bottleneck effect is given by Roberts (1967, 1968). Tristan da Cunha is an isolated island in the Atlantic Ocean. With the exile of Napoleon on the remote island of St. Helena, the British decided to establish a military garrison in 1816 on the neighboring though still distant island of Tristan da Cunha. In 1817, the British Admiralty decided that Tristan da Cunha was of no importance to Napoleon’s security, so the garrison was withdrawn. A Scots corporal, William Glass, asked and received permission to remain on the island with his wife, infant son, and newborn daughter. A few others decided to remain and were joined later by additional men and women, some by choice and some due to shipwrecks. Altogether, there were 20 initial founders. The population size grew to 270 by 1961, mostly due to reproduction but with a few additional immigrants. The growth of this population from 1816 to 1960 is shown in Figure 4.6. Because there is complete pedigree information over the entire colony history, the gene pool can be reconstructed at any time as the percentage of genes in the total population derived from a particular founding individual (Figure 4.7). This method of portraying the gene pool can be related to our standard method of characterizing the gene pool through allele frequencies by regarding each founder as homozygous for a unique allele at a hypothetical locus. Then, the proportion of the genes derived from a particular founder represents the allele frequency at the hypothetical locus of that founder’s unique allele in the total gene pool. The top histogram in Figure 4.7 shows the gene pool composition in 1855 and 1857. Note from the population size graph in Figure 4.6 that a large drop in population size occurred between those years. This was caused by the death in 1853 of William Glass, the original founder. Following his death, 25 of his descendants left for America in 1856. This bottleneck was also accentuated by the arrival of a missionary minister in 1851. This minister soon disliked the island, preaching that its only fit inhabitants were “the wild birds of the ocean.” Under his influence, 45 other islanders left with him, thereby reducing the population size from 103 at the end of 1855 to 33 in March 1857. Note that in going from 1855 to 1857, the gene pool composition changes substantially; the relative contributions of some individuals show sharp decreases (founders 1 and 2) whereas others show sharp increases (founders 3, 4, 9, 10, 11, and 17). Moreover, the genetic contributions of many individuals are completely lost during this bottleneck (founders 6, 7, 12, 13, 14, 15, 16, 19, and 20). Thus, the gene pool is quite different and less diverse after the first bottleneck.

85

Population Genetics and Microevolutionary Theory

260 220

Population Size

86

180 140 100 60 20 1820

1840

1860

1880

1900

1920

1940

1960

Year

Figure 4.6 Population size of Tristan da Cunha on December 31 of each year from 1816 to 1960. Source: Roberts (1968). © 1968, Springer Nature.

Figure 4.6 reveals that the population grew steadily between 1857 and 1884. With the exception of a few new immigrant individuals (founders 21–26), the basic shape of the gene pool histograms changes very little in those 27 years (the second histogram from the top in Figure 4.7). In particular, note that there is much less change in these 27 years than in the 2 years between 1855 and 1857. Hence, the changes induced by the first bottleneck were “frozen in” by subsequent population growth. Figure 4.6 shows that a second, less drastic bottleneck occurred between 1884 and 1891. The island has no natural harbor, so the islanders had to row out in small boats to trade with passing vessels. In 1884, a boat manned by 15 adult males sank beneath the waves with the resulting death of everyone on board, making Tristan da Cunha the “Island of Widows.” Only four adult men were left on the island, two very aged, leading many of the widows and their offspring to leave the island. This reduced the population size from 106 in 1884 to 59 in 1891. The third histogram in Figure 4.7 shows the impact of this second bottleneck on the island’s gene pool. As with the first bottleneck, some individual contributions went up substantially (founders 3, 4, and 22), others went down (founders 9 and 10), and many were lost altogether (founders 21, 24, 25, and 26). After this second bottleneck, there was another phase of steady population growth (Figure 4.6). The shapes of the gene pool histograms change little from 1891 to 1961 during this phase of increased population growth (the bottom histogram in Figure 4.7, which excludes the impact of a few additional immigrants). Once again, this shows how subsequent population growth freezes in the changes induced by drift during the bottleneck. As discussed in Chapter 3, the founder and bottleneck effects on Tristan da Cunha also led to pedigree inbreeding, despite a system of mating of avoidance of inbreeding (see Figure 3.5). This is yet another effect of genetic drift: finite population size leads to an increase in the mean inbreeding coefficient (the average probability of uniting gametes bearing alleles identical-by-descent) with time. Each bottleneck accentuates this accumulation of F because the number of founders

Genetic Drift

25 1855

20

1857 15 10 5 0 1 2 3 4 5

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

25 1857

20

1884

Percent Contribution to Gene Pool

15 10 5 0 1 2 3 4 5

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

20 1884 1891

15 10 5 0 1 2 3 4 5

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

25 20

1891 1961 (minus immigrants)

15 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Founder

Figure 4.7 Gene pool changes over time in the Tristan da Cunha population. The gene pool is estimated from pedigree data as the proportion of the total gene pool that is derived from a particular founder (who are indicated by numbers on the x-axis). Each histogram contrasts the gene pool at two times, as indicated by the legends.

contributing to the gene pool goes down after each bottleneck event, making it more likely that the surviving individuals must share a common ancestor. Thus, founder and bottleneck effects usually increase pedigree inbreeding.

87

88

Population Genetics and Microevolutionary Theory

Genetic Drift and Disequilibrium Just as drift causes changes in allele frequencies, it also changes multi-locus gamete frequencies. Genetic drift tends to create linkage disequilibrium (LD) and associations between loci by chance alone (Hill and Robertson 1968). As we consider more and more loci simultaneously, we subdivide any finite gene pool into more and more gametic categories, thereby tending to make any one particular gamete type rarer. Sampling error is a strong force of evolutionary change for any gamete type that is rare in a gene pool, so, in general, genetic drift is a more powerful force for altering gamete frequencies at the multi-locus level than allele frequencies at the single locus level. The increased sensitivity of LD to drift compared to allele frequencies is illustrated by a study of 34 X-linked microsatellite loci in the United Kingdom and in 10 regions in Scotland (Vitart et al. 2005) – the urban region of Edinburgh and nine rural regions. The rural regions had smaller populations sizes (but still in the tens of thousands or above) and less gene flow. There was little overall differentiation among these subpopulations in allele frequencies, but these areas showed large differences in the amount of linkage disequilibrium. Because several X-linked loci were available, one convenient measure of overall disequilibrium is the map distance in centiMorgans (cM) at which the LD on the X-chromosome is half the difference between its maximum and minimum values, called the LD-half distance. These are plotted in Figure 4.8. The large populations in the United Kingdom and in Edinburgh had no overall LD by this measure, but several of the rural regions did have significant overall disequilibrium. Thus, differentiation could be observed at the level of LD even though it was absent in terms of single locus allele frequencies. Founder and bottleneck effects are particularly effective in creating LD and chance associations. If the loci are closely linked, the specific associations created by a founder or bottleneck episode can persist for many generations. For example, LD occurs in a human population living in southern Italy west of the Apennine mountain range (Filosa et al. 1993). A 3 Mb telomeric region of the

UK Edinburgh (448,624) Galloway (147,765) Argyll (91,306) Angus (108,400) Borders (106,764) Grampian (313,811) Shetland (21,968) Orkney (19,245) W. Ross/Skye (13,500) Lewis (20,473) 0.00

0.25

0.50 0.75 1.00 1.25 LD half Distance (cM)

1.50

Figure 4.8 Linkage disequilibrium (LD) half distances in centiMorgans (cM) in 10 Scottish regions and unrelated UK subjects on the basis of 34 X-linked microsatellite markers. The estimates of the LD-half distances are indicated by blackened circles, and the lines indicate the 90% confidence intervals. The number in parenthesis is the 2001 census size for the 10 Scottish regions. Source: Vitart et al. (2005). © 2005, Elsevier.

Genetic Drift

human X chromosome contains the genes for the enzyme glucose-6-phosphate dehydrogenase (G6PD) and red/green color vision. Nearly 400 distinct mutants are known at the G6PD locus that result in a deficiency of G6PD activity, which in turn can cause hemolytic anemia in individuals hemizygous or homozygous for a deficient allele. This population west of the Apennines has a unique deficiency allele (Med 1), indicating both a founder effect and the relative genetic isolation of this area. Most remarkably, all Med 1 G6PD-deficient males also had red/green color blindness (which is controlled by a small complex of tightly linked genes). Interestingly, on the nearby island of Sardinia, there is also remarkable homogeneity for G6PD-deficient alleles (Frigerio et al. 1994), consistent with a founder effect most likely due to Phoenician contact with the island in the fifth century BCE (Filippi et al. 1977). But in contrast to the southern Italian population, the Sardinian population has a nearly complete absence of color blindness in G6PD-deficient males. Hence, both of these populations influenced by founder effects display significant LD between G6PD and red/ green color blindness, but in opposite directions! Such is the randomness in associations created by genetic drift. Indeed, Hill and Robertson (1968) showed that under broad conditions, the standard LD measure D (Chapter 2), which is signed, has an expected value of 0 under a broad range of conditions, indicating that LD itself has no direction under drift, as we saw at the one-locus level. Hill and Robertson (1968) therefore argued that r2 (Eq. 2.16), which eliminates the effect of the sign of D, is a better measure of LD induced by genetic drift. Indeed, they showed that r2 induced by genetic drift depends upon the product of population size with the recombination rate over a broad range of conditions. This is not surprising when we recall that the numerator of r2 is D2 (Eq. 2.16) and that D has an expected value of 0. Hence, the expected value of r2 is proportional to the variance of D, and just as at the single locus level, genetic drift increases the variance of two locus gamete frequencies in a manner inversely proportional to N.

Genetic Drift, Disequilibrium, and System of Mating The patterns of LD (linkage disequilibrium) found within a population can be strongly influenced by the interaction between genetic drift and assortative mating. Recall the two-locus model of assortative mating given in Table 3.4 in Chapter 3. This simple model, and multi-locus assortative mating models in general, has multiple evolutionary equilibria with different patterns of linkage disequilibrium, as can be calculated from the equilibrium gamete frequencies shown in Table 3.5. In the deterministic model given in Chapter 3, the equilibrium to which a population evolves is determined only by the initial conditions of the gene pool. For the particular model shown in Table 3.4, the evolutionary outcome depended upon the relative magnitudes of the allele frequencies of the A and B alleles (Table 3.5). When the populations are regarded as being of finite size, genetic drift can also strongly influence the evolutionary outcome. This is particularly true when the initial population starts near a boundary condition that separates different potential evolutionary outcomes. For example, suppose that we had a finite population whose system of mating and genetic architecture was described by Table 3.4 and that had a gene pool with pA ≈ pB. Because genetic drift alters allele frequencies, it would be virtually impossible to predict which of the three evolutionary outcomes shown in Table 3.5 would actually evolve. Chance events in the initial few generations that made one allele more common than another would be amplified by the evolutionary force of assortative mating in the later generations. Only when the gene pool has been brought sufficiently close to one of the equilibrium states can assortative mating exert strong evolutionary pressures to make it extremely unlikely that drift would take the population

89

90

Population Genetics and Microevolutionary Theory

into the domain of a different equilibrium. In general, when multiple equilibria exist in a population genetic model, genetic drift can play a major role in determining which evolutionary trajectory is realized. Disassortative mating can also strongly interact with drift-induced linkage disequilibrium, particularly after founder or bottleneck effects. For example, the fruit fly D. melanogaster has a pheromone system leading to strong disassortative mating that is genetically controlled by just a handful of loci scattered across the genome (Averhoff and Richardson 1974, 1976). As seen in Chapter 3, disassortative mating is expected to maintain high heterozygosity at these pheromone loci and at loci in disequilibrium with them. However, as we also saw in Chapter 3, disassortative mating rapidly destroys linkage disequilibrium. Therefore, in a large population, disassortative mating at these pheromone loci is not expected to have much effect at other loci. D. melanogaster has only a few chromosomes (an X chromosome, two major autosomes, and a very small autosome), no recombination in males, and low levels of recombination within these chromosomes in females. As a consequence of these genomic and recombinational features, virtually every locus in the entire genome will be induced to have LD with at least one of the pheromone loci when a severe bottleneck or founder event occurs in D. melanogaster. Hence, disassortative mating at the pheromone loci will also effectively cause disassortative mating at all loci for a few generations after the bottleneck or founder event. Although this disequilibrium will rapidly dissipate, if the small population size does not persist for many generations, this combination of disassortative mating and temporary disequilibrium insures that very little overall genetic variation will be lost due to drift as a consequence of the bottleneck or founder event. As a result, D. melanogaster is buffered against severe losses of allelic variation under temporary population bottlenecks. In contrast, other species of Drosophila (e.g. D. pseudoobscura, Powell and Morton 1979) do not have this pheromone mating system, and most species of Drosophila have much more recombination per unit of DNA in their genomes than D. melanogaster. As a result, the evolutionary impact of founder or bottleneck effects can vary tremendously from one species to the next, even within the same genus. This heterogeneity in evolutionary response to founder effects has indeed been empirically demonstrated in Drosophila and other organisms (Templeton 1999c, 2008a, 2014) and serves to remind us that evolutionary outcomes are not determined by a single evolutionary force such as genetic drift, but rather arise from an interaction of multiple evolutionary factors (in this case, an interaction between drift, recombination, and system of mating). Hence, there is really no such thing as the founder effect; rather, there are many types of founder effects with diverse evolutionary impacts depending upon their context. Treating all founder events as the same phenomenon is an example of unjustifiable unidimensional thinking. In the previous chapter, we noted that there were strong interactions between assortative and disassortative mating systems when populations with distinct allele frequencies are brought together and begin to interbreed. We saw earlier in this chapter that isolated subpopulations will always diverge from one another in terms of allele frequencies due to genetic drift. Divergence at single locus allele frequencies was sufficient to generate LD under admixture, as shown in Chapter 3. Now, we see that isolated subpopulations will not only diverge in terms of allele frequencies they will also diverge in terms of multi-locus associations within each of their gene pools (as illustrated by the opposite associations of G6PD deficiency alleles and color blindness in southern Italy versus Sardinia). The LD induced by genetic drift accentuates the evolutionary impact of admixture because it makes the initial gene pools of the subpopulations even more divergent from one another. Hence, if two or more subpopulations have become genetically differentiated from one another due to drift and then are brought back together, assortative mating on any trait (whether genetic or not) that is correlated with the historic subpopulations will have a major impact on

Genetic Drift

preserving both their allele frequency differences and their internal patterns of LD long after genetic contact between the subpopulations has been reestablished. The evolutionary interaction between assortative mating and drift-induced disequilibrium has major implications for studies on associations between genetic markers and disease in countries such as the United States in which several genetically differentiated populations have been placed together with patterns of assortative mating that are correlated with the historical origins of the populations. Pooling all the populations together for studying clinical associations is a poor strategy because the extensive LD induced by admixture (Chapter 3, Eq. 3.6) can create spurious associations between markers scattered throughout the entire genome with the disease, making disequilibrium useless for mapping causative loci. For example, hypertension (high blood pressure) is more common in African Americans than in European Americans and is associated with differences in psychosocial stress between these two groups (Dressler et al. 2005). These two American subpopulations also differ in allele frequencies at several loci that determine blood group antigens (Workman 1973). For example, the Ro allele at the Rh blood group locus has a frequency between 0.4 and 0.6 in various African American populations, whereas this same allele has a frequency of less than 0.03 in Americans of English ancestry (Workman 1973). Therefore, if one pooled African and European Americans together in a study on hypertension, one would find a strong association of hypertension with the Ro allele, even though there is no evidence that the Rh locus is directly related to hypertension. One must also keep in mind that genetic drift causes disequilibrium to be random in direction, and therefore the association of a disease with a particular marker allele has no universal validity. Such association studies should be used to determine genome location only of the disease influencing genes, and the marker alleles should never be used to make disease predictions in other human populations who may not have the same pattern of linkage disequilibrium. For example, by performing separate analyses of association in the southern Italian and Sardinian populations (as was actually done), strong associations would be found between color blindness and G6PD deficiency, thereby implying that the marker locus of color blindness is closely linked to a gene influencing the disease G6PD deficiency (as indeed it is). However, suppose we had only studied the southern Italian population west of the Apennines. In that population, we would have found that color blind males were more likely to have G6PD deficiency because of that population’s disequilibrium. If we then generalized from this study and incorrectly claimed that color blindness is a genetic marker indicating a predisposition to G6PD deficiency, what would happen when we applied this “knowledge” to the people living in Sardinia? There, we would be telling the people with low risk for G6PD deficiency (color blind males) that they had high risk! Because the disequilibrium induced by drift is random in direction and admixture in humans is incomplete due to assortative mating for all sorts of traits (Chapter 3), disease/marker association studies in humans should not be generalized beyond the actual populations studied. This is an important caution in this age in which “genes for disease X” are announced almost daily, yet almost all of these “genes” are actually identified through association studies with no direct proof of cause and effect.

Effective Population Size As seen above, finite population size has many important evolutionary consequences: increasing the average amount of identity-by-descent, increasing the variance of allele frequencies through time and across populations, causing the loss or fixation of alleles, and generating linkage

91

92

Population Genetics and Microevolutionary Theory

disequilibrium. As also shown by the above examples and simulations, the rate at which these effects occur is roughly inversely proportional to population size. In an idealized population, we can derive a precise quantitative relationship between these evolutionary effects of genetic drift and population size. As with Hardy–Weinberg, the idealized model for studying genetic drift contains many assumptions that are biologically unrealistic but that make the mathematics more tractable. In particular, our idealized case assumes:

•• •• •• •

a diploid population of hermaphroditic, self-compatible organisms, a finite population size of N breeding adults, with no fluctuation in population size from generation to generation, random mating, complete genetic isolation (no contact with any other population), discrete generations with no age structure, all individuals contribute the same number of gametes on the average to the next generation (no natural selection), and the sampling variation in the number of gametes contributed to the next generation is given by a Poisson probability distribution (Appendix B).

The last assumption is often implicit rather than explicit, and at first may seem hard to understand, but it is critical for modeling drift. Because the population size is constant (N) and diploid, each individual must on the average pass on two gametes to the next generation. However, by chance alone, some individuals may successfully pass on 0, 1, 2, 3, or more gametes to the next generation of N adults. Obviously, to the extent that some individuals pass on fewer than average gametes (particularly zero gametes), the effects of sampling error (genetic drift) are going to be accentuated. Hence, it is necessary to make a specific assumption about how uneven successful reproduction is going to be among the adults, and the Poisson model is a mathematically convenient choice. Under these idealized conditions, it is possible to define accurately how finite population size influences the rate of change in F (the average probability of identity-by-descent in the population), variance of allele frequencies, or other genetic features of the population. Of course, real populations deviate from one or more of the assumptions we have made in our idealized case. Effective population size allows us to measure the strength of genetic drift as an evolutionary force in these non-ideal situations. Suppose, for example, we are examining how the average probability of identity-by-descent is accumulating in a finite population such as the Tristan da Cunha population in Figure 3.5. This human population violates every one of the assumptions that we made for our idealized population. But using the initial founders as our reference point (the F‘s in Figure 3.5 were calculated under the assumption of no identity-by-descent in the original founders), we can calculate the average generation times and then observe what F is at a particular generation. We can then ask the question, what value of N do we need to use in our idealized model to yield the same value of F after the same number of generations starting from the same initial condition of F = 0? Whatever value of N that is needed to accomplish this feat is the effective size. The advantage of an effective size is that it allows us to directly measure the strength of genetic drift as an evolutionary force across all real populations using a common reference. This is important because different real populations could violate different assumptions to different degrees. Without a common reference, it would be difficult to make assessments about the role of drift in these populations. For example, suppose there is a founder event of 20 individuals in a population of animals with an equal number of separate sexes, a violation of our idealized reference population. Suppose that otherwise this

Genetic Drift

population is exactly like our idealized population. Now suppose there is another founder population of 40 self-compatible plants with a 50 : 50 mix of random mating and selfing, but otherwise this plant population is exactly like our idealized population. In which population is genetic drift stronger in increasing F? By determining the effective population size with respect to F, we can answer this question. Hence, effective population sizes are measures of the strength of genetic drift in influencing some population genetic feature of interest with respect to a common reference standard. What cannot be emphasized enough is that there is no such thing as the effective size of a population. Just as “inbreeding” is one word with several meanings in population genetics, so is “effective population size.” As indicated above, effective sizes are calculated with respect to some initial reference population and over a specific time period. Alternatively, the initial population can be regarded as being so distant in the past such that the initial state is no longer relevant and effective size is calculated with respect to a long-term equilibrium state. Regardless, the same population today can yield very different effective sizes depending upon what initial reference time is used. Moreover, when real populations deviate from the idealized reference population, the various genetic features used to monitor the impact of drift (F, allele frequency variance, rate of loss or fixation, etc.) can be affected quite differently. This variation in response to deviations from the idealized population means that each genetic feature being monitored requires its own effective size. Therefore, if several genetic features are being monitored, and/or different initial conditions and time frames are being considered, a single real population will require many different effective sizes to describe the effects of drift. It is therefore critical to describe the initial population, the time frame, and the genetic feature of interest before determining an effective size. Without these descriptors, the concept of an effective size is biologically meaningless. In this book, we will primarily focus on two commonly used effective population sizes. They are:

• •

Nef = the inbreeding effective size, which is used to describe the average accumulation of identityby-descent (F) in a population via genetic drift, and Nev = the variance effective size, which is used to describe the variance across generations that is induced in allele frequency via genetic drift, or, alternatively, the variance in allele frequency across replicate subpopulations induced by genetic drift.

Inbreeding Effective Size When the genetic feature of interest is F (the average probability of identity-by-descent in uniting gametes for an autosomal locus in the population), the strength of genetic drift is measured by the inbreeding effective size. First, consider our idealized reference population of constant size N. We start with an initial generation in which we regard all N individuals as being totally unrelated and non-inbred [i.e. F(0) = 0, where F(i) denotes the average probability of identity-by-descent at generation i]. Suppose that these N individuals produce N offspring for the next generation (generation 1). This requires that 2N gametes be drawn from the gene pool. Because the individuals in generation 0 are regarded as unrelated and non-inbred, any two uniting gametes that come from different individuals at generation 0 cannot be identical-by-descent. However, because the individuals are by assumption hermaphroditic and self-compatible, it is possible for both of the gametes involved in a fertilization event to come from the same individual in generation 0. Suppose that one gamete has already been drawn from the gene pool. What is the chance that the second gamete to be united with it is drawn from the same parental individual? Because we have assumed random mating, any particular individual is equally likely to mate with any of the N individuals in the

93

94

Population Genetics and Microevolutionary Theory

population (including itself ). Hence, the chance of drawing two gametes from the same individual is simply 1/N. Given that a self-mating has occurred, the probability that both gametes bear copies of the same gene found in the parental selfer is 1/2 due to Mendelian segregation. Hence, the probability of identity-by-descent in the population is the probability of a self-mating times the probability that both gametes bear copies of the same gene, that is, Probability of Identity by Descent in Generation 1 =

1 1 1 × = =F 1 N 2 2N

41

We can immediately measure the force of genetic drift upon F(1) in this idealized population: F increases proportional to the inverse of 2N, the number of gametes sampled from the gene pool. This means that the average probability of identity-by-descent increases by 1/(2N) due to genetic drift in this generation. Now consider the second generation. First, an individual at generation 2 could have been produced by a selfing event among the parents from generation 1. The contribution to the average identity-by-descent at generation 2 from selfing at generation 1 is once again simply 1/(2N). This also means that the probability that a pair of uniting gametes at generation 2 is not identical-by-descent due to selfing at generation 1 is 1–1/(2N). In addition, even if gametes are not identical-by-descent due to selfing of generation 1 parents, the gametes can still be identical-by-descent due to inbreeding in the previous generation. The probability that two gametes chosen at random from generation 1 are identical-by-descent due to previous inbreeding is F 1 . Note that this is the probability that two gametes bear identical copies of the gene derived from a single individual in the initial generation 0. Hence, the total probability of identity-by-descent at generation 2 is: F 2 =

1 + 2N

1−

1 F 1 2N

42

Equation (4.2) states that the probability of identity-by-descent at generation 2 is due in part to new inbreeding due to selfing induced by drift among the parents at generation 1 [the 1/(2N) term in the right side of 4.2] plus identity due to uniting gametes being derived from the same grandparent at the initial generation [the F(1) term] weighted by the probability that these gametes are not already identical due to selfing [the 1–1/(2N) term]. There was nothing about the derivation of Eq. (4.2) that limits it just to generation 2, so, in general, we can write: F t =

1 + 2N

1−

1 F t−1 2N

43

At this point, the mathematics becomes simpler if we focus upon “heterozygosity” rather than F. Here, “heterozygosity” is not the observed heterozygosity nor the expected heterozygosity under random mating; rather, it is the average over all individuals of the probability that an individual receives two genes from its parents at an autosomal locus that are not identical-by-descent relative to the initial reference generation (the individuals at generation 0 in this model). Thus, individuals who are homozygous for genes that are identical-by-state relative to generation 0 but are not identical-by-descent due to inbreeding subsequent to the initial generation are regarded as “heterozygotes.” For example, consider the experiments of Buri shown in Figure 4.4. The initial generation consists of a gene pool with 16 bw alleles and 16 bw75 alleles. By specifying this initial generation as the reference generation for calculating F in subsequent generations, we regard all 16 copies of the bw allele as different “alleles” even though they are identical in state, and likewise all 16 copies of

Genetic Drift

the bw75 allele are regarded as distinct “alleles.” Because of this convention, if a fly in a later generation is homozygous for bw but the two bw alleles that the fly is bearing trace to different copies of the 16 bw alleles present in the initial generation, this fly is regarded as being “heterozygous.” On the other hand, if the two bw alleles trace back to the same bw allele present in the initial generation, then the fly is regarded as being “homozygous”. Hence, in measuring the impact of drift through an inbreeding effective size, population geneticists make a strict distinction between identity-by-descent relative to the reference generation and identity-by-state. This sometimes creates confusion, because this definition of “heterozygosity” is typically not distinguished verbally from other concepts of heterozygosity (as you can see, many words in population genetics have several meanings). Mathematically, “heterozygosity” in this context is simply the average probability that two uniting gametes in the population are not identical-by-descent relative to the initial reference generation, that is, H t = 1−F t

44

Substituting Eq. (4.3) into the F term of Eq. (4.4), we have 1 1 F t−1 − 1− 2N 2N 1 1−F t−1 1− 2N 1 H t−1 1− 2N

H t = 1− = =

45

Because our initial reference generation has, by definition, F(0) = 0, then H(0) = 1. Hence, by using Eq. (4.5) recursively, we have: H 1 =

1−

1 ,H 2 = 2N

1−

1 2N

2

, …, H t =

1−

1 2N

t

46

We now substitute Eq. (4.4) into (4.6) to express everything in terms of F: 1−F t =

1−

1 2N

t

47

Solving Eq. (4.7) for F yields F t = 1− 1−

1 2N

t

48

Equation (4.8) is a simple mathematical function of N and generation time, t. However, what happens when the actual population deviates from any or all of the idealized assumptions? No matter what these deviations may be, after t generations, there will be some realized level of inbreeding, F(t), in the actual population. Then, the inbreeding effective size (Nef) of the actual population is defined as that number which makes the following equation true: F t = 1− 1−

1 2N ef

t

49

95

96

Population Genetics and Microevolutionary Theory

Solving for Nef in Eq. (4.9), the inbreeding effective size is defined as: N ef =

1 2 1− 1−F t

1 t

4 10

Equations (4.9) and (4.10) make it clear that the inbreeding effective size is the number needed to make the rate of accumulation of identity-by-descent due to drift equal to that of the rate found in an idealized population of size Nef. Note that Nef is determined exclusively from F(t) and t; it has no direct dependence upon the actual population size. With complete pedigree data on a population, it is possible to calculate F for every individual, and therefore F(t) as well. Under these circumstances, it is possible to determine the inbreeding effective size directly from Eq. (4.10) without the use of secondary equations or approximations. An example of such a population was given in Chapter 3, the captive population of Speke’s gazelle founded by one male and three females between 1969 and 1972. The average probability of identity-by-descent in the Speke’s gazelle breeding herd in 1979, regarding the four founders as unrelated and noninbred (and hence the reference generation), was 0.1283. The average number of generations of these animals from the founders was 1.7 generations. Substituting t = 1.7 and F(1.7) = 0.1283 into Eq. (4.10), the inbreeding effective size of the 1979 herd is 6.4, a number which is considerably less than the 1979 census size of 19 breeding animals. This low size is attributable to the founder effect: although there were 19 breeding animals available in 1979, their genes were derived from only four founders and therefore they accumulated inbreeding at a very fast rate, as indicated by the low inbreeding effective size. In 1979, a new management program was instituted that included avoidance of breeding between close biological relatives (Templeton and Read 1983). The first generation bred from the 19 animals available in 1979 consisted of 15 offspring with an average probability of identity-by-descent of 0.1490. Using this value for F (t) and augmenting t by 1 to yield t = 2.7, Eq. (4.10) yields an inbreeding effective size of 8.6. Note that the inbreeding effective size is still smaller than the census size of 15 in this offspring generation, once again showing the persistent effects of the initial founding event. However, note that the inbreeding effective size increased (6.4 to 8.6) even though the census size decreased (19 to 15) when our inference is confined to the animals bred under the new program. This increase in inbreeding effective size reflects the impact of the avoidance of inbreeding (in a system of mating sense), another deviation from our idealized assumption of random mating. In order to focus more directly upon the genetic impact of the Templeton and Read (1983) breeding program that was initiated in 1979, we can now regard the 19 animals available in 1979 as the reference generation. However, there is one complication here. Because these gazelles have separate sexes, selfing is impossible. Selfing was the fundamental source of new inbreeding in the derivations of Eqs. (4.1) and (4.2), but, with separate sexes, the fundamental source of new inbreeding is the sharing of a grandparent rather than a single parent (selfing). This delays the effects of inbreeding by one generation. Hence, if we regarded the 19 animals as completely unrelated and noninbred, their offspring would also be regarded as non-inbred under any system of mating with separate sexes. Therefore, we will make the reference generation the parents of the 19 animals available in 1979 and not just the four original founders from 1969 to 1972. This allows the possibility of inbreeding in the offspring of these 19 animals but reduces the impact of the original founder event and allows us to see more clearly the impact of the breeding program that was initiated in 1979. Using the parents of the 19 animals available in 1979 as the reference generation, the average probability of identity-by-descent of the first generation born under the new breeding program is 0.0207. Since the grandparents of the 15 animals born under the new breeding program are now the

Genetic Drift

reference generation, t = 2, so Eq. (4.10) yields an inbreeding effective size of the first generation of Speke’s gazelle born under the new breeding program of 48.1. Note that the inbreeding effective size is much greater than the census size of 15. This observation reveals yet another widespread fallacy – that effective population sizes have to be smaller than census sizes. This statement, widespread in much of the conservation biology literature, is not true. Always remember that inbreeding effective sizes are determined solely by F and t with no direct dependency upon census size. Since the idealized reference population has random mating (f = 0), a breeding program such as that designed for the Speke’s gazelle that has strong avoidance of inbreeding (recall from Chapter 3 that f = −0.291 under the system of mating established under the new breeding program) will greatly reduce the rate at which average pedigree inbreeding (F ) accumulates, thereby greatly augmenting the inbreeding effective size. Hence, effective sizes can be much larger or much smaller than the census size, depending upon the type of deviations that are occurring from the idealized reference population. Note, also, that we now have two very different inbreeding effective sizes (8.6 and 48.1) for the same 15 animals. The first size tells us how rapidly inbreeding has accumulated since the initial founder event; the later size focuses on the impact of the new breeding program. Hence, both numbers give valuable information. Unfortunately, it is commonplace in the population genetic literature to refer to “the” effective population size as if there were only one effective size for a population. As the Speke’s gazelle example shows, there can be many different effective sizes for the same population as a function of different (and meaningful) reference generations. As we will now see, there are different genetic types of effective size as well.

Variance Effective Size Genetic drift causes random deviations from the allele frequency of the previous generation. Drift also causes variation in allele frequency across replicate subpopulations (e.g. Figure 4.4). These random deviations in allele frequency either across generations or across replicate subpopulations are commonly measured by the variance in allele frequency. To see how the variance of allele frequency can be used to measure the strength of genetic drift, we will consider evolution induced by genetic drift at a single autosomal locus with two alleles, A and a, in a population of size N that satisfies all of our idealized assumptions. Exactly 2N gametes are sampled out of the gene pool each generation in the idealized population. The idealized set of assumptions means that the sample of A and a alleles due to drift follows the binomial distribution (Appendix B), the same type of distribution simulated in Figures 4.1 through 4.3. Let p be the original frequency of A, and let x be the number (not frequency) of As in the finite sample of 2N gametes. Then, with an idealized population of size N, the binomial probability that x = X (where X is a specific realized value to the number of A alleles in the sampled gametes from the gene pool) is: Probability x = X =

2N pX q2N − X X

4 11

where q = 1−p. The mean or expected value of x in the binomial is 2Np and the variance of x is 2Npq (Appendix B). The allele frequency of A in the next generation is x/(2N), so: Expected Allele Frequency = E

x Ex 2Np = = =p 2N 2N 2N

4 12

The fact that the expected allele frequency does not change from one generation to the next reflects the fact that drift has no direction. Because there is no direction to drift, it is impossible to predict

97

98

Population Genetics and Microevolutionary Theory

whether deviations caused by drift will be above or below the previous generation’s allele frequency. Although no change is expected on the average, we know from the previous computer simulations (Figures 4.1 through 4.3) and examples (Figure 4.4) that any particular population is likely to experience an altered allele frequency. The expected squared deviation from the original allele frequency (that is, the variance in allele frequency) measures this tendency of drift to alter the allele frequency in any particular population. This variance is: x Var x 2Npq pq = 2 = 2 = 2N 2N 2N 2N

Variance in Allele Frequency = Var

4 13

When populations deviate from one or more of our idealized set of assumptions, the variance effective size of one generation relative to the previous generation is defined to be the number, Nev, that makes the following equation true: Variance in Allele Frequency =

pq 2N ev

4 14

The variance in Eq. (4.14) can be interpreted either as the expected square deviation of the allele frequency in a particular population from its initial value of p or as the variance in allele frequencies across identical replicate populations all starting with an initial allele frequency of p. As can be seen from Figure 4.4, the allele frequencies in replicate populations become increasingly spread out with increasing time. Mathematically, this means that the variance in allele frequency increases with time under drift, either in terms of variance across replicate populations or the expected variance within a single population. When the reference generation is more than one generation in the past, we must first determine how the variance of allele frequency changes over multiple generations. This determination is done in Box 4.1. From that box, we see that: Variance in Allele Frequency after t generations = Var pt = pq 1 − 1 −

1 2N

t

4 15

where p is the allele frequency at generation 0, q = 1-p, t is the number of generations from the initial reference population, and N is the number of individuals in the idealized population undergoing drift (or the size of each subpopulation when dealing with isolated replicates). As t increases in Eq. (4.15), the variance in allele frequency also increases. This confirms our earlier observation that drift accumulates with time and that deviations from the initial conditions become more and more likely. Also, note that as t goes to infinity, the variance in allele frequency goes to pq. In terms of an experimental set-up like Buri’s (Figure 4.4), the variance of pq is obtained when all populations have become fixed for either the A or a alleles, such that a proportion p of the populations are fixed for the A allele (and therefore have an allele frequency of 1 for A) and q are fixed for the a allele (and therefore have an allele frequency of 0 for A). Under these conditions, the average allele frequency of A overall replicate populations is p 1 + q 0 = p, and the variance of allele frequency is Var allele frequency = E

x −p 2N

2

= p 1−p

2

+ q 0−p

2

= pq q + p = pq

4 16

Hence, Eq. (4.15) also tells us that eventually drift causes all initial genetic variation to become lost or fixed, the only way to achieve the maximum theoretical variance of pq (Eq. 4.16). Equation (4.15) also provides us with our primary definition of the variance effective size, namely, if the actual variance in allele frequency after t generations is σ 2t, then the variance effective size

Genetic Drift

Box 4.1 Allele frequency variance under genetic drift Let xi be the number of A alleles in an idealized finite population of size N at generation i. We will first consider the impact of two successive generations of drift upon the variance of the allele frequency at generation i, pi = xi/(2 N). Because genetic drift has no direction, we know that the expected value of pi is p for every generation i, where p is the frequency of the A allele in the initial reference population (generation 0). By definition, the variance of pi, σι2, is: σ 2i = E pi − p

2

This variance can be expressed in terms of the allele frequencies of two successive generations as: σ 2i = E pi − pi − 1 + pi − 1 − p

2

= E pi − pi − 1

2

+ 2E pi − pi − 1 pi − 1 − p + E pi − 1 − p

= E pi − pi − 1

2

+ 2E pi − pi − 1 pi − 1 − p + σ 2i − 1

2

The mathematical assumptions made about our idealized population give the drift process the statistical property of being a Markov process. This means that the probabilities in going from generation i−1 to generation i are conditionally independent of all previous generations once the outcome at generation i−1 is given. This in turn means that we can separate the expectation operator into two components: E = Ei−1Ei|i−1 where Ei−1 is the expectation at generation i−1, and Ei|i−1 is the expectation at generation i given the outcome at generation i−1. Using this Markovian property, the variance at generation i can be expressed as: σ 2i = Ei − 1 Ei i − 1 pi − pi − 1 = Ei − 1

2

+ 2Ei − 1 pi − pi − 1 Ei i − 1 pi − 1 − p

pi − 1 1 − pi − 1 + 2Ei − 1 pi − pi − 1 Ei i − 1 pi − 1 − p 2N

+ σ 2i − 1 + σ 2i − 1

1 Ei − 1 pi − 1 − p2 − pi − 1 − p 2 + 0 + σ 2i − 1 2N 1 = p − p2 − Ei − 1 pi − 1 − p 2 + σ 2i − 1 2N pq 1 = + σ 2i − 1 1 − 2N 2N =

We already saw that σ 12 = pq/(2 N), so plugging this initial condition into the above equation, we have: σ 22 =

pq pq 1 + 1− 2N 2N 2N

By recursion, we have σ 2i = pq 1 − 1 −

1 2N

i

= pq 1 − 1 −

1 2N

2

99

100

Population Genetics and Microevolutionary Theory

of generation t relative to generation 0 is defined to be the number, Nev, that makes the following equation true: σ 2t = pq 1 − 1 −

1 2N ev

t

4 17

that is, N ev =

1 σ2 2 1− 1− t pq

1 t

4 18

The variance effective size measures how rapidly allele frequencies are likely to change and/or how rapidly isolated subpopulations diverge from one another under genetic drift. As with inbreeding effective size, variance effective size is defined solely in terms of the genetic feature of interest (in this case, the variance of allele frequency), the reference population and time and not in terms of census size. The variance effective size also measures how rapidly a population loses genetic variation under drift as measured relative to the maximum variance in allele frequency (Eq. 4.16). With regard to this last biological interpretation, there is another effective size called the eigenvalue effective size that directly measures the rate at which alleles become fixed or lost. Because of its mathematical complexity, we will not deal with the eigenvalue effective size in any detail in this book. We simply note that the eigenvalue and variance effective sizes both tend to have similar values because they are measuring the same biological phenomenon (loss of variation due to drift) although through different measures. The existence of the eigenvalue effective size also serves as a reminder that many types of effective sizes exist. Once again, there is no such thing as “the” effective size; there are many different effective sizes, each with its own unique biological meaning. Effective size is defined with respect to a genetic feature of interest and a reference generation. As the genetic feature of interest changes, and as the reference generation changes, the effective size changes. Hence, a population can be characterized by several different “effective sizes” simultaneously for a variety of genetic parameters and reference generations. To show this, consider once again the captive herd of Speke’s gazelle. There are no replicate populations of this herd, but because we know the entire pedigree, we can simulate drift at a hypothetical autosomal locus keeping the pedigree structure constant in order to create the analog of multiple replicate populations (MacCleur et al. 1986). Using the original four founding animals as the reference generation, the variance across 10 000 simulated replications of the actual pedigree structure resulted in σ 2/pq = 0.135 for the census population of 15 animals born under the Templeton and Read (1984) breeding program. Recall that these animals are 2.7 generations from the original founding animals, so t = 2.7. Using these values in Eq. (4.18), the variance effective size of these 15 animals relative to the original founders is 9.6. Recall that the inbreeding effective size for these same 15 animals relative to the original founders was 8.6. Thus, the variance effective size is 12% larger than the inbreeding effective size for these particular animals relative to the reference founding generation. There is no a priori reason for these two effective sizes to be equal, and the difference between inbreeding and variance effective size in this case means that these animals accumulated pedigree inbreeding faster than they lost genetic variation, primarily due to the system of mating that existed prior to the initiation of the Templeton and Read (1983, 1984) program. We can eliminate the impact of the system of mating that existed prior to the Templeton and Read (1983, 1984) program by using as the reference population the parents of the 15 animals from the first generation of the breeding program. Using the exact pedigree structure and

Genetic Drift

computer-simulated replicates, the increase in variance in that single generation implies a variance effective size of 20.1. As with the inbreeding effective size calculated earlier, note that the variance effective size in this case is also larger than the census size of 15. Thus, the breeding program is preserving genetic variation in this population at a rate above that of the census size. Once again, the common wisdom that effective sizes are always smaller than census sizes is violated in this case for both variance and inbreeding effective sizes. However, although the variance effective size is only modestly larger than the census size, recall that the inbreeding effective size for these same 15 animals was 48.1, that is, the inbreeding effective size is now 239% larger than the variance effective size for the same population of gazelles. When the four founding animals were the reference generation, the variance effective size was larger than inbreeding effective size (9.6 versus 8.6), but now it is much smaller (20.1 versus 48.1). This illustrates the critical importance of the reference generation for both inbreeding and variance effective sizes. Moreover, the large discrepancy between inbreeding and variance effective sizes under the Templeton and Read (1983, 1984) breeding program serves as a caution that these two types of effective sizes are biologically extremely distinct and that the idea of “the” effective size is meaningless and misleading. The difference between the inbreeding and variance effective number for the Speke’s gazelle relates to a widespread misconception that inbreeding causes a loss of allelic diversity. This misconception stems from observations like that made with the Speke’s’ gazelle herd; this herd is becoming inbred in an identity-by-descent sense and is simultaneously losing allelic variation. The effective size calculations show, however, that these two genetic processes are occurring at different rates, particularly after the initiation of the Templeton and Read (1983) breeding program. Inbreeding, in the system of mating sense (f), has no direct impact on gamete frequencies, and hence does not promote or retard the loss of genetic variation by itself (Chapter 3). How about average pedigree inbreeding (F)? Consider an infinitely large selfing population. Such a population will have a high value of F but no loss of allelic variation due to drift. Thus, inbreeding, as measured by either f or F, has no direct impact on losing allelic variation. What is really going on is a correlation between inbreeding and loss of allelic variation; genetic drift causes a reduction in genetic variability, as measured by Nev, and genetic drift increases inbreeding in the sense of average probability of identity-by-descent, as measured by Nef, leading to a negative correlation between genetic variation and inbreeding in many finite populations. Hence, genetic drift, not inbreeding, is the actual cause of loss of allelic variation in a finite population. This distinction between cause and correlation is critical because correlated relationships can be violated in particular instances. For example, population subdivision (to be discussed in Chapter 6) can under some circumstances decrease the rate of loss of genetic variation but increase the rate of accumulation of identity-by-descent; that is, subdivided populations too can be more inbred but have higher levels of overall allelic diversity than a single panmictic population of size equal to the sum of all of the subpopulations. Hence, in some realistic biological situations, factors that increase pedigree inbreeding result in a greater retention of allelic diversity. Hence, inbreeding does not cause a loss of allelic diversity in finite populations; rather, the loss of allelic diversity is due to finite size itself. In general, inbreeding effective sizes are primarily sensitive to the number of parents and their reproductive characteristics (or other generations even more remote in the past, depending upon the reference generation), whereas variance effective size is primarily sensitive to the number of offspring and their attributes. As an example, the simulations of the bottleneck effect shown in Figure 4.5b dealt with populations that were ideal except for fluctuating population size. In particular, N = 1000, up to and including generation 20, but population size dropped to 4 in generation 21, and then went back up to 1000 in generation 22. Suppose we start with generation 20 as our

101

102

Population Genetics and Microevolutionary Theory

reference generation. What are the inbreeding and variance effective sizes of generation 21? First, we calculate inbreeding effective size. Since gametes are randomly sampled from all 1000 parents to produce the 4 individuals of generation 21, the probability that 2 uniting gametes are derived from the same parent is simply 1/1000. Because there is no inbreeding in the parents (by definition, since we made it the reference generation), the probability that two such gametes involved in a selfing event are identical-by-descent is 1/2. Hence, F(21) = 1/2000 which means that the inbreeding effective size of generation 21 is 1000 even though the census size is 4. Now, consider the variance effective size. Since only 8 gametes are sampled, the variance of p in generation 21 is pq/8, which implies a variance effective size of 4 at generation 21 using generation 20 as the reference generation. Hence, the same four individuals at generation 21 have two very different “effective sizes” (1000 versus 4) for the genetic parameters of identity-by-descent versus variance of allele frequencies relative to generation 20. In non-ideal populations (that is, all real populations), inbreeding and variance effective sizes are generally different. Hence, the phrase “the effective size of the population” is meaningless unless the genetic feature of interest and the reference generation are both specified. We will now examine the differences between inbreeding and variance effective sizes in more detail.

Some Contrasts Between Inbreeding and Variance Effective Sizes Population geneticists have derived several equations that relate inbreeding and variance effective sizes to actual population size under a variety of deviations from the idealized assumptions. For example, suppose the population is not constant but fluctuates from generation to generation, but with all other idealized assumptions holding true. In analogy to Eq. (4.6), the “heterozygosities” at various generations are given by: H 1 =

1−

1 2N 0

H 2 =

1−

1 2N 0

1−

×

t

H t = 1−F t =

1− i=1

1 2N 1

4 19

1 2N i − 1

t

rather than [1−(1/2N)] for H(t) as in Eq. (4.6). Using Eq. (4.7), we have t

1−

1−F t = i=1

1 2N i − 1

1−

1 2N ef

t

4 20

So, t × ln 1 −

1 2N ef

t

ln 1 −

= i=1

1 2N i − 1

4 21

where ln is the natural logarithm. If all the N’s are large numbers, then using a Taylor’s series expansion from calculus, ln[1–1/(2N)] ≈ 1/(2N). Under the assumption that all Ns are large, an approximation to 4.21 is:

Genetic Drift

t = 2N ef

t

1 2N i −1 i=1

4 22

Solving Eq. (4.22) for Nef, an approximation to the inbreeding effective size is (Crow and Kimura 1970): N ef =

t 1 1 + + N 0 N 1

+

1 N t−1

4 23

that is, the inbreeding effective size is approximately the harmonic mean of the population sizes of the previous generations going back to the initial reference generation. Harmonic means are very sensitive to low values. As a consequence, a single generation of low population size can allow drift to have a major impact on the accumulation of identity-by-descent, as we have already seen for founder and bottleneck effects (e.g. Figures 3.4. 4.5, and 4.7). Now, consider the impact of fluctuating population size in an otherwise idealized population upon variance effective size. The equation analogous to Eq. (4.17) now becomes: t

σ 2t = pq 1 −

1− i=1

1 2N i

4 24

In a proof similar to that given for inbreeding effective size, the variance effective size is approximately given by: N ev =

t 1 1 + + N 1 N 2

+

1 N t

4 25

As with the inbreeding size given earlier, the variance effective size is approximated by the harmonic mean of the population sizes. However, there is one critical difference between Eqs. (4.23) and (4.25): in Eq. (4.23), the inbreeding effective size is a function of the population sizes from generation 0 to generation t-1; in Eq. (4.25), the variance effective size is a function of the population sizes from generation 1 to generation t. This shift in the generational indices between 4.23 and 4.25 reflects the greater dependence of inbreeding effective size upon the parents (generations 0 through t−1) versus the greater dependence of variance effective size upon the offspring (generations 1 through t). Whether or not this shift makes much of a difference depends upon the sizes at generation 0 and t. For example, in the Speke’s gazelle captive population, a natural reference generation would be the four founding animals because we had no detailed information about any previous generation. As we already saw, the founder effect of this generation of four animals had a pronounced effect on the inbreeding effective sizes of subsequent generations. However, the reference founder generation of 4 (t = 0) would not directly affect the variance effective size of this population, resulting in variance effective sizes that are larger than inbreeding effective sizes for this herd when the four founders are the reference generation, as we have previously seen. Consider now another single deviation from our idealized situation – a deviation from random mating. As we saw earlier with the Speke’s gazelle, the inbreeding effective size of the first generation of 15 animals bred under the Templeton and Read (1983, 1984) breeding program relative to their grandparents was 48.1, whereas the variance effective size was 20.1. This great excess of inbreeding effective size over census size and variance effective size was primarily due to the system of mating characterized by extreme avoidance of inbreeding (recall from Chapter 3 that f = −0.291

103

104

Population Genetics and Microevolutionary Theory

under the Templeton and Read program). Avoidance of inbreeding had a strong and direct impact upon the rate of accumulation of identity-by-descent, but had only a modest impact upon variance effective size. Likewise, we would expect a system of mating that promoted inbreeding as a deviation from Hardy–Weinberg to decrease inbreeding effective size relative to variance effective size. To see this analytically, we will use one of the mathematical interpretations of f used in the population genetic literature (e.g. Li 1955). Suppose that a population has a system of mating that promotes inbreeding such that a fraction f of individuals is completely inbred (that is, they are homozygous through identity-by-descent), and the remainder of the population (1-f) reproduces through random mating. This mixture results in the same genotype frequencies as other interpretations of f (Eq. 3.2) but is easier to work with analytically in investigating interactions with drift (although this interpretation of f is defined only for f ≥ 0). First, consider the impact of f upon the rate of accumulation of F in an otherwise ideal population of constant size N. The analog of Eq. (4.3) is now: F t = f + 1−f

1 + 2N

1−

1 F t−1 2N

4 26

The first term in the right-hand side of Eq. (4.26) represents the direct impact of the inbreeding system of mating upon identity-by-descent, and the part in the brackets is identical to Eq. (4.3) and represents the impact of genetic drift in the random mating fraction (1-f). Just as we did for Eq. (4.3), we can use 4.26 to predict the average identity-by-descent at generation t in terms of an initial reference generation for which we set F(0) = 0: F t = 1−

1−f

1−

1 2N

t

4 27

Substituting 4.27 into Eq. (4.10), we have that inbreeding effective size under this inbreeding system of mating characterized by f is: N ef =

N 1 + f 2N − 1

4 28

Now consider the impact of f upon the variance of allele frequency given an initial frequency of p. The random mating fraction of the population has the standard variance of pq/(2N). However, the inbred fraction is totally homozygous through identity-by-descent, so individuals from this portion can only pass on one type of gamete per individual. Hence, the variance in the inbred fraction is pq/N. Therefore, the total variance in allele frequency induced by drift is: Variance in Allele Frequency = 1 − f

pq pq pq 1 + f +f = 2N 2N N

4 29

The variance effective size under this deviation from random mating is therefore (see Li 1955 for more details): N ev =

N 1+f

4 30

Note that both Eqs. (4.28) and (4.30) yield effective sizes equal to N when f = 0. This makes sense because we are assuming that the populations are ideal with the only exception of system of mating. When f = 0, the populations are completely ideal, and hence, there is no distinction between an effective size of any type and the census size when all idealized conditions are satisfied.

Genetic Drift

200

Effective Population Size

175 150 125 Nev

100 75 50 25

Nef 50

100 Population Size N

150

200

Figure 4.9 The inbreeding effective size and variance effective size as a function of population size N. The population is assumed to satisfy all the idealized assumptions except for random mating. The population is assumed to have an inbreeding system of mating such that f = 0.1.

When f > 0, however, these two types of effective size are not generally the same. To illustrate the difference between Eqs. (4.28) and (4.30), consider the special case when f = 0.1 and N varies from 2 to 200, as plotted in Figure 4.9. As can be seen from this figure, there is a uniform reduction in the variance effective size relative to the census size for all values of N such that Nev is a linear function of N. In great contrast, Nef quickly levels off close to a value of 5. Why this difference that becomes more extreme with increasing N? The reason is that the system of mating, regardless of population size, is augmenting F by 0.1 per generation in this case. An ideal random mating population of size 5 would also create 0.1 new identity-by-descent every generation [1/(2N) = 0.1 when N = 5]. Hence, as the census size of this nonrandomly mating population becomes increasingly large, the impact of genetic drift becomes increasingly small such that the total rate of accumulation of average identity-by-descent converges to the rate exclusively due to nonrandom mating. This results in an asymptotic inbreeding effective size of 5. Equations (4.28) and (4.30) and Figure 4.9 reinforce our earlier conclusion that inbreeding and variance effective sizes can be influenced in extremely different ways by biologically plausible and common deviations from our set of idealized assumptions. There are many other equations in the population genetic literature that are either special cases or approximations to Eqs. (4.10) and (4.18) that focus upon the impact of specific deviations or sets of deviations from our idealized set of assumptions. For example, Crow and Kimura (1970) give equations for several special cases of inbreeding and variance effective sizes. We will only consider one: when the offspring distribution no longer has a Poisson distribution with a mean and variance of two offspring per individual (the number required for a stable population size in a sexually reproducing diploid species) but rather has a mean of k offspring per individual with a variance of v. Let a population of N individuals fulfill all of the other idealized assumptions. Then, the inbreeding and variance effective sizes of this population of N individuals relative to the previous generation of N0 individuals (note that N = N0k/2) are (Crow and Kimura 1970):

105

Population Genetics and Microevolutionary Theory

N ef =

2N − 1 v k k−1 + 1− k 2N

N ev =

2N v 1− f0 + 1 + f0 k

4 31

where f0 is the deviation from Hardy–Weinberg genotype frequencies in the parental generation. Given that our idealized assumptions include random mating, one may wonder why f0 is appearing. The reason is that finite population size alone induces deviations from Hardy–Weinberg expected frequencies (we already saw this in testing for Hardy–Weinberg genotype frequencies with a chisquare statistic with finite samples; even with random mating, a finite sample is rarely exactly at the Hardy–Weinberg expectations). For an idealized monoecious population, such as the one we have here, the expected value of f0 induced by genetic drift is −1/(2N0–1) = −1/(4 N/k − 1) (Crow and Kimura 1970). We will use this expected value of f0 in all evaluations of 4.31. One simple way of deviating from the idealized set of assumptions is to have k be a number other than 2. If k is less than 2, the population size is declining with time; if k > 2, the population size is increasing. To focus upon the impact of population decline or growth upon the effective sizes, we retain the assumption of a Poisson offspring distribution (v = k) and plot Eq. (4.31) for the special cases of k = 4 and k = 1 over the range of N from 2 to 200 (Figure 4.10). As can be seen, the impact of population increases or declines have very little impact on the variance effective size relative to the previous generation, with Nev ≈ N in both cases. In contrast, the inbreeding effective size is very sensitive to population growth or decline. When the population is declining, the inbreeding effective size is larger than N, and when the population is increasing, it is smaller than N. This difference in the behavior of these two effective sizes reflects the observation made earlier that inbreeding effective size is primarily sensitive to the parental attributes and numbers (which are less than N when the population is growing, and more than N when the population is declining), whereas the variance effective size is primarily sensitive to the offspring attributes and numbers (which is N by definition in both cases).

200

k=

1

175 ef ,

150

N

Effective Population Size

106

125

Ne

100

,k v

,4 =1

k= N ef,

75

4

50 25

50

100

150

200

Population Size N

Figure 4.10 The inbreeding effective size (thick lines) and variance effective sizes (thin lines) as a function of population size N. The population is assumed to satisfy all the idealized assumptions except that it is either increasing with k = 4 (solid lines) or decreasing with k = 1 (dashed lines). The variance effective sizes for k = 4 and for k = 1 are virtually identical and appear as a single thin solid line.

Genetic Drift

We now focus on the impact of deviating from the Poisson assumption by plotting the effective sizes as a function of v, the variance in number offspring, for populations with k = 4 (Figure 4.11) and k = 1 (Figure 4.12) when N = 100. Figures 4.11 and 4.12 reveal that both types of effective size decrease as v increases, that is, the larger the variance in offspring number, the lower both types of effective population size. A contrast of Figures 4.11 and 4.12 reveals an interaction between the demographic parameters of k and v with Nef and Nev. When the population size is increasing, Nev is more sensitive to changes in v than is Nef (Figure 4.11), but the opposite is true when population size is declining (Figure 4.12). One consequence of this interaction is that as v gets larger, the difference between these two types of effective size becomes smaller (for example, contrast the results at v = 1 with v = 10 in Figures 4.11 and 4.12). These two figures reinforce our conclusion that inbreeding and variance effective size are distinct biological measures that should never be equated, and that each is affected in complex but distinct fashions when deviations from our idealized set of assumptions occur. Despite the clear implications of the equations and figures given in this section, the population genetic literature typically ignores the distinctions between inbreeding and variance (and other) effective sizes. For example, Wang (2016) compared several estimators of effective size. Sometimes, Wang (2016) distinguished between inbreeding and variance effective sizes, but often did not. The justification for this was the opinion that all effective sizes are the same “for an isolated or incompletely subdivided population of constant size…” (Wang 2016, p. 4694). This constant population size argument is widespread in much of the recent literature on effective size (e.g. the isolated, constant size argument also appears in Ryman et al. 2019). This claim is patently false and has been known to be false for many decades. For example, consider Figure 4.9 that is based on equations from Li (1955). All the assumptions of an idealized population are held in this figure, including constant population size and isolation, with just one exception: instead of random mating, there

v=k=4

200

Effective Population Size

175 150

Nef

125 100 75 Nev 50 25

2 4 6 8 v, Variance of Number of Offspring Per Individual

10

Figure 4.11 The relationship between inbreeding (thick line) and variance (thin line) effective sizes with the variance in the number of offspring per individual in an otherwise idealized population of size 100 relative to the previous generation and with an average number of four offspring per parental individual. The intersections of the effective size curves with the straight line marked v = k = 4 show the effective sizes obtained under the ideal assumption of a Poisson offspring distribution.

107

Population Genetics and Microevolutionary Theory

200

v=k=1

175 Effective Population Size

108

150

Nef

125 100 75 50

Nev

25 2 4 6 8 v, Variance of Number of Offspring Per Individual

10

Figure 4.12 The relationship between inbreeding (thick line) and variance (thin line) effective sizes with the variance in the number of offspring per individual in an otherwise idealized population of size 100 relative to the previous generation and with an average number of one offspring per parental individual. The intersections of the effective size curves with the straight line marked v = k = 1 show the effective sizes obtained under the ideal assumption of a Poisson offspring distribution.

is system of mating inbreeding with f = 0.1. According to Wang (2016) or Ryman et al. (2019), the variance and inbreeding effective sizes should be identical under these conditions, but they are obviously not and extreme differences between these two effective sizes are possible for these constant size, isolated populations (Figure 4.9). It is never safe to assume that inbreeding and variance effective sizes will ever be the same given any deviation from the idealized population assumptions. Even if we were to assume Wang (2016) or Ryman et al. (2019) were correct that constant size alone collapses all effective sizes into a single effective size, the constant size assumption is almost never true for natural populations. This is particularly true for the area of conservation biology, where effective sizes are frequently used in making management decisions. Almost all populations of concern to conservation biologists are either declining (threatened populations) or increasing (recovering populations), so variance and inbreeding sizes should never be equated in the area of conservation biology, yet they frequently are (Templeton 2011). The dangers of this equation are shown for African rhinoceros populations, which are of great concern to many conservation biologists. Table 4.1 shows the estimated inbreeding and variance effective sizes of several rhinoceros populations (Braude and Templeton 2009). Note that the census size is never equal to either effective sizes for any of these rhinoceros populations. In the Southern White Rhinoceros population, the census size is much larger than either of the effective sizes. This is consistent with the common expectation in much of the population genetic literature that effective sizes will almost always be much smaller than census sizes (e.g. Jamieson and Allendorf 2012, who advocate the “rule of thumb” that the effective size is one-tenth the census size). This common expectation is patently false, as shown by both the Black and Northern White Rhinoceros populations in which both types of effective sizes are larger than the census size, and substantially so in the case of the Black Rhinoceros. As we will see in Chapter 6, when calculating effective sizes in subdivided populations, effective sizes can be orders of magnitude larger than census sizes. The idea that effective

Genetic Drift

Table 4.1

Estimated wild African rhinoceros effective population sizes. Census Size, 1997

Black Rhinoceros Diceros bicornis Southern White Rhinoceros Ceratotherium simum simum Northern White Rhinoceros Ceratotherium simum cottoni

Inbreeding Effective Size

Variance Effective Size

2600

18 840

4189

8440

106

240

23

69

41

Source: Braude and Templeton (2009).

sizes are almost always smaller than census sizes is a myth that has no theoretical nor observational justification. The other major lesson to be taken from Table 4.1 is that inbreeding effective sizes are not equal to variance effective sizes, and often substantially so. Moreover, as clearly shown in Table 4.1, sometimes, the inbreeding effective size is larger than the variance effective size, and, sometimes, the opposite is true. There is no such thing as the effective size for any of these populations. The distinction between variance and inbreeding effective sizes is particularly important in understanding the “50/500 rule” that is often invoked in the management of endangered species. Franklin (1980) introduced this rule for making management decisions about endangered species. The 50/500 rule stipulates that a population needs to be founded from at least 50 individuals, and increased as rapidly as possible to a population size of at least 500. Franklin (1980) makes it clear that both the 50 and 500 are effective sizes and not census sizes. He utilizes only the symbol Ne for effective size throughout his paper and never identifies in equations or text which effective size he is using (one parenthetical phrase does mention variance effective size, but no use of that fact is made in any equation or recommendation). The “50” part of the rule clearly is an inbreeding effective size. Franklin (1980) in this part of the paper is concerned about inbreeding depression and the rate at which pedigree inbreeding accumulates in a finite population. Recall from Eqs. (4.1) and (4.3) that new inbreeding accumulates at a rate of 1/(2Nef). Franklin then notes (p. 140) that “Animal breeders accept inbreeding coefficients as high as a one percent increase per generation (that is, Ne = 50) in domestic animals without great concern.” As pointed out earlier in our discussion on inbreeding depression, different species respond to pedigree inbreeding very differently, and even a single population, such as the captive herd of Speke’s gazelle, can dramatically change its level of inbreeding depression in just a few generations. Hence, we already know that there is no universal rule about toleration to pedigree inbreeding, but this diversity is simply ignored by this rule that is based on the unreferenced opinion(s) of an unspecified number of anonymous breeders of domestic animals. Yet, the fate of some endangered species have hinged on this number of 50 that is based only on anecdotal support from anonymous animal breeders. Indeed, the Speke’s gazelle breeding program of Templeton and Read (1983) was a response to a potential management decision to not waste any more resources on this population because it obviously did not satisfy the “50” part of the rule and therefore was doomed to extinction anyway, at least in the eyes of some bureaucrats. The “500” part of the rule refers to the balance between genetic drift causing a loss of genetic variation versus mutation restoring genetic variation. The “500” is clearly a variance effective size. The basis of Nev = 500 is based on “meager information” (Franklin 1980, p. 142) arising from a single experiment on the drift/mutation balance for abdominal bristle number in homozygous lines

109

110

Population Genetics and Microevolutionary Theory

of Drosophila. Such a balance depends upon the number of loci contributing to the phenotype of interest, the genetic architecture (epistasis, pleiotropy, etc.), and the locus-specific mutation rates in the particular lines in the particular species being studied. Moreover, as we will see in Chapter 6, population subdivision can alter this balance by orders of magnitude. There is no basis for the generality of this recommendation, so the 500 part of the rule is as insubstantial as the 50 part of the rule. Nevertheless, the “wasted resources” argument has been used to argue against conservation of endangered species on the basis of the “500” number as well as the “50” number (Jamieson and Allendorf 2012).

Estimating Effective Population Sizes When complete demographic information exists, particularly through time, it is possible to calculate effective sizes directly, as was done in the case of the captive herd of Speke’s gazelle presented earlier in this chapter. However, very few populations have such complete information, so alternatives have been developed. There are three major ways of estimating the effective sizes of a local deme: incomplete demographic information, a single genetic sampling of the deme, and two or more temporal samples of genetic variation within the deme. Estimating effective sizes from incomplete demographic information utilizes the many equations in the population literature for effective size that examine the impact of one or more deviations from the idealized set of assumptions, such as Eqs. (4.23), (4.25), (4.28), (4.30), and (4.31). These equations and many others can be used to estimate various effective population sizes by the measurement of specific ecological and demographic variables (such as k or v, or the harmonic means of census sizes). Although this patchwork approach breaks the problem of estimating effective population size into more easily managed chunks, one should never forget that equations such as (4.23), (4.25), (4.28), (4.30), and (4.31) are only special cases or approximations to an effective size that include the impact of some ecological and demographic factors but ignore others. Given the strong interactions that exist among the factors influencing effective sizes, an estimation procedure that ignores some factors may often be unreliable. For example, suppose we measured a population’s census size to be 100 and estimated the demographic parameter k to be 1. All unmeasured factors are generally assumed to correspond to the ideal case, so these two demographic measurements would cause us to estimate the inbreeding effective size to be 200 and the variance effective size to be 100 (the intersection with the v = k = 1 line in Figure 4.12). However, suppose that the unmeasured demographic parameter v was in reality 10. In that case, both effective sizes would be about 25 (Figure 4.12). Hence, by ignoring the variance in offspring number in this case, we would have seriously overestimated both types of effective sizes and would have erroneously concluded that the inbreeding effective size was much larger than the variance effective size. Because of the strong interactions among factors, the failure to estimate even a single demographic parameter can seriously affect the entire estimation procedure. Another difficulty in this estimation approach is that much of the population genetic literature presenting such equations simply refers to “the effective population size” and does not specify if the equations they present apply to inbreeding, variance, of some other effective size measure – once again, the fallacy of the effective population size. Consequently, it is not always clear which equation should be used. For example, should Eq. (4.28) or (4.30) be used to correct for an inbreeding system of mating? The answer obviously depends upon whether one wants to estimate the inbreeding effective size or the variance effective size, but confusion often reigns when neither the paper presenting such equations identifies what type of effective size is being studied nor the investigators using the equations know which type of effective size they want to estimate. Further complicating

Genetic Drift

the use of such equations is that many of them look nearly identical superficially, such as Eqs. (4.23) and (4.25), yet can sometimes yield very different results. Moreover, often more than one demographic parameter is estimated, and equations are combined together to yield an estimate of effective sizes. For example, suppose an inbreeding population (f > 0) is followed over several generations and N is recorded each generation. One could correct the N for every generation for inbreeding through either Eq. (4.28) or (4.30), and then correct for fluctuations in N across generations by using the Ns corrected for f in Eqs. (4.23) and (4.25). Only two combinations of these equations are biologically defensible: Eq. (4.28) coupled with Eq. (4.23) to estimate inbreeding effective size, and Eq. (4.30) coupled with Eq. (4.25) to estimate variance effective size. Any other combination is biologically meaningless and does not represent any sort of effective size. Hence, the demographic approach should only be used when the investigator is clear about what type of effective size is to be estimated and is sure that only the specific equations that correct for specific deviations from the ideal are for the same type of effective size to be estimated. This turns out not be an easy task given the state of the literature that rarely identifies the type of effective size being modeled. A second approach to estimating effective sizes is through genetic surveys. This approach in turn can be subdivided into two basic strategies: single sample surveys and temporal surveys that monitor a population over time. Wang (2016) reviewed the single sample estimators and compared their properties through computer simulations. The simplest single sample estimator is based on heterozygosity excess, HE, and this is an estimator of variance effective size. Just as genetic drift induces changes in allele frequency between generations (Figures 4.1–4.4), it will also induce differences in allele frequencies between males and females within a generation in any finite population with separate sexes. For example, suppose that the frequency of the A allele at an autosomal locus is pf in females and pm in males, with the overall frequency of the A allele being, assuming a 50 : 50 sex ratio, p = (pf + pm)/2. Under the idealized assumptions of Hardy–Weinberg that has no genetic drift (in this case meaning that pf = pm = p) and random mating, the frequency of the AA homozygotes is p2, as shown in Chapter 2. However, in this finite population with potentially different allele frequencies in males and females due to genetic drift, the expected frequency of AA under random mating is pf pm. The difference pf pm –p2 = −(pf − pm)2/4. Note that under idealized Hardy–Weinberg, males and females should have the same allele frequencies, so this difference is zero. However, under drift, this difference will always be negative, that is, there is a deficiency in homozygote frequency relative to Hardy–Weinberg equilibrium, and likewise an excess of heterozygote frequency. The (pf − pm)2 term is expected to increase as variance effective size decreases, and this property is used to estimate the variance effective size under the HE method. Obviously, this method is applicable only to populations with separate sexes and that are randomly mating for almost all loci. Even slight deviations in system of mating f from zero can seriously undercut this estimator. Even worse, deviations from random mating (f 0) are often a function of population size itself. Recall from Chapter 3 that f = −1/(N − 10) where N is the ideal population size when matings between relatives up to and including first cousins are avoided, but otherwise mating is random. When N is much greater than 10, these deviations from Hardy– Weinberg are trivial, but when N is small they can be substantial. Many species have mechanisms for avoidance of mating with close relatives, so the HE method becomes unreliable in small populations because mating is significantly nonrandom. Moreover, genetic drift also induces LD (Figure 4.8), which means that even a handful of loci displaying assortative or disassortative mating will influence more and more linked loci through LD as population size gets smaller and thereby induce deviations from Hardy–Weinberg equilibrium. This further undermines the HE method for small populations. Finally, even when all of its idealized assumptions are satisfied, the HE method

111

Population Genetics and Microevolutionary Theory

is less accurate than alternative single sample methods (Wang 2016). Hence, the HE method is a poor choice for estimating variance effective size. The molecular coancestry (MC) method (Nomura 2008) is based on the average coefficient of kinship (Chapter 3, also called coancestry) between non-sibling pairs (which does not require pedigree information, but rather the exclusion of pairs with a coefficient of kinship not significantly different from ¼). This average coefficient of kinship should increase at a rate inversely proportional to Nef in an isolated population and, hence, provides an estimator for the inbreeding effective size. This estimator was imprecise and biased in the simulations of Wang (2016), but these simulations and the worked example given were directed at microsatellite genetic surveys that involve relatively few loci (tens to hundreds) with multiple alleles per locus and much homoplasy. As noted in Chapter 3, such genetic surveys result in poor estimates of both the coefficient of kinship and the pedigree inbreeding coefficient. In contrast, Browning and Browning (2015) used IBD (identity-bydescent) segments (Chapter 3) to exclude half and full siblings and parent–offspring pairs to estimate the inbreeding effective size and checked its accuracy with simulated SNP data. Unlike Wang (2016), Browning and Browning (2015) found this method to be accurate when IBD segments were used, even when population sizes were not constant. Moreover, recall that IBD segment length decreases with time (Chapter 3), so there is also information about past effective sizes in the segment lengths. Depending upon the density of the marker data, Browning and Browning (2015) showed that their MC method can infer inbreeding effective sizes over time from 4 to 200 generations ago, albeit with increasing error. For example, Kardos et al. (2017) used IBD segments to estimate the inbreeding effective sizes for the Baltic collared flycatcher (Ficedula albicollis) for the past 150 generations (Figure 4.13). As can be seen from the 95% confidence intervals (CIs) in Figure 4.13, the estimators are very precise going back to about 30 generations and then become less and less reliable with increasing generations into the past. Kardos et al. (2017) also show that ROH (runs of homozygosity) (Chapter 3) in individuals can also be informative about inbreeding effective size, which is not surprising given that ROH and IBD segments are both measuring identity-by-descent in the genome (Chapter 3). When genomic data are available to identify IBD segments or ROH’s, the MC method provides an accurate method of estimating the inbreeding effective size and its changes through time.

Figure 4.13 Estimates of the inbreeding effective size over time (thick, solid line) based on pairwise IBD segments in the Baltic collared flycatcher population. The dashed lines enclose the 95% confidence interval for Nev. Source: Modified from Figure 11 in Kardos et al. (2017).

40

Nev (thousands)

112

30

20

10

0 0

50

100

Generations back

150

Genetic Drift

Whereas the MC method excludes siblings, Wang (2009) estimated the inbreeding effective size based on the frequency of full- and half-sibling dyads in the sample (the sibling frequency, SF, method). This is a flexible method that performed well in the simulations of Wang (2016), but it is computationally demanding. Like all methods dependent upon estimating coefficients of kinship, it performs best when a large number of markers are used. Moreover, full-siblings are easier to detect than half-siblings, so its performance is poorer when both males and females are polygamous such that half-siblings are much more frequent than full-siblings. This sensitivity can be reduced by using a prior distribution (Appendix B) on sibship frequencies (Wang 2016). A commonly used single sample estimator of variance effective size is based on the fact that genetic drift induces linkage disequilibrium, as shown in Figure 4.8. Hill (1981) derived an LD estimator of the variance effective size from the theoretical dependency of LD upon both genetic drift and recombination rate. Hill’s method has subsequently been improved by a number of authors to reduce various biases, as reviewed in Corbin et al. (2012). An equation that incorporates many of these corrections is (based on combining equations from Barbato et al. 2015 and Corbin et al. 2012): N ev =

1 4f c

1 E r 2adj

where r 2adj = r 2 −

−a 4 32

1 bn

where c is the recombination frequency between the pair of markers, f(c) is a function of the recombination frequency that is typically equated to a classical mapping function such as the Haldane (1919) mapping function, r2 is the LD between the marker pair as measured by Eq. (2.16), r 2adj is the LD adjusted for sample size, n, where b = 2 if the phase in double heterozygotes is known and b = 1 if the phase is unknown, and a is a correction for mutation (more on this shortly). Because the expected or average value of r 2adj is used, marker pairs with similar distances (unlinked, or if linked similar physical or ideally recombinational distances in the genome) are pooled together in a bin (Barbato et al. 2015). One difficulty with this method is the dependency upon the recombination rate, which is often not known. One way of avoiding this problem is to examine the LD between only unlinked markers since the recombination rate is known in that case. This is a frequently used strategy when small numbers of markers are surveyed, but, with genomic surveys, this approach would exclude much of the potential information in the data set. Moreover, simulations reveal that the LD estimator becomes increasingly inaccurate as more linked markers are included when assuming only unlinked markers in the estimation procedure (Wang 2016), so it is important to treat linked markers as linked. This distinction can be biologically useful, as unlinked marker pairs are more sensitive to recent events that affect Nev, whereas linked loci give more information about events in the past (recall from Chapter 2 the slow decay of LD over time for closely linked loci) (Barbato et al. 2015). Few organisms have detailed recombination maps of their genomes, but many do have detailed physical maps, that is, the distance between linked sites is known in base pairs, and it has been assumed in some organisms such as humans, cattle, and horses that the distance between markers in Morgans (a unit of recombination) is proportional to the physical distance (Corbin et al. 2012; Barbato et al. 2015). However, in many organisms, recombination is concentrated into hotspots (including humans, Chapter 1), so this is a poor assumption when dealing with closely linked markers. Typically, the unevenness of the recombination rate across the physical genome is ignored, with unknown consequences on the estimator of variance effective size.

113

114

Population Genetics and Microevolutionary Theory

Table 4.2 Effective size estimates of four populations of Drosophila buzzatii and their 95% confidence intervals (CI) using two single-sample methods: linkage disequilibrium (LD) and sibship frequency (SF). Population

Nev(LD)

95% CI (LD)

Nef(SF)

95% CI (SF)

Mulambin Beach

46

32–76

290

151–1450

Baradine

31

23–45

87

44–373

Maldon

70

36–367

174

93–780

Otamendi

35

27–47

68

34–258

Source: Data and results from Barker (2011).

As shown in Chapter 2, mutation can also create linkage disequilibrium, and this source was ignored in the original equations of Hill (1981). Using theory developed by Ohta and Kimura (1971) under the infinite sites model, a correction of a = 2.2 for Eq. (4.32) was derived and is commonly used (Corbin et al. 2012). However, as noted in Chapter 1, the infinite sites model is a poor model for genomic data such as SNPs, and LD statistics based on this assumption can lead to egregiously wrong inferences (recall the 4-gamete test for recombination in Chapter 2). The impact of realistic mutational models upon the LD estimate of Nev is not yet known, but, in the meantime, it would be wise for users of the LD method to investigate the sensitivity of their estimates to a wide variety of values for the a parameter in Eq. (4.32). The four single-sample estimators discussed above (and others not discussed) are often treated as alternative estimators of a single Ne in papers that compare their statistical properties (e.g. Gilbert and Whitlock 2015) or in software packages that allow the user to implement several methods on their data (e.g. Do et al. 2014). Barker (2011) used several of these methods on a common data set of Drosophila buzzatii. Table 4.2 presents some of the results from Barker’s analyses using two of the estimators discussed above, the LD and SF, on four populations for which both methods yielded finite 95% CIs. As can be seen, the LD and SF methods yield different estimates of effective size in all populations, with some estimates having no or little overlap in their 95% CIs. In every case, the LD estimates are smaller than the SF estimates. Which estimates should we believe in, given these seeming contradictions? The answer is that there is no evidence for any contradiction or incompatibilities between the two estimators in any of the populations. As discussed above, the LD method estimates the variance effective size and the SF method the inbreeding effective size. These are completely different biological parameters, and we have no expectation that they should be the same. Hence, the LD and SF methods are not alternative estimators; rather, they are estimators of different biological parameters. Barker (2011) noted this on page 4468 of his paper: “Different methods are not necessarily estimating the same Ne, they are subject to different bias, and the biology, demography and history of the population(s) may affect different estimators differently. The question ‘What is the true Ne?’ for any population has no answer; it depends on which Ne ….” The other major class of estimators of effective sizes is from temporal sampling. The original definitions of effective sizes in terms of a genetic feature (Eqs. (4.10) and (4.18)) measure how these genetic features change over time. We saw earlier in this chapter how these equations that give the essential definition of effective sizes could be used to estimate inbreeding and variance effective sizes in Speke’s gazelle when complete pedigree data were available. In the absence of pedigree data, the idea of temporal sampling is to use direct genetic measurements instead of pedigree data and observe how they change over time. Accordingly, temporal sampling is often the most direct and accurate way of estimating effective sizes with the fewest additional assumptions.

Genetic Drift

Most of the literature and computer programs for temporal sampling have focused upon allele frequency changes and hence are estimators of the variance effective size. However, genomic surveys now allow the accurate estimation of F, the pedigree inbreeding coefficient, just from genetic survey data (Chapter 3), as shown by the work of Browning and Browning (2015) that was discussed earlier. Hence, genomic surveys at two or more time periods can provide accurate Fs for the sampled individuals, thereby allowing the inbreeding effective size to be estimated from Eq. (4.10). However, one modification needs to be made. Equation (4.10) was derived assuming that F(0) = 0. Letting time 0 correspond to the first sampling period, in general, we will have F(0) > 0, where F (0) is the average of the Fs inferred from the genomic data for the individuals sampled at t = 0. We can now return to Eq. (4.5) to derive a new form of Eq. (4.10) that no longer assumes F(0) = 0 to yield: N ef =

1 1−F t 2 1− 1−F 0

4 33

1 t

Thus, if we had samples at times 0 and t (measured in generations) and their average F’s were determined by a genomic survey, we could use Eq. (4.33) to calculate the inbreeding effective size over the time period t. When the Fs are small, a good approximation to Eq. (4.33) is: N ef ≈

t 2 F t −F 0

=

t 2ΔF

4 34

where ΔF is the change in average F over the time period t. The estimation of variance effective size from temporal sampling is similar to that for inbreeding effective size, except it is based on Eq. (4.17) and changes in allele frequencies rather than Fs. Equation (4.17) can be expressed as: σ 2t 1 = Vs = 1 − 1 − pq 2N ev

t



t 2N ev

4 35

when Nev is large and where Vs is the standardized variance of allele frequency at time t. Krimbas and Tsakas (1971) showed that one could sample a population at two time periods, say times 0 and t (where time is measured in generations) and estimate the frequency of allele i at an autosomal locus in these two samples as xi at time 0 and yi at time t. Krimbas and Tsakas (1971) then showed that one could estimate Vs by: Vs =

1 A x i − yi 2 A i = 1 xi 1 − xi

4 36

where A is the number of alleles at the locus being surveyed. Combining Eq. (4.36) with the approximation given in 4.35, an approximate estimator of the variance effective size is: N ev =

t 2Vs

4 37

There have been many statistical refinements and bias corrections to the estimators 4.36 and 4.37 (Nei and Tajima 1981; Pollak 1983; Wang 2001; Wang and Whitlock 2003; Jorde and Ryman 2007), and programs are available to estimate all of these temporal estimators as well as the single sample estimators in an integrated package (e.g. Do et al. 2014). Table 4.3 shows the applications of many of

115

116

Population Genetics and Microevolutionary Theory

Table 4.3 Variance effective size estimates of a population of Brown Trout (Salmo trutta) using various estimators using temporal samples and their 95% confidence intervals (CIs). Source of Estimation Method

Nev

Pollak (1983)

121

Jorde and Ryman (2007)

95% CI

89–176

81

58–143

Wang and Whitlock (2003), Closed Population

127

106–154

Wang and Whitlock (2003), Open Population

106

89–130

Source: Data and results from Serbezov et al. (2012).

these methods to a common data set on a population of Brown Trout (Salmo trutta) (Serbezov et al. 2012). As can be seen, all the methods produced similar estimates of the variance effective size with extensive overlap in all 95% CIs. Temporal samples can also be used for inference on the underlying demographic parameters that influence the variance effective size. An example of this is given by the work of Renan et al. (2015) on the endangered Asiatic wild ass (Equus hemionus). This species was once abundant in western Asia, including the Negev Desert in southern Israel, but declined throughout its range due to hunting and habitat loss. This species became extinct in the Negev by the 1920s, but imported stock from zoos was used to create a breeding core in Israel. Between 1982 and 1993, 38 Asiatic wild asses were reintroduced to protected areas in the Negev, and by 2014, the population had expanded to about 250 individuals. Blood samples were available for animals from the original breeding core, and fecal samples were available for the wild population in 2014, with five generations separating these two sampling points. These samples were surveyed for microsatellites. The Renan et al. study was not focused on the variance effective size per se, but rather upon a demographic parameter that could strongly influence the variance effective size: the level of polygyny. Direct observations on the reintroduced population showed that the Asiatic wild ass displays resource-defense polygyny in which solitary males vigorously defend territories that contain resources (food, water, etc.) that provide mating opportunities with females that temporarily enter the territories (Saltz and Rubenstein 1995). Although there is much demographic information on this reintroduced population, it is difficult to know how many of the matings are due to the small fraction of territorial dominant males. Indeed, the real level of polygyny had never been measured in any wild population of equids at the time of this study. Because much was known about this population and its demographic history, a forward simulation approach was used that took advantage of this known history to determine the expected distributions of genetic variation after 5 generations under 20 different values of proportions of mating males (PMM), with 5000 simulations for each value of PMM. These simulated distributions were then tested for their goodness of fit to the observed genetic drift among alleles. When shifts in allele frequency were used as the measure of genetic drift, all values of PMM above 25% could be rejected at the 0.05 level of significance or lower. This corresponded well to the observed proportion of territorial males of 27%. To gain additional resolution and statistical power, Renan et al. (2015) used a statistical refinement suggested by Neuwald and Templeton (2013) that not only improves the statistical power but that also simplifies the entire theory of genetic drift. Instead of using allele frequencies to monitor

Genetic Drift

genetic drift, as in Eq. (4.36) and other equations, Neuwald and Templeton (2015) proposed that genetic drift could be better measured by a transformed allele frequency: 4 38

ai = arcsin pi

where pi is the frequency of allele i and the arcsin is measured in radians. The transformation in Eq. (4.38) is called a variance-stabilizing transformation and is relevant to both the fundamental theory of genetic drift and some of the statistical difficulties encountered when trying to estimate variance effective size. This problem is apparent in Eq. (4.13) that gives the variance of the allele frequency induced by genetic drift in an ideal population of size N for an autosomal allele with an initial frequency of p to be pq/(2N). Note that the variance of allele frequency under genetic drift is a function of two parameters: p and N. However, when dealing with effective sizes and the strength of drift, the focus is on N, and p is what is called a “nuisance parameter” in statistics when trying to making inference about N. The variance induced by genetic drift for the transformed allele frequency given by Eq. (4.38) for an autosomal locus is 1/(8 N) [in general, the variance of a transformed frequency is 1/(4n) where n is the sample size, but recall that the sample size for N individuals at an autosomal locus is 2 N genes, thereby leading to the denominator of 8 N]. Note that the variance of the transformed allele frequency is a function of only N, and the nuisance parameter p has totally disappeared. After t generations of genetic drift in a population with variance effective size Nev, using the steps given in Box 4.1 upon the transformed allele frequencies yields that the expected variance in a at an autosomal locus to be t/(8Nev) – an exact result analogous to the approximation given in Eq. (4.35). The statistical importance of the arcsin, square root transformation is also the fact that all alleles have the same sampling variance regardless of their allele frequency. This greatly simplifies the statistical analysis of temporal changes in the a’s as opposed to allele frequencies (e.g. Eq. 4.36). Because the sample size of the core population was small, Renan et al. (2015) used a small sample size correction for this transformation (Bishop et al. 1975): a=

1 2

arcsin

np + arcsin n+1

np + 1 n+1

4 39

where n is the sample size (number of genes sampled). With this stabilized variance transformation, Renan et al. (2015) compared the simulated distributions over all alleles of the statistic: Δa = a5 − a0

4 40

where a5 is the transformed allele frequency at generation 5 and a0 is the transformed allele frequency in the founders. Because of the transformation, the Δa’s have the same sampling and expected drift variances for every allele. Figure 4.14 shows the observed distribution of the Δa’s versus the simulated distributions. As can be seen, the goodness of fit improves considerably below a PMM value of 0.15 (only intervals of 0.05 of PMM were simulated), with the optimal goodness of fit occurring at PMM = 0.10. Hence, only about 10% of the males were contributing their gametes to the gene pool in this reintroduced population. Greenbaum et al. (2018) followed up on this work by investigating the impact of polygyny on variance effective size. They estimated genetically the variance effective size of this wild ass population to be 24.3 (95% CI: 13.8–44.0) with the temporal estimator of Wang (2001) based on allele frequencies. This genetic estimate of variance effective size is an order of magnitude less than the census size. To investigate the role of polygyny in reducing the variance effective size to this low

117

118

Population Genetics and Microevolutionary Theory

0.4

Simulated Distributions

0.2 Δa 0.0 –0.2 –0.4 Observed 0.0 Distribution

0.2

0.6

0.4 PMM

0.8

1.0

Observed Distribution

Figure 4.14 Distributions of the shifts in the transformed allele frequencies. The dots show the observed distribution of the Δa values of the wild population for 29 alleles at 5 generations after reintroduction. To the right of the observed distribution are the simulated distributions for 20 different proportions of mating males (PMM) values incremented by intervals of 0.05. Each shade represents a 5%-quantile of the distribution (darker shades represent higher shifts in transformed allele frequencies). Lower PMM values show more dispersed distributions, indicating stronger genetic drift in these scenarios. Source: Modified from Figure 5 of Renan et al. (2015).

value, Greenbaum et al. (2018) estimated the variance effective size from demographic data as well, adding on components to see their effect. First, they estimated the variance effective size from census sizes, population growth rates, and uneven sex ratios for each generation, using a demographic correction for the sex ratio from Wright (1931) for a single generation: N ev =

4N m N f Nm + N f

4 41

where Nm and Nf are the numbers of adult males and females in a specific generation, respectively. These single generation numbers were placed into Eq. (4.25) to obtain an overall estimator of variance effective size that takes into account the growth and changes in census sizes over the generations. The resulting demographic estimator was 120.9 (114.7–126.5), which was highly significantly different than the genetic estimator of 24.3 (Figure 4.15). Hence, basic demographic information of the type commonly available could not explain the low variance effective size. Next, Greenbaum et al. (2018) added on the effect of polygyny by modifying Eq. (4.41) to incorporate the PMM parameter discussed above: N ev =

4 PMM N m N f PMM N m + N f

4 42

Equation (4.42) reflects the fact that of males present, only a portion of them (PMM) actually reproduce. Figure 4.15 shows the variance effective size that uses demography plus polygyny plotted against PMM values ranging from 0 to 0.25 (recall that the Renan et al. analysis rejected any PMM > 0.25). As can be seen from Figure 4.15, when PMM = 0.106 (0.10–0.11), the variance effective size based on standard demography plus polygyny is identical to the genetically estimated variance effective size of 24.3. This result is completely consistent with the Δa analysis discussed above. Because of extensive monitoring of this population, there was also information about the reproductive success (RS) of females (measured by the number of foals who reach adulthood from a specific

Genetic Drift

Demography Only

120 100

Nev

80 Demography + Heritability of RS 60 Demography + Polygyny 40

Demography + Polygyny + Heritability of RS Genetic Estimate

20 0.00

0.05

0.15

0.10

0.20

0.25

PMM

Figure 4.15 Demographic and life history impacts on variance effective population size (Nev) in the Asiatic wild ass as a function of proportions of mating males (PMM). Lines with a constant value are not functions of PMM. These include the line that is the estimated Nev based on temporal genetic data, the top line that is estimated only from basic demography, and the middle line that is estimated from basic demography and heritable female RS. Only two curves intersect the purple line; the curve that includes basic demography and male polygyny, and the curve that includes basic demography, male polygyny, and heritable female RS. Shading indicates margins of error; vertical continuous lines indicate the point estimates of PMM and dashed lines the ranges when estimated error is considered around the estimates of PMM. Source: Modified from Figure 1 of Greenbaum et al. (2018).

mother) as well as the maternal heritability (see Chapter 8) of RS. This is somewhat analogous to PMM in males as it quantifies the variation among females in their ability to reproduce. Nomura (2002) derived equations to correct variance effective size for RS and its heritability, and when these corrections were added on to the basic demography (but not including male polygyny), the resulting estimate of variance effective size was 69.1 – also significantly different than the genetic estimator of 24.3 (Figure 4.15). However, when male polygyny was also added to basic demography and female heritable RS, the variance effective size from these demographic variables equaled 24.3 when PMM = 0.195 (0.16–0.22), as shown in Figure 4.15. There are several lessons to learn from the results of Greenbaum et al. (2018). First, extreme polygyny was critical for explaining the low value of the variance effective size in this population of Asiatic wild ass. The only demographic models that could fit the genetic observations were those that included male polygyny, and the degree of polygyny had to be extreme (PMM is 0.106 if female RS is ignored and 0.195 when female heritable RS is included). Second, basic demography (census sizes and sex ratios) was inadequate to explain the variance effective size estimated directly from genetic data. Even when data on female reproductive success and its heritability were added (data rarely available in most studies), the demographic approach was still inadequate. Only when the demographic variable of polygyny was added (estimable only from genetic data, not demographic observations) could the model converge to the direct genetic estimate of variance effective size (Figure 4.15). This reinforces the earlier warning that unmeasured demographic parameters can undermine the demographic approach to estimating effective sizes. Third, this analysis shows how insight into the demographic variables influencing variance effective size can be achieved by combining demographic analyses with genetic analyses. Much is to be gained by combining these two approaches to effective size.

119

120

Population Genetics and Microevolutionary Theory

Although powerful, the temporal procedures inherently are limited to estimating effective sizes and demographic parameters over short time intervals since they must use the first sample as the reference generation, and the time span of the study is limited by human or institutional life spans. There are alternative methods for using genetic data to estimate effective sizes over long periods of evolutionary time, as already illustrated in Figure 4.13, and that theme will be amplified in the next chapter. There has been much concern with estimating various types of effective population size because genetic drift is an important evolutionary force in its own right and because the impact of other evolutionary forces in natural populations (such as natural selection) is always overlaid upon genetic drift since all real populations are finite. Moreover, genetic drift has direct implications in conservation biology for the preservation of the most fundamental level of biodiversity: genetic diversity within a species. Genetic diversity is the ultimate basis of all evolutionary change, so biodiversity at this level has an upwardly cascading effect on biodiversity at all other levels: local populations, species, communities, and ecosystems. Genetic drift and its quantification through effective sizes are therefore important topics within population genetics.

121

5 Genetic Drift in Large Populations and Coalescence As shown in the previous chapter, the impact of drift as an evolutionary force is proportional to 1/ (2N) for a diploid system in an idealized finite population of size N. From this, one might be tempted to think that drift is only important in small populations or populations that have experienced bottleneck/founder events, thereby making their effective sizes small. But drift can be important in large populations as well. In this chapter, we will consider two circumstances in which drift can play a major evolutionary role in populations of any size, including extremely large populations. These two circumstances are newly arisen mutations and neutral mutations that have no impact on the bearer’s ability to replicate and pass on DNA to the next generation. We will also examine genetic drift of neutral alleles backwards in time, that is, we will start with the present and look at the drift-induced evolutionary process going from the present to the past. This time-reversed approach is called coalescence, and it offers many new insights into the evolutionary impact of genetic drift, even in very large populations.

Newly Arisen Mutations The first circumstance in which genetic drift can play an important role in both small and large populations involves the evolutionary fate of newly arisen mutations. Whenever a mutation first occurs, it is normally found in only one copy in one individual. Recall that every individual in our idealized population produces an average of two offspring (to maintain constant population size under sexual reproduction) under a Poisson probability distribution and assuming that all mutations are neutral. Under this probability distribution, the probability that this single mutant will leave no copies in the next generation is the probability of having i offspring (e−22i/i! under a Poisson distribution, see Appendix B) times the probability that none of the offspring received the mutant allele – [1/2]i under Mendel’s first law, that is, ∞

Prob 0 copies =

e − 2 2i 1 i 2 i=0

i

= e−2



1 = e − 1 = 0 37 i i=0

51

Thus, irrespective of population size, over a third of the mutations are lost in the first generation due to drift. Hence, the genetic variants that become available for subsequent evolutionary processes are strongly influenced by two random factors in all populations regardless of size: mutation and genetic drift.

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

122

Population Genetics and Microevolutionary Theory

Although total population size does not play much of a role in the survival of a new mutant in the first few generations after it occurred, population growth rates can play a major role. This can be modeled by letting the average number of offspring be k. If k > 2, the diploid population is expanding in size; if k < 2, the population is declining in size. Then, ∞

Prob 0 copies =

e − k ki 1 i 2 i=0

i

= e−k

2

∞ i=0

e−k

2

i

k 2

i

= e−k

2

52

Hence, as k increases, the chance of a mutant allele surviving the first generation goes up. For example, if k = 4, then the probability of 0 copies is reduced from the stable population size value of 0.37 to 0.14, whereas if the population is declining with k = 1, then the probability of losing a new mutant during the first generation is 0.61. This impact of population growth rate upon the evolutionary fate of new mutations has interesting implications when a founder or bottleneck event is followed by a period of rapid population growth (a population flush). Such founder-flush models (Carson 1968; Templeton 1980a; Carson and Templeton 1984) produce an interesting combination. First, the founder event causes a low variance effective size relative to the pre-founder event generation. As we saw in Chapter 4, this means that there is some loss of the old alleles that were present in the pre-founder event ancestral population. Second, the flush that follows the founder event results in k > 2, so new mutations that arise during the flush phase are more likely to persist in the population. Hence, genetic variation is initially reduced, but it is then replenished at a rapid rate. In contrast, if a founder event is followed by little or no population growth (k close to two), there will be no rapid replenishment of lost variation, and continued small population size over many generations will cause additional losses of ancestral alleles. Therefore, a founder event followed by slow or no population growth results in reduced levels of genetic variation, whereas a founder-flush event results in a rapid turnover of variation in a population with less serious reductions of overall levels of genetic variation. The rapid change in the available allelic forms under the founder-flush model may trigger large evolutionary changes in the population when combined with natural selection (Carson 1968; Templeton 1980a; Slatkin 1996). This also reinforces our earlier conclusion that not all founder or bottleneck events have the same population genetic consequences. Earlier, we saw that system of mating and amount of recombination have a major effect on modulating the evolutionary consequences of a founder or bottleneck event; now, we see that the population growth rate that exists shortly after a founder or bottleneck event is also a critical modulator of the evolutionary consequences. It is incorrect to talk about the founder effect in evolution; there are many types of founder effects in terms of their population genetic consequences depending upon the genetic and demographic context in which they occur (Templeton 1980a).

Neutral Alleles The Neutral Theory and Its Origins The other major circumstance under which drift has a major evolutionary impact irrespective of population size is when the mutant alleles are neutral. A neutral allele is functionally equivalent to its ancestral allele in terms of its chances of being replicated and passed on to the next generation. Sewall Wright first elaborated the evolutionary importance of genetic drift for neutral alleles in small populations in the 1930s (Wright 1931, 1932). Wright assumed neutrality only for mathematical convenience because he was primarily interested in how genetic drift interacted with

Genetic Drift in Large Populations and Coalescence

non-neutral alleles subject to natural selection in populations with small variance effective sizes. He did not pursue the evolutionary role that drift could play irrespective of population size upon neutral alleles. The importance of genetic drift upon neutral alleles in all populations irrespective of size only became generally recognized in the 1960s. Two important breakthroughs in molecular biology occurred during that decade that contributed significantly to our understanding of genetic variation and the importance of drift upon neutral alleles. They were:

••

amino acid sequencing protein electrophoresis

Now, we consider the role that each of these developments played for focusing attention upon neutral alleles. Amino acid sequencing allowed comparisons between homologous genes in different species, revealing the fate of genes over long periods of evolutionary time. Homology refers to traits (including amino acid or DNA sequences) found in two or more individuals (or chromosomes for DNA sequences) that have been derived, with or without modification, from a common ancestral form. Genes within a species’ gene pool are homologous when all the different copies of a gene occupying a specific locus are derived, ultimately, from a common ancestral gene through DNA replication events (premise 1 from Chapter 1). Just as genes can be homologous within a species, genes can also be homologous across species. The genes at a locus found in one species could be derived from the same ancestral gene in the distant past as the genes occupying a locus in a second species. If so, the genes are said to be homologous regardless of the species from which they were sampled. To make use of the interspecific homology of genes, a method was needed to score genetic variability across species. During the first half of the twentieth century, the primary techniques for scoring genetic variation were based upon some sort of breeding experiment (e.g. the classical “test-cross” of Mendelian genetics), and such breeding designs were primarily limited to assaying intraspecific variation. The ability to purify certain proteins and perform amino acid sequencing of these proteins advanced considerably in the 1950s and 1960s. As a result, starting in the 1960s, data sets became available on the amino acid sequences of homologous proteins (and, hence, homologous genes) across many species. Table 5.1 shows an example of the type of data being generated at this time (Dayhoff 1969). This table gives the pairwise amino acid sequence differences between the α-chains of hemoglobin (an essential portion of the protein complex that transports oxygen in our blood) in six species.

Table 5.1 Pairwise amino acid sequence differences in the α-chain of hemoglobin in five vertebrate species. Mouse

Human Mouse Chicken Newt Carp

16

Chicken

Newt

Carp

Shark

35

62

68

79

39

63

68

79

63

72

83

74

84 85

Note: Each entry represents the number of amino acid positions at which the two species differ in this protein. The columns are ordered by evolutionary relatedness such that the species at the top of any column is equally distant in evolutionary time from all species listed to its left.

123

124

Population Genetics and Microevolutionary Theory

Human

Mouse

Chicken

Newt

Carp

Shark

Z

Y

X

Figure 5.1 An evolutionary tree of the six species whose α-chain hemoglobin amino acid sequence differences are given in Table 5.1.

Note that the pairwise differences in Table 5.1 for humans versus sharks and carps versus sharks are approximately the same. Yet, humans overall are phenotypically very much unlike sharks, whereas carps are more similar to sharks (at least from the human perspective). The differences observed at the molecular level did not seem to correspond to the differences observed at the organismal level. Instead, these pairwise molecular contrasts reflected the historical evolutionary relationships of these species, which are shown in Figure 5.1, as inferred from the fossil record. The total lengths of the branches between any pair of current species and their common ancestral node in Figure 5.1 are proportional to the time since the two species diverged from their common ancestor. The columns in Table 5.1 are arranged such that the elements of each column represent an evolutionary divergence of the same length of time. Note that humans and carps shared a common ancestor after the lineage leading to the shark had already split off. Therefore, in terms of time, humans and carps are equally removed from sharks. Hence, the near constancy of the pairwise amino acid differences in a column seems to indicate that amino acid differences accumulate proportional to time, regardless of the amount of overall phenotypic evolution that occurred between species. The observation of uniformity in the rate of amino acid replacement (first noted by Zuckerkandl and Pauling 1962) led to the proposal of the molecular clock that molecular changes in genes and their protein products accumulate more or less constantly over time. The idea of the molecular clock was strongly supported in an influential paper by King and Jukes (1969). This paper was quite controversial at the time because many evolutionary biologists thought that changes in species (phenotypic and molecular) should reflect the influence of changes in environment or niche space. Yet, here was evidence, at the molecular level, that only time since divergence seemed important. Molecular evolution appeared to proceed at a constant rate, in great contrast to morphological evolution. This seemed to undercut the importance of natural selection

Genetic Drift in Large Populations and Coalescence

at the molecular level and indicated that some other evolutionary force may predominate at the molecular level. As we shall shortly see, King and Jukes argued that their molecular clock data could be best explained by a new variant of genetic drift theory articulated by Kimura (1968a). We now know that much of the molecular data are not as clock-like as they initially appeared to be. For example, suppose the number of pairwise differences from the shark to the common ancestor of humans and carps is “X” (which is not directly observable from the data since the protein of the ancestor cannot be directly determined), and the pairwise differences from that common ancestor to the human and to the carp are “Y” and “Z,” respectively, as shown in Figure 5.1. Assume now that the total distance between any two points in the evolutionary tree is approximately the sum of the intermediate branch lengths. Then, the number of pairwise differences between humans and sharks is X + Y, and the number of differences between carps and sharks is X + Z (which are observable entries in the first column of Table 5.1). Note that these two numbers are always going to appear to have similar values because they share “X” in common. Today, statistical procedures exist to correct for this artifact, and the molecular clock has been found to be frequently violated or subject to much larger errors than suspected in the 1960s. Nevertheless, the idea of a molecular clock was of tremendous importance to the formulation of scientific thought at the time. Although not always strictly clock-like in its behavior, the accumulation of molecular divergence with time is now well established and provides a useful albeit rough measure of the timing of evolutionary events (as will be seen in Chapter 7). The molecular clock challenged the preconceptions of many biologists about how interspecific genetic variation accumulated during the evolutionary process. Also, in the 1960s, a second advance in molecular biology was challenging preconceptions about the amount of intraspecific genetic variability: the technique of protein electrophoresis (details of this technique are given in Appendix A). Previously, most genetic survey techniques required variation at a locus to even identify the locus at all, thereby ensuring that all detectable loci had two or more alleles. Thus, the simple question of how many genes within a species were variable and how many were not could not be answered by direct means. This situation changed with the advent of protein electrophoresis (Appendix A). Protein electrophoresis allowed one to observe homologous gene products regardless of whether or not there was underlying allelic variation in amino acid sequence within species. To see why this was important, we need a brief history lesson. Two schools of thought originated quite early in the development of population genetics, and both traced their roots to Thomas Hunt Morgan’s famed Drosophila laboratory that made so many of the fundamental advances in genetics in the first half of the twentieth century. One school, the “classical school,” was associated with Morgan himself and some of his students, such as H.J. Muller. Morgan and Muller worked primarily with highly inbred strains of the fruit fly Drosophila melanogaster and scored genetic variation that affected morphological traits (e.g. eye color, wing vein patterns, etc.). They observed that most individuals were alike for these morphological traits and, hence, developed the idea that there is little genetic variation in natural populations, with most individuals being homozygous for a “wildtype” allele at each locus. Occasionally, mutants occur, but these are generally deleterious and are rapidly eliminated from the gene pool by natural selection. Even more rarely, a mutant is advantageous, in which case it rapidly increases in frequency in the gene pool and becomes the new “wildtype” allele. In either event, most loci in most individuals at most times are homozygous for the “wildtype” allele, that is, natural populations have very little genetic variation under the classical model. Alfred Sturtevant was another student, and later an associate, of Morgan. A Russian student by the name of Theodosius Dobzhansky came to Morgan’s laboratory and soon began working with Sturtevant. In the 1930s, Dobzhansky and Sturtevant began to look at genetic variation in natural

125

126

Population Genetics and Microevolutionary Theory

populations, as opposed to laboratory stocks. They scored genetic variation in Drosophila using the then new cytogenetic technique of staining the giant polytene chromosomes of the larval salivary glands (Figure 1.4). Such stained chromosomes can reveal up to 5000 bands over the entire genome, providing a degree of genetic resolution unparalleled at the time. One form of variation that was readily observable with this technique was large structural rearrangements of the chromosomes caused by inverted segments. They soon established that these inversions were inherited as Mendelian units, just like the genes being studied by Morgan and Muller. In contrast to the morphological traits studied by Morgan and Muller in laboratory strains, they found extensive genetic diversity in these inversions in natural populations with no obvious “wildtype.” Subsequent studies revealed selection operating upon these alternative chromosome inversion types, leading to the “balanced school” which postulated that natural populations had high levels of genetic diversity and that this diversity is maintained by selection (so-called “balancing selection”). The debate over the amount of genetic variation in natural populations continued until the mid1960s because until then the techniques used to define genes usually required genetic variation to exist. This situation changed dramatically in the mid-1960s with the first applications of protein electrophoresis. The initial electrophoresis studies (Harris 1966; Johnson et al. 1966; Lewontin and Hubby 1966) indicated that about a third of all protein coding loci were polymorphic (i.e. a locus with two or more alleles such that the most common allele has a frequency of less than 0.95 in the gene pool) in a variety of species. This amount of variation was much higher than many had expected, and it had to be an underestimate because protein electrophoresis can only detect a subset of the mutations causing amino acid changes in protein coding loci (Appendix A). These protein electrophoretic surveys made it clear that most species have large amounts of genetic variation. Interestingly, this observation did not settle the debate between the classical and balanced schools. At first glance, it would seem that the observation of much polymorphism in natural populations should support the balanced school. What happened instead was that the observation of high levels of genetic variation transformed the debate from the amount of genetic variation to the significance of genetic variation. Recall that the balanced school not only argued that high levels of genetic variation existed in natural populations but they also argued that the genetic diversity was maintained by balancing selection and was therefore not neutral. The classical school now became the school of neutral evolution, arguing that most of the genetic diversity being observed by the molecular techniques did not have any phenotypic impact at all. The wildtype allele of Morgan became a set of functionally equivalent alleles under the neutral theory. In this manner, the proponents of the classical school acknowledged the presence of variation but not that it was maintained by natural selection. An empirical demonstration of neutral alleles is provided by the work of Zeyl and DeVisser (2001). They founded 50 replicate populations from a single founding genotype of the yeast Saccharomyces cerevisiae. They then transferred these replicate populations to new media for several generations, going through single genetic individuals at each transfer. By going through single genetic individuals, mutations that occurred in the lineage leading to these individuals would accumulate and would not be eliminated by selection. However, any mutation that resulted in complete or nearly complete lethality or sterility would be eliminated under their experimental protocol, so their results cannot be used to estimate the incidence of extremely deleterious mutations. They kept replicates of the founding genotype in a dormant state that would not accumulate mutations, and then they tested their 50 replicate lines against the original, non-mutated genotype for competitive growth in their experimental media. A ratio of competitive growth of a mutant line to the nonmutant founder of one corresponds to neutrality. Ratios greater than one imply that a line carries

Genetic Drift in Large Populations and Coalescence

10 9

Number of Mutant Lines

8 7 6 5 4 3 2 1 0

0.6

0.64 0.68 0.72 0.76 0.8

0.84 0.88 0.92 0.96

1

1.04 1.08

Competitive Ratio of Mutant Lines to Initial Genotype

Figure 5.2 The distribution of competitive ratios of 50 mutation accumulation lines to the founding nonmutant genotype in yeast Saccharomyces cerevisiae. Source: Zeyl and Devisser (2001). © 2001, Genetics Society of America.

beneficial mutations, and ratios less than one imply deleterious mutations under the laboratory environment. Figure 5.2 summarizes their results. As can be seen, the most common class of mutant lines (18% of the total) had a ratio of one, implying neutrality. Indeed, the distribution of selective effects is bimodal, with one mode being centered around neutrality and the other distributed over a broad range of deleterious effects. Other procedures for estimating mutant fitness effects are less biased against highly deleterious mutations, and, commonly, the class of highly deleterious mutations becomes the most common in genes (Bataillon and Bailey 2014; Tataru et al. 2017). However, these studies also reveal many mutations at or close to neutrality. Thus, the concept of a set of functionally equivalent neutral alleles has empirical validity. Motoo Kimura was the leading proponent and developer of the neutral theory. In 1968, Kimura published two papers that put forth a model of evolution of neutral alleles via genetic drift that explained both the observation of high levels of genetic variation and the molecular clock (Kimura 1968a,b). Kimura showed that many aspects of the evolution of truly neutral alleles through genetic drift did not depend upon population size at all. How can this be, given that in Chapter 4 we saw that the strength of genetic drift as an evolutionary force is inversely proportional to population size (in an ideal population)? The answer to this question lies in the fact that the neutral theory does not deal with drift alone but rather the interactions between genetic drift and mutation. Kimura considered an ideal population of size N in which all 2N genes at an autosomal locus are neutral. These 2N genes are not the same as 2N alleles. Alleles refer to different types of genes at a locus. Genes, in this context, refer to the individual copies at a locus. Therefore, in 10 individuals, there are necessarily 20 “genes” that in theory could be grouped into 1–20 different allelic classes. As shown in the previous chapter, drift in the absence of mutation will eventually fix one gene (and therefore one allelic type) and lose all the others. Under neutrality, all genes are equally likely to be fixed by definition, meaning that each of the original 2N genes has a probability of fixation of 1/(2N).

127

128

Population Genetics and Microevolutionary Theory

Kimura now introduced mutation into his model. Let μ be the mutation rate of neutral alleles in the population. He also assumed the infinite alleles model in which all mutations yield a new allele. Then, the rate of production of new neutral alleles is the number of genes at the locus in the population times the mutation rate = 2Nμ. Kimura next considered the balance of drift versus mutation. He pointed out that large populations have a much smaller probability of fixation (1/2N) but a greater rate of mutant production (2Nμ) than do small populations. These two effects balance one another out so that the overall rate of molecular evolution (the rate at which new alleles are produced times the probability of one going to fixation) is: Rate of Neutral Evolution =

1 × 2Nμ = μ 2N

53

Note that in Eq. (5.3), there is no effect of the population size on the rate of neutral evolution by drift. Hence, drift is an important evolutionary force for neutral alleles in all populations, not just small ones. Kimura (1968a) used Eq. (5.3) to explain the molecular clock, and his explanation was used by King and Jukes (1969). If the neutral mutation rate is relatively constant (an internal property of the genetic system as opposed to an external property arising from changing environments), then the rate of molecular evolution is independent of what is going on in the environment. This explains why we see an apparent molecular clock, as long as the alleles are neutral (functionally equivalent). One problem with this explanation was that mutation rates are generally measured per generation, but the clock was measured in absolute time (years). Kimura overcame this difficulty by claiming that mutation rates are constant in absolute time and not in generation time. Kimura (1968b) also used neutrality to explain why there is so much polymorphism in natural populations. Kimura noted that it can take many generations for a neutral allele to go from being a new mutation to fixation, and during this time, the population is polymorphic, albeit in a transient sense. But, quite often, one or more copies of the allele going to fixation will actually mutate to new neutral alleles before the original version achieved fixation. Hence, if the population is large enough, large amounts of neutral variation will be present at any given time, even though the specific alleles are in constant turnover due to mutation and drift. The impact of drift and mutation upon polymorphism can be quantified by returning to Eq. (4.3), the equation that describes how the average probability of identity-by-descent in an idealized population changes from one generation to the next due to the force of genetic drift. There is no mutation in Eq. (4.3), so we need to add a model of mutations in order to investigate how the balance of mutation and drift can explain intraspecific polymorphisms. We will assume an infinite alleles, neutral mutation model. Under the infinite alleles model, the only way to be “identical”-by-descent (which is synonymous to identical-by-state under this model) is to have no mutational events occurring between the generations in addition to being inbred in the pedigree sense. Given that neutral mutations occur with a probability μ, then the probability that no mutation occurs is 1 − μ. Since two gametes are needed to produce an individual, identity-by-descent requires that no mutation occurred in the production of either gamete. Under the assumption that all meiotic events are independent, the probability that both the male and female gametes experience no mutation is (1 − μ)2. We can now incorporate these new probabilities into Eq. (4.3) to obtain:

Genetic Drift in Large Populations and Coalescence

F t = Average Probability of Identity by Descent

=

1 + 2N

1−

1 F t−1 2N

1−μ

2

Probability of Identity

Probability of No

by Descent Due to

Mutation in Both

at Generation t

Genetic Drift

54

Gametes

As N decreases, F increases, but as μ increases, F decreases. Hence, Eq. (5.4) tells us that drift increases the average probability of identity-by-descent, whereas mutation decreases it. Over time, the average probability of identity-by-descent may reach an equilibrium F eq , reflecting a balance between the two antagonistic forces of drift and mutation. This equilibrium can be determined from Eq. (5.4) by setting F(t) = F (t − 1) to yield: F eq =

1 2N 1 1 − μ 2 − 1 + 1

55

Using a Taylor’s Series Expansion, (1 − μ)−2 ≈ 1 + 2 μ + 3 μ2 + 4 μ3 + …. If μ is very small, terms of μ2 and higher can be ignored, so a good approximation to Eq. (5.5) is: F eq =

1 4Nμ + 1

56

To deal with populations that deviate from our idealized set of assumptions, we need to substitute Nef (Chapter 4) for N in Eq. (5.6). Because F eq is the equilibrium level of average identity-by-descent in the population and because identity-by-state equals identity-by-descent under the infinite alleles model, F eq also has the interpretation of being the expected homozygosity under random mating. Therefore, 1 − F eq = Heq is the expected heterozygosity. Let θ = 4Nefμ, where θ measures the proportional strength of mutation (μ) to genetic drift (1/Nef). Then, Eq. (5.6) can be recast as: 1 − F eq = H eq = 1 −

1 θ = θ+1 θ+1

57

Equation (5.7) gives the expected heterozygosity for neutral alleles as a function of the ratio of the strength of mutation versus drift (θ). Figure 5.3 shows a plot of expected heterozygosity versus θ. As can be seen, Heq can take on any value between 0 and 1. Thus, depending upon the exact value of θ, the neutral theory can explain any degree of genetic variability found in a population.

Critiques of the Neutral Theory The neutral theory showed that genetic drift was an important evolutionary force in all populations, regardless of size. Under neutrality, drift and mutation could together explain much of the observed patterns in genetic variation both between and within species. However, a drawback of applying the neutral theory was that it explained the genetic observations in terms of parameters that were unknown or difficult to estimate. For example, the predictions of the neutral theory with respect to the molecular clock depend upon μ, the neutral mutation rate. The neutral theory always acknowledged that some mutations are not neutral, and, in particular, many new mutations are deleterious, just as shown in Figure 5.2. Hence, μ in the neutral theory is not the mutation rate, but rather the rate of mutation to functionally equivalent alleles only. Directly estimating mutation rates is difficult enough, but no one has devised a method of directly measuring the mutation rate to

129

Population Genetics and Microevolutionary Theory

0.8

0.6 Heq

130

0.4

0.2

2

4

6

8

10

θ = 4Nef μ

Figure 5.3 The relationship between the equilibrium expected heterozygosity (Heq) under the infinite alleles model and the ratio of the strength of the neutral mutation rate (μ) to genetic drift (1/Nef) as measured by θ = 4Nefμ.

just neutral alleles. The situation is even worse for explaining the amount of intraspecific variation under the neutral theory; as shown by Eq. (5.7), the neutral explanation depends upon the values of both μ and of Nef, the inbreeding effective size. As shown in the last chapter, estimating the inbreeding effective size is also a difficult task in many situations. In the absence of direct estimates, one could always set the values of μ and of Nef to explain any particular rate of interspecific evolution or intraspecific heterozygosity under the assumption of neutrality. Thus, the neutral theory had great explanatory power but was difficult to test. This difficulty became apparent in the 1970s, when many population geneticists attempted to test the neutral theory. It proved extremely difficult to make observations that unambiguously discriminated neutrality from selection. For example, the molecular clock is explained under neutrality by assuming that μ is a constant over time for the gene under study. However, given that μ is the neutral mutation rate, μ is not expected to be constant across genes that are subject to different degrees of functional constraint even if the mutation rate per nucleotide is constant in all genes (that is, the mutation rate that includes both neutral and non-neutral mutations). For example, the α-chain of the hemoglobin molecule carries oxygen to the cells and interacts with a variety of other globin chains during the course of development. Hence, it seems reasonable that most amino acid substitutions in this molecule would have deleterious consequences for the functioning of the molecule, leading to a low neutral amino acid replacement mutation rate. In contrast, the protein fibrinopeptide is involved in blood clotting, but only a small portion of the molecule is actually involved in this process; the bulk of the molecule seems to have little function other than to be cleaved off and discarded when the clotting process is initiated. Consequently, fibrinopeptide should have a higher neutral mutation rate than α-hemoglobin and, therefore, a faster molecular clock under the neutral theory. Figure 5.4 shows the observed amino acid sequence evolution in these two molecules, along with the estimated rate of molecular divergence, which should be the neutral mutation rate under the neutral theory. As can be seen, fibrinopeptide evolves much more rapidly than the more functionally constrained molecule of α-hemoglobin, as predicted by the neutral theory.

Genetic Drift in Large Populations and Coalescence

tides opep

120

100

Fibrin

Corrected Amino Acid Changes Per 100 Residues

140

n

bi

lo

og

m

He

80

60

40

20

0

0

100

200 300 400 500 600 Millions of Years Since Divergence

700

Figure 5.4 Rates of amino acid substitution in the fibrinopeptides and α-hemoglobin. The approximate time of divergence between any two species being compared is given on the x-axis as estimated from paleontological data. The y-axis represents the number of amino acid changes per 100 residues for a comparison of the proteins of two species, after correction for the possibility of multiple substitutions at a single amino acid site. Source: R.E. Dickerson (1971). © 1971, Springer Nature.

This line of argument for neutrality has been greatly extended with data sets based upon DNA sequencing. For example, we now know that third codon positions usually evolve more rapidly than the first and second positions. This fits the functional constraint argument for neutrality because a third position mutation is more likely to be synonymous than mutations at the first and second positions due to the degeneracy of the genetic code; hence, a third position mutation should be more likely to be neutral. Similarly, sometimes, a functional gene gets duplicated, but the duplicate is nonfunctional and is therefore called a pseudogene. In general, pseudogenes evolve more rapidly than their functional ancestral gene, a pattern also consistent with the greater probability of a neutral mutation in the pseudogene versus the functional gene. This pattern of faster evolution in molecules with less functional constraint is a test supporting neutrality only if this same pattern is inconsistent with the hypothesis that molecular evolution is driven by natural selection. Is it? Back in the 1930s, the English population geneticist and statistician R.A. Fisher (1930) argued that the smaller the effect of a mutation upon function, the more likely it would be that the mutation would have an advantageous effect on the phenotype of fitness (for now, we define fitness as the quantitative ability of an individual to replicate and pass on DNA). We will discuss Fisher’s reasoning in more detail in Chapter 12, but, for now, we simply point out that Fisher’s theory, formulated long before the neutral theory, also results in the prediction that molecules with low levels of functional constraint should evolve rapidly due to natural

131

Population Genetics and Microevolutionary Theory

0.25

Expected Heterozygosity

132

0.20

0.15

0.10

0.05

0

0

2

4

6 8 10 Logarithm of Population Size

12

14

Figure 5.5 The relationship between population size (measured on a logarithmic scale) and expected heterozygosity from protein electrophoretic data, as observed in a large number of species. Source: Nei and Graur (1984). © 1984, Springer Nature.

selection because more mutations will be of small phenotypic effect and thereby are more likely to be selectively advantageous. Hence, interspecific clock patterns such as that shown in Figure 5.4 are compatible with both the neutral theory and Fisher’s theory of selectively driven advantageous mutations of small phenotypic effect. Now, consider testing the neutral theory with observations on intraspecific levels of genetic variation. Equation (5.7) allows any level of variation to be explained as a function of μ and Nef. Thus, at first glance, it would appear that the neutral theory is virtually unassailable by observations on the amounts of intraspecific variation and, thereby, not testable with such data. But is it? If μ is regarded as a constant for a given gene or set of genes (to explain the molecular clock), then any variation in expected heterozygosity levels across species for the same gene or set of genes would have to be due to differences in Nef under neutrality. Figure 5.5 shows a plot of expected heterozygosities (determined by protein electrophoresis) in several species versus the logarithm of the estimated actual population size of those species. Note that all the observed heterozygosities are less than 0.25. Under the neutral theory, the predicted relationship shown in Figure 5.3 maps the heterozygosities found in Figure 5.5 onto a very narrow range of θ values and, hence, of Nef values. Indeed, the bulk of the observed expected heterozygosities shown in Figure 5.5 are between 0.01 and 0.10, implying only a 10-fold range of variation in Nef across species whose actual population sizes span 14 orders of magnitude. Although Nef is not the same as census size, a discrepancy of 13 orders of magnitude seems difficult to explain under neutrality. Ellegren and Galtier (2016) have suggested that part of this dilemma is due to the dependence of Eq. (5.7) upon effective population size. They argue that many of the life history attributes of species interact with absolute population size in such a manner as to produce a more restricted range of effective population size. They also argue that linkage effects could further explain the dilemma. As will be shown in Chapter 12, when natural selection results in the fixation of a beneficial mutation, there is also a tendency to fix neutral mutations that are closely linked, causing a decrease in neutral variation. Also, when selection eliminates a deleterious mutation, it reduces the local inbreeding effective size of the genomic region closely linked to the deleterious mutation, also causing reduced

Genetic Drift in Large Populations and Coalescence

neutral variation in the region of the deleterious mutation. These linked selective effects tend to become more effective with increasing population size, which once again tends to produce a narrower range of neutral variation in the genome. Ellegren and Galtier (2016) do not claim that they have totally explained the neutral theory dilemma of a narrow range of genetic diversity, but life history traits and the linkage effects of selection do help. Another way out of this apparent dilemma of too little variation in expected heterozygosities across species is to turn attention to μ. Recall that μ is not the mutation rate, but the neutral mutation rate. But what exactly is a neutral allele? As originally used by Kimura, a neutral allele is one that has absolutely no impact on the phenotype of fitness under any conditions. However, one of Kimura’s colleagues, Tomoko Ohta, focused on mutations that displayed weak fitness effects – socalled nearly neutral mutations (Ohta 1976). In terms of Figure 5.2, Ohta was concerned with all the mutations in the upper mode centered around neutrality and not just the absolutely neutral ones. She argued that these mutations of small selective impact would be effectively neutral in very small populations because genetic drift would dominate the weak effects of selection. Hence, the effective neutral mutation rate would be high in small populations because it would include not only the absolutely neutral mutations but also many of the slightly selected mutations. As population size increased and the force of drift decreased, more and more of these mutations of slight effect would come under the influence of selection. Hence, as Nef increases, the effective neutral mutation rate μ would decrease. Because of these opposite effects, the product Nefμ would show a much dampened range of variation. With Ohta’s inclusion of mutations with slight fitness effects, a wide range of Nef values should give similar H values under this modified neutral theory, as is observed (Figure 5.5). The trouble with Ohta’s explanation for the proponents of neutrality is that it undermines the prediction of the molecular clock. The μ in Eq. (5.3) is now a function of Nef, so discrepancies in the clock across species and across time are now expected as no one expects Nef to remain constant across species and over time. As a consequence, although Kimura (1979) briefly embraced Ohta’s idea of nearly neutral mutations, he quickly abandoned it and went back to a model of absolute neutrality in order to preserve the strict molecular clock of Eq. (5.3) but at the price of not having a satisfactory explanation for the patterns of intraspecific variation. The strict molecular clock (Eq. 5.3) derived by Kimura has itself come under attack. Kimura assumed a constant population size of N. Balloux and Lehmann (2012) have shown that when population size fluctuates over time (the realistic case for all species) and generations overlap, then the rate of neutral evolution is a function of population demography and not just the neutral mutation rate, thereby undermining the validity of Eq. (5.3). The constancy of the neutral mutation rate, μ, in Eq. (5.3) has also been questioned, not only by Ohta’s (1976) nearly neutral theory but also by questioning Kimura’s treatment of neutrality as being an intrinsic property of a mutation that is invariant to external factors. Hartl et al. (1985) showed both theoretically and empirically that enzymes will evolve under selection in such a manner that many mutations will evolve to be neutral or nearly neutral as a consequence of long continued natural selection on the enzyme-coding gene. A similar model for regulatory evolution at enhancers has also shown that the selective effects of mutations at a site change dramatically over time due to substitutions elsewhere in the enhancer, and even the overall degree of constraint across the enhancer can change considerably (Bullaughey 2011). One result of this model for enhancer evolution is that even deleterious mutations on the initial genetic background can evolve toward effective neutrality upon the selected genetic background. Zheng et al. (2019) performed experiments on the bacteria Escherichia coli adapting to a novel light environment and found that populations with many initial neutral or mildly deleterious genetic variants adapted with greater diversity and higher ultimate fitness than populations without such initial variants. The initial neutral and deleterious variants created more mutational

133

134

Population Genetics and Microevolutionary Theory

pathways to adaptive variants. For example, suppose variants A and B are neutral alleles separated by a single mutational change. Let C be an adaptive mutation in the new environment that is a single mutational step from B but two mutational steps from A. An ancestral population having both the A and B alleles in its gene pool is more likely to generate the adaptive mutant C than an ancestral population with only A in its gene pool. In this manner, neutral variants actually enhance adaptive evolution by providing a greater diversity of adaptive pathways (Zheng et al. 2019). All of these studies show that neutrality is not an intrinsic property of a mutant, but depends critically upon the genetic and environmental backgrounds in which that mutation occurs – backgrounds that are constantly changing. Given that neutrality itself can evolve, Eq. (5.3) no longer predicts a strict molecular clock. Kimura’s neutral clock does display some rate variation due to the stochastic nature of genetic drift, but Gillespie (1986, 1988) has shown that the molecular clock shows a greater range of variation than predicted by Kimura’s strict clock. Hence, molecular clocks still retain much validity in timing evolutionary events, but they are more subject to random variation than predicted by Kimura’s neutral theory. Because of difficulties such as those mentioned above, Kimura’s neutral theory is still controversial. For example, Kern and Hahn (2018) argued that modern data and theory have “overwhelmingly rejected” the neutral theory, but Jensen et al. (2019) argued that much of the neutral theory is still valid, particularly for the large fraction of the genome that is non-coding (Chapter 1). Nevertheless, most population geneticists accept the idea that genetic drift is a major player in the evolution of genes at the molecular level, for neutral, nearly neutral, and conditionally neutral mutations. The ideas of a strict molecular clock and of absolute neutrality do not appear to explain the observed data well, but molecular clocks are still useful and widely used. Moreover, as will be shown in Chapters 12 and 13, modern data sets allow many ways of investigating natural selection at the molecular level, so the focus of much current research is now not testing the neutral theory per se, but rather using the neutral theory as a convenient null hypothesis in testing for selection. The interesting evolutionary biology usually emerges when the null hypothesis of neutrality is rejected and the focus of the research shifts to “Why is neutrality rejected?”

The Coalescent Up to now, we have been taking a forward-looking approach to see what will happen in the next generation given the current generation. Frequently, however, we make observations about present patterns and want to know how evolution occurred in the past to produce the present. Therefore, we also need to look at evolution backwards in time from the present. This backwards approach is in many ways the more practical one when dealing with natural populations rather than experimental ones. With natural populations, we do not know the future, but we can survey genetic variation in current populations, and then use the present-day observations to make inferences about the evolutionary past. Also, we often do not have the luxury or opportunity of conducting forward-looking experiments, particularly when we want to study evolution over long time scales or species with long generation times. Once again, we need a theory that deals with time backwards from the present in order to test hypotheses about evolution in such cases. One class of backward-looking models in population genetics is called coalescent theory (Kingman 1982a,b). We will introduce coalescent theory by modeling genetic drift in this time-backwards sense.

Genetic Drift in Large Populations and Coalescence

The Basic Coalescent Process Recall from Chapter 1 our first population genetic premise: DNA replicates. In a forward sense, this means that one molecule of DNA can become two or more molecules in the future. In the backward sense, this means that two or more molecules of DNA observed today can coalesce into a single copy of DNA in the past (Figure 5.6). We say a coalescent event occurs when two lineages of DNA molecules merge back into a single DNA molecule at some time in the past. Hence, a coalescent event is simply the time inverse of a DNA replication event. To see how genetic drift can be recast in terms of coalescent events, consider a hypothetical drift experiment consisting of just six haploid individuals (Figure 5.7). This population has six copies of any homologous DNA segment at any given time. As this finite population reproduces, DNA replicates. By chance alone, some molecules get more copies into the next generation than others. Eventually, fixation occurs (after eight generations in Figure 5.7) such that all copies are now descended from a single DNA molecule at generation 1. Now, consider the drift process as observed backwards in time. Suppose generation 10 is the current generation and we survey all six genes. We now look at the genealogical history of those six genes. Note first that we cannot observe any part of the drift process that led to the extinction of a

Replication

AT CG CG

ac CG

TA CG

TA TA AT TA AT CG CG TA T G C A

C

A

C

G

G

T A

T

GC CG AT TA GC TA TA GC AT GC GC TA

G

D

AT TA CG TA TA GC TA GC GC TA TA

GC

TA AT

TA TA

AT GD TA CG

Coalescence

Figure 5.6 DNA replication and its time inverse, coalescence.

135

136

Population Genetics and Microevolutionary Theory

Replication Generation 1

Generation 2

Generation 3

Generation 4

Generation 5

Generation 6

Generation 7

Generation 8

Generation 9

Generation 10

Figure 5.7 A hypothetical case of genetic drift in a population with only six copies of a homologous gene. Each vertical or diagonal line indicates a DNA replication event, going from the top to the bottom. A DNA molecule with no lines coming from its bottom did not pass on any descendants. By generation 3, all six copies are descended from a single DNA molecule. This common ancestral molecule and all its descendants are shown in bold compared to all the other DNA lineages that went extinct by generation 10.

gene lineage prior to generation 10 (the parts in gray in Figure 5.7). Because we are starting only with the genes present in the current population (generation 10), we have no way of knowing about the gene lineages that no longer exist. (In some cases, we can get information about past genes from fossils, as will be discussed in Chapter 7, but such cases are exceptional.) Moreover, once all our current gene lineages have coalesced back to a common ancestor (generation 3 in Figure 5.7), we no longer have genetic variation in the coalescent process. Therefore, we have no information about evolutionary events that occurred prior to the appearance of the common ancestor because all observable aspects of evolution involve genetic variation. We do not know exactly how many other genes existed at that generation of ultimate coalescence. We also do not know how many copies of the “winning” gene (the one that became the ancestor of all current genes) existed in

Genetic Drift in Large Populations and Coalescence

Generation 3

Generation 4

Generation 5

Generation 6

Generation 7

Generation 8

Generation 9

Generation 10 Coalescence

Figure 5.8 A hypothetical case of the coalescent process in a population with only six copies of a homologous gene. Each combination of a vertical with a diagonal line indicates a coalescent event, going from the bottom to the top. This figure shows the same population illustrated in Figure 5.7, but shows only that part of the genealogical structure associated with the variation present at generation 10 back to the generation of coalescence (generation 3) of all six copies present at generation 10.

the coalescence generation or in previous generations. Finally, we do not know how long the winning gene existed in the population prior to it being the ancestor of all current variations. Hence, in the backward experiment, we cannot observe as much as in the forward experiment. The most we can observe about the coalescent process is shown in Figure 5.8. In general, consider taking a sample of n genes from a population. The word “genes” in coalescent models refers to the different copies of a homologous stretch of DNA. Because drift inevitably causes fixation in the future sense, this means in the backward sense that eventually all of the genes can be traced back in time to a common ancestral gene from which all current copies are descended. Thus, what we see as we look backwards in time is a series of coalescent events, with each event reducing the number of DNA lineages (as illustrated in Figure 5.8). This reduction proceeds until all sampled copies in the present coalesce to a common ancestral DNA molecule that existed at some time in the past. For example, all the copies of mitochondrial DNA (mtDNA) found in living humans must eventually coalesce into a single ancestral mtDNA. Because mtDNA is inherited as a maternal haploid in humans, this ancestral mtDNA must have been present in a female. Some scientists and much of

137

138

Population Genetics and Microevolutionary Theory

the popular media have dubbed this bearer of our ancestral mtDNA as “mitochondrial Eve” and have treated this as a startling discovery about human evolution, as will be detailed in Chapter 7. However, the existence of a mitochondrial Eve is trivial under coalescence. Finite population size (all real populations) ensures that all copies of any homologous piece of DNA present in any species have been derived from a single common ancestral DNA molecule – indeed, this is the very definition of genic homology (descent from a common ancestor). To say that humans have a mitochondrial Eve is to say only that all human mtDNA is homologous. When “Eve” is called the ancestor of us all, it only means that our mtDNA is descended from her mtDNA, and not necessarily any other piece of the human genome. All genes that are homologous have a common ancestor at some point in the past. Assuming that all the genes are neutral, what can we say about how finite population size (genetic drift) influences the coalescent process? Consider first the simple case of a random sample of just two genes. The probability that these two genes coalesce in the previous generation is the same as being identical-by-descent due to pedigree inbreeding from the previous generation under random mating. As shown earlier, this probability is 1/(2Nef) for a diploid gene. In general, the probability of coalescence in the previous generation is 1/(xNef) where x is the ploidy level. Therefore, the probability that the two genes did not coalesce in the previous generation is 1 − 1/(xNef). Hence, the probability of coalescence exactly t generations ago is the probability of no coalescence for the first t − 1 generations in the past followed by a coalescent event at generation t: Prob Coalesce at t =

1−

t−1

1 xN ef

1 xN ef

58

The average time to coalescence is then: ∞

Expected Time to Coalesce =

t 1−

t =1

1 xN ef

t−1

1 xN ef

= xN ef

59

We now consider a sample of n homologous autosomal genes taken from a large population of size N such that n is much smaller than N. Under these conditions, it is unlikely that more than a single pair of genes will coalesce at any given generation. Recall that the probability that a pair of genes coalesced in the previous generation in an idealized population is 1/(xNef) where x is the ploidy level. The number of pairs of genes in a sample of n is given by the binomial coefficient: Number of pairs of genes =

n 2

=

n n n−1 = n−2 2 2

5 10

We do not care which pair coalesced first, but simply that any pair coalesced. Hence, the probability that one pair coalesced the previous generation is simply the product of the probability that a pair coalesced times the number of pairs: Prob coalescence in the previous gen =

n 2

1 n n−1 = 2xN ef xN ef

5 11

Hence, Prob no coalescence in the previous gen = 1 −

n n−1 2xN ef

5 12

Genetic Drift in Large Populations and Coalescence

If no coalescence occurred in the previous generation, there are still n gene lineages, and the above probabilities still apply to the next generation in the past and so on until the first pair coalesced. Therefore, 1−

Prob first coalescence in t generations =

n n−1 2xN ef

t−1

n n−1 2xN ef

5 13

and the expected time to the first coalescence is: ∞

E time to first coalescence =

t 1−

t=1

n n−1 2xN ef

t−1

n n−1 2xN ef

5 14

2xN ef = n n−1 The variance of the time to first coalescence, σ 12, is given by: σ 21 =



t−

t =1

2xN ef = n n−1

2xN ef n n−1

2

1−

n n−1 2xN ef

t−1

n n−1 2xN ef

5 15

2xN ef −1 n n−1

Once the first coalescent event has occurred, we now have n − 1 gene lineages, and, therefore, the expected time to the second coalescent event starting from the time at which the first coalescent event took place is: ∞

E time to second coalescence first =

t 1−

t =1

=

n−1 n−2 2xN ef

t−1

n−1 n−2 2xN ef

2xN ef n−1 n−2 5 16

In general, the expected time and variance between the k − 1 coalescent event and the kth event is: E time between k − 1 and k coalescent events = σ 2k =

2xN ef n−k + 1 n−k

2xN ef −1 n−k + 1 n−k

2xN ef n−k + 1 n−k

5 17 5 18

As the coalescent process proceeds, we finally get down to the n − 2 coalescent event that leaves only a single pair of genes. From the equation immediately above, we can see by letting k = n − 1, the next and final coalescent event, that the expected time between the second to last and the last coalescent event is xNef, the same result obtained for a random pair of genes. Notice that the time between coalescent events becomes progressively longer as we look farther and farther back into the past. Indeed, as we will now see, the time between the next to last coalescent event and the final one that unites all the original DNA lineages into a single ancestral molecule is nearly as much as the times for all the coalescent events that preceded it added together. To show this, we note that the total time it takes n genes to coalesce to a single ancestral molecule is the

139

140

Population Genetics and Microevolutionary Theory

sum of all the times from the first to the last coalescent event and that these times are independent because of the Markovian property (Box 4.1). Hence, we have that: n−1

E time to coalescence of all n genes = k=1

=

2xN ef n−k + 1 n−k

5 19

2xN ef 1 − 1 n n−1

σ 2k

Var time to coalescence of all n genes = k=1 n−1

= k=1

2xN ef n−k + 1 n−k n−1

≈ 4x 2 N 2ef k=1 n

= 4x 2 N 2ef i=2

2xN ef −1 n−k + 1 n−k

1 n−k + 1 2 n−k i

2

1 i−1

2

2

5 20 As the above equations reveal, the time to coalescence of a large sample of genes is about 2xNef, so the first n − 2 coalescent events take about xNef generations and the last coalescent event of the remaining pair of DNA lineages also takes xNef generations on the average. The average coalescent time to the common ancestor of all n genes is 2xNef (1 − 1/n). These equations for expected coalescence time reveal another property of the evolutionary impact of drift: drift determines how rapidly genes coalesce to a common ancestral molecule. The smaller the inbreeding effective size, the more rapidly coalescence is expected to occur. Also, note that the expected time for ultimate coalescence approaches 2xNef as the sample size (n) increases. Hence, the expected time to coalescence to a single molecule for any sample of genes is bounded between xNef and 2xNef. Therefore, if you are interested in “old” events, you do not need large samples of genes to include some of the oldest coalescent events in your sample. For example, with just a sample of 10 genes, the expected coalescence time of your sample is 90% of the expected coalescence time for the entire population. Indeed, just a sample of two genes yields 50% of the expected coalescence time for the entire population. On the other hand, the expected time to the first coalescent event in a sample of n genes (that is, the coalescent event that reduces the number of gene lineages from n to n − 1) is 2xNef/[n(n − 1)], which is very sensitive to the sample size n. For example, suppose we take a sample of 10 genes at a diploid locus (x = 2), then the expected range between the first and last coalescent event in this sample is 0.0444Nef–3.6Nef. Now, if we increase our sample to 100 genes, then the coalescent events in our sample are expected to be between 0.0004Nef and 3.96Nef. Note that by increasing our sample by an order of magnitude, we had only a minor impact on the expected time to the oldest coalescent event (going from 3.6Nef to 3.96Nef) but changed the expected time to the first coalescent event by two orders of magnitude (0.0444Nef–0.0004Nef). Hence, if you are interested only in the old events in a gene’s history, then small samples are sufficient, but if you are interested in recent events as well, then large sample sizes are critical. The equation for the expected coalescent time for a large sample of genes, 2xNef, also indicates that ploidy level has a major impact on the time to ultimate coalescence. For an autosomal region, x = 2, so the expected time in a large sample to the most recent common ancestral DNA molecule for an autosomal region is about 4Nef. X-linked DNA is haploid in males and diploid in females, so in a population with a 50 : 50 sex ratio, x = 1.5 and the expected ultimate coalescent time is about

Genetic Drift in Large Populations and Coalescence

3Nef. MtDNA is inherited as a haploid element in many animals, so x = 1. Moreover, mtDNA is maternally inherited, so only females pass on their mtDNA. Thus, the inbreeding effective size for the total population of males and females, the Nef that is applicable to autosomal and X-linked DNA, is not applicable to mtDNA. Instead, the expected time to ultimate coalescence of mtDNA is influenced only by the inbreeding effective size of females, say Nef♀. Thus, with x = 1, the expected coalescence time of mtDNA is about 2Nef♀. Similarly, Y-chromosomal DNA is inherited as a paternal haploid, so its expected coalescent time is about 2Nef♂, where Nef♂ is the inbreeding effective size for males. Because the sex ratio is close to 50 : 50 in humans, it is commonplace to approximate the sex-specific inbreeding sizes by 1/2Nef. Thus, a 1 : 1 : 3 : 4 ratio is expected for the relative coalescence times of Y-DNA, mtDNA, X-linked DNA, and autosomal DNA, respectively. However, inbreeding effective sizes are affected by many factors, including the variance of reproductive success (Chapter 4). In many species, the variance of reproductive success is larger in males than in females. The higher this variance, the lower the effective size, so in general Nef♀ > Nef♂. Therefore, Y-DNA is expected to coalesce the most rapidly of all the genetic elements in the human genome. Templeton (2005) estimated the times to ultimate coalescence of human Y-DNA, mtDNA, 11 Xlinked loci, and 12 autosomal loci by using the molecular clock method of Takahata et al. (2001). With this method, one first needs to compare the human genes to their homolog in chimpanzees. Let DCH be the average number of mutations that have accumulated between the chimp and human genes. Given that the fossil record indicates that humans and chimps split about 6 million years ago (Haile-Selassie 2001; Pickford and Senut 2001), the time available for this number of mutations to accumulate is 12 million years (6 on the human lineage, and 6 on the chimp lineage). Let DH be the average number of mutations that have accumulated within the human sample of genes between lineages that diverged from the common human ancestral gene. Then, assuming a constant rate of mutational accumulation (the molecular clock, Eq. (5.3)), the estimated time to coalescence of the human gene sample is 12DH/DCH. These estimated coalescence times are presented in Figure 5.9. As expected, Y-DNA has the smallest coalescence time, with mtDNA not being much longer. The Xlinked loci all have longer coalescence times than mtDNA, and the autosomal loci have the longest times to coalescence on the average. Hence, these data confirm the theoretical prediction that the average time to coalescence should be affected by the level of ploidy and the pattern of inheritance (unisexual versus bisexual). Note in Figure 5.9 that there is much variation in coalescence times within the X-linked loci and within the autosomal loci. This result indicates that there is a large variance in coalescent times. Equation (5.20) shows that the variance in ultimate coalescence time for a sample of n genes is proportional to Nef2 while the mean is proportional to Nef. This variance in coalescence time is sometimes called “evolutionary stochasticity” to emphasize that it is an inherent property of the evolutionary realization of a single coalescent-drift process and is not related to the sample size or other sources of error. Unfortunately, many attempts to estimate coalescent times use Eq. (5.3) (the neutral molecular clock equation) and ignore the evolutionary stochasticity of the coalescent process. Kimura and others derived Eq. (5.3) for interspecific comparisons spanning millions of years (e.g. Table 5.1). On that time scale, the fixation of new mutations (the time forward inverse of coalescence to a single molecule) within a species was regarded as instantaneous and, therefore, not a contributor to variance. One simply has to wait for a neutral mutation to occur and go to instantaneous fixation, which is regarded as a relatively rare, random event. These assumptions yield a Poisson probability distribution (Appendix B) as the descriptor of the number of mutations accumulated under the neutral molecular clock, so the neutral clock is also subject to evolutionary stochasticity. However, for the Poisson clock, the mean equals the variance (Appendix B). However, when dealing with the smaller time scales of interest to a population geneticist, regarding fixation as instantaneous is a poor assumption. As shown in Eqs. (5.19) and (5.20),

141

Population Genetics and Microevolutionary Theory

9 8 TMRCA (In Millions of Years)

7 6 5 4 3 2 1 MX1

FUT2

CCR5

LACTASE

FUT6

Hbβ

CYP1A2

HFE

MS205

ECP

EDN

MC1R

PDHA1

TNFSF5

RRM2P4

APLX

AMLEX

G6PD

HS571B2

Xq13.3

MSN/ALAS2

FIX

MAO

Y-DNA

0 mtDNA

142

AUTOSOMAL

X-LINKED LOCUS

Figure 5.9 Estimated coalescent times for 25 human DNA regions. Details and references for the DNA regions studied are given in Templeton (2005). Source: Templeton (2005).

the time to the most recent common ancestral molecule has much greater variance than a Poisson process and therefore must always be regarded as subject to extensive evolutionary error stemming from genetic drift. This variance can be seen experimentally in Buri’s experiment on genetic drift in Drosophila (Figure 4.4). These fixation times, the forward analogue of coalescent times, are plotted in Figure 5.10. Note how “spread out” the observations are in this set of identical replicates: and most of the fixation is yet to occur! In general, coalescence times have a large variance and are skewed toward older times. In Buri’s experiment, we can directly visualize the enormous variance in coalescence times associated with genetic drift under identical demographic conditions because there were 107 replicates. For most natural populations, you have only one realization of the drift process that is subject to this large variance in coalescent time. For example, we mentioned earlier that there is a human mtDNA common ancestral molecule. How long ago in the past must we go to coalesce back to this ancestral molecule? Using the standard molecular clock (Eq. (5.3) and an estimator of μ of 10−8 per year, the time to coalescence of all mtDNA to a common ancestral molecule has been estimated to be 290 000 years ago (Stoneking et al. 1986). (Actually, the mutation rate for mtDNA, much less the neutral mutation rate, is not accurately known, introducing further error into this estimation process, but that error will be ignored for now and μ will be treated as a known constant.) This figure of 290 000, however, is subject to much error because of evolutionary stochasticity (the inherent variance associated with the clock and with the coalescent process). When evolutionary stochasticity is taken into account (ignoring sampling error, measurement error, and the considerable ambiguity in μ), the 95% confidence interval around 290 000 is 152 000–473 000 years (Templeton 1993) – a

Genetic Drift in Large Populations and Coalescence

60

Number of Lines

50 40 30 20 10 0 1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 18 19 >19

Generation of Fixation

Figure 5.10 The observed times to fixation in the 107 replicate lines of populations of 16 Drosophila melanogaster in Buri’s (1956) experiment on genetic drift. The experiment was terminated after generation 19, and over half the lines were still polymorphic at that point. These lines are indicated in the last column and would obviously take longer than 19 generations to go to fixation. Source: Buri (1956). © 1956, John Wiley & Sons.

span of over 300 000 years! Hence, genetic drift induces much variance into coalescence times that can never be eliminated by increased sampling because natural populations represent a sample size of one of the coalescent process that we can observe for any single locus or DNA region, such as mtDNA. Thus, coalescent theory teaches us to be humble in our dating of events through intraspecific studies of molecular variation.

Coalescence with Mutation The basic coalescent model given above ignores mutation. Now, we want to include both genetic drift and mutation as potential evolutionary forces. Consider first a sample of just two genes. By adding mutation, the two gene lineages could coalesce, mutate, or do neither in any given generation. As before, the probability that these two genes coalesce in the previous generation is 1/(xNef), and the probability that they do not coalesce in the previous generation is 1 − 1/(xNef). Assuming an infinite alleles model of mutation, the probability that two genes are identical-by-descent is the probability that the two gene lineages coalesce before a mutation occurred in either lineage. If the two genes coalesced t generations ago, this means that there were 2t DNA replication events at risk for mutation (two gene lineages each undergoing t replication events). Hence, the probability that neither gene lineage experienced any mutation over this entire time period is (1 − μ)2t. Putting this probability together with Eq. (5.8) that describes the impact of drift, we have: Prob coalescence before mutation = Prob identity by descent =

1−

1 xN ef

t−1

1 xN ef

1−μ

2t

= Prob no coalescence for t − 1 generations × Prob coalescence at generation t × Prob no mutation in 2t DNA replications 5 21

143

144

Population Genetics and Microevolutionary Theory

Now, consider the probability that we observe a mutation before coalescence. Suppose the mutation occurred t generations into the past. This means that of the 2t DNA replication events being considered, only one experienced a mutation and the other (2t − 1) replication events did not. Because we have two gene lineages, either one of them could have mutated at generation t and we do not care which one it is. Hence, there are two ways for the mutation to have occurred at generation t. The total probability of having a single mutation at generation t in 2t DNA replication events is therefore 2μ (1 − μ)2t−1. The probability of no coalescence in these t generations is [1 − 1/ (xNef)]t. Putting these two probabilities together, we have that: Prob mutation before coalescence =

1−

1 xN ef

t

2μ 1 − μ

2t−1

5 22

If μ is very small and Nef is very large, then the occurrence of both coalescence and mutation during the same generation can be ignored. Therefore, if we look backwards in time until these two gene lineages either coalesce or have a mutation, the conditional probability (see Appendix B) of a mutation before coalescence given that either mutation or coalescence has occurred is: Prob mutation before coalescence mutation or coalescence 2μ 1 − μ

= 2μ 1 − μ

2t − 1

1−1

2t − 1 t

xN ef

1−1

t xN ef

1 + 1−μ xN ef

2t

1−1

t−1 xN ef

=

2xN ef μ − 2μ 2xN ef μ − 3μ + 1

5 23

If μ < < Nef μ (i.e. a large inbreeding effective size), and by defining θ = 2xNef, Eq. (5.23) simplifies to: Prob mutation before coalescence mutation or coalescence 2xN ef μ − 2μ 2xN ef μ θ = ≈ = θ+1 2xN ef μ − 3μ + 1 2xN ef μ + 1

5 24

Note that the probability of mutation before coalescence given that one has occurred is the same as the expected heterozygosity under random mating for an autosomal locus (x = 2) that we derived in Eq. (5.7). This makes sense because if we had mutation before coalescence given that mutation or coalescence has occurred, then the two gene lineages being compared must be different alleles (recall, our mutational model is the infinite alleles model). Since we drew the two genes from the population at random, this is equivalent to the expected heterozygosity under random mating. Hence, whether we look at the joint impact of drift and mutation in a time forward sense (Eq. 5.7) or a time backward sense (Eq. 5.24), we get the same result for the impact of the balance of mutation to genetic drift upon the level of genetic variation present in the population. Consider the coalescent process with mutation with a sample of n genes in a diploid population of inbreeding effective size Nef = N. Using Eq. (5.17), the expected time between the k − 1 coalescent event and the kth event, we can calculate the expected number of mutations that occur in any given lineage between the k − 1 coalescent event and the kth event, assuming that mutations occur at a rate μ per generation in every DNA lineage under the infinite alleles or infinite sites model (Chapter 1), as: 4Nμ n−k + 1 n−k θ = n−k + 1 n−k

E number of mutations between k − 1 and k coalescent events =

5 25

Genetic Drift in Large Populations and Coalescence

Between the k − 1 coalescent event and the kth event, there exist n − k + 1 DNA lineages, each at risk for mutation. Hence, the total number of mutations that are expected to be observed between the k − 1 coalescent event and the kth event in a sample of n genes, say Sk, is: Sk = n − k + 1

θ θ = n−k + 1 n−k n−k

5 26

Over the entire coalescent process in which the n sampled genes trace back to a single ancestral molecule, the expected number of mutational events is, summing Eq. (5.26) over all n − 1 coalescent events: n−1

n−1

Sk = θ

S= k=1

k=1

1 n−k

5 27

For DNA sequence data, the above equations are derived under the infinite sites model. As pointed out in Chapter 1, this assumption eliminates the possibility of mutational homoplasy and ensures that each new mutation creates a new allele or haplotype in the DNA region of interest. Under the infinite sties model, S in Eq. (5.27) also has the biological meaning of being the number of segregating (polymorphic) nucleotide sites in the DNA region of interest in the sample. Either the n-coalescent with mutation or standard neutral theory can be used to derive the site frequency spectrum (SFS), the probability distribution for the number of times each allele occurs in the sample of n. Under the assumptions of constant population size, the standard neutral model, and the infinite alleles model of mutation, the SFS is: Probablity number of copies of an allele = i =

θ i

5 28

Rothman and Templeton (1980) derived a general form of the SFS, and it is important to note that the exact form varies even under neutrality as a function of demographic history (Bhaskar and Song 2014; Rosen et al. 2018) and the mutational model (Bhaskar et al. 2012). The dependence on demographic history is readily apparent from Eq. (5.25). This equation divides the coalescent process into the intervals between each pair of successive coalescent events. Although only a single N (actually Nef) is used for all intervals in deriving Eq. (5.26) from Eq. (5.25), the N’s could be different at different times in the past, thereby making θ a function of i in Eq. (5.26). Consequently, the n coalescent is strongly influenced by past inbreeding effective sizes and their fluctuations over time. As we will see in Chapter 7, this also means that we can obtain insight and estimates of past effective sizes over time from the n coalescent with mutation. For a large sample of genes, the sum of (n − k)−1 from k = 1 to n − 1 in Eq. (5.27) converges to 2, so the total number of mutations expected in the n-coalescent process under neutrality is 2θ when n is large. Note that the last term in this sum (corresponding to k = n − 1) is 1. Recall that the expected time it takes between the next to last (k = n − 2) and the last (k = n − 1) coalescent event is half of the total expected coalescent time for the entire sample of n genes. Thus, for large n, half of the expected mutations occur in the more ancient half of the total coalescent process and half the mutations are expected to occur in the more recent half (each of these two equal time intervals should experience θ mutational events). Now, the next to the last term in the sum of (n − k)−1 in Eq. (5.27) (corresponding to k = n− 2) equals 1/2. As just mentioned above, the sum up to and including the n − 2 term is about 1 for large n, so the number of mutations expected to occur between the n − 3 and n − 2 coalescent events is θ/2, and thus θ/2 are expected to occur between the present and the n − 3 coalescent event. As shown earlier, both of these expected time intervals are equal in duration.

145

146

Population Genetics and Microevolutionary Theory

Continuing with this argument, it is evident that the expected number of mutations is uniformly distributed across time. Thus, in a large sample of genes, any single mutation that occurred during the coalescent process is as likely to be recent as ancient. If we sequence a DNA region either with a high mutation rate and/or sequence a large number of nucleotides to ensure that θ for the total sequenced region is large, we should encounter many recently arisen mutational events and their associated haplotype states. Collectively, young haplotypes are common in large samples with large θ even though any particular young haplotype is expected to be globally rare.

Genealogies, Gene Trees, and Haplotype Trees With mutation in the model, distinctions can now be made among genealogy, gene trees, and haplotype trees. Genealogies describe all the ancestors and their mating patterns of a sample of individuals. A family pedigree (e.g. Figure 3.1, including all boxes and circles) is an example of a genealogy. Gene trees are genealogies of genes, or more commonly a specific genetic region, in a sample of individuals. Gene trees describe how different copies at a homologous DNA region are “related” by ordering coalescent events (e.g. Figure 5.8). Figure 3.1 also shows why gene trees are not the same as genealogies. The ½ entries in Figure 3.1 reflect Mendelian segregation and show that for an autosomal locus a parent only passes on one of the copies of any homologous locus that that parent received from his or her parents. Hence, for the individual at the bottom of the pedigree, all the grandparents are genealogical ancestors, but, due to Mendelian segregation, some of the grandparents may not be genetic ancestors for a particular gene locus. The probability of loss of a particular gene from a specific genealogical ancestor increases with increasing generations to that ancestor. Hence, in genealogies, the number of ancestors tends to increase with increasing generation time into the past (e.g. 2 parents, 4 grandparents, 8 great-grandparents, etc. assuming no pedigree inbreeding), but as we saw with the coalescent, gene trees lose ancestors as we go into the past until finally there is only one common genetic ancestor (e.g. Figure 5.8). Hence, a gene tree for a particular genomic region is a subset of the genealogy – the lucky winners at every generation in the Mendelian lottery. What surprises many is that this loss of genealogical ancestors extends to the whole genome, despite the advertising of many companies that claim to reveal your genealogy from genomic data (recall the discussion of ghost ancestors in Chapter 3). Because of Mendelian segregation, any particular genealogical ancestor from n generations ago is a genetic ancestor for a particular nonrecombining locus or a nucleotide with probability (½)n.. When we consider the whole genome,. the expected proportion of an individual’s genomes that descends from that ancestor is (½)n. As n increases, this number become vanishingly small. As we learned in Chapter 3, recombination breaks up ancestral chromosomes into smaller and smaller segments in a probabilistic manner as n increases (Hanson 1959). However, the genome is not infinitely divisible by recombination, and recombination hotspots further constrain the locations of recombination events. The consequence of this is that as time proceeds, we inherit only a decreasing number of finite and increasingly smaller genomic regions that come from a particular genealogical ancestor. Genetic drift ensures random loss when dealing with a finite number of chromosomal segments, particularly when there are very few of them from an ancient ancestor, so, with time, it becomes increasingly likely to lose all the genetic ancestry from a given genealogical ancestor (Wiuf and Hein 1997). This occurs rather rapidly. For example, one expects only about 0.1% of one’s genome to come from an ancestor 10 generations ago, but, for human genomes, about 2/3 of one’s genealogical ancestors 10 generations ago make no genetic contributions whatsoever to your genomes (Swamidass 2019). Genetic ghosts are genealogical ancestors that are no longer genetic ancestors

Genetic Drift in Large Populations and Coalescence

(a)

(b)

Mutation

Mutation

Mutation Mutation

A

B1

D

B

B2

C

D

E

Alleles A (Haplotypes)

C

E

Figure 5.11 Gene trees and allele/haplotype trees. Panel (a) shows the same gene tree portrayed in Figure 5.8, but now with some mutational events added on. Each mutation creates a new, distinguishable allele, as indicated by a change in shape of the box containing the DNA molecule. The extinct, ancestral haplotype is shown only by an unboxed DNA motif. Panel (b) shows the haplotype tree associated with the gene tree in Panel (a). The only observable coalescent events are those associated with a mutational change, so each line in this tree represents a single mutational change. Letters correspond to different allelic categories that exist in the current population. Two identical copies of the B allele/haplotype are indicated by B1 and B2.

for any part of the genome for an individual or sample of individuals. Indeed, the vast majority of any person’s genealogical ancestors are ghosts and, therefore, do not appear in any gene tree. Indeed, many genealogical ancestors become super-ghosts that are simultaneously genealogical ancestors for everyone in the current population but genetic ancestors to no one (Gravel and Steel 2015). Your genetic ancestry is just a tiny subset of your genealogical ancestry, particularly for generations in the distant past. Resolution is further reduced when we turn our attention to haplotype trees. Figure 5.11a shows a simple gene tree for a sample of six copies of a homologous DNA region. This figure is a repeat of the gene tree shown in Figure 5.8, but now we allow some of the DNA replication events to have experienced mutation. Looking back through time, we can see that the A and B1 copies of the gene in Figure 5.11a coalesce prior to their coalescence with the B2 copy. However, note that the B1 and B2 genes share the same haplotype state, whereas the A gene differs from them by a single mutation. Thus, the A and B1 genes experienced mutation prior to coalescence, whereas the B1 and B2 genes coalesced prior to mutation. The gene tree shows precise information here about the genetic ancestry, including cases in which two genes have a closer common ancestor even though they are not identical (e.g. A and B1) compared to two genes that have a more distant common ancestor yet are identical in sequence (the B1 and B2 genes). Unless pedigree information is available (such as that for the captive Speke’s gazelle population, or the human population on Tristan da Cunha), such precise information about genetic ancestry is usually not known. For example, from sequence data alone, the B1 and B2 genes in Figure 5.11a are

147

148

Population Genetics and Microevolutionary Theory

identical and therefore indistinguishable. We would have no way of knowing from sequence data that the B1 gene is actually genealogically closer to the A gene than to its indistinguishable B2 copy. Generally, our resolution of ancestral relationships is limited to those copies of the DNA that are also distinguishable in sequence state. Such distinguishable copies are created by mutation (under the infinite alleles/sites model in the models considered in this chapter). The gene copies that differ from one another by one or more mutational events are called alleles if the DNA region we are examining corresponds to a functional gene locus, or more generally they are called haplotypes if the DNA region corresponds to something other than a traditional locus (Chapter 1) – such as a portion of a locus, non-coding DNA, a DNA region that spans several loci, or even an entire genome such as mtDNA. Haplotypes or alleles are important to us because they are the only copies of DNA that we can actually distinguish in the absence of pedigree data. The only branches in the gene tree that we can observe from sequence data are those marked by a mutation. We cannot observe the branches in the gene tree that are caused by DNA replication without mutation. Therefore, the ancestral coalescent tree observable from sequence data retains only those branches in the gene tree associated with a mutational change. This lower resolution tree is called an allele or haplotype tree. The allele or haplotype tree is the gene tree in which all branches not marked by a mutational event are eliminated. The haplotype tree corresponding to Figure 5.11a is shown in Figure 5.11b. As can be seen by contrasting Figure 5.11a and b, we cannot see all the coalescent events in the haplotype tree. We therefore group genes into their allelic or haplotype classes. The haplotype tree is a tree of genetic variation showing how mutational variation arose and created interrelated haplotypes. We also know the allele/haplotype frequencies in our current sample (for example, allele B in Figure 5.11 has two copies in our sample, B1 and B2; all other alleles are present in only one copy). Hence, the coalescent process that we can see only deals with the evolutionary history of alleles or haplotypes and their frequencies in our sample. In some cases, we do not know even as much information about the allele tree as that shown in Figure 5.11b. In that figure, the “black” ancestral DNA is portrayed as known, so it constitutes a root for the tree and gives us time polarity. For some allele or haplotype trees, we do not know the root of the tree. This results in an unrooted allele or haplotype tree called a haplotype network. Such unrooted networks show the mutational relationships that occurred in evolution to transform one haplotype (allele) into another, but do not indicate the temporal orientation of mutational events. Figure 5.12 shows the unrooted haplotype network corresponding to Figure 5.11b.

A

B

D

E

C

Figure 5.12 The unrooted network corresponding to the rooted allele/haplotype tree shown in Figure 5.11b. Each line in this network represents a single mutational change. Letters correspond to different allelic categories that exist in the current population.

Genetic Drift in Large Populations and Coalescence

The first step in estimating a haplotype tree or network is to infer the haplotypes. For parts of the genome that are effectively haploid (e.g. mtDNA and X and Y chromosomes from males in humans), the haplotypes are obtained directly from many types of genetic survey data. It is also possible to directly obtain haplotypes in autosomal regions by some molecular techniques (e.g. Abelleyro et al. 2019). Haplotypes can also be inferred from multi-locus gene pool data and linkage disequilibrium patterns through several statistical methods (Climer et al. 2009, 2015) (see Appendix B for more details.) Another problem in estimating a haplotype tree from genetic data is homoplasy (Chapter 1). If the same mutated form can arise repeatedly, it can obscure evolutionary history because identityby-state no longer necessarily means identity-by-descent. Homoplasy represents a major difficulty when trying to reconstruct evolutionary trees, whether they are haplotype trees or the more traditional species trees of evolutionary biology. Dealing with homoplasy is a major issue in phylogenetics in general, and many methods have been developed for estimating evolutionary trees of species under some level of homoplasy. Many of these methods have also been applied to the problem of estimating haplotype trees within species. One of the earliest methods developed for estimating trees from chromosomal mutations such as inversions was to find the tree that required the fewest mutations and homoplasies (Sturtevant and Dobzhansky 1936). Today, this principle of constructing evolutionary trees or networks by minimizing the number of mutational changes and homoplasy to explain derived states is known as maximum parsimony, and many computer programs are available to estimate evolutionary trees and networks under this criterion. Sturtevant and Dobzhansky (1936) also rooted their estimated inversion trees within a species by including closely related species in their sample. Using a related species to root the intraspecific inversion network is based on the assumption that the root of the total tree that includes both intraspecific samples and one or more interspecific samples must lie somewhere on the branch connecting the intraspecific portion of the tree to the interspecific branch emerging from the intraspecific portion. This method of rooting is now called outgroup rooting. In general, outgroup rooting defines the root for the evolutionary tree of the entities (species, haplotypes, alleles, inversions, etc.) of primary interest (called the “ingroup”) as that specific entity or node in the network that connects the ingroup to another entity or set of entities (called the “outgroup”) that is thought to be more evolutionarily distant from all the ingroup entities than any ingroup entity is to another ingroup entity. The outgroup rooting method of Sturtevant and Dobzhansky (1936) is now one of the standard methods of rooting evolutionary trees of all sorts. An example of such an outgroup rooted, maximum parsimony haplotype tree is shown in Figure 5.13. Fullerton et al. (2000) sequenced 5.5 kb of the human Apoprotein E (ApoE) region in 96 individuals, revealing 23 SNPs. There was no evidence for recombination in this region. Figure 5.13 shows the haplotype network estimated for this region under maximum parsimony using a chimpanzee sequence as an outgroup. Note that there are several loops in this haplotype tree. Since time does not go in circles, there should be no loops in a true evolutionary history. Rather, each of the loops reflects ambiguity about the evolutionary history of these haplotypes due to homoplasy under the principle of maximum parsimony, although not all homoplasies are in loops. As we will see shortly, the loops are ideally broken in a true evolutionary tree, so not all of the homoplasies in a loop actually occurred. However, a glance at Figure 5.13 reveals that homoplasies and mutational hotspots are common in this gene. Mutational homoplasy undermines the two most commonly used models of mutation in the population genetic literature: the infinite alleles model and the infinite sites model. When there is a substitution at a nucleotide, there are only three possible states for the new mutant nucleotide site (for example, a site with nucleotide A can only mutate to G, C, and T). Hence, at the single

149

Population Genetics and Microevolutionary Theory

21

9

2440

15

4075

29

3937

8 308

3

624

12

3673

0

832

0

1998

23

0

13

3106

5

31

5361

560

11

3937

6 560

4

2440

560

2

15 0

4951

5361

7

17

52

63

545

11

1998

16

560

832

01

29

560

4951

37

10

0

471

30

0

28

560

4036

2907

19

25

75

4951

15 61

0

73

1

22

624

22

1575

560

24

624

4

0 624

624

26

14

53

560

62

150

0 18

20

27 Chimpanzee (Outgroup)

Figure 5.13 The haplotype network of a 5.5 kb segment of the ApoE gene. Circles designate the haplotypes, each identified by a number (1 through 31) either inside or beside the circle. The relative sizes of the circles indicate the relative frequencies of the haplotypes in the sample. A “0” indicates an inferred intermediate haplotype that was not found in the sample. Each line represents a single mutational change, with the number indicating the mutated nucleotide position. Boxed mutational numbers indicate potential homoplasies. Arrow heads show the direction of time determined by the chimpanzee outgroup. A solid line is unambiguous under maximum parsimony, whereas all dashed lines are ambiguous alternatives under maximum parsimony. Thin lines with long dashes are ambiguous under both maximum and statistical parsimony. Dashed lines with short but thick dashes are resolved as occurring under statistical parsimony. Source: Fullerton et al. (2000). © 2000, Elsevier.

nucleotide level, we cannot justify an infinite alleles model of mutation. Second, the DNA regions being surveyed typically include many thousands or more of base pairs. If mutation is randomly distributed across these thousands of nucleotides and is relatively rare, then the chance that any single nucleotide mutating more than once should be low – the usual justification for the infinite sites model. Under this mutational model, there is no homoplasy because no nucleotide site ever mutates more than once. However, as pointed out in Chapter 1, mutation at the molecular level does not occur at each nucleotide independently but is strongly influenced by nearby nucleotides resulting in DNA motifs that result in mutational hotspots (e.g. Table 1.2). The effects of hotspots are amplified in intraspecific data because haplotypes that are common in the gene pool are more likely to give rise to a new haplotype through mutation simply as a result of their high frequency in the population. As a result, multifurcations from common haplotypes rather than bifurcations (the norm in species trees) are common in intraspecific haplotype trees (e.g. look at the multifurcations associated with the three most common haplotypes – 1, 2, and 5 – in the ApoE haplotype tree in Figure 5.13). If a common haplotype or haplotypes bears one or more mutational motifs, this frequency-dependent effect amplifies the impact of such hotspots, making homoplasies even more

Genetic Drift in Large Populations and Coalescence

common in intraspecific haplotype trees. This indeed is the case for the ApoE haplotype tree shown in Figure 5.13. Unlike the Lipoprotein Lipase (LPL) gene for which CG dinucleotides are mutational hotspots (Table 1.2), CG dinucleotides are not mutagenic in ApoE, perhaps because the Cs are not methylated in the germline (Chapter 1). Fullerton et al. (2000) showed instead that homoplasy is highly concentrated into two sites in the ApoE region, sites 560 and 624 that are both located within an Alu element. Alu is a family of short interspersed repeated elements of about 300 bp in length found in the genomes of primates. Alu elements are particularly abundant in the human genome, accounting for about 11% of the total DNA in our genome (Mustafina 2013). Alu sequences are thought to promote localized gene conversion (Cooper 1999), so the high rate of homoplasy at these two sites may be due to local gene conversion rather than traditional single nucleotide mutation (Fullerton et al. 2000). This mutagenic motif is found in the common haplotypes 1, 2, 4, and 5, thereby making homoplasy at these two sites extremely common in the ApoE region. Both LPL (Table 1.2) and ApoE (Figure 5.13) show that the biological rationale for using the infinite sites model for intraspecific haplotype trees is poorly supported despite its widespread use in population genetics. The fact that the mutagenic motifs in the human LPL region are different from those found in ApoE illustrates that different regions of the genome can have different types of mutational hotspots. It is therefore best to approach the analysis of any new DNA region cautiously, keeping in mind that the infinite sites model is frequently violated, but perhaps for different reasons in different genomic regions. As pointed out in Chapter 1, studies on interspecific phylogenetic inference have shown that using a mutational model that incorporates the context of the neighboring nucleotides versus models of independent nucleotide mutation has a large and highly significant effect on tree estimation (Siepel and Haussler 2004; Baele et al. 2008; Bérard and Guéguen 2012; Chachick and Tanay 2012). The benefits of incorporating context dependency unfortunately come at the expense of computational efficiency, so these context-dependent models are rarely used in phylogenetic inference. One possible solution is based on the fact that comparisons of haplotype backgrounds often allow us to distinguish between identity-by-descent versus identity-by-state (e.g. Figure 2.7). Indeed, the very construction of a haplotype tree under the common phylogenetic inference procedure of maximum parsimony reveals many potential homoplasies in a conservative fashion (Figure 5.13 and Templeton et al. 2000c) without having to invoke any mutational model at all. Nevertheless, standard maximum parsimony typically leaves many unresolved ambiguities in intraspecific haplotype trees due to homoplasy. Maximum parsimony is only concerned with minimizing the total number of mutations in the haplotype network. Under maximum parsimony, many loops of phylogenetic ambiguity exist, as shown in Figure 5.13. Because time is not circular, we know that these loops have to be broken in a true evolutionary tree. In the ApoE maximum parsimony tree, there are two loops of four mutational changes and one double loop (Figure 5.13). As shown in Figure 5.14a, each of the loops of four can be broken in 4 different ways, and the double loop can be broken in 15 different ways (Templeton 1997). Every way of breaking these loops results in the same number of total mutations in the overall tree. Thus, because of homoplasy, there are 4 × 4 × 15 = 240 different and equally parsimonious evolutionary histories for the ApoE haplotypes under maximum parsimony. Although maximum parsimony can identify many homoplasies, homoplasy still can create serious difficulties in estimating haplotype networks or trees under maximum parsimony. Fortunately, coalescent theory indicates that there are other sources of information about the evolutionary history of intraspecific haplotypes besides just the total number of mutations in the tree. For example, the coalescent Eqs. (5.26) and (5.27) indicate that nucleotide divergence becomes more probable with increasing time. By summing Eq. (5.26) from the time of the mutational event that caused a new haplotype to arise to the present, the ancestral and mutated haplotypes tend to

151

Population Genetics and Microevolutionary Theory

(a) Maximum Parsimony C

624

T T T C 560 624 1575

T C C 560 624 1575

T

560

560

T

A

A A

C

T T

560 624 1575

1575

A

O C

624

C

OR

OR

T

C

560 624 1575

T

OR

(b) Statistical Parsimony C T

C

624

T

C

T

T

T 560

T

A

A A

C

T

560 624 1575

T

1575

C

560 624 1575

560 624 1575

560

152

O C

C

624

A T

T

C

560 624 1575

OR

Figure 5.14 The difference between maximum parsimony (panel (a)) and statistical parsimony (panel (b)) for four haplotypes found at the human ApoE locus. Solid lines indicate mutational changes that are fully resolved under the relevant parsimony criterion, and dashed lines indicate mutational changes that may or may not have occurred, depending upon the true evolutionary history. The small double-headed arrows indicate the actual nucleotide substitution along with its position number in the reference sequence associated with each potential mutation. The possible evolutionary histories consistent with the criteria are shown underneath the loops.

diverge more and more at the sequence level the longer the time since they diverged. Templeton et al. (1992) coupled these divergence equations with a finite sites model in which a finite set of nucleotides are subject to mutation, and multiple mutational events at each site are allowed. Using a Bayesian estimation procedure (see Appendix B), Templeton et al. (1992) quantified the chance of

Genetic Drift in Large Populations and Coalescence

a homoplasy between two haplotypes as a function of the number of observed site differences between the haplotype pair. Hence, there is information about evolutionary history not only in the total number of observable mutational events in the overall haplotype tree but also in how homoplasy is allocated among the inferred individual branches. For example, two of the loops leading to tree ambiguity in ApoE involve both sites 560 and 624 as well as a mutation at site 1575. There are four equally parsimonious ways of breaking these loops (Figure 5.14a). However, consider breaking the loop between the two haplotypes at the top with the sequences TCC and TTC. These two haplotypes differ by only a single nucleotide out of 23 variable sites and about 5500 total sites. Breaking the loop between these two haplotypes implies an evolutionary history in which three mutations separate the haplotypes TCC and TTC rather than just the one observable difference. Similarly, a break between the haplotypes TTC and ATC implies that these two haplotypes that differ by only one observable nucleotide actually had three mutational events between them before they coalesced. On the other hand, suppose we broke the loop by deleting one of the dashed arrows that connects haplotype ACT to either TCC or ATC. In either case, haplotype ACT differs from both of these alternative connecting haplotypes by two observable mutational changes. Deleting either of these dashed arrows connected ACT to TCC or ATC would allocate an extra pair of mutations between one of these more divergent haplotype pairs. Placing an extra pair of mutations between two haplotypes that already differ at two observable nucleotide positions is far more likely than placing an extra pair of mutations between haplotypes that differ at only one nucleotide position under the Bayesian estimator of Templeton et al. (1992). A computer program for calculating this probability under a neutral finite sites model is available at http://darwin.uvigo.es. These coalescent probabilities lead to a refinement of maximum parsimony called statistical parsimony in which ambiguous branches that allocate homoplasies between less divergent haplotypes are eliminated when the estimated probability of homoplasy is less than 0.05 as a function of the level of observable divergence. (A program for estimating statistically parsimonious networks is available at the same website given above.) For example, under statistical parsimony, there are only two ways of breaking the loops (Figure 5.14b) rather than the four under maximum parsimony (Figure 5.14b). Although this may seem to be only a modest gain in resolving the haplotype tree, many haplotype trees contain loops under maximum parsimony that can be completely resolved by statistical parsimony. Moreover, the ambiguities caused by disjoint loops are multiplicative. For example, using all of the ApoE data, we already saw that there are 4 × 4 × 15 = 240 different maximum parsimony trees for the ApoE haplotypes. Statistical parsimony reduces the possibilities to two alternatives for the two single loops and for one of the loops in the double loop (Figure 5.14), yielding a total of 2 × 2 × 2 × 4 = 32 trees under statistical parsimony. Thus, the number of evolutionary possibilities has been reduced by an order of magnitude in this case by the application of coalescent theory through statistical parsimony. Coalescent theory provides even more sources of information about evolutionary history. As pointed out with regard to Figures 5.11 and 15.13, the potentially observable aspects of the coalescent process are the haplotype tree and the current haplotype frequencies. Castelloe and Templeton (1994) showed that there are nonrandom associations between the frequency of a haplotype and its topological position in the haplotype tree under a neutral coalescent model. For example, a haplotype that has many copies of itself in the gene pool is much more likely to give rise to a mutational descendant than a rare haplotype simply because there are more copies at risk for mutation. As previously pointed out, this effect is amplified when a common haplotype also bears a mutagenic motif. When dealing with haplotypes near the tips of the tree (that is, relatively recent events), the current haplotype frequencies are expected to be close to the frequencies in the recent past. Therefore, if a tip haplotype (one that is connected to only one other haplotype or node in the tree) has an

153

154

Population Genetics and Microevolutionary Theory

ambiguous connection in the haplotype tree, the relative probabilities of the alternative connections are proportional to the relative frequencies of the potential ancestral haplotypes under neutral coalescence (Crandall and Templeton 1993). For example, haplotype 21 in Figure 5.13 is a tip haplotype, and under statistical parsimony, it can be connected to the remainder of the tree either through haplotype 1 or through haplotype 26. As can be seen from the sizes of the circles, haplotype 1 is much more common than haplotype 26 (the actual haplotype frequencies are 0.234 and 0.005, respectively). The haplotype frequencies of 1 and 26 are significantly different, so the frequency information can be used to infer topological probabilities. In particular, under neutral coalescence, the probability of haplotype 21 being connected to haplotype 1 is 0.234/(0.234 + 0.005) = 0.98 and the probability of it being connected to haplotype 26 is 0.005/(0.234 + 0.005) = 0.02. Hence, we can, with great confidence, discriminate between these two alternatives, reducing the number of statistically parsimonious trees down to 16 of the 240 maximum parsimony trees as the loop in the upperleft corner of Figure 5.13 is now fully resolved. DeWitt et al. (2018) have confirmed through both simulations and experiment that much information for phylogenetic inference is contained in frequencies, so this source of information is quite valuable for inference of intraspecific haplotype trees and should not be squandered. The genetic properties of the sequenced region provide yet another source of evolutionary information. As indicated above, mutations are not uniformly distributed in many DNA regions, and therefore, homoplasy is nonrandomly distributed across the variable sites, as was shown for the LPL region. This information can be used to resolve the double loop in Figure 15.13. As can be seen from that figure, the double loop is caused either by homoplasy at site 560, one of Alu sites that displays much homoplasy even outside this loop, or sites 2440 and 3937, which otherwise show no homoplasy at all. Moreover, site 3937 is one of the amino acid replacement sites, which is the rarest class of substitutions in this region. Therefore, the pattern of mutation in this DNA region indicates that the double loop is best resolved through homoplasy at site 560 with no homoplasy at sites 2440 and 3937. When this resolution is coupled with the previous resolutions achieved through statistical parsimony and haplotype frequencies, only two ambiguities remain in the entire haplotype network – the two alternatives of connecting haplotype 22 and two alternatives that eliminate the homoplasy at site 2440 in the double loop. The two alternative haplotypes to which haplotype 22 may be connected are both rare and do not have significantly different allele frequencies, and the same is true for the two possible branches involving site 2440 (although the haplotype 6 to haplotype 2 branch is more likely). Moreover, all alternative pathways involve homoplasy at one of the Alu sites 560 and 624. Although this one ambiguity cannot be fully resolved, we have gone from 240 maximum parsimony solutions of this haplotype tree to only four possible haplotype trees by using coalescent theory and our knowledge of fundamental mutational properties, as shown in Figure 5.15. Parsimony is an example of using character states to estimate a phylogenetic tree in which the state and transitions of particular nucleotides or other types of mutation are used for inference. There are several other character state methods for inferring phylogenetic trees based on the statistical principles of maximum likelihood and Bayesian procedures (Appendix B). Character state approaches yield much detailed information about the inferred tree, but one disadvantage is computational time, although this problem is diminishing due to more efficient algorithms and increases in computational speed and power. Finding the optimal character state tree(s) is still often a computational challenge. For n distinct haplotypes, there are a maximum of (2n − 3)!/[2n − 2(n − 2)!] distinct rooted trees. Just 10 haplotypes have 34 459 425 possible trees. With modern data sets, it often happens that there are so many haplotypes that most computer programs cannot examine all the possibilities and instead use heuristic algorithms that are not

Genetic Drift in Large Populations and Coalescence

4075

8

3673

0

75

0

13

624

0

23

1998

5

3106

31

5361

832

471

12

560

2440

3937

560

6

308

3

15 73

2440

560

62

11

15

0

4951

2

560

5361 4

7

52 29

545

63

1998

16 17

0 11

25

560

0 4951

01

832

37

10

19

0 624

30

29

28

560

624

1

9

2907

22

4951

0

624 53 61

560

24

15

0

624

4

14 22

26

4036

1575

21

18

20

27 Chimpanzee (Outgroup)

Figure 5.15 The statistical parsimony haplotype tree for the ApoE gene, incorporating additional resolutions based on haplotype frequencies and knowledge of mutational motifs. Only two ambiguities exist in this tree, as indicated by dashed lines. The nucleotide sites that have experienced multiple, independent mutational events to the same state are circled.

guaranteed to find the optimal solutions. Moreover, implementing even these heuristic algorithms with data sets with many haplotypes can be time-consuming with even the fastest computers. There exist other tree estimating algorithms that are much less computationally intensive. Almost all of these are based upon some form of genetic distance, so we first need to examine what a genetic distance means. There are three main types of genetic distance in the population genetic literature: genetic distances between populations, genetic distances between individuals, and genetic distances between molecules. For now, we are only interested in molecule genetic distances that ideally measure the number of mutational events that occurred in the two molecular lineages being compared back to their coalescence to a common ancestral molecule. Population and individual genetic distances will be discussed in the next two chapters, but we note for now that most population and individual genetic distances can reach even their highest values in the complete absence of mutation. Therefore, these types of genetic distance should never be equated or confused. Unfortunately, all these genetic distance measures are typically called “genetic distance” in the literature, so readers have to infer which type of distance is being used from the context. In this text, we will always use the phrases “molecule genetic distance,” “individual genetic distance,” and “population genetic distance” to make this distinction explicit. Many mathematically distinct distance measures exist within these classes, and we will not attempt an exhaustive survey. Rather, we will give only the simplest types of molecule genetic distance designed for DNA sequence data. The simplest molecule genetic distance measure for DNA comparisons is just the observed number of nucleotide differences between the molecules being

155

156

Population Genetics and Microevolutionary Theory

compared, often symbolized by π. The problem with this simple measure is homoplasy. When the same nucleotide mutates more than once, we still see this as only a single observable difference or even no difference at all if the second mutation reverts back to the ancestral state within one DNA lineage or if parallel mutational events occurred in both DNA lineages. Because any nucleotide can take on only four distinct states, any time a single nucleotide site undergoes more than one mutation, it is likely that a reversal or parallelism occurred. This problem is greatly accentuated by nonrandom mutation at the molecular level. For example, methylated C-G dinucleotides are prone to specifically mutating to a T-G dinucleotide (Chapter 1), thereby making homoplasy even more likely than mutation to a random, alternative nucleotide. This causes the observed number of differences to underestimate the actual number of mutational events separating two DNA molecules, which is our ideal standard for a molecule genetic distance. One of the simplest molecule genetic distances that corrects for multiple mutational hits is the Jukes and Cantor distance (Jukes and Cantor 1969). The derivation of this molecule genetic distance is given in Box 5.1, where it is shown that under the assumptions of neutrality, all sites mutating at the same homogeneous rate of μ, and all mutations being equally likely to go to any of the three alternative nucleotide states: 3 4 DJC = − ℓn 1 − π 4 3

5 29

where DJC is the Jukes and Cantor molecule genetic distance and π is the observed number of nucleotides that are different divided by the total number of nucleotides being compared. Figure 5.16 shows a plot of the Jukes and Cantor molecule genetic distance as a function of π, the observed proportion of nucleotides that differ between the DNA molecules being compared, as well as a plot of π against itself. As can be seen, when DJC is small, there is little difference between DJC and π. Thus, unobserved mutations become common only when the observed divergence level (π) becomes large. That is, when two DNA molecules show few observed nucleotide differences, it is unlikely that unseen homoplasies have occurred in their evolutionary history. This is exactly the same property used to justify statistical parsimony. The Jukes and Cantor model assumes neutrality, a constant mutation rate that applies uniformly and independently to all nucleotide sites, and an equal probability of mutating to all three alternative nucleotide states. There are many ways of deviating from this idealized set of assumptions, and there are therefore many other molecule genetic distance measures designed to deal with more complicated models of mutation. As already shown by the contrast of LPL and ApoE, the underlying model of mutation that is most appropriate can vary considerably from one DNA region to the next. In general, as the molecule genetic distances get larger, different molecule genetic distances become more sensitive to the underlying mutation model, and the model used can sometimes have a major impact on all subsequent inferences. Therefore, it is important to examine each DNA region carefully and then choose the molecule genetic distance that is appropriate for that region. This involves looking at overall base composition, evidence for transition/transversion biases in the mutational process, mutagenic sites, and other sources of mutational rate heterogeneity across nucleotides. Posada (2008) has developed a computer program to aid in such a search, although it does not incorporate multi-nucleotide mutagenic motifs and the nonindependence across nucleotides of mutagenesis (Chapter 1). Once the problem of multiple mutational events has been dealt with through an appropriate molecule genetic distance, a haplotype tree can be estimated through one of several algorithms that use molecule genetic distances instead of character states. One of the more popular ones is the

Genetic Drift in Large Populations and Coalescence

Box 5.1 The Jukes and Cantor Molecule Genetic Distance Consider a single nucleotide site that has a probability μ of mutating per unit time (only neutral mutations are allowed). This model assumes that when a nucleotide site mutates it is equally likely to mutate to any of the three other nucleotide states. Suppose further that mutation is such a rare occurrence that in any time unit it is only likely for at most one DNA lineage to mutate and not both DNA lineages being compared. Finally, let pt be the probability that the nucleotide site is in the same state in the two DNA molecules being compared given they coalesced t time units ago. Note that pt refers to identity-by-state and is observable from the current sequences. Then, with the assumptions made above, the probability that the two homologous sites are identical at time t + 1 is the probability that the sites were identical at time t and that no mutation occurred times the probability that they were not identical at time t but that one molecule mutated to the state of the other, that is: pt + 1 = pt 1 − μ

2

+ 1 − pt 2μ 3 ≈ 1 − 2μ pt + 2μ 1 − pt 3

with the approximation requiring μ to be small. The above equation can be rearranged as: 8 2 Δp = pt + 1 − pt = − 2μpt + 2μ 1 − pt 3 = − μpt + μ 3 3 Approximating the above by a differential equation yields: dpt 8 2 = − μpt + μ dt 3 3 and the solution to this differential equation is: pt = 1 + 3e − 8μt

3

4

A molecule genetic distance ideally measures the total number of mutations that occurred between the two DNA molecules being compared, not their observed number of differences. Jukes and Cantor used the neutral model throughout, so the expected total number of mutations between the two molecules under neutrality is 2μt. Therefore, we want to extract 2μt from the equation given above to obtain the expected total number of mutations, both observed and unobserved. The extraction proceeds as follows: 1 3 + e − 8μt 3 4 4 3 − 8μt 3 1 = pt − e 4 4 8 4 1 − μt = ℓn pt − 3 3 3 3 4 1 2μt = − ℓn pt − 4 3 3

pt =

where “ℓn” is the natural logarithm operator. The above equation refers to only a single nucleotide, so pt is either 0 or 1. Hence, this equation will not yield biologically meaningful results when applied to just a single nucleotide. Therefore, Jukes and Cantor (1969) assumed that the same set of assumptions is valid for all the nucleotides in the sequenced portion of the two (Continued)

157

Population Genetics and Microevolutionary Theory

Box 5.1 (Continued) molecules being compared and that mutation occurs independently at all nucleotides. Defining π as the observed number of nucleotides that are different divided by the total number of nucleotides being compared, Jukes and Cantor noted that pt is estimated by 1 − π. Hence, substituting 1 − π for pt yields: 3 4 2μt = − ℓn 1 − π 4 3

DJC

where DJC is the Jukes and Cantor molecule genetic distance.

2 Molecule Genetic Distance: DJC (Solid Line) or π (Dashed Line)

158

1.5

1

0.5

0.1

0.2 0.3 0.4 0.5 0.6 Observed Proportion of Different Nucleotides, π

0.7

Figure 5.16 A plot of the Jukes and Cantor molecule genetic distance (solid line) and π, the observed proportion of nucleotides that differ between the DNA molecules being compared (dashed line).

algorithm of neighbor-joining (Saitou and Nei 1987) that estimates an evolutionary tree by grouping together the entities that are close together with respect to a molecule genetic distance measure. The details for estimating trees under neighbor-joining are given in Box 5.2. Many computer programs exist that can determine the neighbor-joining tree rapidly even for large data sets containing many haplotypes. This is a great advantage over character state approaches, such as parsimony, maximum likelihood or Bayesian. Unlike parsimony, neighborjoining always produces only a single tree. This is a great disadvantage. The loops that appear in a statistical parsimony network or the multiple solutions under maximum parsimony are excellent reminders that we do not really know the true evolutionary history of the haplotypes; rather, we are only estimating this history, often with some error or ambiguity. Indeed, it is often best to estimate a haplotype tree by more than one method as another way of assessing this ambiguity. For example, Templeton et al. (2000a) estimated the LPL haplotype trees using both statistical parsimony and neighbor-joining. Many common phylogenetic programs allow users to overlay the character state changes upon the branches estimated by neighbor-joining, thereby allowing a direct

Genetic Drift in Large Populations and Coalescence

Box 5.2 The Neighbor-Joining Method of Tree Estimation Step 1 in neighbor-joining is the calculation of the net molecule genetic distance of each haplotype from all other haplotypes in the sample. Letting dik be the molecule genetic distance between haplotype i and haplotype k, the net molecule genetic distance for haplotype i is: n

ri =

dik k =1

where n is the number of haplotypes in the sample. The net distances are used to evaluate violations of the molecular clock model. For example, suppose one haplotype lineage experienced a much higher rate of accumulation of mutations than the remaining haplotypes. Then, the haplotypes from that fast-evolving lineage would all have high r’s. Step 2 is to use the net distances to create a new set of rate-corrected molecule genetic distances as: δij = dij −

ri + r j n−2

Step 3 is the actual neighbor-joining. The two haplotypes with the smallest δij are placed on a common branch in the estimated evolutionary tree. That means that haplotypes i and j are now connected to a common node, say u, and the node u in turn is connected to the remaining haplotypes. Haplotypes i and j are now removed from the estimation procedure and replaced by their common node, u. Step 4 is to calculate the molecule genetic distances from each of the remaining haplotypes to node u as dku = (dik + djk − dij)/2. Step 5 is to decrease n by 1, reflecting the fact that two haplotypes (i and j) have been replaced by a single node (u). Steps 1 through 5 are then repeated until only one branch is left, thereby producing the neighbor-joining tree.

comparison of parsimony, Bayesian and maximum-likelihood trees with neighbor-joining trees. This was done for the estimated LPL haplotype trees. The resulting statistical parsimony and neighbor-joining trees had some differences, but the differences were not statistically significant using the character-state tree comparison test of Templeton (1983a, 1987a). Such congruence indicates that these two very different algorithms for tree estimation are detecting a common evolutionary signal. Another potential disadvantage of neighbor-joining is that it reduces all of the mutational changes at the nucleotide sites into a single number. As we saw with both the LPL and ApoE examples, the details of the inferred mutational changes at each individual site can teach us much about the types of mutation in shaping haplotype diversity and allow tests of specific hypothesis (such as mutagenic motifs in the genome and the fit of the infinite sites model). As we will soon see, detailed information about which nucleotides are changing on a branch can also give us much insight into recombination. Such valuable knowledge should not be ignored, so the individual site changes should be mapped onto the neighbor-joining tree and examined thoroughly even though these changes are not explicitly used in estimating the neighbor-joining tree. Many computer programs allow this to be done.

159

160

Population Genetics and Microevolutionary Theory

Coalescence and Species Trees In the previous section, we noted that amplification of homoplasy through high frequency haplotypes in an intraspecific gene pool is one example of how intraspecific haplotype trees are biologically different from evolutionary trees of species. In species trees, the tips of the tree represent current species that are all typically given the same weight, and internal nodes in the tree represent extinct ancestral species. However, in the intraspecific haplotype tree, not only are many ancestral haplotypes (internal nodes) still present in the current gene pool, but some ancestral haplotypes are the most common haplotypes in the gene pool, such as haplotypes 1, 2, and 5 in Figure 5.13, whereas most tip haplotypes are rare. This is the norm, not the exception, for intraspecific haplotype trees (Crandall and Templeton 1993; Castelloe and Templeton 1994). The reason for the persistence of ancestral forms in haplotype trees is that in general there were multiple identical copies of each haplotype in the ancestral population (from premise 1, DNA can replicate). Hence, when one copy of the ancestral form mutates to create a descendant form, the other identical copies of the ancestral form in general do not simultaneously mutate. This leads to the coexistence in the population of the newly mutated descendant form with many identical copies of its ancestral form. Such persistence of ancestral haplotype lineages within polymorphic populations can sometimes be extreme. For example, Ebersberger et al. (2007) estimated haplotype trees from 23 210 homologous genomic regions in humans, three great apes, and the rhesus monkey, with the rhesus monkey serving as the outgroup. Figure 5.17 shows the resulting tree topologies with respect to species for the subset of haplotype trees that had a significantly resolved topology. The leftmost tree in Figure 5.17 is the species tree that most genomic regions support, and much other evidence indicates that this is indeed the correct topology for the species tree among these taxa. However, notice that only about three quarters of the haplotype trees have this species tree topology. The haplotype trees from the remaining quarter are not incorrect or wrong – these are indeed the haplotype tree topologies associated with those genomic regions. Altogether, statistically significantly resolved haplotype trees yield 10 different topologies for these species. Thus, a substantial portion of the human genome has an evolutionary history of haplotypes that is discordant with the accepted species tree, and, moreover, different regions of the human genome have discordant evolutionary histories with each other. Why is this? The answer to this question arises from the polymorphic status of the gene pools of the ancestral nodes, as shown in Figure 15.18. As stated earlier, all the genetic variation found in a species at a homologous DNA region has to coalesce back to a single DNA molecule if we go back far enough in time, but the time needed to go back to the most recent common ancestral haplotype may actually be longer than the species has existed. When the coalescent time is older than the species, we can

H

C

G

O

R

9148 DNA Regions (76.58%)

C

G

H

O

R

1369 DNA Regions (11.46%)

H

G

C

O

R

1361 DNA Regions (11.39%)

Figure 5.17 The three most common significantly resolved topologies of haplotype trees at the species level. Haplotypes were determined from five species: humans (H), common chimpanzees (C), gorillas (G), orangutans (O), and the rhesus monkey (R). The rhesus monkey was used as an outgroup to root the trees. Source: Based on Ebersberger et al. (2007).

Genetic Drift in Large Populations and Coalescence

(a)

(b)

(c)

Common ancestral species

Species A

Common ancestral species

Species B Species A

Species A

Species A

Common ancestral species

Species B Species A

Species B

Species A

Species B

Species B

Species B

Trans-specific Polymorphism (Both Species Polymorphic)

Trans-specific Polymorphism (Lineage Sorting with One Species Polymorphic)

Intraspecific Monophyly

Figure 5.18 Contrasting patterns of species tree versus gene trees. The inverted Ys represent the splitting of one species into two, with time going from top to bottom (the present). The gene pool for a species at any given time is represented by four genes. The gene tree of the genes is shown by the small lines contained within the inverted Ys. The resulting haplotype trees are shown below the inverted Ys. In all panels, black circles indicate past ancestral gene lineages that did not survive to the time of the split, open squares indicate a mutated ancestral gene lineage that is monophyletic in Species A, open diamonds a mutated ancestral gene lineage that is monophyletic in Species B, open circles a mutated ancestral gene lineage showing transpecific polymorphism in cases (a) and (b), and open stars the ancestral gene lineage that is ancestral to all three mutant derivatives. Panels (a) and (b) show two types of transpecific polymorphism, and panel (c) shows intraspecific monophyly of the haplotype tree.

have a trans-specific polymorphism. Trans-specific polymorphisms occur when some of the haplotypes found in one species are genealogically more closely related to haplotype lineages found in a second species than to other haplotypes found in their own species (Figure 5.18a and b). As shown in Figure 5.18, trans-specific polymorphisms arise from shared ancestral polymorphisms in the two (or more) isolated populations that arise from the splitting of the ancestral population. These shared polymorphisms then undergo independent lineage sorting (genetic drift) during coalescence within the isolates, which can result in the two isolates sharing a polymorphic haplotype lineage (Figure 5.18a) or one isolate having some haplotypes that are evolutionarily closer to haplotypes in the other isolate than they are to other haplotypes in their own gene pool (Figure 5.18b). Such patterns are common in the human genome (Figure 5.17), so of the two haplotypes you received from your parents, one of your parental haplotype lineages may coalesce with a haplotype lineage found in a gorilla before it coalesces with the haplotype lineage you inherited from your other parent.

161

162

Population Genetics and Microevolutionary Theory

With increasing time, the process of genetic drift within isolates ultimately leads to complete coalescence within the isolates, resulting in what is known as monophyly (Figure 5.18c). A monophyletic group consists of all the descendants (haplotypes, species, etc.) from a single, common ancestral form. For a haplotype tree to be monophyletic within a species, all haplotypes found within the species must be a monophyletic group found exclusively within that species (Figure 5.18c). This is the case for three-quarters of the human genome (Figure 5.17). However, Figure 5.17 clearly warns us that haplotype trees are not necessarily evolutionary trees of species. When the isolates are not species but just isolated populations within a species, the time scales are generally much shorter and the problem of trans-isolate polymorphisms and lineage sorting is much more common. Hence, haplotype trees should never be equated to evolutionary trees of populations within a species – and indeed, such population trees may not even exist at all within a species (as will be discussed in Chapter 7). This is not to say that haplotype trees do not contain information about evolutionary history within and among species. Such information exists, but it must be extracted carefully (such analyses will be described in Chapter 7). Haplotype trees should not naively be equated to species or population trees.

Recombination and Coalescence Another major consideration in the estimation of haplotype trees from DNA sequence data is recombination, particularly when one is dealing with nuclear DNA. Recombination creates a far more serious problem for the estimation of haplotype trees than mutational homoplasy because recombination can undercut the very idea of an evolutionary tree. When recombination occurs, a single haplotype can come to bear different DNA segments that had experienced different patterns of mutation and coalescence in the past. Thus, there is no single evolutionary history for recombinant haplotypes. When recombination is common and uniform in a DNA region, there is no meaningful haplotype tree at all. Therefore, screening for the presence of recombination is an important step in an analysis of a nuclear DNA region. A variety of algorithms exist for such screening (Crandall and Templeton 1999; Templeton et al. 2000a; Martin et al. 2011; Wilton et al. 2015; Cámara et al. 2016; Wall and Stevison 2016; Mirzaei and Wu 2017). Interestingly, a haplotype tree estimated under the assumption of no recombination can be used to detect and test for recombination. For example, Figure 5.19 presents one of eight possible statistical parsimony networks of the LPL data under the assumption of no recombination (Templeton et al. 2000c). This network is filled with homoplasy. Some of this homoplasy is undoubtedly due to multiple mutational hits (recall Table 1.2), but some of this homoplasy is highly correlated with the physical position of adjacent sites in the DNA. For example, three sites near the 5 end of the sequenced region (polymorphic sites 7, 8, and 13 in Figure 5.19) show much homoplasy, but the same three mutations appear repeatedly together throughout the statistical parsimony network. A screen for recombination reveals that this pattern is not due to some mysterious mutational mechanism that causes these three sites to mutate again and again in concert, but rather it is due to a recombination hotspot (Chapter 1) in the sixth intron of this gene (Figure 1.6). As a result, the motif associated with sites 7, 8, and 13, located just 5 to the recombination hotspot, has been placed upon several different sequence states defined by the region just 3 to the hotspot (Templeton et al. 2000a). Indeed, the mutations encountered on the “tree” that interconnect these occurrences of the 7,8,13 motif are all 3 of this motif. A statistical test exists to test the null hypothesis that this pattern of physical locations of mutations is random, and for the 7,8,13 motif, this hypothesis is strongly rejected (Templeton et al. 2000a). Overall, some 29 recombination events were detected in the LPL region in this manner, making recombination

21 4

5

72R 23 69 34 29 36 55 57 58

10

57J

38J

28

61

22

14J

24

8

8J

44

9 5

19

55

17

18

16

71R

63N 62

58N

53

35

59

75R

68 69 2

6

13J

53

47N

66N 13

9 10 50

78R

35

22J

11 61

41

17

46N

30

35J

1

40

88R

4

29

23

19

4

55

21

49N 6 28

32J

84R

73R

23

27

59 8

44

29J

19 5 30

26

29

17 26 42

4 13 23

1

17

16

11

53

69

20

26

4 7

48 52 16 20

62N

69R 27 50

53

65N

29 33

62

86R

44 35

44

8 13

26 42

77R

48N 21J 53

50N

64

56N

45J 55N

19 37

45 49 68 7

18J

49

44 53

41 40 27

30

53N 26 42

6NR

35

27

41

58

36

26

29

5

3 6 14

18

25

30

49

20J

15 67

26J

68R

50

33

80R

3 30

34J 81R

55

39J 67N

59N 42 26 30 b 60 61 65 66

76R

61N 60N

50

56

70R 2JNR 1JNR

7

45 59 56

10

9

64 34

40

32

69

55

12J 25J 19

8

5 67 26

64J

46

43J

56 33 31 29 63

7

38 29

52N 42N

54

57

60 51

47

46

44

36 35

16 53

27J

23J

29

79R 59 53

4JN

65

13

28J 9N 8 8

63 3 38 17

44N 38

30

19

37J 40 63 26

8 7

63 13 13

20 31

33

50

58 46

31

40J 25 8 58 53

4

14 15

33J 63 65

15J 26

5NR 7NR

20

44 59 19

30

3JNR

85R 54N 41N

43

24J 56

19J 65

31J

66

61 19

18

25

5

65

16

11J

9

4 6 5 11 17 21 31 59 23

87R 29

82R 27

83R 29 33 56

31

74R 63

51N

41

39

69 36 8 29 8

10

53

38

35

31

5

16 66 29

16J

19

25

2

10J

17

41

17J

30J

36J

12

Genetic Drift in Large Populations and Coalescence

Figure 5.19 The statistical parsimony haplotype network based on 69 variable sites (numbered 1–69, going from 5 to 3 ) for 9.7 kb in the human LPL gene. Each line represents a single mutational change, with the number by the line indicating which of the 69 variable sites mutated at that step. Haplotypes present in the sample are indicated by an alphanumeric designation. Intermediate nodes inferred to have existed but not present in the sample are indicated by small circles, with the exception of some that are marked by a lowercase letter. These nodes are inferred to have been parental haplotypes involved in a recombination event (see Figure 5.22). Mutational lines that are fully resolved are solid; mutational lines that may or may not have occurred indicating alternative statistically parsimonious solutions are indicated by dashed lines. The appearance of the same site number more than once indicates homoplasy in this case. The three 5 sites 7, 8, and 13 seem to have mutated to the same state repeatedly but always together, as indicated by gray shading. Source: Modified from Templeton et al. (2000c).

163

164

Population Genetics and Microevolutionary Theory

the major source of homoplasy in the data (Templeton et al. 2000a). These recombination inferences are identical under each of the eight statistical parsimonious trees and, indeed, even under neighbor-joining trees. Although different “tree” topologies will allocate different branch locations and number of mutations, if recombination is the true cause, these differences cancel each other out (such as a mutation and its reversal) as one traces a path through the “tree” that interconnects the apparent physical clusters of homoplasy (such as the 7, 8, 13 motif in Figure 5.19). Thus, the “tree” portrayed in Figure 5.19 does not really represent the actual evolutionary history of this DNA region since many of the “mutations” shown in that figure are not true mutations at all but rather are artifacts created by how tree estimation algorithms fail to deal with recombination (these algorithms all assume no recombination and, therefore, are forced to explain recombinants through additional mutations with pseudo-homoplasy). Because of recombination, the haplotypes in many DNA regions cannot be ordered into an evolutionary tree. Haplotype trees are therefore only a possibility, not a certainty, in DNA regions subject to recombination. Recombination also undercuts some of the coalescent theory developed earlier that assumed the infinite alleles model. For example, Eqs. (5.21) and (5.22) assume through the infinite alleles model that the only way to create a new haplotype or allele is through mutation. However, when recombination occurs, a new haplotype can be created by rearranging the phase of previously existing polymorphic sites in the absence of mutation. Hence, haplotype diversity is created by a mixture of recombination and mutation, as is the case for the LPL haplotypes. Further complicating this picture is that recombination itself is highly mutagenic, perhaps due to the repair of meiotic double-stranded breaks that are a prelude to recombination or exposure of single-stranded DNA to mutagenic agents during its repair (Arbel-Eden and Simchen 2019). These mutagenic effects still occur even when the double-stranded break does not lead to a crossover event, which is the most common resolution of such breaks (Halldorsson et al. 2019). All of these factors undercut the biological interpretation of Eqs. (5.24), (5.26), and (5.27) that explain the expected haplotype diversity only in terms of genetic drift and mutation without any homoplasy, either mutational or recombinational. The coalescent theory results that were derived from the infinite alleles model are most applicable to those DNA regions that have experienced little or no recombination and little or no homoplasy. This difficulty with the infinite alleles model created by recombination can be avoided by applying the infinite sites model to a single nucleotide. A single nucleotide cannot recombine, and the infinite sites model assumes that any specific nucleotide mutates only once, if at all, thereby ensuring that identity-by-state is the same as identity-by-descent at that nucleotide. Because the infinite alleles model and the infinite sites model played similar roles in developing coalescent theory by ensuring that identity-by-state is the same as identity-by-descent, some authors have mistakenly equated the two (e.g. Innan et al. 2005). In reality, these two models of mutation have extremely different properties, both in strengths and in weaknesses. As just stated, the infinite alleles model when applied to multi-nucleotide DNA regions can be undermined by recombination, whereas the infinite sites model when applied to single nucleotides is not affected by recombination. However, as also noted above, the mutational process at the nucleotide level is characterized by many mutagenic motifs, which in turn undercut, often seriously, the infinite sites model. If a site mutates more than once during the coalescent process, it will only violate the infinite alleles assumption if the exact same mutation occurs at the same site and on exactly the same ancestral haplotype background. Otherwise, a homoplasious mutation will create a new haplotype/allele (e.g. Figure 2.7), so the infinite alleles model is more robust at the multi-site level than the infinite sites model to multiple mutational hits at the same nucleotide site.

Genetic Drift in Large Populations and Coalescence

The enhanced robustness of the infinite alleles model to mutational homoplasy can be seen in the ApoE haplotype tree (Figure 15.15). Although recombination plays no role in this DNA region, mutational homoplasy does. The sites on all branches affected by mutational homoplasy are circled in Figure 15.15, that is, all the circled mutations occurred independently two or more times in the evolutionary history of this DNA region. As can be seen, most branches on this tree are defined by mutations at nucleotide sites that experienced multiple, identical mutational events. Each of these circled sites represents a violation of the infinite sites model that did not violate the infinite alleles model because the independent mutations at a particular site occurred on different ancestral haplotype backgrounds. In general, every case of homoplasy detected in any haplotype tree based on DNA sequence data in regions with no recombination represents a violation of the infinite sites model that did not violate the infinite alleles model. Certainly, it is possible for the same mutation to occur independently upon different copies of the same ancestral haplotype, thereby violating the infinite alleles model as well. Such violations are totally invisible to us with DNA sequence data, so some degree of error is unobservable in any haplotype tree analysis. Nevertheless, as shown in Figure 15.15, many violations of the infinite sites model do not violate the infinite alleles model and thereby yield detectable homoplasy. Consequently, in the absence of recombination but in the presence of mutational motifs, it is better to use haplotypes and not nucleotides or individual SNPs as the basis for population genetic analyses. The contrasting strengths and weaknesses of the infinite alleles versus infinite sites models leave us in somewhat of a conundrum: the infinite sites model applied to a nucleotide deals well with the problems caused by recombination but not with mutational homoplasy, whereas the infinite alleles model applied to longer stretches of DNA deals better with the problems caused by mutational homoplasy but not with recombination. What are we to do? The fact that recombination is often concentrated into recombinational hotspots (Chapter 1) suggests that we can identify regions of the genome for which the infinite alleles model is applicable to haplotypes (little to no recombination) and to partition the roles of mutation and recombination as contributors to haplotype diversity in regions of the genome that experience recombination (Templeton et al. 2000c). We return to LPL (Chapter 1) as an example of such a partitioning. Although there is no biologically meaningful evolutionary tree for the haplotypes defined by all 69 ordered variable nucleotide sites found in the entire 9.7 kb LPL region, the concentration of recombination within the sixth intron of the LPL gene (Figure 1.6) implies that meaningful evolutionary histories should exist for the regions just 5 and 3 of this recombination hotspot. This conclusion is also indicated by the pattern of linkage disequilibrium in the LPL region (Figure 2.6). Recombination reduces the magnitude of linkage disequilibrium (Chapter 2), so regions of high recombination generally tend to have low amounts of linkage disequilibrium. Figure 2.6 shows that the region in which the 29 inferred recombination events occurred has much less linkage disequilibrium than is found in either the 5 and 3 flanking sets of variable sites (Templeton et al. 2000a). Accordingly, separate haplotype trees can be estimated for the 5 and 3 regions through statistical parsimony (Figure 5.20). Extensive homoplasy was observed within each of these flanking regions of low recombination, indicating serious deviations from the infinite sites model at the nucleotide level. Some of this homoplasy resulted in “loops” in 3 flanking region haplotype network that could not be explained by recombination (Figure 5.20). An analysis of the sites involved in homoplasy and these loops of ambiguity revealed that they were preferentially the same mutagenic sites identified in Table 1.2 (Templeton et al. 2000c). Even with these ambiguities, most of the evolutionary relationships among haplotypes were resolved in the flanking regions, so we do have a good picture of haplotype evolution in the 5 and 3 LPL regions.

165

166

Population Genetics and Microevolutionary Theory

(a)

49N

17

5′-1

13

84R

5′-4

4

4

7

23J

8

5′-2

36J 5

5′-3

12

16

9

5′-6

3

5′-5 17

44N

2

10

6

6

4

18

32J

14

16

17

5′-8

8

14J

8J

15

5′-7

(b) 53

3′-11 59 41

16J

36

69

36J

39

38

41

44

43

46

60

47

64J 28J 30J 56

16J

59 59

45 45

53

51N 56

3′-1 38

44N

63 63 63 63

61

43J

58

63

40J

65

3′-5 59 53

3′-3 56

3′-2 38

9N

42

3′-4

50 44

38

41N

56

34J 54

40

24J

42N

59N

50

42

78R 53N

64

39J

55

49

81R

56N

68 36

3′-6

50N 67N

41

50 53

42 42

48N

42

53

55 46

58

40

49

12J

8J

20J

35J

77R

61

38J 44 44

29J 55

14J

41

61

46N 62

52

3′-8

62

44

48

49

3′-7 50

58

37

45

54N

69

3′-9

36

51 67

53

59

45J

55

64

37J

40

57

64

55

66

3′-12 46

53

3′-10

30J

75R

32J

36 41

58

67

26J

49N

Figure 5.20 Haplotype trees for the 5 (panel (a)) and 3 (panel (b)) regions on either side of the sixth intron recombination hotspot in the human LPL gene (Figures 1.6 and 2.7). Each line represents a single mutational change, with the number by the line indicating which of the 69 variable sites (numbered 1–69, going from 5 to 3 ) mutated at that step. Haplotypes present in the sample are indicated by an alphanumeric designation. In some cases, the same 5 nucleotide state has several different 3 states associated with it, giving rise to a set of haplotypes that have the same 5 state. These haplotype sets are indicated by “5’- #” where # is a number in panel (a). Similarly, sometimes, the same 3 nucleotide state has several different 5 states associated with it, giving rise to a set of haplotypes that have the same 3 state. These haplotype sets are indicated by “3’- #” where # is a number in panel (b). Intermediate nodes inferred to have existed but not present in the sample are indicated by small circles. Mutational lines that are fully resolved under statistical parsimony are solid; mutational lines that may or may not have occurred depending upon the true evolutionary history are indicated by dashed lines. Source: Data from Templeton et al. (2000c).

As discussed in Chapter 1, 29 statistically significant recombination events and one gene conversion event were identified in the sixth intron of the LPL gene (Figure 1.6). A statistical parsimony haplotype tree can be estimated for the haplotypes that remain after excluding the recombinants and their mutational descendants, as shown in Figure 5.21. This represents the portion of the evolutionary history of this 9.7 kb region within the LPL gene that was determined only by mutation and coalescence. Altogether, 30 of the 88 haplotypes in the entire LPL region were produced by mutations alone, with no recombination in their evolutionary history. All the remaining haplotypes were either produced by one or more recombination events or were mutational derivatives of

Genetic Drift in Large Populations and Coalescence 2

17

19

17J

10

36J

9

28J 52N 60N

56

61N

59

38

29 45

1JNR

63

2JNR 30

56N

64

9N

29

26 42

31

70R

33

42 56

53N

30

50

59N

56

16J

5

29

79R

13

65

29

43J

23J

46

64J

8

7

66

16

41

30J

8

36

69

66

25

59 53

29

8

4

4JN

26

50

16

63

12

10J

11J

65

19 58

53

8

63 40

26

27J

61 25

31J

65

31

19J 35

40J

38

37J

39

53

34J

26

5

41

67

43 44 46 47

29J

44

19

52

48

37

32

55 51

60

Figure 5.21 The estimated haplotype tree for the 9.7 kb region in the LPL gene after all recombinant and gene conversion haplotypes and their mutational descendants have been removed. The layout of the network is the same as that given in Figure 5.20. The thick lines indicate two possible resolutions of recombination event 6 (see Figure 5.22). One of these pairs of thick lines should be removed, but the symmetry of this recombination event prevents any inference on which one leads to a parental type and which one to a recombinant.

ancestral recombinant haplotypes. Note, in this genomic region, only a minority of the current haplotype diversity was created by mutation and coalescence. To have a complete evolutionary picture, we therefore need to consider recombination and gene conversion. Because the recombination inference procedure given in Templeton et al. (2000a) identifies the parental haplotypes (even if no longer present in the current population) and the site of the cross-over or gene conversion event, it is possible to construct a graph in which, looking backwards in time, a given haplotype lineage splits into two separate lineages due to a recombination or gene conversion event – the opposite of the coalescent process. Such a graph is called an ancestral recombination graph (Nordborg 2000), and Figure 5.22 shows the ancestral recombination graph for LPL (Templeton et al. 2000c), the first ever estimated. Figure 5.22 shows that many haplotypes created by recombination or gene conversion subsequently created additional descendant haplotypes through mutations, that is, a mini-coalescent/mutation process tracing back to a single recombinant haplotype. Hence, this ancestral graph shows the impact of recombination, gene conversion, mutation, and coalescence in creating haplotype diversity in this 9.7 kb region. Figures 5.21 and Figures 5.22 together give the complete evolutionary history of haplotype diversity in this 9.7 kb region of the human genome for this sample of individuals as affected by coalescence, mutation, gene conversion, and recombination. These reconstructions of the evolutional history of LPL show the analytical power of haplotypes and haplotype tree analyses, even when the haplotype tree is not even a true evolutionary tree, as shown in Figure 5.19. The above examples reveal that estimating haplotype trees and ancestral recombination graphs can sometimes be a difficult and laborious process, so why bother? We will see in later chapters that both haplotype networks and haplotype trees, even if they contain many ambiguities and some recombination, can be used as powerful tools in population genetics for investigating a variety of evolutionary processes and patterns. Hence, the effort is worthwhile.

167

T-2 X Node n 15 (16-29)

17

10

9

T-1 X 11J 6 (16-19)

17J X T-3 25 Node i X Node k (17-25) 11 Node b X Node c (19-25) 5 22 (56-61) 32J 24 Node j X 28J

36J

12

Node f X Node e X T-2 7 14 (19-29) (16-31)

25J 30

28

35 6 61

46

58

35

12J

T-2 X Node p 17 (33-35) 65N

40

29

36

6

41

14

19

Node m

18

4 14 15

67

15

58

18

26J

14J 58N

30 49

17

55

30 49

18J

44

23

84R

35

Node s X Node r X

86R

20 (33-44)

21J

50N 49N

46N

26 30

17

44

8

q 50

29

87R X T-2 59

9 (29-35)

23

Node h X T-4 10 (21-29)

8J

88R

Node m X 73R 21 (5-9) 13J

19

55 57

X 83R

63N

62

61

27

26

6NR 30

50

78R

3 (33-35)

X T-2

X 4JN 27 (27-33)

69R

42N 54

1 (27-29)

31 38

53

47N 71R

38J

Node g X T-2 8 (29-34)

29

28 (19-33)

19 (33-44)

58

20 (33-44) 68R 22 (33-35)

26

62N X 45J

69

55N

1

72R

5NR X T-2 27

53

23

36

20

4

23

34

35

T-2 X Node X 29 (64-68)

62

42

16

29

2JNR XT-2 48N 67N 18 27 41 (33-35) 40

4

31

77R

53

62 13

35J

82R

u

Node a X 2JNR 2 (13-29)

39J

16 (29-31)

55 21

69

26 (23-34)

26

79R X T-2

41

59

44N

42

22J

17

75R

3 38

76R

35

11

53

T-4 X Node w

T-2 X Node t 23 (19-33)

60

57J

55

15J

25

28

Node v X T-3 24 (18-29)

Gene 33J X Node I Conversion 13 (58-63)

20J

81R

12 (5-9)

3

Node a X 1JNR 4 (13-29)

80R

6

56

24J

36

2

19

7NR

3JNR

44

31

30 19

85R

54N

50 33

41N

59

59

66N

20

51N

63

74R

Figure 5.22 The ancestral recombination graph of the human LPL gene. Current haplotypes are indicated by a number followed by a letter (J, N, or R), and inferred ancestral haplotypes are indicated by “Node x” where x is a lowercase letter (Figure 5.20) or by 3 -# (see Figure 5.21). The symbol “×” joins the two parental haplotypes that were the parental types in a recombination or gene conversion event. Thick arrows point to the haplotype created by a recombination event that are numbered 1 through 29. Below the recombination number are two numbers in parenthesis, which indicate the polymorphic nucleotide site numbers between which the recombination event occurred. A thin arrow indicates a haplotype created by a gene conversion event. Lines broken up by small circles with a number by them indicate mutational events that accumulated in various DNA lineages, where the number indicates which of the 69 variable sites mutated. Source: Modified from Templeton et al. (2000c).

169

6 Gene Flow and Population Subdivision In deriving the Hardy–Weinberg Law in Chapter 2, we assumed that the population was completely isolated. Isolation means that all individuals that contribute to the next generation come from the same population with no input from individuals from other populations. However, most species consist of not just one deme but rather many local populations or subpopulations consisting of the individuals inhabiting a geographic area from which most mating pairs are drawn that is generally small relative to the species’ total geographic distribution. Although most matings may occur within a local population, in many species, there is at least some interbreeding between individuals born into different local populations. Gene flow is genetic interchange between local populations. In Chapter 1, we noted that DNA replication implies that genes have an existence in space and time that transcends the individuals that temporally bear them. Up to now, we have been primarily focused upon a gene’s temporal existence, but, with gene flow, we begin to study a gene’s spatial existence. In this chapter, we will study the evolutionary implications of gene flow and investigate how a species can become subdivided into genetically distinct local populations when gene flow is restricted. Restricted gene flow leads to variation in the frequency of a gene over space.

Gene Flow Between Two Local Populations We start with a simple model in which two infinitely large local populations experience gene flow by symmetrically exchanging a portion m of their gametes each generation. We will monitor the evolution of these two populations at a single autosomal locus with two neutral alleles (A and a). The basic model is illustrated in Figure 6.1. In this simple model, there is no mutation, selection, or genetic drift. For any given local population, we assume that a portion 1 − m of the gametes are sampled at random from the same local area and that a portion m of the gametes are sampled at random from the other local population’s gene pool (that is, gene flow). Letting p1 be the initial frequency of the A allele in local population 1, and letting p2 be the initial frequency of A in local population 2, then the allele frequencies in the next generation in the two local populations are: p1 = 1 − m p1 + mp2 p2 = 1 − m p2 + mp1

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

61

170

Population Genetics and Microevolutionary Theory

Local population 2

Local population 1 Gene pools at generation 0

A

a

A

a

p1

q1

p2

q2

m

m

1−m

Gene pools at generation 1

1−m

A

a

A

a

p′1

q′1

p′2

q′2

Figure 6.1 A model of symmetrical gene flow between two populations. The boxes represent the gene pools at an autosomal locus with two alleles, A and a, for the two populations over two successive generations, with m of the genes being interchanged between the two localities and 1 − m staying within the same locality.

We can now see if evolution occurred by examining whether or not the allele frequencies in either local population change across the generations: Δp1 = p1 − p1 = 1 − m p1 + mp2 − p1 = − m p1 − p2 Δp2 = − m p2 − p1

62

Equation (6.2) show that gene flow acts as an evolutionary force (that is, gene flow alters allele frequencies) if the following two conditions are satisfied:

• •

m > 0 (the local populations have some genetic exchange and are not completely reproductively isolated), and p1 p2 (the local populations have genetically distinct gene pools).

In other words, gene flow is an evolutionary force when it occurs between populations with distinct gene pools. Gene flow causes evolution in a nonrandom, predictable fashion. Starting with the initial populations prior to gene flow in Figure 6.1, their genetic distinctiveness is measured by the difference in their allele frequencies, that is, d0 = p1 − p2. After one generation of gene flow, Figure 6.1 shows that: p1 = 1 − m p1 + mp2 = p1 − m p1 − p2 = p1 − md0

63

and similarly, p2 = p2 + md0

64

Hence, the difference in gene pools between the two local populations after a single generation of gene flow is: d1 = p1 − p2 = p1 − md0 − p2 − md0 = d0 1 − 2m

65

Gene Flow and Population Subdivision

Note that Eq. (6.5) implies that |d1| < |d0| for all m > 0 and d0 0. By using the above equations recursively, the difference in allele frequencies between the two local populations after t generations of gene flow is: dt = d0 1 − 2m

t

0 as t



66

Therefore, gene flow decreases the allele frequency differences between local populations. Now, consider a special case of Figure 6.1 in which p1 = 0 and p2 = 1. In this case, the frequency of the A allele in the population 1 gene pool will go from being completely absent to being present with a frequency of m. This evolutionary change caused by gene flow mimics that of mutation. Let the mutation rate from a to A be μ, then the evolutionary change caused by mutation in a population initially lacking the A allele would be to introduce that allele with a frequency of μ. Hence, gene flow can introduce new alleles into a population, with m being the analog of the mutation rate. One major difference between gene flow and mutation as sources of new genetic variation for a local deme is that, in general, μ is constrained to take on only very small values, whereas m can be either small or large. A second major difference is that gene flow can introduce variation at many loci simultaneously, whereas mutation generally affects only one locus or nucleotide site at a time. A third major difference is that many new mutations are deleterious (Figure 5.2) and initially occur as a single copy, thereby insuring that many are rapidly lost from the population (Chapter 5). In contrast, gene flow introduces genetic variation that has usually been around for more than one generation and can introduce multiple copies of new variants. Hence, there is the potential for a massive influx of new genetic variability that is more likely to be beneficial through gene flow that can drastically alter a local gene pool, even in only a few generations. The effects of gene flow on genetic variation between and within local populations described above can be summarized as gene flow decreases genetic variability between local populations and increases genetic variability within a local population. Recall from Chapter 4 that genetic drift causes an increase in genetic variability between populations (their allele frequencies diverge) and decreases genetic variability within a population (loss and fixation of alleles). Hence, the effects of gene flow on within and between population genetic variability are the opposite of those of genetic drift. In Chapter 2, we introduced the idea of population structure as the mechanisms or rules by which gametes are paired together in the reproducing population. We now include in those rules the exchange of gametes among local populations (gene flow). Parallel to this process-oriented definition of population structure, there is also a pattern-oriented definition: population structure is the amount of genetic variability and its distribution within and among local populations and individuals within a species. This definition emphasizes the spatial patterns of genetic variation that emerge from the rules of gametic exchange. The pattern of genotypic variability (heterozygosity versus homozygosity) among individuals within a local population is highly dependent upon the system of mating, as we saw in Chapter 3. As mentioned above, the distribution of allelic variation within and among local demes is influenced by both gene flow and genetic drift. Therefore, genetic population structure has three major components (ignoring age structure until Chapter 15):

•• •

System of mating Genetic drift Gene flow

Because of the opposite effects of gene flow and genetic drift, the balance between drift and gene flow is a primary determinant of the genetic population structure of a species.

171

172

Population Genetics and Microevolutionary Theory

The concept of genetic population structure (hereafter called population structure) is critical for the remainder of this book. Genotypic variability provides the raw material for all evolutionary changes, including adaptive changes caused by natural selection. Population structure therefore determines the pattern and amount of genetic variability that is available for evolution within a species and its local populations. As will be seen later, natural selection and other evolutionary forces operate within the constraints imposed by the population structure. Accordingly, virtually all evolutionary predictions, particularly those related to adaptive evolution, must be placed in the context of population structure. Given the central importance of population structure to microevolutionary processes, we need additional tools to measure and quantify it. The tools for measuring system of mating have already been discussed in Chapter 3 and those for drift in Chapters 4 and 5, so now we need to develop measures for the balance between gene flow and drift.

The Balance of Gene Flow and Drift Recall from Chapter 4 that to measure the impact of genetic drift upon identity-by-descent, we started with Eq. (4.3): F t =

1 + 2N

1−

1 F t−1 2N

where N is replaced by the inbreeding effective size for nonideal populations. To examine the balance between drift and mutation, we modified the above equation to yield Eq. (5.4): F t =

1 + 2N

1−

1 F t−1 2N

1−μ

2

Because gene flow and mutation behave in an analogous manner with respect to genetic variation within a local deme, a similar modification of Eq. (4.3) can be used to address the following question: suppose a local deme of inbreeding effective size Nef is experiencing gene flow at a rate of m per generation from some outside source. What is the probability that two randomly drawn genes from this local deme are identical-by-descent AND came from parents from the same local population? That is, if one of the genes came from a migrant, we no longer regard it as “identical.” Effectively, this means that the outside source population or populations is/are assumed to share no identityby-descent (F = 0) with the local deme of interest. The equation for the average probability of identity-by-descent within the local deme is then: F t =

1 + 2N ef

1−

1 F t−1 2N ef

1−m

2

67

Equation (6.7) is the probability of identity by descent as a function of genetic drift in the local deme (Eq. 4.3) times the probability that both of the randomly chosen gametes came from the same local deme. At equilibrium, Eq. (6.7) yields (analogous to Eq. 5.6): F eq =

1 4N ef m + 1

68

if m is small such that m is much greater than m2 and m is on the order of magnitude of 1/Nef or smaller.

Gene Flow and Population Subdivision

Recall from Eq. (5.7) that the balance of mutation to genetic drift was measured by θ = 4Nefμ. The balance of gene flow to genetic drift is measured in Eq. (6.8) by a similar parameter: 4Nefm. The similarity between gene flow and mutation can also be framed in terms of a coalescent process. For example, we can determine the conditional probability that two genes randomly drawn from the same subpopulation coalesce back to a common ancestor before either lineage experienced a gene flow event given than either coalescence or gene flow has occurred. In analogy to Eq. (5.24) for an autosomal, diploid locus, Prob gene flow before coalesence gene flow or coalescence ≈

4N ef m 4N ef m + 1

69

Since the probability of identity in this model is the probability of coalescence before gene flow given that either gene flow or coalescence has occurred, the equilibrium probability of identity in the gene flow coalescent model is simply one minus Eq. (6.9), which yields Eq. (6.8). Whether we look backward or forward in time, we obtain the same equilibrium balance of gene flow (strength proportional to m) to drift (strength proportional to 1/Nef) that is proportional to their ratio of their relative strengths: [m/(1/Nef) = Nefm]. Note that our concept of “identity” has been altered once again from what it was in Chapter 4. Wright (1931), who first derived Eq. (6.8), defined the F eq in Eq. (6.8) as “Fst” where the “st” designates this as identity-–by—descent within the subpopulation relative to the total population. Therefore, we have yet another “inbreeding coefficient” in the population genetic literature that is distinct mathematically and biologically from the “inbreeding coefficients” previously used in this book. In particular, Fst does not measure identity-by-descent in the pedigree inbreeding sense (F), nor system of mating inbreeding (f), nor the impact of genetic drift within a single deme upon average identity-by-descent inbreeding (F); rather, the “inbreeding coefficient” Fst measures the ratio of the strength of drift to gene flow and how this ratio influences population structure in a process-oriented sense. In terms of patterns, a high value of Fst indicates that there is little genetic variation in a local population relative to the total population, whereas a small value indicates much local variation relative to the total. Hence, in terms of the pattern definition of population structure, Fst measures the proportion of genetic variation among individuals drawn from all demes that is due to genetic differences between demes. Fst is a commonly used measure of population structure in the evolutionary genetic literature. Equation (6.8) shows us how processes can generate patterns of population structure. As m increases (gene flow becomes more powerful), Fst decreases (more variation within local demes and less genetic differences between). As 1/Nef increases (drift becomes more powerful), Fst increases (less variation within local demes and more genetic differences between). These properties are exactly as expected from the impact of genetic drift and gene flow on variation within and between local demes when considered separately. What is surprising from Eq. (6.8) is that even a small amount gene flow can cause two populations to behave effectively as a single evolutionary lineage. For example, let Nefm = 1. Note that Nefm is not the actual number of migrating individuals per generation but rather is an effective number of migrants because it depends upon the local inbreeding effective size Nef and not the census number of individuals. Then, Fst = 1/5 = 0.20, as shown in Figure 6.2. This means, from Eq. (6.9), that 80% of the gene pairs drawn from the same subpopulation will show gene flow before coalescence. Hence, the genealogical histories of the local demes are extensively intertwined when Nefm ≥ 1. Note also that Nefm = 1 defines a transition point in the plot of Eq. (6.8) against the effective number of migrants (Figure 6.2). Fst declines only very slowly with increasing effective number of migrants when Nefm ≥ 1, but Fst rises very rapidly with decreasing effective number of migrants when Nefm ≤ 1. Accordingly, an effective number of

173

Population Genetics and Microevolutionary Theory

1

0.8

0.6 Fst

174

0.4

0.2

1

2

3

4

5 Nef m

6

7

8

9

10

Figure 6.2 A plot of Fst in Eq. (6.8) versus the effective number of migrants per generation, Nefm. The plot shows the transition point at Nefm = 1, corresponding to Fst = 0.2.

migrants of one marks a biologically significant transition in the relative evolutionary importance of gene flow to drift. It is impressive that very few effective migrants are needed (only one or more per generation on average) to cause gene flow to dominate over genetic drift, leading to subpopulations that display great genetic homogeneity with one another. It is also surprising that the extent of this genealogical mixing depends only upon the effective number of migrants (Nefm) and not upon the rate of gene flow (m). For example, two subpopulations of one billion each would share 80% of their genes by exchanging only one effective individual per generation from Eq. (6.8), but so would two subpopulations of size 100. However, the rates of gene flow would be greatly different in these two cases: m = 0.000000001 for the first case and m = 0.01 for the second. Thus, very different rates of gene flow can have similar impacts upon population structure. Alternatively, identical rates of gene flow can have very different impacts on population structure. Suppose, for example, that m = 0.01 in both the cases considered above. For the two subpopulations with inbreeding effective sizes of one billion, the resulting Fst is effectively zero (2.5 × 10−8) from Eq. (6.8), whereas for the local populations of inbreeding effective size 100, Fst = 0.20. The reason why the same number of effective migrants is needed to yield a specific value of Fst and not the same rate of gene flow is that Fst represents a balance between the rate at which genetic drift causes subpopulations to diverge versus the rate at which gene flow makes them more similar. In large populations, divergence due to genetic drift is slow, so small amounts of gene flow are effective in counterbalancing drift-induced divergence; as populations become smaller, larger and larger rates of gene flow are needed to counterbalance the increasing rate of drift-induced divergence. Similarly, the ratio of the strength of gene flow to drift (Nefm) and not m alone determines the relative coalescence times of genes within and among local populations. If there is restricted gene flow among demes, it makes sense that the average time to coalescence for two genes sampled within a deme will be less than that for two genes sampled at random from the entire species. In particular, Slatkin (1991) has shown that these relative times are determined by Nefm. The exact relationship of coalescence times within and among local populations depends upon the pattern of gene flow. Consider the simple case of a species subdivided into a large number of local demes each

Gene Flow and Population Subdivision

Common gene pool from all local demes

m

m

m

N

m

m

N

m

N

m

m

N

m

m

N

Local demes

Figure 6.3 The island model of gene flow among multiple local demes, each of idealized size N, and each contributing a fraction m of its gametes to a common gene pool that is then distributed at random over all local demes in the same proportion.

of size Nef and each receiving a fraction m of its genes per generation from the species at large. This “island model” of gene flow among multiple local demes of identical inbreeding effective size (Figure 6.3) also leads to Eq. (6.8) and, hence, is a straightforward multi-deme extension of the two deme model illustrated in Figure 6.1. Slatkin (1991) has shown that N ef m =

t0 4 t − t0

6 10

where t 0 is the average time to coalescence of two genes sampled from the same subpopulation and t is the average time to coalescence of two genes sampled from the entire species. Hence, the ratio of within-deme coalescence time to entire species coalescence time is: 4N ef m t0 = t 1 + 4N ef m

6 11

Therefore, Fst can be reinterpreted in terms of coalescence times: F st =

t − t0 t0 or = 1 − F st t t

6 12

For example, Fst = 0.156 when averaged over 109 loci for local populations of humans scattered throughout the world (Barbujani et al. 1997). From Eq. (6.8), this Fst value yields 1.4 effective migrants per generation. Thus, although there is seemingly little gene flow among human populations at a global scale in an absolute sense (a little over one effective migrant on average per generation among human subpopulations), this is sufficient to place the human species in the domain where gene flow dominates over drift, resulting in human populations at the global level showing little genetic differentiation by the standard criterion used in population genetics (Figure 6.2). An Fst value of 0.156 also means that (1 − Fst) × 100% = 84.4% of the time the gene you inherited from your mother and the gene you inherited from your father trace back to different human subpopulations even when your parents are both from the same current human subpopulation (from Eq. (6.9). Finally, this means, from Eqs. (6.11) or (6.12), that the time it takes two genes sampled from the same local human population to coalesce is 84.4% of the time it would take two genes sampled at random over the entire human species to coalesce. In humans, there is little difference between local coalesce times and global coalesce times because of extensive gene flow. Therefore, human subpopulations even at the global level are extensively intertwined genetically. This conclusion

175

176

Population Genetics and Microevolutionary Theory

is compatible with the simulations of Rohde et al. (2004) that all humans are biological relatives if we could trace our ancestry back just a few thousand years, as discussed in Chapter 3. However, this interpretation is made under the assumption that the human Fst value is actually due to the equilibrium balance of gene flow and drift and not some other factors. This assumption will be examined in Chapter 7. Equation (6.7) can be extended to include mutation. If we assume that “identity” can be destroyed by both mutation and gene flow, then the appropriate analog to Eq. (6.7) is: F t =

1 + 2N ef

1−

1 F t−1 2N ef

1−μ 1−m

2

6 13

If both μ and m are small, then using a Taylor’s series, we have: F eq = F st =

4N ef

1 μ+m +1

6 14

Equation (6.14) shows that the joint impact of mutation and gene flow is described by the sum of μ and m, once again emphasizing the similar role that the disparate forces of mutation and gene flow have upon genetic variation and identity-by-descent. Equation (6.14) also shows that when m is much larger than μ (frequently a realistic assumption), then gene flow dominates over mutation in interacting with genetic drift to determine population structure. In all of the above equations, Fst was defined in terms of identity-by-descent, but, in many cases, all we can really observe is identity-by-state. To get around the problem of identity-by-state not always being the same as identity-by-descent, Fst is often measured by the proportional increase in identity-by-state that occurs when sampling within versus between subpopulations (Cockerham and Weir 1987). Let Fs be the probability of identity-by-state of two genes randomly sampled within a deme, and let Ft be the probability of identity-by-state of two genes randomly sampled from the total species. If all subpopulations had identical gene pools, then Fs = Ft. But with population subdivision, we expect an increase in identity-by-state for genes sampled within the same subpopulation beyond the random background value of Ft. Note that 1 − Ft is the probability that two genes are not identical-by-state with random sampling of the total population. If Fst is now regarded as the additional probability of identity-by-state that occurs beyond random background sampling when we sample two genes from the same local deme, we have: F s = F t + 1 − F t F st F st =

Fs − Ft 1 − Ft

6 15

Equation (6.15) provides another way of estimating Fst through identity-by-state by randomly sampling pairs of genes drawn from the total population (Ft) and from within the subpopulations (Fs) (Davis et al. 1990). Equation (6.15) can also be used to extend the models to multiple hierarchies. All our models of gene flow so far have assumed a total population subdivided into a series of local demes or subpopulations, which are then all treated equally with respect to gene flow. However, this is often an unrealistic model. For example, most gene flow among human subpopulations occur between subpopulations living in the same continent (Cavalli-Sforza et al. 1994). As a result, instead of subdividing the global human population simply into local demes, it is more biologically realistic to subdivide the global human population first into continental subpopulations, and then subdivide

Gene Flow and Population Subdivision

each continental subpopulation into local intracontinental subpopulations. To deal with this hierarchy of three levels, we need three levels of sampling. Therefore, let Ft be the probability of identity-by-state of two genes randomly sampled from the total human species, Fc be the probability of identity-by-state of two genes randomly sampled from humans living in the same continent, and Fs be the probability of identity-by-state of two genes randomly sampled from the same intracontinental subpopulation. Then, in analogy to Eq. (6.15), we have: F ct =

Fc − Ft 1 − Ft

6 16

Fs − Fc F sc = 1 − Fc

where Fct measures the increase in identity-by-descent due to sampling within continental subpopulations relative to the total human species, and Fsc measures the increase in identity-by-descent due to sampling within intracontinental local populations relative to the continental subpopulations. Fst can be recovered in this three-hierarchy model from (Wright 1969): 1 − F sc 1 − F ct =

1 − Fs 1 − Fc

1 − Fc 1 − Ft

=

1 − Fs 1 − Ft

= 1 − F st

6 17

Equation (6.17) indicates the total Fst can be partitioned into two components: Fct that measures genetic differentiation between continental subpopulations, and Fsc that measures genetic differentiation among local populations living in the same continent. Both contribute to Fst as shown in Eq. (6.17). In particular, for humans, the Fst of 0.156 can be partitioned into a component of 0.047 that measures the relative proportion of genetic variation among local populations within continental groups and 0.108 between continental groups (Barbujani et al. 1997). Also, note that Eq. (6.17) defines a chain rule that allows F statistics for measuring population structure to be extended to an arbitrary number of levels (Wright 1969). We saw in Chapter 3 that the inbreeding coefficient could be defined in terms of identity-bydescent (F) or by genotype and allele frequencies (f). The same is true for Fst. Up to now, we have defined Fst in terms of identity-by-descent. We saw in Chapter 4 that genetic drift influences many genetic parameters besides identity-by-descent, including the variance of allele frequencies across isolated replicate demes. This aspect of drift motivates an alternative fst in terms of variances of allele frequencies across the local demes (Cockerham and Weir 1987). No distinction is made between these two definitions in much of the population genetic literature, but, just like F and f, Fst and fst have distinct mathematical and biological meanings. Accordingly, we will always make the distinction in this book. Consider a model in which a species is subdivided into n discrete demes where Ni is the size of the ith deme. Suppose further that the species is polymorphic at a single autosomal locus with two alleles (A and a) and that each deme has a potentially different allele frequency (due to past drift in this neutral model). Let pi be the frequency of allele A in deme i. Let N be the total population size (N = Ni) and wi the proportion of the total population that is in deme i (wi = Ni/N). For now, we assume random mating within each deme. Hence, the genotype frequencies in deme i are: Genotype

AA

Frequency

2

pi

Aa

aa

2pi qi

qi 2

177

178

Population Genetics and Microevolutionary Theory

The frequency of A in the total population is p= wipi. If there were no genetic subdivision (that is, all demes had identical gene pools), then, with random mating, the expected genotype frequencies in the total population would be: Genotype Frequency

AA p2

Aa 2pq

aa q2

6 18

However, in the general case where the demes can have different allele frequencies, the actual genotype frequencies in the total population are: n

wi p2i

Freq AA = i=1 n

Freq Aa = 2

wi pi qi

6 19

i=1 n

wi q2i

Freq aa = i=1

By definition, the variance in allele frequency across demes is: n

n

Var p = σ 2p =

wi pi − p i=1

2

n

wi p2i − p2 =

= i=1

wi q2i − q2

6 20

i=1

Substituting (6.20) into (6.19), the genotype frequencies in the total population can be expressed as n

wi p2i − p2 + p2 = p2 + σ 2p

Freq AA = i=1 n

wi q2i − q2 + q2 = q2 + σ 2p

Freq aa =

6 21

i=1

Freq Aa = 1 − Freq AA − Freq aa = 2pq − 2σ 2p By factoring out the term 2pqfrom the heterozygote frequency in Eq. (6.21), the observed frequency of heterozygotes in the total population can be expressed as Freq Aa = 2pq 1 −

σ 2p pq

= 2pq 1 − f st

6 22

where f st = σ 2p pq . Note that f st = σ 2p pq is a standardized variance of allele frequencies across demes. In the extreme case where there is no gene flow at all (m = 0), we know from Chapter 4 that drift will eventually cause all populations to either lose or fix the A allele. Since drift has no direction, a portion p of the populations will be fixed for A, a portion q will be fixed for a, and the variance (Eq. (6.20)) becomes p 1 − p 2 + q 0 − p 2 = pq. Therefore, fst is the ratio of the actual variance in allele frequencies across demes, σ 2, to the theoretical maximum when there is no gene flow at all, pq. Using fst, the genotype frequencies in Eq. (6.21) can now be expressed as: Freq AA = p2 + pq f st Freq Aa = 2pq 1 − f st 2

Freq aa = q + pq f st

6 23

Gene Flow and Population Subdivision

Note the resemblance between Eqs. (6.23) and (3.2) from Chapter 3. Equation (3.2) describes the deviations from Hardy–Weinberg genotype frequencies induced by system of mating inbreeding (f ). Because a variance can only be positive, fst ≥ 0, the subdivision of the population into genetically distinct demes causes deviations from Hardy–Weinberg that are identical in form to those caused by system of mating inbreeding within demes (f > 0). This “inbreeding coefficient” is called fst because it refers to the deviation from Hardy–Weinberg at the total population level caused by allele frequency deviations in the subpopulations from the total population allele frequency. This deviation from Hardy–Weinberg genotype frequencies in the species as a whole that is caused by population subdivision is called the Wahlund effect after the man who first identified this phenomenon (Wahlund 1928). The balance between drift and gene flow is the primary determinant of what fraction of a species’ genetic variability is available in local gene pools, but the local system of mating then takes the gene pool variation available at the gametic level and transforms it into genotypic variation at the individual level, as we saw in Chapter 3. Therefore, a full consideration of how genetic variation is distributed between demes, among individuals within a deme, and within individuals (heterozygosity versus homozygosity) requires a model that integrates the effects of system of mating, genetic drift, and gene flow – the three major components of population structure. So far, we have only considered random mating within local populations. Now, we consider nonrandom mating within demes by letting fis be the system of mating inbreeding coefficient for individuals within a subpopulation. This is the same as the “f” introduced in Chapter 3, but now we add the subscripts to emphasize that we are considering system of mating in the context of a population consisting of several demes. Because fis functions in a manner identical to f in local demes, we have from Eq. (3.2) that, within each deme, the genotype frequencies are: Freq AA in deme j = p2j + p j q j f is Freq Aa in deme j = 2p j q j 1 − f is Freq aa in deme j =

q2j

6 24

+ p j q j f is

With respect to the total population, the AA genotype frequency is now: n

w j p2j + p j q j f is

Freq AA = j=1 n

n

w j p j − p2j f is

w j p2j +

= j=1

6 25

j=1

= p2 + σ 2p + f is p − p2 − σ 2p = p2 + pq f st + f is 1 − f st Letting fit = fst + fis(1 − fst) and performing similar derivations to Eq. (6.25) for the other genotype frequencies, we have: Freq AA = p2 + pq f it Freq Aa = 2pq 1 − f it

6 26

2

Freq aa = q + pq f it Equation (6.26) superficially resembles Eq. (6.23), but Eq (6.26) is a function of fit and not just fst. It also follows from fit = fst + fis(1 − fst) that (1 − fit) = 1 − fst − fis(1 − fs) = (1 − fst)(1 − fis), so the deviation of the heterozygote genotype frequency from Hardy–Weinberg at the total population

179

180

Population Genetics and Microevolutionary Theory

level, (1 − fit), is partitioned into a component due to the local system of mating (1 − fis) and a component due to differences in allele frequencies across local demes (1 − fst). As an example of this decomposition, consider the snail Rumina decollate. This snail is particularly abundant in the parks and gardens along the Boulevard des Arceaux in the city of Montpellier, France. However, like most snails, it has limited dispersal capabilities, even over this area of about one hectare. Hence, despite thousands of snails living in this small area, the fst among 24 colonies along the Boulevard des Arceaux is 0.294 (Selander and Hudson 1976), nearly twice as large as the fst value of around 0.15 for humans on a global scale (Lewontin 1972). This snail is also capable of selffertilization, and a detailed examination of its local system of mating reveals that 85% of the time the snails self-mate (Selander and Hudson 1976). As a result of much selfing within local demes, fit = 0.775, yielding fis = 0.681. In this snail, both local system of mating inbreeding (selfing in this case) and population subdivision make large contributions to an overall extreme deficiency of heterozygotes relative to Hardy–Weinberg expectations. Another example of this decomposition is given by Ward and Neel (1976) on populations of Yanomama Indians of South America. Like many human populations, Yanomama have incest taboos. Because they live in rather small villages, an incest taboo can have a measurable impact as a deviation from Hardy–Weinberg genotype frequencies within villages (Chapter 3, Jacquard 1974), and, in this case, fis is −0.01, indicating an avoidance of system of mating inbreeding and a slight excess of observed heterozygosity within villages (Ward and Neel 1976). The small variance effective sizes within villages also result in a substantial amount of differentiation among villages, with fst = 0.073 (almost half of the global human fst). Hence, the overall deviation from Hardy–Weinberg genotype frequencies in the total Yanomama population is fit = fst + fis(1 − fst) = 0.073 − 0.01(0.927) = 0.064. Even though the Yanomama have a local system of mating characterized by an avoidance of inbreeding within villages (fis = −0.01 < 0), the Wahlund effect induced by the balance of drift and inter-village gene flow creates an overall heterozygosity deficiency (fit = 0.064 > 0) at the tribal level. If one ignored or was unaware of the genetic subdivision of Yanomama into villages, the deviations from expected Hardy–Weinberg genotype frequencies at the tribal level would imply that the Yanomama’s system of mating was characterized by inbreeding when in fact it is characterized by avoidance of inbreeding (fis < 0). Because of the Wahlund effect, caution is required in interpreting deviations from Hardy–Weinberg expectations: a deficiency of heterozygotes in a sample could mean the population has system of mating inbreeding, or it could mean that the sample included individuals from different local populations with different allele frequencies. From Eq. (6.22), an alternative expression for fst is given by: f st = 1 −

Freq Aa 2pq − Freq Aa Ht − Hs = = 2pq 2pq Ht

6 27

where H t = 2pq is the expected heterozygosity if the total population were mating at random, and Hs is the observed frequency of heterozygotes in the total population when random mating is assumed within each subpopulation. With nonrandom mating in local demes, Hs is the average expected heterozygosity under random mating within subpopulations. The definition of fst given by Eq. (6.27) is useful in extending the concept of fst to the case with multiple alleles, as expected and observed heterozygosities are easily calculated or measured regardless of the number of alleles per locus. There are several different methods of using Eq. (6.27) to estimate fst when data exist for multiple loci with multiple alleles (Holsinger and Weir 2009), depending upon how averages are taken across loci and adjustments made for uneven sample sizes. These different methods and alternative but similar statistics are often given different symbols in the population genetic literature,

Gene Flow and Population Subdivision

such as gst (Nei 1973), θst (Weir and Cockerham 1984), and many others (Kane, https://www.molecularecologist.com/2011/03/should-i-use-fst-gst-or-d-2; Ma et al. 2015), but all are basically trying to estimate Eq. (6.27) or something similar from multi-locus, multi-allelic data. Equation (6.27) is used more commonly in the literature to measure population structure than Eq. (6.15), but readers need to be wary as many papers do not explicitly state which of the two definitions of Fst/fst is being used. To see why the distinction of Fst versus fst is important, return to Eq. (6.8) that shows that the Fst defined in terms of probability of identity-by-descent can be related to the amount of gene flow, m, under the island model of gene flow in an equilibrium population. In the island model, a species is subdivided into a large number of local demes of equal size and with each local deme receiving a fraction m of its genes per generation from the species at large (Figure 6.3). Under this model, the variance in allele frequency across demes for an autosomal locus with two alleles reaches an equilibrium between drift and gene flow of (Li 1955): σ2 = Because f st f st =

pq 2N ev − 2N ev − 1 1 − m

6 28

2

σ 2p pq for a two-allele system, we have: 1 2N ev − 2N ev − 1 1 − m

2



1 4N ev m + 1

6 29

when m is small. Note that Eq. (6.29) is almost identical to Eq. (6.8). The only difference is that in Eq. (6.8) for Fst we have Nef and in Eq. (6.29) for fst we have Nev. This difference between Eqs. (6.8) and (6.29) points out that there are two qualitatively different ways of defining Fst/fst:

• •

Fst is measured through identity-by-descent or identity-by-state (Eqs. 6.8 and 6.15, or coalescent Eqs. 6.9 and 6.12) fst is measured through variances of allele frequencies or heterozygosities (Eqs. 6.22, 6.23, and 6.27).

Both fst and Fst ideally measure the balance of gene flow to genetic drift and in many cases are similar in value. However, as we saw in Chapter 4, Nef and Nev can sometimes differ by orders of magnitude under biologically realistic conditions. Accordingly, the two alternative definitions of Fst and fst can also differ substantially because they focus upon different impacts of genetic drift. In particular, the two definitions can differ when recent events have occurred that disturb equilibrium. The identity-by-descent definitions depend upon long-term coalescent properties (e.g. Eq. 6.9) and are not as sensitive to recent disturbances as are the allele frequency variance definitions, which tend to respond more rapidly to altered conditions. For example, using an estimation procedure based upon Eq. (6.15) for Fst, Georgiadis et al. (1994) obtained an Fst value not significantly different from zero for mtDNA for African elephants sampled either within eastern Africa or within southern Africa. Using the same data with an estimation procedure based upon Eq. (6.27), Siegismund and Arctander (1995) found significant fst values of 0.16 and 0.30 for eastern and southern African elephant populations, respectively. This apparent discrepancy is expected given that human use of the habitat in these regions has fragmented and reduced the elephant populations over the last couple of hundred years in these two regions of Africa. Recall from Chapter 4 that Nef is generally larger than Nev when population size is decreasing. Hence, we expect 4Nefm > 4Nevm for these elephant populations, which means we expect Fst < fst, as observed. Fst and fst are not two alternative ways of estimating the same population structure parameter. As the elephant

181

182

Population Genetics and Microevolutionary Theory

example shows, Fst and fst are biologically different parameters that measure different aspects of population structure in nonideal (in sensu effective size) populations. Another limitation of estimating Fst/fst through Eq. (6.15) or (6.27) emerged as genetic data sets began to measure genetic variation at the haplotype and sequence level. When the concept of fst was first developed, genetic variation was primarily measured through allelic diversity, with alleles being different qualitative categories. One could say that two alleles were either the same or different, which is the basis of Eqs. (6.15) and (6.27), but nothing more. In our previous models of population subdivision, any pair of homologous genes was classified into one of two mutually exclusive categories: the pair was identical-by-descent or not (for defining Fst); or the pair was heterozygous or not (for defining fst). However, with the advent of haplotype and DNA sequence data, we can refine this categorization by the use of a molecule genetic distance (Chapter 5). For example, suppose we have sequence data on a 10 kb locus, and a pair of genes at this locus differs by only a single nucleotide site. Now, consider another pair of genes, but this time differing by 20 nucleotide sites. In the models used previously in this chapter, both of these pairs of genes would be placed into the same category: they are not identical-by-descent at this locus, or they are “heterozygous.” Intuitively, the second pair is much less identical than the first pair; or alternatively, the first pair has much less nucleotide heterozygosity than the second pair. These quantitative differences in nonidentity or in the amount of heterozygosity at the molecular level are ignored in our previous formulations. Taking these quantitative differences into account can increase power and sensitivity for detecting population subdivision and restricted gene flow (Hudson et al. 1992). Accordingly, there have been several proposed alternatives to fst that incorporate the quantitative amount of difference at the molecular level between heterozygous pairs of alleles (that is, a molecule genetic distance). Among these are Nst, which quantifies the amount of molecular heterozygosity by the average number of differences between sequences from different localities (Lynch and Crease 1990), and Kst, which uses the average number of differences between sequences randomly drawn from all localities (Hudson et al. 1992). In general, we will let Φst designate any fst-like statistic that uses a molecule genetic distance instead of heterozygosity as its underlying measure of genetic diversity (Excoffier et al. 1992). Recall that we can define multiple f statistics to partition genetic variation at different biological levels in a hierarchy of individuals and populations (e.g. Eq. (6.17)). The same types of hierarchical partitions can be made using Φ statistics instead of f statistics. A partition of genetic variation that substitutes a molecule genetic distance for heterozygosity is called an Analysis of MOlecular VAriance (AMOVA). These newer measures of population structure are desirable when dealing with data sets with extremely high levels of heterozygosity, as are becoming increasingly common with DNA sequencing. When there are high levels of allelic or haplotypic heterozygosity in all populations, both HT (the expected heterozygosity if the total population were mating at random) and HS (the observed or average expected heterozygosity within local populations) are often close to one in value. When all heterozygosities are close to one, the fst value calculated from Eq. (6.27) approaches zero regardless of the values of the underlying evolutionary parameters (such as Nev, m, or μ) and is difficult to estimate accurately. Thus, the all-or-nothing nature of traditional heterozygosity measures can induce serious difficulties when dealing with highly variable genetic systems. For example, recall from Chapter 1 the survey of 9.7 kb of the human lipoprotein lipase locus (LPL) in which 88 haplotypes were found in a sample of 142 chromosomes coming from three human populations. Regarding each haplotype as an allele, the value of fst across these three populations was 0.02 (Clark et al. 1998), which was not significantly different from zero. This seemingly indicates that these three human populations display no significant genetic differentiation. However, using the

Gene Flow and Population Subdivision

number of nucleotide differences at the sequence level as the molecule genetic distance, Φst was 0.07, a value significantly different from zero. Therefore, these human populations did indeed show significant genetic differentiation, but these differences were not detected by the traditional fst using haplotypes as alleles and heterozygosity as a qualitative measure of genetic differentiation. In general, when high levels of genetic variation are encountered, a quantitative scale of differences between alleles or haplotypes is preferable to a qualitative one.

An Example of the Balance of Drift and Gene Flow Both Fst and fst measure the relative balance of gene flow to drift, and in both cases, this balance appears as a product of m, the amount of gene flow, with an effective size, a measure of genetic drift. Further understanding of this balance and the factors altering it is revealed by studies on translocated populations of eastern collared lizards (Crotaphytus collaris collaris) living on glades on Stegall and Thorny Mountains in the Missouri Ozarks (Templeton et al. 2001, 2011; Neuwald and Templeton 2013). Ozark glades are barren rocky outcrops, usually with a southern or southwesterly exposure that creates a desert or prairie-like microhabitat (Figure 6.4). Desert and dry prairieadapted plants and animals (such as prickly pear cacti, scorpions, tarantulas, and collared lizards)

Figure 6.4 The foreground is a glade on Thorny Mountain (Ozark Mountains, Missouri, USA). The background on the left is Stegall Mountain and shows the woodland matrix in which glades are embedded. The Thorny Mountain glades were colonized by lizards from Stegall Mountain after the fire management area was expanded to include Thorny Mountain and the valley between Stegall and Thorny Mountains.

183

184

Population Genetics and Microevolutionary Theory

expanded into the Ozarks about 8000 years ago during the Xerothermic maximum (the period of maximum warmth in our current interglacial period). Increasing rainfall at the end of the Xerothermic maximum about 4000 years ago allowed forests to expand into the Ozarks (Mondy 1970). As a result, the glades became desert and dry-prairie-like habitat islands in an ocean of forest. Until European settlement, frequent fires recurred that maintained the forest surrounding the glades as an open woodland. Fire scar data on old tree stumps indicated that the forest on Stegall Mountain burned roughly every five years prior to European settlement, and no decade between 1640 and 1800 CE was without at least one major fire (Guyette and McGinnes 1987). With European settlement, rounds of clear cutting occurred throughout most of the Ozarks, and fires were suppressed as a new forest grew in, particularly after the mid-twentieth century. This new forest was an oak-hickory forest with a dense, woody understory and a forest floor covered with leaf-liter that prevented many of the woodland grasses and herbaceous perennials from growing (and thereby reducing the insect communities associated with these woodland grasses and perennials). Collared lizard hatchlings can disperse kilometers through an open woodland that has abundant insect preyitems on the forest-floor and spots of sunshine for warmth, but they are reluctant to move even as little as 50 m through a forest with a dense understory, no sunshine, and sterile floor. Consequently, gene flow between glade populations in areas of fire suppression had virtually stopped. A single isolated glade typically can support only 10–30 adults, with only exceptional glades having larger local sizes (Sexton et al. 1992; and unpublished data). Small local population sizes coupled with little or no gene flow resulted in extreme genetic fragmentation and demographic instability, such as extreme fluctuations in the sex ratio. Moreover, fire suppression allowed the fire-sensitive red cedar (Juniperus virginiana) to invade the open glades, which in turn allowed successional invasion by other woody species. These woody invaders over several decades would shade-out parts or all of a glade, resulting in a reduction in the amount of glade habitat that in turn led to the collapse of the entire glade community. Collared lizards, being the apex predator of the glade community, were a sensitive indicator species of the health of the glade habitat and often the first species to go extinct with habitat degradation. Fragmentation and habitat degradation lead to much local extinction. By 1980, it was conservatively estimated that 75% of the Ozark collared lizard populations had been extirpated (Templeton 1982a). A glade restoration and lizard translocation program was initiated in 1982 to prevent the collapse of glade communities and the extinction of the Ozark collared lizards. Stegall Mountain, owned by the Missouri Department of Conservation and the National Park Service, was chosen as one of the initial translocation sites. The glade habitat on this mountain had been severely reduced during the period of fire suppression as inferred from aerial photographs from 1956 (Figure 6.5). No collared lizards could be found on Stegall and nearby mountains in 1980. Glade restoration began in 1982 by the clearing of the woody vegetation and prescribed burning of the existing glade habitat (but permission to burn the surrounding woodlands was denied at this time). The glade plant life and grasshopper communities (the primary prey of collared lizards) responded rapidly to clearing and burning, so translocations of collared lizards to restored glades began in 1984. This entailed capturing yearling and adult lizards from other Ozark glades that still had natural populations. No more than two lizards were ever removed from any one natural glade population to minimize the impact on the source population. Ten lizards (five males, five females) were released on one of the restored glades on Stegall Mountain in 1984 (Figure 6.5). Because of fragmentation of all glade populations throughout the Ozarks coupled with small local population sizes, the level of genetic diversity within any single natural glade population was very low. However, the average fst in the Ozark populations using a microsatellite genetic survey was 0.403, indicating that much genetic variation existed as differences between populations (Hutchison

Gene Flow and Population Subdivision

SM-8 1987 9 Lizards

500 m SM-7 1984 10 Lizards

N

SM-9 1989 10 Lizards

Figure 6.5 A topographic map of Stegall Mountain. Contour intervals are 20 ft (6.1 m), and the highest contour line is 1350 ft (411.5 m). The glade boundaries inferred from a 1956 aerial photograph are shown in gray, and the glade boundaries that existed in 1993 are indicated in yellow. The glades used in the translocation project are indicated by arrows from a box that gives the glade designation, the year of the translocation, and the number of lizards translocated.

and Templeton 1999). To restore genetic variation, and hence adaptive flexibility, the 10 lizards released in 1984 came from nine different source glades, making the translocated population highly diverse genetically. Additional translocations onto Stegall Mountain glades occurred in 1987 and 1989 (Figure 6.5) with the same protocols. These three translocated lizard populations all persisted, although the total population size for all three glades as estimated from mark/recapture studies remained less than 50 for all but one year from 1984 to 1993 (Figure 6.6). During 1984–1993, prescribed burning was limited to glades, and prescribed burning of the surrounding woodlands was forbidden. Although prescribed burning is a common tool in conservation management now, in the 1980s, such burning was controversial. During this pre-burn time period, only a single marked animal was observed to disperse, despite the fact that glades SM-7 and SM-8 were only 50 m apart. Moreover, no new glades were colonized in 10 years despite restored glades existing just 60 m away from a translocated glade. Genetic surveys did reveal that another individual dispersed during this 10 year period, but still gene flow was extremely weak during these initial 10 years. Accordingly, this pre-burn phase of the restoration program resulted in almost complete isolation between the three translocated glade populations. In 1990, the State of Missouri assembled a biodiversity task force that published its recommendations in 1992 (Nigh et al. 1992). All recommendations of the task force for preserving the

185

Population Genetics and Microevolutionary Theory

500 450 400 Number of Lizards

350 300 250 200 150

Start of Burn Management

100 50 0 1984 1987 1988 1989 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

186

Year

Figure 6.6 The total collared lizard population size on Stegall Mountain as estimated from mark/recapture studies from 1984 until 2010. A gap in these studies between 2006 and 2010 is indicated by a dashed line.

biodiversity of Missouri were unanimous save one – a recommendation for prescribed burning of forests and woodlands. A majority of the task force did support prescribed burning, and such burning was first implemented in the spring of 1994 on the northwestern half of Stegall Mountain that included translocated populations SM-7 and SM-8 (Figure 6.5). The woodland was substantially transformed just by this single burn. The canopy trees were mostly unaffected, but the woody understory was significantly reduced, the ground cover experienced a burst of diversity of herbaceous plants, some sunlight now reached the forest floor, and the woodland grasshopper community increased by more than an order of magnitude in abundance and species diversity. During the summer field season of 1994, several marked lizards were found that had dispersed to non-natal occupied glades and to an unoccupied glade. The burn policy was extended to all of Stegall Mountain in 1999 and eventually to many nearby mountains. The management plan mimics the preEuropean settlement burn pattern by attempting to burn all areas every four to five years, weather conditions permitting. The onset of woodland burning marked the beginning of a new demographic phase for the collared lizards in which there was rapid growth between 1994 and 2000 (Figure 6.6), and the colonization of many new glades (Figure 6.7) and much dispersal. By 2000 CE, the population size and number of occupied glades became roughly stable (Figures 6.6 and 6.7), although the lizards also began to colonize nearby mountains in an expanded burn management area (this will not be discussed here). This stability was actually quite dynamic because some populations on a glade continued to go extinct, but now the glades were later recolonized. Both extinction and colonization rates stabilized at about 10% per year by 2001 (Figure 6.8). A population consisting of many subpopulations with extinction and recolonization is called a metapopulation. Accordingly, after 2000, the collared lizard population on Stegall Mountain had entered a stable metapopulation phase. Genetic surveys of six microsatellite loci with a total of 42 alleles were performed on these lizards throughout all the demographic phases mentioned above. We can therefore examine the balance of

Gene Flow and Population Subdivision

70

Number of Occupied Glades

60 50 40 30 Start of Burn Management

20 10

1984 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

0

Year

Figure 6.7 The number of glades occupied by collared lizards on Stegall Mountain from 1984 until 2010. A gap in these studies between 2006 and 2010 is indicated by a dashed line.

Probability Being Newly Colonized Newly Extinct

0.7 Proportion of Occupied Glades Newly Colonized Probability Newly Extinct

0.6

0.5

0.4

0.3

0.2

0.1

0 1994

1995

1996

1997

1998

1999 Year

2000

2001

2002

2003

2004

Figure 6.8 The glade population extinction/recolonization probabilities on Stegall Mountain from 1994 until 2004. Source: Templeton et al. (2011). © 2011, John Wiley & Sons.

187

Population Genetics and Microevolutionary Theory

0.1

Pre-Burn Phase

Stable Metapopulation Phase

Colonizing Phase

0.09 0.08 0.07 0.06 fst 0.05 0.04 0.03 0.02

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993:1998

1992:1997

1989:1994

0

1988:1993

0.01 1984,1987, 1989

188

Year

Figure 6.9 The fst measure of the balance of gene flow and genetic drift in the collared lizard population translocated to Stegall Mountain from 1984 to 2006. Because the three founding glade populations were translocated at different times, the fst values were calculated for them at common times since founding rather than absolute time during the pre-burn phase. Source: Based on data from Neuwald and Templeton (2013).

gene flow and genetic drift through the fst statistic from 1984 to 2006 (Figure 6.9). Figure 6.9 shows that fst increased during the pre-burn phase, as expected for mostly isolated, small populations. As the colonizing phase began, fst at first fell rapidly, as expected from the interchange of lizards among the glades. However, fst then increased to high levels during the last years of the colonizing phase. Given that the total population size increased rapidly during this phase (Figure 6.6), this secondary rise of fst may seem surprising. However, this phase was also characterized by a rapid colonization of new glades (Figure 6.7) Most of these colonizations involved just two or three lizards, so there were extreme founder effects during this stage affecting a majority of the glade populations. Because fst is a measure of genetic differentiation across all occupied glades, these recent founder effects would cause a high level of differentiation across glade populations, resulting in the observed high fst. By 1999, the colonizing phase had resulted in many more occupied glades than just the year before (Figure 6.7). Also, in 1999, the lizards made a transition to higher levels of the probabilities of dispersal that typified the stable phase (Table 6.1). With this increase in dispersal, the fst went down in 1999, anticipating the gene flow regime that became firmly established in the stable metapopulation phase. As the Stegall Mountain population entered the stable metapopulation phase, continued exchange of lizards among glades rapidly reduced fst to around 0.04, about half of the peak value during the colonizing phase (Figure 6.9). The fst value has been stable and significantly different from zero thereafter, reflecting an equilibrium balance of drift and gene flow, as well as a balance of extinction and recolonization events (Figure 6.8). Figure 6.9 demonstrates the dynamic nature of the shifting balance of gene flow and genetic drift in a population through time and illustrates that fst can measure these dynamic changes. As noted earlier, fst is expected to be more responsive to current and recent events than Fst..

Gene Flow and Population Subdivision

Table 6.1 Dispersal probabilities for the eight classes of collared lizards defined by sex and significant heterogeneity in logistic regression and Fisher’s exact tests. Category

Probability of Dispersal

Hatchling Female Colonizing

0.088

Hatchling Female Stable

0.319

Hatchling Male Colonizing

0.143

Hatchling Male Stable

0.500

Yearlings

0.095

Adult Females

0.100

Adult Male Colonizing

0.128

Adult Male Stable

0.254

Source: Data from Templeton et al. (2011).

500 m

N

Figure 6.10 A topographic map of Stegall Mountain showing the dispersal events as arrows that were observed during the post-burn phases. Dispersal events originating from Stegall Mountain to adjacent mountains are also shown, although dispersal events internal to these other mountains are not shown. Source: Modified from Neuwald and Templeton (2013).

Separate measures of gene flow and drift are possible in these populations. First, consider gene flow. All released lizards and all subsequently captured lizards were marked, thereby allowing a direct assessment of their dispersal from their natal glade upon recapture. During the 10 year pre-burn phase, one marked animal was observed to disperse from its natal glade. Genetic studies, to be discussed shortly, indicated another animal dispersed, which is not surprising as only about a third of the yearling and adult animals were marked during this phase (as opposed to 80–90% in the

189

Population Genetics and Microevolutionary Theory

0.6

0.5 Probability of Dispersal

190

0.4

0.3

0.2

0.1

0 Females* & Yearling Males

Hatchling Females Stable

Hatchling Males Stable

Hatchling & Adult Male Colonizing

Adult Males Stable

Figure 6.11 The probabilities of dispersal of different categories of lizards on Stegall mountain. Lizards were classified into two sexes (females and males), three age categories (hatchlings, yearlings, and adults), and two post-burn demographic phases (colonizing and stable). The figure only shows the categories with significant differences. The asterisk by the word “Females” in the left-most category indicates that hatchling females from the stable demographic phase are not included. Source: Modified from Templeton et al. (2011).

post-burn phase) and very few hatchlings were marked. The estimated total number of lizards at risk for observable dispersal (marked and recaptured) in all three glades over the entire pre-burn phase is 363, so the total amount of genetic interchange based on both dispersal and genetic inference during this phase is 2/363 = 0.0055 over 10 years. The generation time (see Chapter 15) during the pre-burn phase was estimated to be 3.5 years, so 10 years represents 2.9 generations, yielding m = 0.0019 per generation. Hence, gene flow was indeed weak during the pre-burn phase. In contrast, there was an explosion of dispersal of marked individuals as soon as prescribed burning of the woodland matrix began, with 218 observed dispersal events out of 1541 lizards with potentially observable dispersal between 1994 and 2006 (Figure 6.10). Ninety-five additional dispersal events were inferred genetically for this time period, for a total of 313 dispersal events. The generation time during the 5 years of the colonizing phase (1994–1999) was 3.1 years and during the 7 years (2000–2006) of the stable phase was 3.0 years, yielding an estimate of m of 0.0668 in the post-burn phases – more than a 35-fold increase over the pre-burn phase. This is undoubtedly an underestimate of gene flow as the marked population is heavily biased against hatchlings. Nevertheless, hundreds of hatchlings were marked, so instead of just pooling all observations, there is sufficient power to separate the impacts sex, age class, and demographic phase upon dispersal in the postburn period. Figure 6.11 shows the classes nested within sex that had significantly different probabilities of dispersal, using the results shown in Table 6.1. All of these probabilities are two orders of magnitude higher than the m estimated for the pre-burn phase. There is no doubt that the woodland matrix surrounding the glades has a major impact on gene flow, with a burned woodland promoting gene flow, even between mountains separated by valleys with no glades (Figure 6.10). Figure 6.11 also reveals that hatchlings are major agents of gene flow. This is precisely the age/ stage category that is most difficult to mark and monitor, and this is commonplace with many other animal and plant species. Genetic surveys can be used to compensate for this severe limitation of

Gene Flow and Population Subdivision

500 m

N

Figure 6.12 A map of Stegall Mountain and its glades (shown in gray) with arrows indicating the movement of unique alleles from a single founder glade to other glades on the mountain. Source: Neuwald and Templeton (2013). © 2013, John Wiley & Sons.

many dispersal studies. One method of monitoring gene flow is to trace alleles or haplotypes across geographic space, and in the case of the lizards, alleles can be traced over both space and time. The genetic survey of the Stegall Mountain lizards initially revealed 42 distinct alleles at 6 loci. Of these 42 alleles, 14 were confined to just one of the original three translocated glades, but three of these 14 were rapidly lost in the fragmented pre-burn phase due to genetic drift. The remaining 11 alleles with unique geographic origins on Stegall Mountain were able to spread to other glades, and this spread could be plotted through time, as shown by the arrows connected origin to destination in Figure 6.12. One of the arrows shown in Figure 6.12 corresponded to a dispersal event from SM-7 to SM-9 during the pre-burn phase, but all of the remaining arrows indicate gene flow during the post-burn phase. Note the similarity between Figures 6.10 and 6.12, indicating that dispersal and a direct measure of gene flow are much the same in this system (which is not always the case, as will be discussed later). As pointed out in Chapter 3, genetic markers can be used to infer relatedness among individuals, such as parents and offspring. This knowledge in turn can be used to infer dispersal (Saenz-Agudelo et al. 2009; Cope et al. 2015; Bode et al. 2019). As also pointed out in Chapter 3, the ability to make accurate inferences about relatedness depends upon having a large number of genetic markers. Parentage inference in these collared lizards was not reliable due to having only six loci, but in general inferring parentage and other kin relationships is a powerful way of inferring dispersal and gene flow events when a large number of genetic markers are available.

191

192

Population Genetics and Microevolutionary Theory

Overall, these studies indicate that gene flow was extremely dynamic over time for the Stegall Mountain collared lizards. Gene flow was at a very low level during the pre-burn phase, but a dramatic increase in gene flow started with the onset of woodland prescribed burning. These changes in gene flow over time explain part of the pattern observed in Figure 6.9. The other component determining the pattern shown in Figure 6.9 is genetic drift. This component can also be measured directly for the Stegall Mountain collared lizards. As discussed in Chapter 4, a direct measure of the strength in genetic drift is the change it induces in allele frequencies over time. The changes in allele frequencies for the microsatellite alleles from one year to the next are shown in Figure 6.13. As can be seen, the changes of greatest magnitude tend to be found in the pre-burn and colonizing phases, indicating that genetic drift was stronger during these periods than during the stable period. Another measure of the strength of drift is loss of alleles. Six alleles were lost in the pre-burn phase (1.58 alleles lost per generation), zero were lost in the colonizing phase, and eight were lost during the stable phase (2.50 alleles lost per generation). By this criterion, genetic drift appears to be stronger in the pre-burn and stable phases than during the colonizing phase. The reason for this apparent contradiction between changes in allele frequency versus loss of alleles as measures of the strength of genetic drift goes back to the concept of effective population size. Recall from Chapter 4 that effective population size measures the strength of genetic drift in influencing some population genetic feature of interest. In a nonideal population (and this lizard population is far from the Fisher-Wright ideal), there is no reason to believe that genetic drift will have the same strength when looking at different population genetic features. The variance effective size is designed to measure the strength of genetic drift with respect to changes in allele frequencies and genetic differentiation among subpopulations, but the eigenvalue effective size measures the strength of genetic drift with respect to the rate of loss of alleles. The colonizing phase was characterized by many extreme founder events that induced much allele frequency change (Figure 6.13), but the colonizing phase was also characterized by exponential population growth (Figure 6.6). As pointed out in Chapter 5, rapid population growth decreases the probability of loss of alleles, even rare alleles. Hence, the colonizing phase is both a strong (allele frequency) and a weak (allele loss) phase for genetic drift, depending upon which population genetic feature is being measured. Once again, there is no such thing as the effective population size (Chapter 4). Some may also be surprised that the allele loss per generation was higher in the stable phase than in the pre-burn phase despite a much larger total population size during the stable phase (Figure 6.6). The answer to this mystery also lies in effective size and the impact of population subdivision on the variance effective size of the total population and upon allele loss in the total population size, as will be discussed later in this chapter. For now, we will simply note that total or near total isolation (as occurred in the pre-burn phase) is more efficient at retaining allelic variation at the total population level than when significant gene flow occurs among subpopulations (as occurred in the metapopulation phase). As also discussed in Chapter 4, it is better statistically to use the arcsin, square-root transformation of allele frequencies (Eq. 4.38) than the allele frequencies themselves when studying genetic drift. In the case of the collared lizards, there were sporadic genetic surveys during the early part of the pre-burn phase, and annual genetic surveys from 1991 until 2006. Thus, we can measure the strength of genetic drift between time periods i and j (with j < i) directly through the statistic: Gdij = αi − α j

2

6 30

Gene Flow and Population Subdivision

Colonizing

Stable

0.150 0.000 –0.150

181 192 200

–0.300 0.400

185 194 202

190 198

188 196 204 97 103 106

0.200 0.000 –0.200 –0.400 0.200

0.000

–0.200

109

119

122

125

128

131

135

–0.400 0.450

0.150

162 174 186

158 170 182 199

154 166 179 191

0.300

0.000 –0.150 –0.300 0.300 0.150 0.000 135 146 152

–0.150

144 150 146

139 148 144

–0.300 0.200 0.100 0.000 –0.100 144 146

–0.200

19 84 19 87 19 88 19 89 19 90 19 91 19 92 19 93 19 94 19 95 19 96 19 97 19 98 19 99 20 00 20 01 20 02 20 03 20 04 20 05 20 06

Δp for Alleles at locus O6

Δp for Alleles at locus O21

Δp for Alleles at locus O24

Δp for Alleles at locus O25

Δp for Alleles at locus E21 Δp for Alleles at locus N5

Pre-Burn 0.300

Year

Figure 6.13 The strength of genetic drift in the total Stegall Mountain population of collared lizards for the alleles at six microsatellite loci as measured by the annual change in allele frequencies at six microsatellite loci. Source: Based on data from Neuwald and Templeton (2013).

193

194

Population Genetics and Microevolutionary Theory

where the α’s are the true transformed allele frequencies at the two time periods. In general, we do not directly know these true values, but we do know the sample values, the a’s. Consider then the observed squared difference of transformed allele frequencies: ai − a j

2

=

2

= ai − α j =

2

ai − α j − a j − α j + aj −αj

ai − α i − α j − α i

2 2

− 2 ai − α j a j − α j + aj −αj

2

6 31

− 2 ai − α j a j − α j

Taking the expectation (Appendix B) of Eq. (6.31) under a model of pure genetic drift, we have: E ai − a j

2

2

= E ai − αi 2 + E α j − αi + E a j − α j = Var ai + E Gdij + Var a j

2

6 32

where Var indicates the sampling variance. As shown in Chapter 4, the sampling variance of the transformed allele frequencies for an autosomal locus is 1/(8nk) where nk is the number of individuals in the sample (and hence 2nk genes in the sample for autosomal loci) scored for the locus of interest at time k. However, for the collared lizards, we have a sampling situation in which the sample often approaches the total population size. If we had complete sampling, then we would be measuring the true population allele frequencies, and by definition, there would be no sampling variance. It is therefore necessary to correct our sampling variances for the fact that the sample includes a substantial portion of the actual population. This is done by measuring the sampling variances at time k by: Var αk =

N k − nk 1 Nk 8nk

6 33

where Nk is the population size (as opposed to the sample size, nk) at time k. Note that if we did sample all of the population (nk = Nk), then Eq. (6.33) would be 0 (no sampling variance in allele frequencies because we performed an actual census). However, if our sample size was much smaller than the population size (nk < < Nk), then the sampling variance would be close to 1/(8nk). Substituting Eq. (6.33) into Eq. (6.32) and solving for the expectation of the genetic drift measure, we have: E Gdij = E ai − a j

2



N j −nj 1 N i − ni 1 − Ni Nj 8ni 8n j

6 34

In the case of the collared lizards, we know all the n’s, and we have accurate estimates of all of the N’s from extensive mark/recapture studies with high sample coverage. Hence, all we need to estimate the strength of genetic drift is to estimate the expected statistic on the right-side of Eq. (6.34). This is simplified greatly by the fact that all alleles, regardless of their allele frequencies, have exactly the same expectation because of the arcsin, square-root transformation. One way of estimating the expected value of (ai − aj)2 is to first calculate the observed values of this statistic for each allele, then sum these allele-specific values over all alleles and divide by the total number of alleles surveyed. We call this the allele average. This allele average is then used as the estimator of E(ai − aj)2. Equation (6.34) is then evaluated to measure the strength of genetic drift. An alternative is to sum the allele specific values of Eq. (6.34) at a particular locus and divide by the total number of alleles at that locus. Next, sum these locus-specific averages over all loci and divide by the number of loci. We call this the locus average. For the collared lizards, it made little difference which averaging technique was used, so in the following discussion, only the simpler, allele average will be used.

Gene Flow and Population Subdivision

Strength of Genetic Drift

A remaining issue is the biological level at which to apply Eq. (6.34). The strength of genetic drift can be measured at the total population level by pooling all the Stegall Mountain glade populations together for each time period. Alternatively, we can measure the strength of drift in a local glade population by applying Eq. (6.34) only to the samples from that particular glade. For explaining the balance of gene flow and drift as measured by fst, the more relevant measure of drift is at the local population level since fst measures the divergence among local populations due to drift. During the pre-burn phase, there are only three glade populations, so we can plot the strength of drift (Eq. 6.34) over time for each population. Unfortunately, SM-8 dropped to low levels, so no or just one lizard was captured for many years until year 6 after translocation, the year just before the onset of prescribed woodland burning for that population. However, both SM-7 and SM-9 have multiple samples in the pre-burn period. During this phase, genetic drift is virtually the sole force because gene flow was very weak, as noted before. Because not all genetic surveys were annual during this part of the study, an annual rate is obtained by dividing the strength of drift measure by the number of years between the two sampling points. The results are shown in Figure 6.14. Although these three populations were founded with nearly equal numbers and mostly drawn from the same source glades so that they were very similar genetically (recall that fst among the three populations at their origin was 0.018), their evolutionary fates under drift were quite distinct. The first translocated population, SM-7, maintained a steady population size up to the fifth year after founding (the initial founder size was 10, and the average population size during this time was 9.0), but with modest growth in the second half of the pre-burn phase (an average population size of 14). The strength of genetic drift reflects these changes in population size, with drift stronger in the first half of the pre-burn phase and weaker in the second. The operation of drift in this population is also reflected in the loss of alleles. The original founders of SM-7 had 30 alleles for the six microsatellite loci, and this was reduced to 23 by the fifth year after translocation, and finally to 19 alleles at the end of the pre-burn phase. The second translocated population, SM-8, is not plotted, but underwent a population bottleneck of between three and four individuals soon after translocation. During this time period, this population went from 23 alleles to 14, indicating very strong genetic drift with respect to allelic loss. The last of the translocated populations, SM-9, had consistently low and decreasing levels of genetic drift throughout the pre-burn period (Figure 6.14), and gradually went from having 29 alleles to 24. Unlike the other two glades, the SM-9 population

0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0

SM-7 SM-9

2

3

5 6 7 Years Since Translocation

8

9

Figure 6.14 The strength of genetic drift in two of the translocated collared lizard populations on Stegall Mountain over time during the pre-burn phase of management. Source: Based on data from Neuwald and Templeton (2013).

195

Population Genetics and Microevolutionary Theory

showed rapid and consistent growth through much of its pre-burn phase, growing from the initial 10 founders to 48 individuals in just four years. The reason for this population growth can be found in Figure 6.5. Glade SM-9 was a much larger glade in 1956. After the release in 1989, additional within-glade clearing and burning expanded the size and quality of SM-9 during the pre-burn phase because this glade could be accessed via a jeep trail at the time, thereby facilitating the movement of crews and equipment. There were no trails of any sort to SM-7 and SM-8, which made such intensive management much more difficult. Thus, despite the initial similar founding conditions, these three glades displayed very different drift effects that reflected three different post-introduction demographies: approximate constant population size (SM-7), declining population size (SM-8), and increasing population size (SM-9). Figure 6.14 illustrates well the point made in Chapter 5 that there is no such thing as the founder effect, as founder events followed by different population size growth trajectories can have extremely different evolutionary consequences. The extensive gene flow in the post-burn phases undermines the use of allele-frequency changes as an indicator of the strength of genetic drift within local populations because local glade allele frequencies are strongly influenced by both drift and gene flow (Ryman et al. 2019). In contrast, the equations for estimating effective sizes given in Chapter 4 and Eq. (6.34) all assume that genetic drift is the sole cause of allele frequency changes within the population and, hence, are inapplicable to the local glade populations once gene flow had become established, particularly at high levels. However, Eq. (6.34) is still a valid measure of drift-strength for the total Stegall Mountain population because there were no lizards on any nearby mountains for most of this time period, and even when lizards from Stegall did colonize other mountains, the population sizes on these other mountains were very small for many years. Hence, Stegall Mountain as a whole can be regarded as a closed population up to 2006. The strength of genetic drift as measured by Eq. (6.34) in the total Stegall Mountain lizard population over all demographic phases is given in Figure 6.15. For the pre-burn period, time was measured in years since translocation. Glade SM-8 only had good genetic sampling in its sixth year, but

0.06 Strength of Genetic Drift (GD)

Pre-Burn Phase

Colonizing Phase

Stable Metapopulation Phase

0.05 0.04 0.03 0.02 0.01

rig

in

+6 /7 19 94 –1 99 5 19 95 –1 99 6 19 96 –1 99 7 19 97 –1 99 8 19 98 –1 99 9 19 99 –2 00 0 20 00 –2 00 1 20 01 –2 00 2 20 02 –2 00 3 20 03 –2 00 4 20 04 –2 00 5 20 05 –2 00 6

0

O

196

Time Period

Figure 6.15 The strength of genetic drift in the total collared lizard population on Stegall Mountain over time as measured by the variance in transformed allele frequencies. Source: Based on data from Neuwald and Templeton (2013).

Gene Flow and Population Subdivision

SM-7 had no sampling in its corresponding sixth year, but good sampling in its seventh year after translocation, as did SM-9. Hence, the pre-burn period is represented only by a single data point that measures drift for the total population at year 6/7 after translocation. As can be seen, drift was only a moderate force on the total population during the pre-burn phase. Why was drift only moderately strong given the small total population size during this phase (Figure 6.6)? This may seem to contradict the results discussed earlier, but those were for individual glade populations and not the total population. As will be discussed later in this chapter, a little-appreciated impact of fragmentation into isolates is that it can greatly increase the variance effective size of the total population even while decreasing the variance effective size of each isolate, thereby explaining the mild drift observed at the total population level during this fragmentation phase. Genetic drift greatly increases in strength as we enter the colonizing phase, but then drops and rebounds, as was observed for fst (Figure 6.9). The strength of genetic drift decreases and then stabilizes as the population transitions into the stable metapopulation phase. The strength of drift is weakest during the stable metapopulation phase, reflecting the now large total population size (Figure 6.6). Thus, just like gene flow, genetic drift is a dynamically changing evolutionary force in the Stegall Mountain populations that eventually stabilizes. The joint stabilization of genetic drift and of gene flow during the stable metapopulation phase led to the observed stabilization of fst, as shown in Figure 6.9.

Factors Influencing the Amount and Pattern of Gene Flow Dispersal The parameter m is defined in terms of gene pools, and therefore m represents the amount of exchange of gametes between the local populations and not necessarily individuals. In some species, gametes are exchanged directly without diploid individuals moving at all. For example, many trees are wind pollinated, and the pollen (haploid gametophytes) can be blown for hundreds of miles by the wind. Hence, tree populations that are quite distant can still experience gene flow, yet no diploid trees are walking back and forth! For many other species, m requires that individuals move from their local population of birth to a different local population, followed by reproduction in their new location. Because gene flow requires both dispersal and reproduction, any factor that influences the amount or pattern of dispersal of individuals between local populations or the chances for reproductive success of dispersing individuals can also influence the amount and pattern of gene flow. For example, in a review of over 400 studies on vertebrate species, species that walk tend to have larger fst’s (i.e. less gene flow) than those that swim or fly, and freshwater swimmers tend to have larger fst’s than those in oceans (Medina et al. 2018a). Obviously, the mode of locomotion greatly influences the type and severity of barriers to dispersal and hence gene flow. There are many methods of directly monitoring dispersal. In the collared lizard example, individuals were marked and later recaptured, often multiple times, thereby allowing the direct detection of multiple between-glade dispersal events (Figure 6.10). There are now many methods of marking individuals (Jonsson et al. 2016), including those, such as GPS (Global Positioning System) collars, that provide monitoring of the positions of individuals over user-defined time intervals (Giotto et al. 2015). However, one must never forget that there are multiple types of dispersal (Dingle and Drake 2007), and not all forms of dispersal result in gene flow. One common mode is relatively short daily movements associated with individual foraging or territorial behavior. For example, a mark/recapture study was executed on the federally endangered Hine’s emerald

197

Population Genetics and Microevolutionary Theory

700.00 600.00 Female Distance in Meters Male Distance in Meters

500.00 Distance in Meters

198

400.00 300.00 200.00 100.00 0.00 0

2

4

6

8

10

12

14

16

Interval in Days

Figure 6.16 A plot of dispersal distances of marked Hine’s emerald dragonflies versus the number of days available for dispersal before recapture for females and males at Kay Branch. Source: Modified from figure K3 in Walker et al. (2020).

dragonfly (Somatochlora hineana) that inhabits fens, a type of wetland with calcareous seepage flow (Walker et al. 2020). One of the sites for the mark/recapture study was a complex of nine fens along Kay Branch Creek in the Ozarks. Dragonflies were only marked at a central fen, but recaptures were made and found over all the Kay Branch fens. There was no significant association between the number of days available for an observed movement and the distance moved (Figure 6.16), indicating that the movements were primarily daily movements. Figure 6.16 includes same-day recaptures, and the distribution of movements within a day is not significantly different from movements over multiple days, again emphasizing that these are daily movements. Collared lizards also show much movement within glades associated with foraging and territorial behavior (Conley et al. 2021). Both males and females establish territories, and males aggressively defend their territories against other adult males. A single male territory often includes the territories of multiple females, and these territories are also the sites of mating. The mark/recapture studies described earlier indicate that many individuals that disperse to a new glade remain in that glade and establish territories. Some adult males, however, appear to wander from one glade to another, apparently unable to establish a territory. Because interglade dispersal is associated with reproductive behavior in collared lizards, any factor that influences interglade dispersal and the ability to establish a territory should affect gene flow. The first factor to consider is the initiation of dispersal. As only one marked lizard dispersed during the preburn phase, whereas many did after the initiation of woodland burning, it is obvious that the nature of the woodland matrix separating the glade populations had a major impact upon the probability of dispersal. Table 6.1 presents the probabilities of dispersal during the colonizing and stable phases and the significant heterogeneity that existed due to sex, age, and demographic phase. Yearlings of both sexes and adult females had the lowest probabilities of dispersing, whereas hatchling males followed by hatchling females had the highest probabilities of dispersing, but only during the stable metapopulation demographic

Gene Flow and Population Subdivision

phase (0.500 and 0.319, respectively). In contrast, the colonizing phase had the lowest probabilities of dispersal for both sexes and all age classes (ranging from 0.088 to 0.143). Territoriality may explain this shift. During the colonizing phrase, much of the dispersal was to uninhabited glades. Once on such a glade, territories could be established without competition, and this would be the case until the population size increased on the new glade. Hence, there would be little motivation to disperse to a new glade for a few years after colonization. This hypothesis would also explain why adult males were twice as likely to disperse to a new glade during the stable phase when glades would more likely be saturated with territories compared to the colonizing phase when territories could often be established without aggressive attacks by territory-holding males (Table 6.1). Conley et al. (2021) examined the impact of territoriality both through detailed behavioral studies on six focal glades and by discovering a physical proxy for male territoriality that allowed an analysis of dispersal over hundreds of glades. Collared lizards are visually oriented animals. They are sitand-wait predators, perching upon a high rock until they see a prey item, then rapidly running after it, often on their hind legs. Males are highly colorful, using visual signals both to attract mates and to warn off other males. Ignored warnings can lead to physical fights, which can severely injure a lizard due to the strong bite force that adult males can inflict (as can be personally testified). There is much heterogeneity between glades in the visual environment experienced by the lizards. Exposed bedrock that produces an open visual field is the preferred habitat for collared lizards within a glade. Some glades consist mostly of exposed bedrock, whereas others have patches of exposed bedrock separated by tall grassy areas and shrubs, as seen in the glade in Figure 6.4. These grassy areas and shrubs are effective visual barriers at lizard height and break up the visual openness of a glade. The degree of visual openness can be quantified by calculating the radius of gyration from digital aerial photographs. The radius of gyration represents the average distance a lizard randomly placed on bedrock could move before encountering a boundary defined by non-bedrock substrate. The correlation length is the area weighted mean radius of gyration and measures the extensiveness of all bedrock patches within a glade. Larger correlation length values indicate a more connected, more visually open glade. Because larger glades tend to have longer correlation lengths, the log transformed correlation length was regressed against the log of total bedrock area to yield the log residual correlation length. Positive log residual correlation lengths indicate that bedrock was more evenly spaced than on average after adjustment for bedrock area, whereas negative log residual correlation lengths indicate that bedrock was more clustered and the glade was less visually open. The log residual correlation lengths for the glades on Stegall Mountain ranged from −0.794 to 0.890 with a mean of −0.049 and standard deviation of 0.431, indicating much interglade heterogeneity in the amount of bedrock clustering and visual openness. Social interaction networks and territorial overlap networks were constructed from the detailed behavioral studies on the focal glades. These networks revealed that breaking up the bedrock and decreasing visual openness allowed a few males to monopolize the bedrock clusters, whereas glades with a more even, visually open environment decreased the potential for male monopolization. The dispersal patterns during the stable metapopulation phase, when establishing territoriality would be difficult, were then analyzed to find glade attributes that promoted dispersal. No significant factors were discovered for female dispersal, but male dispersal was highly significantly affected by both glade population size and glade log residual correlation length. In general, glades with larger population sizes produced fewer male dispersers, but larger sized glades with a high degree of visual openness produced significantly more male dispersers than larger sized glades with less visual openness. It is possible that fewer collared lizard males disperse out of glades with larger population sizes in general because the habitat is perceived as high quality. The fragmenting of bedrock facilitates monopolization of multiple females by a male, which makes such glades particularly valuable

199

200

Population Genetics and Microevolutionary Theory

habitats for a male that can establish a territory. If male hatchlings and yearlings can avoid damage from aggressive adult males by exploiting the visual barriers on glades with bedrock fragmentation, staying on a glade with a high potential for monopolization of females could be a more effective strategy than risking dispersal. A glade with many lizards and a high degree of visual openness does not allow nonterritorial males many opportunities to avoid aggression from the territorial males, and males disproportionately leave these seemingly optimal glades that can support a large population size but put them at increased risk to aggressive attacks. Given the decision to disperse, there follows the decision of where to go and stay. Conley et al. (2021) show that both female and male dispersers make nonrandom choices, preferentially going from glades with smaller population sizes to glades with larger population sizes, once again indicating that population size can be considered an indicator of habitat quality. The log residual correlation length had no significant impact on the choice of destination for either sex. Overall, these studies on collared lizards indicate that both the decision to disperse and the decision of where to stay given dispersal are complex and affected by multiple factors: including sex, age, demography, social interactions, habitat quality, and the nature of the matrix through which dispersal occurs. These complex decisions produce nonrandomness and asymmetries in the predicted amount and pattern of gene flow that is not captured by simple gene flow models such as that shown in Figures 6.1 and 6.3.

Isolation by Distance and Resistance Equations (6.8) and (6.29) relate the balance of drift and gene flow (as measured by Fst or fst) to underlying quantitative measures of genetic drift (Nef or Nev) and gene flow (m). However, we derived Eqs. (6.8) and (6.29) only for specific models of gene flow: either symmetrical gene flow between two demes, or the island model of many demes. The two-deme model is of limited generality, and the island model depends upon several specific and biologically implausible assumptions, such as a portion m of the gametes being extracted from each deme and distributed at random over all other demes regardless of their locations relative to one another. The island model was chosen primarily for its mathematical convenience rather than its biological realism. Changing the underlying assumptions of the model can change the balance between drift and gene flow. Consequently, Eqs. (6.8) and (6.29) do not represent the general quantitative relationship between drift and gene flow as forces causing genetic subdivision. Instead, these equations represent only special and highly unrealistic cases. We now consider some alternate models. One frequently unrealistic aspect of the island model is the assumption that all dispersing individuals (or gametes) are equally likely to migrate to any local deme. In most species, some pairs of local demes experience much more genetic interchange than others. One common type of deviation from the island model is isolation by distance in which local demes living nearby to one another interchange gametes more frequently than do geographically distant demes. We need look no further than the collared lizards for an example of this. Figure 6.17 presents the cumulative distributions of dispersal distance for various classes of collared lizards on Stegall Mountain. A cumulative distribution gives the proportion of dispersing individuals that have dispersed a given distance or less. Such distributions always start at 0 for distance 0 (no dispersal) and increase to 1 at the distance by which all dispersing individuals have dispersed. As can be seen, adult females had the smallest dispersal distances such that the cumulative distribution reached 1 in just a few hundreds of meters, and the 50% dispersal distance of adult females was 76 m during the colonizing phase and 116 m during the stable metapopulation phase. In contrast, yearlings and hatchlings in the stable phase had dispersal events just short of 3 km, with the 50% dispersal distance for the yearlings being 269 m, and

Gene Flow and Population Subdivision 1 0.9 0.8

Cumulative Distribution

0.7 0.6 Adult Female Colonizing

0.5

Adult Male Colonizing 0.4

Adult Female Stable Adult Male Stable

0.3

Yearlings 0.2

Hatchlings Colonizing Hatchlings Stable

0.1 0 500

0

1000

1500

2000

2500

3000

Dispersal Distance in meters

Figure 6.17 The cumulative distributions of dispersal distance for various classes of collared lizards from Stegall Mountain. Source: Templeton et al. (2011). © 2011, John Wiley & Sons.

515 m for hatchlings in the stable phase. As with the probability of initiating dispersal, the dispersal distances were strongly influenced by sex, age, and demographic phase (Figure 6.17). Cumulative distribution plots against distance such as that shown in Figure 6.17 are commonly observed in many species, including humans, although the shape and distance scale can vary considerably from species to species. Dispersal restricted by distance is extremely common in terrestrial species and in aquatic organisms living in rivers or streams. The major implication of dispersal distributions restricted by distance is that most gene flow tends to occur between nearby populations and falls off between populations with increasing distances. We therefore need to incorporate this distance restriction into our models of gene flow. There are many models of isolation by distance in the population genetic literature, and we will only consider a few simple ones. We start with a one-dimensional stepping stone model in which a species is subdivided into discrete local demes, as with the island model. These local demes are arrayed along a one-dimensional habitat such as a river, a valley, or shore line. One version of this model allows two types of gene flow (Figure 6.18). First, a fraction m∞ of the gametes leave each deme and disperse at random over the entire species, just as in the island model. Second, a fraction m1 of the gametes from each deme disperse only to the adjacent demes. Because this is a onedimensional model, each deme has just two neighbors (Figure 6.18, ignoring the demes at the two ends of the habitat), and we assume symmetrical gene flow at this local geographic level, that is, m1/2 go to one of the neighboring demes, and the other m1/2 go to the other (Figure 6.18). Then, Weiss and Kimura (1965) showed: f st = 2N ev

1 1 1− 1− 2N ev

1−

2R1 R2 R1 + R 2

6 35

201

202

Population Genetics and Microevolutionary Theory

Common gene pool from all demes

m∞

m∞

m∞

m∞

m1/2

m∞ m1/2

m1/2

m∞

m∞

m1/2 N

N

N

m∞

m∞ m1/2

N

m1/2

m∞

m1/2

N m1/2

Figure 6.18 The one-dimensional stepping stone model of gene flow between discrete demes. Each deme is of idealized size N and is represented as a circle arrayed on a line. A portion m1 of the gametes from any one population are exchanged with the two neighboring populations, half going to each neighbor. Moreover, each population contributes a fraction m∞ of its gametes to a common gene pool that is then distributed at random over all demes in the same proportion.

where the correlations between allele frequencies of demes one (R1) and two (R2) steps apart are: R1 =

1 + 1 − m1 1 − m ∞

2

− m1 1 − m ∞

2

6 36 R2 =

1 − 1 − m1 1 − m ∞

2

− m1 1 − m ∞

2

When m1 = 0, all dispersal is at random over the entire species’ geographical distribution (the island model), and Eq. (6.35) simplifies to fst = 1/[1 + 4Nevm∞]. Hence, Eq. (6.29) is a special case of (6.35) when there is no additional gene flow between adjacent demes. In many populations, there is much more dispersal between neighboring demes than between distant demes. In terms of our stepping stone model, this means that m∞ is very small relative to m1. Under these conditions, Eq. (6.35) is approximately: f st =

1 1 + 4N ev 2m1 m ∞

6 37

Note from Eq. (6.37) that even when m∞ is very small relative to m1, the long-distance dispersal parameter m∞ still has a major impact on genetic subdivision because the impact of gene flow is through the product of m1 and m∞. For example, let Nev = 100 and m1 = 0.1. Then, fst = 0.053 if m∞ = 0.01. However, if m∞ = 0.001, then fst = 0.276. Note in this example that large differences in fst are invoked by changes in long-distance dispersal even though long-distance dispersal is 10 to a 100 times less common than short-distance dispersal. At first, this sensitivity to rare, longdistance dispersal may seem counter-intuitive, but the reason for it can be found in Eq. (6.2). From those equations, we saw that the evolutionary impact of gene flow upon allele frequencies depends upon two factors: (i) how much interchange is actually occurring between two demes and (ii) how genetically distinct the demes are in their allele frequencies. When much dispersal occurs between neighboring demes, the allele frequencies in those neighboring demes are typically very similar. Hence, even though there is a large amount of exchange between neighbors relative to long-distance dispersal, exchange between neighboring demes has only a minor evolutionary impact upon allele frequencies. On the other hand, when long-distance dispersal occurs, it generally brings in gametes to the local deme that come from a distant deme with very different allele frequencies. As a consequence, long-distance dispersal has a large evolutionary

Gene Flow and Population Subdivision

impact when it occurs. This trade-off between frequency of genetic interchange and the magnitude of the evolutionary impact given genetic interchange explains why both m1 and m∞ contribute in a symmetrical fashion to fst. The importance of long-distance dispersal (m∞) upon overall gene flow even when it is rare means that gene flow is difficult to measure accurately from dispersal data. Many methods of monitoring dispersal directly are limited to short geographical scales, and it is usually impossible to quantify the amount of long-distance dispersal. Yet, these rare, long-distance dispersal events can have a major impact on a species’ genetic population structure. The problem of inferring long-distance dispersal is accentuated by the fact that, in many species, long-distance dispersal is primarily done by an age class or life stage that is often difficult to monitor by direct means. In Figure 6.17, we see that long-distance dispersal is most extreme in hatchling collared lizards. This is the age stage that is most difficult to monitor. Adult collared lizards will eat hatchlings, so the hatchlings tend to stay on the margins of a glade and are extremely cautious and quick to hide. Only when the adults go under rocks in preparation for winter do the hatchlings move onto the glade proper and are more active and visible. However, even then, they are more difficult to capture than adults. Another example of this problem occurs in the many marine species that live in coral reefs. The reefs represent only a very small part of the oceans, so the adults are typically concentrated into these reefs and rarely if ever disperse out of the reef in which they first settled. Dispersal between reefs occurs instead during a pelagic larval stage. Very few of these larvae survive to adulthood, but they can disperse many thousands of kilometers. These combination makes it virtually impossible to study dispersal directly in these species. However, indirect genetic monitoring can be informative. For example, moray eels in the Indo-Pacific (about 2/3 of the world’s surface) have one of the longest pelagic larval phases of any reef fish. This extended pelagic phase should allow them to disperse over long distances due to oceanic currents. A genetic survey of the moray eel Gymnothorax undulates revealed genetic homogeneity across populations over the entire Indo-Pacific, with no isolation by distance across 22 000 km (Reece et al. 2010). Hence, the extreme sedentary behavior of the adult eels is totally misleading about their patterns of gene flow. Indeed, their gene flow is so extreme that there is only one local deme in the entire Indo-Pacific. Models of isolation by distance have also been developed for the case in which a species is continuously distributed over a habitat, and not subdivided into discrete local demes (Malécot 1950). In models of a species with continuously distributed individuals, the density of the individuals, δ, replaces the population size of the local demes (N in the discrete model shown in Figure 6.18), and σ, the standard deviation of the geographical distance between birthplace of parent and offspring, replaces m1 for short-distance dispersal. The parameter m∞ is retained as the longdistance dispersal parameter that measures random dispersal over the entire species range. Then, at equilibrium: f st ≈

1 1 + 4δσ 2m ∞

6 38

which is the continuous analog of Eq. (6.37). These models have also been extended to species living in habitats with more than one dimension. For example, the analog of Eq. (6.38) for a two-dimensional habitat is: f st ≈

1 8πδσ 2 1+ − ℓn 2m ∞

6 39

203

Population Genetics and Microevolutionary Theory

0.2

0.15 fst

204

0.1

1 Dimensional Habitat

0.05 2 Dimensional Habitat 20

40 60 Density of Population, δ

80

100

Figure 6.19 The effect of dimensionality of the habitat upon population subdivision as measured by fst in the gene flow model over a continuous habitat. The dispersal variance σ2 is fixed at one, and the long-distance dispersal parameter m∞ is fixed at 0.01. Density, δ, is allowed to vary from 5 to 100 for both one-dimensional (upper curve) and two-dimensional (lower curve) habitats.

For the same values of δ, σ, and m∞, the one-dimensional fst will be much larger than the twodimensional fst, as illustrated in Figure 6.19. The reason for this is that any single population or geographical point is genetically interconnected with more distinct populations or individuals as dimensionality increases. For example, in the discrete two-dimensional stepping stone model in which demes are placed at intersections in a lattice, there are four adjacent demes to any one deme (ignoring edge effects), whereas there are only two adjacent demes to any one deme in the onedimensional case (Figure 6.18). Hence, as dimensionality increases, the homogenizing effects of gene flow become more powerful because more local demes are intermixing at the short distance rate, resulting in smaller amounts of genetic subdivision as measured by fst. The plots of fst given in Figure 6.19 show how fst varies as a function of density for a fixed σ and m∞. However, in many species, σ and m∞ change as density changes. For example, Levin and Kerster (1969) studied four separate colonies of the herb Liatris aspera. Individual plants were continuously distributed within each colony in their two-dimensional habitats (fields), but different colonies had different densities. Table 6.2 gives the densities of these plants in four colonies, which vary over an 11-fold range. Both seed dispersal and movement of pollen by pollinators mediated gene flow in these plant populations, and both types of movements were studied to obtain an estimate of σ2 within each colony, also shown in Table 6.2. Because the plants were distributed continuously within these colonies, there were no discrete demes within colonies to which we can assign a meaningful Nev. For such continuously distributed populations, Wright (1946) proposed an alternative to a discrete deme called the neighborhood: the subregion within the population’s continuous distribution that surrounds a point in space from which the parents of individuals born near that point may be treated as if drawn at random. Assuming that dispersal is random in direction and follows a normal distribution, Wright showed that the neighborhood area is given by: Neighborhood area for 1 dimension = 2σ π Neighborhood area for 2 dimensions = 4πσ 2

6 40

Gene Flow and Population Subdivision

Table 6.2 Density-dependent gene flow in four colonies of the herb Liatris aspera living in continuous twodimensional habitats. Colony Parameter

I

II

III

δ, Density (Plants/m2)

1

3.25

5

σ2, Seed Plus Pollen Dispersal Variance (m2)

2.38

1.83

1.51

2

IV

11 1.35

Neighborhood Area (m )

30

23

19

17

Neighborhood Size

30

75

97

191

Source: Modified from Levin and Kerster (1969).

The neighborhood size is the neighborhood area times the density, δ. It is important to keep in mind that neighborhood size is not an effective size (Chapter 4), but rather is a measure of withinpopulation spatial genetic structure that depends strongly on the dispersal characteristics of a species (Nunney 2016). Nevertheless, simulations reveal that neighborhood size can influence effective sizes (Nunney 2016). Table 6.2 shows both the neighborhood areas and the neighborhood sizes for four colonies of Liatris. As can be seen, the dispersal variance decreases with increasing density, and as a consequence, the neighborhood area also decreases as density increases. The reason for the decrease in neighborhood area with increasing density in Liatris was almost entirely due to the gene flow caused by pollinators. The pollinators tended to fly from one plant to its nearest neighbor, but, of course, the nearest neighbor distance between plants decreased with increasing plant density. As a consequence, the neighborhood area became smaller as density increased. Thus, even though the density increased by 11-fold in going from colony I to IV, the neighborhood size only increased by sixfold (Table 6.2). Hence, actual population sizes or densities are not necessarily reliable indicators of variance effective or neighborhood sizes, which are functions of population size/density, dispersal, and the interactions between them. Another interesting implication of the isolation by distance models is that the degree of genetic differentiation between two demes or two points on a geographical continuum should increase with increasing separation – either the number of “steps” (in the stepping stone model) or the geographical distance (in the continuous distribution models). As distance increases in these models, gene flow decreases, which in turn shifts the balance of gene flow to drift more in favor of drift, resulting in increasing genetic differentiation with increasing distance. To test for this predicted pattern, it is convenient to have a population genetic distance that measures the degree of genetic differentiation between two populations. There are several types of population genetic distances (not to be confused with the molecule genetic distances used in Chapter 5), and Box 6.1 shows Nei’s population genetic distance, one of the more commonly used measures. Another distance measure often used for isolation by distance models is the pairwise fst [or sometimes a related pairwise measure, fst/ (1 − fst)]. A pairwise fst is an fst calculated from Eq. (6.27) but applied to just two populations at a time. The total population now used to calculate Ht refers just to the two populations of interest, and all other populations in the species are ignored. Note that this population genetic distance (and all others as well) is biologically quite distinct from the molecule genetic distances discussed in Chapter 5. A molecule genetic distance ideally measures the number of mutations that occurred between two DNA molecules during their evolution from a common ancestral molecule. A pairwise fst is a function of allele frequencies between two demes, and mutation is not even

205

206

Population Genetics and Microevolutionary Theory

Box 6.1 Nei’s (1972) Population Genetic Distance Consider two populations, 1 and 2, scored for allelic variation at a locus. Let p1i and p2i be the frequencies of the ith allele in populations 1 and 2, respectively. Then, the probability of identityby-state of two genes chosen at random from population 1 is j1 = Σ p1i2 where the summation is taken over all alleles. Similarly, the probability of identity-by-state of two genes chosen at random from population 2 is j2 = Σ p2i2. The probability of identity-by-state between two genes, one chosen at random from population 1 and one chosen at random from population 2, is j12 = Σ p1i p2i. Nei (1972) defined the normalized genetic identity-by-state between these two populations as: I 12 =

j12 j1 j2

Note that this measure of identity between the two populations ranges from zero (when the two populations share no alleles in common, thereby making j12 = 0) to one (when the two populations share all alleles in common and at the same allele frequencies, thereby making j12 = j1 = j2). In contrast to an identity measure, a distance measure should get larger as the two populations share less and less in common in terms of alleles and their frequencies. Nei mathematically transformed I12 into a distance measure by taking the negative of the natural logarithm of identity: D12 = − ℓn I 12 This population genetic distance ranges from zero when the populations share all alleles in common and at the same allele frequencies (I12 = 1) to infinity when the populations share no alleles in common (I12 = 0). When data from multiple loci exist, Nei (1972) recommended that the j’s be averaged over all loci, including monomorphic loci (loci with only one allele, thereby ensuring that all j’s are one at monomorphic loci). These average j’s are then used to calculate an overall identity, which is then transformed to yield an overall genetic distance. Hillis (1984) pointed out that averaging the j’s across loci can sometimes lead to distances that make little sense biologically. For example, suppose that two populations are fixed for the same allele at one locus (and hence all the j’s are 1 and I12 = 1), and, at a second locus, they share no alleles, but each population is polymorphic for two alleles each with a frequency of 0.5. At this second locus, j12 = 0 and I12 = 0 because they share no alleles, but j1 = j2 = 0.5. Hence, across both loci, the average j12 = 1/2(1 + 0) = 0.5, and the average j1 = the average j2 = 1/2(1 + 0.5) = 0.75. Using these average values of the j’s, I12 is calculated to be 0.5/0.75 = 0.667 and D12 = 0.41. Now, consider another case in which the first locus is polymorphic with both populations sharing two alleles and with each allele in each population having a frequency of 0.5. In this case, all the j’s are 0.5 in value, and I12 for this locus has a value of 1. At the second locus in this case, each population is fixed for a different allele, so j12 = 0 and j1 = j2 = 0.5. In this second case, the average j12 = 1/2(0.5 + 0) = 0.25 and the average j1 and average j2 are both 1/2(0.5 + 1) = 0.75. Using these average j’s, I12 = 0.25/0.75 = 0.333 and D12 = 1.1. In this case and in the previous case, the two populations are both completely identical at the first locus and completely different at the second, yet they have very different genetic distances (0.41 versus 1.1) as originally defined by Nei (1972). Hillis (1984) points out that situations like (Continued)

Gene Flow and Population Subdivision

Box 6.1 (Continued) this are likely to occur when different loci have different overall rates of evolution. To make the population genetic distance measure robust to this heterogeneity in evolutionary rates across loci, Hillis (1984) recommends that the I’s be averaged across loci, not the j’s. For example, in the first case where the populations are identical at the monomorphic locus and different at the polymorphic one, the average I12 = 1/2(1 + 0) = 0.5, yielding a population genetic distance of 0.69. In the second case where the populations are identical at the polymorphic locus and different at the monomorphic one, the average I12 = 1/2(1 + 0) = 0.5, yielding the same population genetic distance of 0.69. Therefore, it is better to average the I’s and not the j’s in calculating this type of population genetic distance from multi-locus data when there is rate heterogeneity across loci. Source: Modified from Nei (1972).

necessary for this population genetic distance to take on its maximum value. For example, suppose an ancestral population was polymorphic for two alleles, A and a, at an autosomal locus. Now assume that the ancestral population split into two isolates, with one isolate becoming fixed for A and the other for a. Then, the pairwise fst for these two isolates would be 1 (and the Nei’s distance in Box 6.1 would be infinite), the maximum value possible, even though not a single mutation occurred. Population genetic distances should never be confused with molecule genetic distances. In the isolation by distance models, let fst(x) be the pairwise fst between two populations x steps apart or x geographical units apart. Malécot (1950) has shown that: f st x =

e − x 2m ∞ m1 when m ∞ < < m1 1 + 4N ev 2m1 m ∞

6 41

for the discrete, one-dimensional stepping stone model, and for the continuous habitat models: e − x 2m ∞ σ for 1 dimensional habitats 1 + 4δσ 2m ∞ 2

f st x ≈

e−x

f st x ≈

6 42

2m ∞ σ 2

for 2 dimensional habitats

2

1+

8πδσ − ℓn 2m ∞

x

In general, the isolation by distance models can be approximated by an equation of form (Malécot 1950): f st = ae − bx x − c

6 43

where c = 1/2(dimensionality of the habitat −1), and where a and b are estimated from the fst(x) data. The parameter c can also be estimated from the observed pairwise population genetic distances when it is not obvious what the dimensionality of the habitat may be. For example, a species may be distributed over a two-dimensional habitat, but more constraints on movement may occur

207

Population Genetics and Microevolutionary Theory

0.30 0.25 0.20 0.15 Genetic Distance

208

0.10 0.05 0.00 –0.05 –0.10 –0.15 –0.20 –1000

0

1000

2000

3000

4000

Geographic Distance

Figure 6.20 Isolation by distance among the glade populations of collared lizards on Stegall Mountain. The y-axis is the pairwise fst genetic distances between two glade populations, and the x-axis is the distance in meters between the two glades. Source: Modified from Neuwald and Templeton (2013).

in one direction than another. Hence, noninteger dimensions between one and two are biologically meaningful. Equation (6.43) and similar results from more complex models all indicate that isolation by distance results in the population genetic distance between a pair of populations being an increasing function of the geographical distance between the pair. The collared lizards on Stegall Mountain provide an example. By 1999, a pattern of isolation by distance had arisen that persisted into the stable metapopulation period. Figure 6.20 shows the pattern observed in 2004 between pairwise fst and pairwise geographical distance. Because pairwise data points are generally not independent, testing for a significant association between genetic and geographic distances is done through various types of permutation testing, such as the Mantel test that is effective for detecting isolation by distance (Kierepka and Latch 2015). The result shown in Figure 6.20 was highly significant, with a p-value of 0.002. Hence, the restricted dispersal distances shown in Figure 6.17 did indeed translate into a significant isolation by distance gene flow pattern in the Stegall Mountain collared lizards. Collared lizards also display isolation by distance on a larger geographical scale. Figure 6.21 shows a plot of pairwise fst versus geographical distance for collared lizards from western Texas and Oklahoma, where they are distributed in a more continuous fashion in a predominantly non-forested region, and a plot of the pairwise fst‘s for northeastern Ozark populations (Hutchison 2003). In western Texas and Oklahoma, the fst values gradually rise with increasing geographical distance, reaching a value averaging around 0.15 for the longest geographical distances. This is the classic pattern for isolation by distance, and it shows that collared lizards disperse well over relatively long distances in the absence of forests. The northeastern Ozark populations represent a dramatic outlier to this isolation by distance pattern. The fst values range from near 0 to 0.9 in the northeastern Ozarks and show no association with geographical distance, with

Gene Flow and Population Subdivision

the most distant populations being only 90 km from one another. This scattershot of fst values in the Ozarks is likely due to the recent fragmentation of glades from one another due to woodland fire suppression. The resulting unburned forest barrier caused a dramatic divergence of even nearby populations due to genetic drift without gene flow in small, isolated glade populations. Figure 6.21 also serves as a warning that a species can have different population structures in different regions of its distribution. The fragmentation effects shown in Figure 6.21 and the dramatic change in dispersal probabilities after prescribed woodland burning began in the Ozarks are indicative of a different type of isolation: isolation by resistance. Isolation by resistance emerges because organisms have different abilities to traverse certain landscape features and habitat types. As noted earlier, only one marked collared lizard traversed the 50 m of unburned woodland separating glades SM-7 and SM-8 (Figure 6.5) in 10 years. Once the woodland was burned, lizards frequently traversed this same distance (Figures 6.10 and 6.12) – indeed, now kilometers of woodland were no longer a barrier to dispersal (Figure 6.17). Obviously, geographical distance alone is not the only factor influencing the amount of dispersal and gene flow. The environmental features of the landscape play a major role, and this role can be quantified by attributing different resistances to dispersal as a function of different landscape features. Dealing with resistance takes us into the domain of landscape genetics, an area that combines landscape features with genetic data to infer resistance models (Fourtune et al. 2018), with isolation by distance frequently regarded as the baseline model with uniform resistance of all landscape features (van Strien et al. 2015). The Mantel test is often not as effective in detecting isolation by resistance as it is for isolation by distance (Kierepka and Latch 2015; Legendre et al. 2015), so a variety of analytical methods have been derived for performing landscape genetics (Dyer 2015; Kierepka and

1.0

Pairwise fst

0.9 0.8

Northeastern Ozarks

0.7

Western Texas and Oklahoma

0.6 0.5 0.4 0.3 0.2 0.1 0.0 0

100

200 300 400 500 Distance Between Populations in Kilometers

600

700

Figure 6.21 A plot of pairwise fst versus geographical distance for populations of collared lizards from the southwestern United States (solid diamonds) and for populations from the northeastern Ozarks (open circles). Source: Based on data from Hutchison (2003).

209

210

Population Genetics and Microevolutionary Theory

Latch 2015; Lundgren and Ralph 2019; Peterson et al. 2019). One approach is circuit theory that models dispersal between populations as current flowing through an electrical circuit composed of nodes and resistors (McRae 2006; McRae et al. 2008; Hanks and Hooten 2013). Populations are nodes, with the current produced related to the number of dispersers from the population. The resistance to current flowing between nodes is determined by the characteristics of the landscape. Circuit analysis better reflects the landscape as experienced by dispersing individuals by incorporating alternative movement pathways and the influence of matrix heterogeneity into predictions of overall connectivity. Circuit analysis has proven to be a flexible and robust method that is extensively used in conservation biology (Dickson et al. 2019). Conley et al. (2021) applied circuit analysis to the collared lizard populations on Stegall Mountain during the stable metapopulation phase. As we have already seen, significant isolation by distance had already been established during this demographic phase (Figure 6.20), but isolation by distance explained less than 10% of the variation in genetic distances between glade populations. The null model was isolation by distance for which the entire landscape was assigned a uniform resistance. Previous analyses (Templeton et al. 2011; Neuwald and Templeton 2013) had indicated that slope was important for dispersal, so slope was incorporated as an environmental feature into the circuit analysis. Many circuit analyses treat the nodes (in this case glades) as homogeneous, but we have already seen that the degree of visual openness as measured by the log residual correlation length has a significant impact on the decision to disperse from a glade. This behavior related to an internal landscape feature of a glade (its visual openness) was incorporated into the circuit analysis by weighting the amount of current produced at each glade/node by the log residual correlation length. The various models produced by all possible combinations of these variables were simulated and tested against the pairwise fst’s between glade populations using the genetic data from 2000 to 2003, the years of the most extensive genetic sampling. The statistically significant, best fitting model that emerged is shown in Figure 6.22. This model incorporates increasing resistance to dispersal with increasing steepness of the slope and also the impact of the internal visual openness of a glade upon generating dispersal current. As can be seen from Figure 6.22, the amount of current (gene flow) across this landscape deviates substantially from a simple isolation by distance model, with some glades showing much gene flow over distances that have little gene flow on other parts of the mountain. This resistance model explains significantly more of the variance in fst’s between glades than the null model of isolation by distance or a resistance model that uses only slope (Conley et al. 2021). One limitation of the above circuit analysis and other methods in landscape genetics is that they require discrete local populations (nodes in circuit analysis). However, many species have a continuous or semi-continuous distribution over their range, or at least parts of their range. One solution is to use a two-dimensional stepping stone model overlayed upon the continuous parts of the range with such a fine grid of nodes that it approximates a continuous distribution. Petkova et al. (2016) used a genetic distance between two individuals (hence, no a priori population categories are needed) sampled from a fine grid and generated an expected individual genetic distance by simulating and integrating over all possible migration pathways in the stepping stone grid. Landscape features are not directly incorporated into these simulations, but a resistance distance is estimated by maximizing the fit of the expected genetic distances of two points through multiple simulations to the observed average individual genetic distances. These resistance distances are then interpolated across the map of the geographical area to produce an estimated effective migration surface (EEMS) similar to that shown in Figure 6.22. Effective migration in this case is not m, the rate of gene flow, but rather more related to δσ2 in Eq. (6.42). Al-Asadi et al. (2019) built upon the EEMS approach, but with some significant differences. Instead of using a measure of genetic distance

Gene Flow and Population Subdivision

Figure 6.22 Circuit analyses of gene flow among collared lizard populations on Stegall Mountain. The brighter the area, the more predicted gene flow through that area. Source: From Figure 6.6 in Conley et al. (2021).

between two individuals, Al-Asadi et al. (2019) used shared identity-by-descent segments between two individuals (Chapter 3), as previously done by Ringbauer et al. (2017, 2018). There are two advantages to using identity-by-descent segments. First, this results in a haplotype-based approach, and identity-by-descent of haplotypes is more robust to mutational models than single-nucleotide markers (Chapter 5). Second, recombination breaks up the identity-by-descent segments into smaller and smaller lengths as time progresses (Chapter 3), so by examining different length classes of identity-by-descent segments, one can estimate both recent and historic gene flow patterns in different time periods. Using a likelihood model (Appendix B), Al-Asadi et al. (2019) estimated separate maps of δ and σ2 upon two-dimensional grids, and therefore called this technique Migration And Population-size Surfaces (MAPS). When Al-Asadi et al. (2019) compared their results to the results of EEMS on the same data set (humans from western Europe), they found nontrivial differences. Part of this arose because EEMS is a surface of δσ2 whereas MAPS produces separate surfaces for these two parameters. Another difference is using an individual genetic distance versus shared identity-by-descent segments. Differences between the two approaches were reduced by converting the shared identity-by-descent segments into a genetic distance and using that distance for EEMS analysis, which reduced but did not eliminate the discrepancies. This implies that either differences in the underlying likelihood models are important or that the information in a genetic distance based on identity-by-descent segments is not the same as a direct analysis of identity-by-descent segments themselves. More work needs to be done to help clarify these issues. Another method of using haplotypes to enhance resolution is to use coalescent theory when haplotype trees are available. As can be seen from Eq. (6.10), information about coalescent times within and between populations contains information about gene flow, and this has proven to be a higher resolution source of inference on gene flow than just allele or haplotype frequencies alone. For example, Crandall et al. (2019) reanalyzed mtDNA data on 41 marine species sampled throughout the Hawaiian archipelago that extends linearly more than 2500 km in the Pacific Ocean. Because of

211

212

Population Genetics and Microevolutionary Theory

pelagic larvae and current patterns, an isolation by distance pattern was hypothesized. However, using fst’s, only four species showed a significant isolation by distance pattern. Coalescent times can define a different type of pairwise Fst (Eq. (6.12)). By sampling coalescent times, 70% of the species showed a significant isolation by distance pattern from the same data set. They also performed simulations that indicated that coalescent sampling in a stepping stone model could detect isolation by distance in nearly 100% of the cases even with 100 migrants per generation, corresponding to an Fst of 0.002 from Eq. (6.8).

Total Effective Population Size in Subdivided Populations So far, we have mostly considered the effects of genetic drift at the local level, with the exception of Figure 6.15. For example, the variance and inbreeding effective sizes invoked in all the models given previously in this book apply to local demes, and not the total population. Because these local populations are genetically interconnected, it is of great interest to see what impact limited gene flow and population subdivision has upon the evolutionary properties of the reproductive community as a whole. The mathematics for this problem can become complicated, so only some basic results are given here. More details and results are given in Crow and Maruyama (1971) and Maruyama (1972, 1977). Consider first the impact of population subdivision on the long-term inbreeding effective size of the total reproductive community, NefT. A long-term effective size assumes a reference population in the distant past and that sufficient time has elapsed for an equilibrium to have been established among the relevant evolutionary forces (here, genetic drift, mutation, and gene flow under neutrality). If we also assume random mating within each local deme, then the average probability of identity-by-descent among the individuals of this subdivided population is Fst (e.g. Eqs. 6.13 and 6.14). Using the basic definition of an equilibrium inbreeding effective size, we have: F eqT = F st =

1 1 + 4N efT m + μ

6 44

where μ is the mutation rate. Equation (6.44) yields the long-term inbreeding effective size of the total population to be: N efT =

1 4 m+μ

1 − F st F st

6 45

Note that the total inbreeding effective size is a decreasing function of Fst. Hence, the more subdivided a population is, the higher the chance of identity-by-descent under random mating at both the local and total levels, and thus the smaller the total inbreeding effective size. Consider the special case of an equilibrium island model with an infinite number of local demes. Under this model, each local deme has an inbreeding effective size of Nef, but because there are an infinite number of local demes, the total population size is also infinite. If there were no population subdivision, the total inbreeding effective size should also be infinite. But when there is subdivision, we can substitute Eq. (6.14) into the Fst term of Eq. (6.45) to yield: N efT = N ef

μ+m μ

6 46

Gene Flow and Population Subdivision

When there is no gene flow (m = 0) and each local deme is an isolate, then the total inbreeding effective size is simply the local inbreeding effective size. This observation has important implications in conservation biology. When a species becomes fragmented into completely isolated subpopulations, pedigree inbreeding will accumulate due to drift at a rate determined by the local population sizes within the fragments, not the total species population size. Thus, the total inbreeding effective size of a fragmented species can be much smaller than the census size. When there is gene flow, (μ + m)/μ > 1, and the total inbreeding effective size is larger than the local inbreeding effective size under the island model. Crow and Maruyama (1971) showed that in general NefT is larger than Nef but smaller than the total population size when m > 0. Note also that the larger the gene flow rate m, the larger the total inbreeding effective size. Given that the mutation rate μ is typically small, even a small degree of gene flow can result in a total inbreeding effective size that is many orders of magnitude larger than the total inbreeding effective size under complete isolation. For example, let μ = 10−5 and m = 0.01, then from Eq. (6.46) we have that NefT = 1001Nef versus NefT = Nef when m = 0. This illustrates that even low levels of genetic interchange among fragmented subpopulations can greatly increase the inbreeding effective size and reduce the rate of pedigree inbreeding – outcomes that are frequently a high conservation priority. Although the total inbreeding effective size tends to decrease with increasing population subdivision (Crow and Maruyama 1971), the opposite is generally true for the total variance effective size. The total variance effective size measures how much variance in allele frequency is induced by genetic drift for the population as a whole and not for each local deme. For example, Wright (1943) considered a finite island model in which an otherwise ideal population is subdivided into n local populations, each of ideal size N, with a gene flow rate of m. The total population size in this model is nN, and if the population were panmictic, its variance effective size would be nN. Wright showed that the total variance effect size in this case is N evT =

nN 1 − f st

6 47

When the population is indeed panmictic with no subdivision, fst = 0, then the total variance effective size equals the census size, as expected for this idealized population. However, note that when the population is subdivided due to restricted gene flow, fst > 0, so 1 − fst < 1 and NevT > nN. Thus, a subdivided population has a total variance effective size larger than the census size in this case! In general, a subdivided population has a variance effective size larger than the sum of the variance effective sizes of the isolates. We have an example of this effect of subdivision with the collared lizards. Recall that the strength of genetic drift in inducing variance in allele frequencies was moderate in the collection of all three isolated translocated populations in the pre-burn phase (Figure 6.15) while being strong within at least two of these isolates, SM-7 and SM-8. This effect can be examined more directly by converting the measures of strength of genetic drift (Eq. (6.34)) into variance effective sizes. Because individual lizards can survive and reproduce over several years, these lizard populations have overlapping generations. Using an analog of an estimator of variance effective size from untransformed allele frequencies in populations with overlapping generations (Charlesworth 1980), an estimator with the transformed allele frequencies for an autosomal, diploid locus is: N ev =

t ij 8E Gdij

6 48

213

214

Population Genetics and Microevolutionary Theory

Table 6.3 Estimates of the variance effective size, Nev, for three translocated collared lizard populations six years (SM-8) and seven years (SM-7, SM-9) after translocation during the pre-burn phase and for the total population (all three glade populations combined). Also given are the estimated census sizes, N, at the sixth or seventh year after translocation and the ratio of the effective size to the census size. Nev

Population

SM-7

3.76

N

Nev/N

10.24

0.38

SM-8

1.98

13.66

0.14

SM-9

7.30

35.57

0.21

25.94

59.47

0.44

Total Population

Source: Derived from data in Templeton et al. (2011) and Neuwald and Templeton (2013).

where tij is the time in generations between the sampling points i and j. It turns out that the generation times for both males and females change across the three demographic phases, so tij is given by: t ij =

tyij 1 g + gm 2 f

6 49

where tyij is the time in years between sampling points i and j, gf is the generation length of females in years, and gm is the generation length of males in years. Table 6.3 shows the estimated variance effective sizes for each of these three translocated populations and for the total pooled population during the pre-burn phase obtained by substituting Eq. (6.49) into (6.48). As expected from the founder effect, all three of the translocated populations have a relatively small Nev, with SM-8 having the smallest variance effective size and SM-9 the largest. This is to be expected from their varied demographic histories, as SM-8 had a declining population that went down to three to four individuals, SM-7 had little population growth for many years, and SM-9 had extensive population growth. There are no surprises in Table 6.3 for these variance effective size estimates for the local populations. However, notice that the variance effective size for the total population is larger than the sum of the variance effective sizes of all of its component populations (25.94 versus a sum of 13.04). Just as Eq. (6.47) reveals, the whole is greater than the sum of its parts when dealing with variance effective size. Moreover, the ratios of variance effective size to census size are larger for the total population than for any of the three isolates. This ratio is expected to increase with increasing number of isolates, and as Eq. (6.47) shows, this ratio can even exceed one (that is, the effective size can be larger than the census size at the total population level). Also, if each population continued as an isolate, we would expect fst to eventually approach 1 as each isolate became fixed for a new mutational lineage. Note that as fst approaches 1, the total variance effective size goes to infinity even though the total census size is finite. Hence, studies on local effective sizes may not be informative about how the total population size of a species is responding to genetic drift when there is population subdivision. We already saw this earlier with respect to the rate of loss of alleles: the much larger stable metapopulation with little subdivision lost alleles more rapidly than the pre-burn, highly subdivided population of small size. Equation (6.48) can also be used to estimate the total variance effective sizes through all three demographic phases of the Stegall Mountain collared lizard population. The inverse of the strengths of genetic drift shown in Figure 6.15 can be converted into total variance effective size estimates after adjustment for generation time (Eqs. 6.48 and 6.49). These generation times differed for

Gene Flow and Population Subdivision

the three demographic phases, being 3.34 years for females and 3.68 years for males during the preburn phase, 3.08 years for females and 3.07 years for males during the colonizing phase, and 2.65 years for females and 3.32 years for males during the stable metapopulation phase. Figure 6.23 plots the ratio of the total variance effective size to the total census size throughout this management program. As can be seen from Figure 6.23 and from Table 6.3, this ratio was high (0.44) during the pre-burn phase of near total isolation. This ratio plunged to low values when gene flow was established at the onset of burning in 1994 and then gradually increased as the total population size and number of glade populations increased with increasing fst values (Figures 6.6 and 6.9), only to change erratically at the beginning of the stable metapopulation phase, but finally stablizing at a value of around 0.6 (Figure 6.23). Overall, this ratio was highest during the pre-burn and metapopulation phases even though the pre-burn phase had the lowest census size and the metapopulation phase had the highest census size (Figure 6.6). The isolation in the pre-burn phase explains its high ratio despite there being only three subpopulations and small population sizes, and the larger number of glade populations with significant subdivision as shown by fst (Figure 6.9) explains the high ratio through Eq. (6.47) during the stable metapopulation phase. Subdivision as measured by fst plummeted when gene flow suddenly increased in 1994, and so did the Nev/N ratio (Figure 6.23) as expected from Eq. (6.47). Subdivision as measured by fst gradually increased during the colonizing phase (Figure 6.9), and so did the Nev/N ratio (Figure 6.23) along with increasing numbers of local populations (Figure 6.7). Dispersal increased again in 1999, and, once again, the Nev/N ratio plummeted initially but then behaved erratically until finally stablizing at a high ratio, reflecting a large number of subpopulations with significant genetic subdivision. In general, the Nev/N ratio goes up with less gene flow (the pre-burn phase) and more subpopulations with significant subdivision (the stable phase) and is reduced by increases in gene flow with less subdivision, as occurred at the beginning and end of the colonization phase (Figure 6.9). Obviously, population structure has a major impact on the relationship of the total variance effective size to the census size.

Pre-Burn Phase

Colonizing Phase

Stable Metapopulation Phase

1

Nev/N

0.8

0.6

0.4

0.2

0 6/7

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Year

Figure 6.23 The 2-year moving average of the ratio of the total variance effective size (Nev) to total census size (N) over time for the collared lizard total population on Stegall Mountain.

215

216

Population Genetics and Microevolutionary Theory

Nunney (1999) extended Wright’s results to the case where there is local inbreeding measured by fis to obtain: N evT =

nN 1 + f is 1 − f st

6 50

Notice that avoidance of system of mating inbreeding at the local deme level (fis < 0) further increases the total variance effective size, whereas local system of mating inbreeding decreases it. By contrasting Eqs. (6.45)–(6.47), we see that increasing population subdivision reduces total inbreeding effective size but increases total variance effective size. This paradoxical effect of genetic subdivision is also seen with the rate of loss of alleles due to drift at the level of the total reproductive community. For example, the rate of loss of alleles per generation due to drift in a two-dimensional continuous model of isolation by distance in an otherwise idealized population of total size NT is expected to be 1/(2NT) if δσ2 is greater than or equal to one (Maruyama 1972). That is, with sufficient gene flow (δσ2 ≥ 1), the total reproductive community is losing its allelic variation at the same rate as a single panmictic deme of idealized size NT. However, when gene flow is decreased such that δσ2 < 1, then the rate of loss of allelic variation becomes δσ2/(2NT) < 1/(2NT). Hence, the eigenvalue effective size, which is defined in terms of this rate of loss, is Nee = NT/(δσ2) > NT when δσ2 < 1. Thus, a continuous population of total size NT with restricted gene flow has lower overall rates of loss of allelic variation than a panmictic population of total size NT and its eigenvalue effective size can exceed the census size. The above models raise an important question: How can population subdivision increase the amount of pedigree inbreeding while simultaneously slowing down the rate of loss of genetic variation and reduce the variance in allele frequencies at the total population level? This paradox is resolved when we recall that any evolutionary force, including genetic drift, can have an impact only when there is genetic variation. As a species becomes subdivided due to restricted gene flow, more and more of its local demes can become temporarily fixed for a particular allele. Such fixation increases the overall level of homozygosity due to identity-by-descent in the total population. However, the subset of the local demes that are fixed for a single allele at a given time is immune to frequency changes caused by drift because they have no genetic variation at that locus. Because of the random nature of drift, it is unlikely that each of the local demes that is fixed for some allele are all fixed for exactly the same allele. Thus, at any given time at the total population level, the genetic variation that is preserved as fixed differences between local areas is immune to loss due to genetic drift in the total population, thereby reducing the overall rate of loss of genetic variation relative to the total panmictic case. Moreover, the overall allele frequencies in the subset of the total population that are fixed for some allele cannot change due to drift, so the overall variance in allele frequencies is reduced relative to the total panmictic case. The effect of genetic subdivision on increasing total variance effective sizes and slowing down the loss of alleles can be quite large. For example, in an island model in which the effective number of migrants (Nevm) is 0.5 (the exchange of one effective individual every other generation), the variance of allele frequencies in the total population will be half of what it would have been under panmixia, corresponding to a doubling of the total variance effective size over total census size (Eq. (6.45)). As we saw in Chapter 4, we have to be very careful when dealing with effective sizes to define both what type of size we are measuring and what our reference population is in time. The results given here indicate that we also need to be careful in defining our spatial reference. The impact of an evolutionary force upon local demes can be just the opposite of its impact on a collection of local demes.

Gene Flow and Population Subdivision

The opposing effects of population subdivision upon individual pedigree inbreeding levels and total population levels of genetic variation can create difficult choices for conservation biologists. As noted earlier, a priority in many conservation management programs is to establish gene flow between fragmented isolates. Such gene flow has many beneficial effects from a population genetics perspective: it increases local inbreeding and variance effective sizes, it increases local levels of genetic variation and hence local adaptive flexibility, and it reduces the overall level of pedigree inbreeding, thereby minimizing the dangers of inbreeding depression. However, another goal of many management programs is to maintain high levels of genetic variation in the total population for long periods of time – the explicit goal of the 500 portion of the 50/500 rule (Chapter 4). This goal is more readily achieved by reducing gene flow and subdividing the population (Chesser et al. 1980, 1993; Chesser 1991). Thus, there is a trade-off, and management decisions require a careful assessment of what the priorities are for any given endangered species. Further complicating matters is that most of the genetic estimators of effective size (Chapter 4) are only appropriate for an isolated population (Ryman et al. 2019). When applied to subdivided populations, local effective sizes can be biased in an extreme manner when there is some gene flow, and different types of effective size can diverge considerably from one another (Ryman et al. 2019). The ubiquity of population subdivision, particularly in endangered species, further undercuts the applicability of the 50/500 rule in conservation biology (Chapter 4). There are no easy, universal answers.

Multiple Modes of Inheritance and Population Structure Most of the theory presented in this chapter up to now has focused upon autosomal loci. However, we can study DNA regions with different modes of inheritance that are found in the same individuals. For example, many animals, including humans and fruit flies in the genus Drosophila, have DNA regions with four basic modes of inheritance (Chapter 1):

•• ••

Autosomal nuclear DNA regions, with diploid, biparental inheritance X-linked nuclear DNA regions, with haplo-diploid, biparental inheritance Y-linked nuclear DNA regions, with haploid, uniparental (paternal) inheritance Mitochondrial DNA, with haploid, uniparental (maternal in many species) inheritance

These different modes of inheritance interact strongly with the balance between genetic drift and gene flow that shapes genetic population structure. In our previous models, we assumed an autosomal system with diploid, biparental inheritance. As a consequence, an effective population size (of whatever type) of Ne individuals corresponds to an effective sample of 2Ne autosomal genes. As we saw earlier, the force of drift in this case is given by 1/(2Ne). In general, the force of drift is given by 1/(the effective number of gene copies in the population). An autosomal nuclear gene is only a special case in which the effective number of gene copies is 2Ne. The effective number of gene copies is not the same when we examine other modes of inheritance. For example, in a species with a 50 : 50 sex ratio (such as humans or Drosophila), Ne individuals corresponds to an effective sample of 3/2Ne X-linked genes. Because different modes of inheritance have different numbers of genes per individual, the force of drift is expected to vary across the different genetic systems that are coexisting in the same individuals. Indeed, we saw in Chapter 5 that the strength of drift as measured by coalescence times varies as a function of mode of inheritance (Figure 5.9). Now we note that variation in the force of drift directly induces variation in the balance of drift and gene flow. Therefore, we do not expect different DNA regions imbedded

217

218

Population Genetics and Microevolutionary Theory

within the same populations to display the same degree of genetic subdivision, even when the rate of gene flow is identical for all DNA regions. For example, if dealing with a haploid genetic system in a population of size Nev, Eq. (6.29) becomes: f st =

1 1 + 2N ev m

6 51

Note that the term “4Nev” in Eq. (6.29) has been replaced by “2Nev” in Eq. (6.51). This change by a factor of two is due to the fact that a diploid system in a population with variance effective size Nev actually is based upon an effective sample of 2Nev genes, whereas the haploid system only has an effective sample of Nev genes. If we also stipulate that the haploid system has uniparental inheritance in a species with a 50 : 50 sex ratio, only half the individuals sampled can actually pass on the haploid genes, reducing the effective size by another factor of two (assuming the sexes have equal variance effective sizes). Thus, for haploid, uniparental systems like mitochondrial or Y-chromosomal DNA in humans or Drosophila, the effective size for a haploid, uniparental locus is approximately 1/2Nev, so Eq. (6.51) becomes: f st =

1 1 + N ev m

6 52

Similarly, adjusting for the number of genes per individual, the analog of Eq. (6.29) for an X-linked locus in a species with a 50 : 50 sex ratio is f st =

1 1 + 3N ev m

6 53

Hence, as we go from autosomal to X-linked to haploid uniparental systems, we expect to see increasing amounts of genetic subdivision for the same rate of gene flow (m). This expected variation in the balance of drift and gene flow presents an opportunity to gain more insight into population structure by simultaneously studying several inheritance systems rather than just one. For example, populations of the fruit fly Drosophila mercatorum were studied that live in the Kohala Mountains on the northern end of the Island of Hawaii (DeSalle et al. 1987). Samples were taken from several sites in the Kohalas (Figure 6.24), but the bulk of the collection came from site IV and the three nearby sites B, C, and D. We will consider this as a two-population system by pooling the nearby sites B, C, and D and contrasting them with site IV. An isozyme survey (Appendix A) of nuclear autosomal genes yields an fst of 0.0002 for this contrast, which is not significantly different from zero. Given the sample sizes and using Eq. (6.29), this nonsignificant fst is statistically incompatible with any Nevm value less than 2. Hence, the nuclear autosomal system tells us that Nevm > 2. This is not surprising, as sites IV and B-C-D are less than 3 km apart, with the intervening area consisting of inhabited area for this species. Although the nuclear fst tells us that there is significant gene flow between these two localities, we cannot distinguish between Nevm of 3, or 30, or 300, or 3000 with the nuclear autosomal isozyme data alone. The same individual flies were also scored for restriction site polymorphisms (Appendix A) for mtDNA, yielding a statistically significant fst of 0.17. Using Eq. (6.52), this significant fst implies Nevm ≈ 5. This value of Nevm ≈ 5 from the mtDNA data is consistent with the inference that Nevm > 2 from the isozyme data. Thus, there is no biological contradiction for these samples in yielding a nonsignificant fst for isozymes and a significant fst for mtDNA. This population of flies is also polymorphic for a deletion of ribosomal DNA on the Y-chromosome (Hollocher et al. 1992). The Y-DNA fst was 0.08 and significantly different from 0. Using Eq. (6.52) for the Y-DNA data yields an estimate of Nevm ≈ 12, a result also compatible with the isozyme data. Hence, the nuclear

Gene Flow and Population Subdivision

3600 3400

A 3200

3000

B C D

2800

F 2600

2400

True north

2400

1 km 2200

1 mile Contour interval: 40 ft (12.2 m)

IV

Road

Figure 6.24 A topographic map of collecting sites for Drosophila mercatorum in the Kohala Mountains near the town of Kamuela (also known as Waimea) on the Island of Hawaii. A transect of collecting sites on the slopes of the Kohalas is indicated by the letters A, B, C, D, and F, and a site in the saddle at the base of the Kohalas is indicated by IV.

autosomal system told us very little about the possible values for Nevm other than that the value is greater than two, but, by combining all the data, we know that Nevm in this population is around 5–12 and not in the hundreds or thousands or more. Although both of the unisexual, haploid-inherited elements were individually consistent with the isozyme results, note that the fst calculated from the Y-DNA is less than half that from the mtDNA. This difference was statistically significant even though both Y-DNA and mtDNA are haploid, uniparental systems. To understand why these two haploid, uniparental systems could yield significantly different results, recall that Eq. (6.52) was applied to both of them under the assumptions that the variance effective sizes of both sexes were equal and that gene flow was equal for both sexes. In most non-monogamous species, the variance in offspring number is greater in males than in females, resulting in males having a smaller variance effective size than females (see Eq. 4.31 in Chapter 4, and Chapter 5). This in turn, through Eq. (6.52), would imply that the fst for the Y-DNA should be larger than that for mtDNA – exactly the opposite of the observed pattern. Therefore, the typical pattern in variance of male versus female reproductive success and effective sizes cannot explain the observed results. We therefore turn our attention to sex-specific influences on the rate of gene flow. Direct mark/recapture studies on dispersal in this species reveal identical dispersal behaviors in males and females (Johnston and Templeton 1982), but this does not mean that m is identical in males and females. All dispersal in this species occurs during the adult phase, and almost all adult

219

220

Population Genetics and Microevolutionary Theory

D. mercatorum females are inseminated by one to three males. The females have a special organ for storing sperm and can retain viable sperm for several days. Recall that m measures the rate of exchange of gametes, not individuals. Therefore, when a male disperses, only male gametes are potentially being dispersed. However, dispersing females carry not only their own gametes but those of one to three males as well. So, even with equal dispersal rates for males and females, there is actually much more gene flow of male gametes than female gametes. This male-biased gene flow predicts a smaller fst for Y-DNA versus mtDNA, as is observed. Hence, we need two gene flow rates, mf for female gametes and mm for male gametes. The overall gene flow rate for an autosomal locus is just the average of the two sexes as they are in a 50 : 50 ratio, that is, m = (mf + mm)/2. Assuming the variance effective sizes are the same for both sexes, the nuclear isozyme results imply that Nevm = Nev(mf + mm)/2 > 2; the mtDNA implies that Nevmf = 5; and the Y-DNA implies that Nevmm = 12. All systems together therefore yield Nevm = Nev(mf + mm)/2 = (5 + 12)/2 = 8.5. Thus, by combining the results of several genetic systems with different modes of inheritance, we can gain more insight into population structure, including sex-specific differences.

Admixture Up to now, we have treated dispersal and gene flow as occurring at the individual level. However, in some cases, a large portion of a deme, a whole deme, or even a group of related demes moves as a population into another area. This can occur because of climate change, range expansion, the elimination or reduction of previous barriers to movement, or human-facilitated movement of populations, both of humans and nonhuman species (Bullock et al. 2018; Fontsere et al. 2019). As discussed in Chapter 3, population movements that bring two or more genetically differentiated populations into breeding contact is called admixture. Admixture events have been common in human evolutionary history, with over 100 instances having been identified over just the last 4000 years (Hellenthal et al. 2014). Once contact between two or more populations has occurred, the degree and speed of admixture depends heavily upon the system of mating, as discussed in Chapter 3. The impact of system of mating is greatly augmented in such situations because admixture itself induces extensive linkage disequilibrium (Eq. 3.6), so that assortative or disassortative mating for a trait that initially differentiates the populations can have genome-wide effects. Indeed, as pointed out in Chapter 3, even assortative or disassortative mating on a non-genetic trait that is associated with the ancestral populations can have a large genome-wide effect on the amount and speed of genetic introgression between the populations. First, consider assortative mating. The European corn borer, an insect pest, has two pheromone races that apparently had once been geographically separated but are now broadly overlapping (Harrison and Vawter 1977). There is strong assortative mating for pheromone phenotype in these insects with greater than 95% of the matings occurring within the pheromone types (Malausa et al. 2005). Moreover, these races have allele frequency differences at many isozyme loci (Appendix A) because of their historical isolation. Recall from Chapter 3 that when two previously isolated, genetically differentiated populations make genetic contact with one another, extensive linkage disequilibrium is created in the mixed population (Eq. 3.6). Thus, in the areas of overlap of the pheromone races, there is linkage disequilibrium between the pheromone loci and all other loci having allele frequency differences between the historical races. Because assortative mating reduces the chances that individuals from the different pheromone races will mate with one another, it also reduces the

Gene Flow and Population Subdivision

effective gene flow m for all loci that had different allele frequencies in the historical races. As a result, assortative mating for pheromone type greatly reduces gene flow as an evolutionary force for all differentiated loci. Despite close physical proximity of individual corn borers, the effective m is very small and the races have maintained their differentiation even at isozyme loci that have no direct impact on the pheromone phenotype. In contrast, disassortative mating enhances m for all loci. Drosophila melanogaster has a strong disassortative mating pheromone system (Averhoff and Richardson 1974, 1976), just the opposite of the European corn borer. D. melanogaster across the globe is predominately a single, cosmopolitan species showing only modest geographical differentiation (except for Africa and some selected loci) even on a continental basis (Singh and Rhomberg 1987). This homogeneity has occurred despite the fact that populations of D. melanogaster, a human commensal species, have been moved around extensively by human commerce over the last several thousand years. Nevertheless, there are some regional differences. D. melanogaster originated in sub-Saharan Africa, and recently colonized much of the rest of the world (Duchen et al. 2013). During this expansion, the “cosomopolitan” population that left sub-Saharan Africa experienced a bottleneck (Chapter 4, Figure 4.5) that changed allele frequencies across much of the genome (Duchen et al. 2013). The sub-Saharan and “cosomopolitan” ancestral populations have since encountered each other on multiple occasions and have admixed in numerous geographic locations both within sub-Saharan Africa and worldwide (Pool et al. 2012). When these populations of D. melanogaster of diverse geographical origins came into contact, the disassortative mating pheromone system would result in extensive and rapid interbreeding, creating a pulse of introgression that appears as a single event on an evolutionary time scale. Although this pulse would be rapid, it would still have lasting genetic consequences. Linkage disequilibrium between the pheromone loci and unrelated loci would rapidly dissipate for most loci, but much of the initial disequilibrium created by the admixture event (Eq. 3.6) would persist for many generations among loci unrelated to the pheromone system, particularly for unrelated markers that are closely linked to each other (Eq. 2.11 and Figure 2.5). The dissipation of linkage disequilibrium at these loci would be gradual, resulting in decreasing lengths of tracts of linkage disequilibrium over time. Modern sequencing studies give the resolution to examine the lengths of these tracts. Genetic surveys of unadmixed populations can help identify possible ancestral populations as well as identify linkage disequilibrium tracts that stem from a particular local ancestry. Medina et al. (2018b) used computer simulations to fit single pulse models of admixture to data on D. melanogaster populations from sub-Saharan Africa. The results are shown in Figure 6.25, which gives both the admixture proportions to the current gene pool (the M’s in Figure 6.25) and the times of admixture from non-Ethiopian populations into the unadmixed Ethiopian gene pool, the t’s. It is important to note that M measures the proportion of the gene pool of the current admixed population that is derived from a specific ancestral population. Hence, M is the cumulative amount of introgression from an ancestral population upon the current admixed population and not m, the per generation rate of gene flow. Sometimes, admixture is over many generations rather than a single pulse. Using genetic surveys on the ancestral populations to identify the lengths of ancestral chromosome segments (LACS, the genomic areas of high linkage disequilibrium that can be attributed to a specific ancestral population), Jin (2015) and Jin et al. (2014) showed that, under a pulse model, the probability distribution of the LACS from a given ancestral population involved in admixture t generations ago is: f x M, t = 1 − M te −

1 − M tx

6 54

221

222

Population Genetics and Microevolutionary Theory

Unadmixed Ethiopian Population

Ancestral Gene Pools

A

a

pE

qE

West African Population

Cosmopolitan Population A

a

pC

qC

MC = 0.29 tC = 372

Gene Pool in Present Gambella, Ethiopia

ME = 0.31

A

a

pW

qW

MW = 0.40 tW = 4,962

A

a

pA = MCpC + MEpE + MWpW

qa = MCqC + MEqE + MWqW

Figure 6.25 A model of admixture between three populations of D. melanogaster to produce the current population in Gambella, Ethiopia. The three ancestral populations are a nearby unadmixed population from Ethiopia, a non-African cosmopolitan population, and a West African population. In this model, Mi represents the cumulative contribution from ancestral population i upon the current Gambella population. Below the M’s, ti is the estimated time in generations since the admixture event involving non-Ethiopian population i. The gene pools are depicted as having different allele frequencies at a hypothetical autosomal locus with two alleles, A and a, with frequencies pi and qi, respectively, in ancestral population i. The allele frequencies pA and qa give the expected allele frequencies in the current Gambella population as a function of the admixture parameters. Source: Based on data given in Medina et al. (2018b).

where x is a random variable indicating the segment length and M is current proportion of the gene pool of the admixed population that derives from an ancestral population of interest. The mean and variance of the probability distribution given in Eq. (6.54) are: 1 1−M t 1 2 σ = 1 − M 2 t2 μ=

6 55

They also modeled admixture beginning t generations ago followed by continual but gradual introgression until the present. Then, the mean and variance become: μ=

2 1−M t t

4 σ2 =

1 −1 i=1 i 1 − M 2 t2

6 56

Notice that, on the average, the LACS are twice as long for a given t in the gradual introgression model than in the single pulse model. As mentioned in Chapter 3, African Americans represent an admixed population between ancestral west African and west European populations. Jin (2015) determined the distribution

Gene Flow and Population Subdivision

of LACS in the African American gene pool and found that a gradual model of admixture fit the distribution better than a pulse model. In particular, the best fitting model had t = 14 generations (280 years ago with a generation time of 20 years, or 350 years ago with a generation time of 25 years) with European input occurring at a rate of 0.017 per generation to result in M = 0.245. These estimates correspond well to the historical record. Admixture is normally limited to only a portion of the species range, but admixture can provide a powerful tool for examining gene flow in other parts of the species range. For example, the European sea bass (Dicentrarchus labrax) was subdivided into Atlantic and Mediterranean glacial lineages about 300 000 years BP. At the end of the last glacial period, Atlantic migrants began contact and interbreeding with the western Mediterranean population. Duranton et al. (2019) determined LACS from the Atlantic population at different distances across the Mediterranean. Using simulations, they were able to estimate the spatial scale of dispersal within the Mediterranean sea bass lineage by comparing the length distribution of introgressed Atlantic tracts in the Mediterranean population at different distances from the contact zone with the Atlantic lineage. The LACS decreased in length as one went from west to east, indicating that the LACS take time (and hence more recombination events) to reach the more eastern parts of the Mediterranean. This pattern allowed them to estimate the average per-generation dispersal distance with the Mediterranean lineage to be less than 50 km. This approach is similar to the allele tracing done in artificially admixed populations of collared lizards (Figure 6.12), although with the lizards, the time scale was short and only alleles were surveyed, not tract lengths.

Identifying Subpopulations and Population Structure With the exception of the two analyses that used genetic differences between individuals in a continuous population (Petkova et al. 2016; Al-Asadi et al. 2019), all the models and analyses presented in this chapter have assumed that the subpopulations were known a priori. In some cases, the habitat requirements of a species naturally result in discrete local subpopulations, such as the glade populations of the Ozark collared lizards. But in many cases, populations are not inherently distributed in a discrete fashion, and even when they are, these discrete subpopulations may still not represent local demes if there is much dispersal and genetic interchange among them. For example, the nine distinct fens in the Kay Branch study of Hine’s emerald dragonflies do not define nine discrete demes as the daily movement of individual dragonflies encompasses the entire fen complex (Figure 6.16). The discrete coral reefs distributed over the entire Indo-Pacific Ocean do not correspond to discrete demes for moray eels, which have only a single random mating deme in this gigantic ocean (Reece et al. 2010). Consequently, what constitutes a subpopulation or local deme or neighborhood is an important and nontrivial question. Consider, for example, the fire salamander, Salamandra infraimmaculata, in northern Israel (Sinai et al. 2019). Northern Israel is the southernmost limit for this endangered species, and indeed Israel is the southernmost limit for the entire genus Salamandra. All members of this genus have an aquatic larval phase, and most salamander species specialize in the type of aquatic resource they use: springs, ponds, streams, etc. Such aquatic resources are scarce in Israel, so S. infraimmaculata living in Israel have evolved as aquatic resource generalists, being able to use permanent water sources such as springs and permanent streams and ponds, as well as temporary sources, such as intermittent streams and rock pools that fill with water for a few months during the rainy season in the winter. In Israel, these salamanders are found only at higher elevations in four distinct

223

224

Population Genetics and Microevolutionary Theory

Upper Galilee

N

Sutability 0 – 0.2 0.2 – 0.4 0.4 – 0.6 0.6 – 0.8 0.8 – 1 5 Kilometers

Lower Galilee

Mt. Carmel

Figure 6.26 Maxent habitat suitability scores over the three major regions sampled in northern Israel for Salamandra infraimmaculata. Mount Carmel is shown in the lower left-hand corner, the Upper Galilee in the upper right-hand corner, and the Lower Galilee just south of the Upper Galilee. White circles mark the 97 water bodies known to serve as breeding sites for these salamanders. Source: From figure 8 in Sinai et al. (2019). © 2019, Springer Nature.

geographical areas: Tel Dan, an area with permanent springs in the northeast of Israel near the Syrian and Lebanese borders; the Upper Galilee that is continuous with the distribution of these salamanders into Lebanon; the Lower Galilee that is separated from the Upper Galilee by a deep but narrow valley; and Mount Carmel, a 50 km long ridge separated from the Lower Galilee by a low-elevation, broad valley. The Tel Dan population is quite distinct genetically and morphologically from all the other populations within Israel, so the main focus of the studies of Sinai et al. (2019) were on the other three regions. A Maximum Entropy analysis (Phillips et al. 2006) indicated that the Upper Galilee and Mount Carmel were the more optimal habitats for this species, over the lower elevation and drier Lower Galilee (Figure 6.26). Samples from these salamanders are most easily obtained near the aquatic breeding resources on rainy, winter nights. Hence, sampling tends to be geographically clustered at breeding sites. However, individual salamanders have been observed to disperse a kilometer or more to different breeding sites even within a single breeding season (Bar-David et al. 2007), so it is not clear if these clustered samples from breeding sites correspond to local demes. The samples obtained from these breeding site collections were scored for 15 microsatellite loci. The first question to answer is how

Gene Flow and Population Subdivision

many distinct subpopulations, if any, are found in northern Israel (excluding the obviously distinct Tel Dan population). One common technique for answering this question is to use the program STRUCTURE (Pritchard et al. 2000) to cluster individuals into a finite number of subpopulations based solely on genetic data. STRUCTURE uses a model-based Bayesian analysis (Appendix B) to allocate individuals into K subpopulations, where K is specified beforehand. As shown previously, mixing individuals from different subpopulations creates deviations from single-locus Hardy– Weinberg (the Wahlund effect) and generates linkage disequilibrium between loci (Eq. 3.6). STURUCTURE seeks to identify clusters of individuals that minimize deviations from Hardy–Weinberg and the absolute values of linkage disequilibrium, that is, it attempts to identify random-mating local populations. This program does allow for the fact that some individuals may be the result of past gene flow or admixture, and thereby have a mixture of origins from these discrete K subpopulations. The output of STRUCTURE depends upon the chosen value of K and upon the prior (Appendix B) that specifies the proportional contributions of the source populations to the current, pooled sample. Unfortunately, the prior is typically ignored with an implicit choosing of a default uniform prior built into the program. As for choosing K, there are no statistically rigorous methods for this choice, but several heuristic methods have been proposed. One of the more popular methods is the deltaK measure of the change in the log of the probability of the data given K for two consecutive K’s (Evanno et al. 2005). This measure is then plotted over several values of K, and the optimal K is the one that maximizes deltaK. For the salamander data, the optimal K was two. The results are shown in Figure 6.27, which divided the sample into individuals from Mt. Carmel and from the Galilee (both Upper and Lower combined), with only a few individuals showing admixture or gene flow between these two subpopulations. The STRUCTURE result was reinforced by another method to identify subpopulations: principal component analysis (PCA), a statistical procedure used to simplify multivariate data with a minimal loss of information. Multivariate data can always be plotted in a multi-dimensional space, with each dimension corresponding to one of the variables being measured. A PCA rotates the axes of the original coordinate system through this multidimensional space such that one rotated axis, called the first principal component, captures the maximum amount of variation possible in a single dimension. The second principal component is the rotated axis constrained to be perpendicular (and thereby statistically independent) to the first principal component that captures the maximum

1 0.8 0.6 0.4 0.2 0

1

3 2

5 4

7 6

9 11 13 15 17 19 21 23 25 8 10 12 14 16 18 20 22 24

27 26

28

29 31 30 32

Figure 6.27 Genetic clustering of Salamandra infraimmaculata individuals obtained with STRUCTURE with K = 2, the optimal K under the deltaK method. Individuals from the same geographic collecting site are grouped together. Each individual in the sample defines a vertical bar of color. Identical colors identify populations with a homogeneous genetic composition, while different colors represent genetically differentiated populations. The red is associated with individuals sampled from the Galilee (sites 1–23) and green from Mount Carmel (sites 24–32). Individuals that were inferred to have ancestry from both of these populations have vertical bars with both colors, with the proportion of colors reflecting the proportion of ancestry. Source: From figure 8 in Sinai et al. (2019). © 2019, Springer Nature.

225

226

Population Genetics and Microevolutionary Theory

Table 6.4 Allele frequency data on five human populations for the O allele at the ABO blood group locus and for the D− allele at the Rh blood group locus. Population

Frequency of O

Frequency of D−

Africa

0.69

0.20

Asia

0.60

0.15

Europe

0.65

0.36

America

0.90

0.02

Australia

0.76

0.00

Source: Data from Cavalli-Sforza et al. (1994).

amount of the remaining variation, and so on until an entire new set of perpendicular (orthogonal) rotated axes have been defined. We will first illustrate the meaning of PCA to identify genetic clusters with a much smaller and simpler data set. Table 6.4 gives the frequencies of the D−allele at the human Rh blood group locus and the frequencies of the O allele at the human ABO blood group locus in five human populations (data from Cavalli-Sforza et al. 1994). We can plot the allele frequencies at these two loci against one another, as shown in Figure 6.28. These original axes correspond to the frequencies of the D− and Oalleles, respectively, in this two-dimensional space. The first principal component is defined to be the line through this two-dimensional space that minimizes the sum of the perpendicular distances of each population’s data point from the line, as is shown in Figure 6.28. The projections of the original allele frequencies onto this line can always be expressed as a linear combination of the original allele frequencies, that is, apD−+bpO where pD− and pO are the frequencies of the D− and O-alleles, respectively, from a particular population, and a and b are the weights assigned to a particular allele by minimizing perpendicular distances from a rotated axis (many computer programs exist to calculate these weights). The second principal component is then defined as a line perpendicular to the first principal component that minimizes the distances of the data points to this second line. Because the example shown in Figure 6.28 has only two dimensions (one corresponding to each of the two alleles being considered), there is only one way for the second principal component to be drawn in this case. However, when one has multiple alleles per locus and multiple loci, a total of Σ(ni − 1) dimensions are needed in such a plot where ni is the number of alleles at locus i and the summation is over all loci. In these higher dimension cases, the minimization criterion is needed to calculate all but the very last of the Σ(ni − 1) principal components. When the frequencies of alleles at different loci are correlated due to differentiation among demes, much of the variation in how individuals differ genetically will be found in the first principal component. There is progressively lesser amounts of variation in the amount of genetic differentiation as you go to the second, third, etc. principal components. This is also shown in Figure 6.28, which shows the projections of the original data upon the first and second principal components. As can be seen, the populations are much more spread out on the line corresponding to the first principal component than they are on the line corresponding to the second principal component. In general, much of the information about genetic differentiation should be contained in just the first few principal components. A PCA was performed on the salamanders using the data on the 15 microsatellite loci. Figure 6.29 shows a plot of the position of individuals for the first two principal components. As can be seen, the individuals from Mt. Carmel define a tight and well-defined cluster that is distinct from individuals

Gene Flow and Population Subdivision

First Principal Component America

Australia

Africa

Asia

Europe

Second Principal Component Australia Asia

Africa America Europe

America ABO-O-Allele Frequency

0.9

0.8 Australia 0.7

Africa

0.6

Europe

Asia

0.5 0

0.1

0.2

0.3

0.4

RH-D-Allele Frequency

Figure 6.28 Principal component analysis of the allele frequency data shown in Table 6.4 for five human populations scored for the ABO O allele and the Rh D− allele. The two perpendicular lines representing the first and second principal components are shown. Dotted lines show the perpendicular projections of each point corresponding to a population’s allele frequencies upon the first principal component line, and dashed lines show the perpendicular projections onto the second principal component. At the top of the diagram, the principal components are redrawn, showing the points of intersection of the perpendicular projections from each population. Source: Modified with permission from figure 1.13.1 in L. Cavalli-Sforza et al. (1994). Copyright © 1994 by Princeton University Press.

from the Galilee, but there is much overlap between individuals from the Upper and Lower Galilee regions. Hence, PCA reinforces the conclusion of two major subpopulations (excluding Tel Dan) in northern Israel: Mt. Carmel and the Galilee. However, the PCA (Figure 6.29) revealed much more variation among individuals from the Galilee than Mt. Carmel – variation that is completely invisible to the STRUCTURE analysis (Figure 6.27). The PCA therefore captured more information about subpopulations than STRUCTURE. Despite its popularity, there are some serious difficulties and limitations with STRUCTURE besides the loss of information illustrated by contrasting Figure 6.29 with Figure 6.27. The first involves the choice of K, for which only heuristic criteria exist. Perhaps, a K ≥ 3 would have been more appropriate for the salamanders. Indeed, Janes et al. (2017) found that the deltaK method is strongly biased to yield K = 2 even when more subpopulations exist. When ancestral contributions

227

Population Genetics and Microevolutionary Theory

PC2 (3.1%)

228

PC1 (7.5%)

Figure 6.29 Results of the principal component analysis on the microsatellite data from the salamander samples. Only the first and second axes are presented. The dots show individual salamanders. Ovals represent ellipses containing 95% of the individuals from each of the 32 sampling sites. Blue represents the Upper Galilee; gray, the Lower Galilee; black, Mount Carmel. Source: From figure 8 in Sinai et al. (2019). © 2019, Springer Nature.

are unbalanced (as is common), incorrect estimates of K occur if the default prior is used, and individual assignments to populations are poor (Wang 2017). The clusters are prone to be artifacts of sampling when one site has a much larger sample size than other sites (Kalinowski 2011; Puechmaille 2016). A common practice is simply to redo the analyses with many K’s and choose those that are subjectively reasonable to the user. Unfortunately, there is no statistical test in STRUCTURE to see if adding another population results in a significantly better overall fit to the data or if the added population is significantly genetically differentiated from the other populations. In the absence of these tests, the choice of K is truly subjective. A more fundamental problem is whether or not an “optimal K” even exists. In most species with subdivided populations, population structure is hierarchical, as discussed earlier in this chapter (recall Eq. (6.17). We would expect such a hierarchical structure for the salamanders in northern Israel just based on their habitat suitabilities (Figure 6.26). Mt. Carmel is separated from the Galilee by a broad, low elevation valley of extremely low suitability, which should translate into high resistance to dispersal and low gene flow. In contrast, the valley separating the Upper from the Lower Galilee is narrow and has a higher suitability than the valley between Mt. Carmel and the Lower Galilee. Accordingly, we expect a lesser degree of resistance between the Upper and Lower Galilee regions. The Lower Galilee itself contains much low suitability habitat, whereas the Upper Galilee consists mostly of high suitability habitat. Hence, we would expect greater subdivision within the Lower Galilee than within the Upper Galilee. All of

Gene Flow and Population Subdivision

this implies that there could be multiple K’s that are all biologically meaningful as they are sensitive to different dispersal barriers and historical events, and no one K can capture hierarchical population structure. Kalinowski (2011) and Puechmaille (2016) used simulations to show STRUCTURE with multiple K’s frequently results in clusters that are not consistent with the true hierarchical gene flow patterns. Another problem is that STRUCTURE regards all ancestral subpopulations to be identical in their own internal population structure, and specifically to be random mating demes. In many species, no local population is randomly mating (recall the snail R. decollate). In other species, not all local populations have the same internal population structure. For example, in humans, many populations approximate a random mating population, but about 10% of human populations have system of mating inbreeding in which marriages between relatives (mostly cousins) are preferred (Bittles and Black 2010). An additional problem occurs when there is isolation by distance, another extremely common pattern in many species. Geographically clustered sampling coupled with isolation by distance leads STRUCTURE to create sharp boundaries between population when none actually exist (Serre and Paabo 2004; Handley et al. 2007). The problem of clustered sampling is not unique to STRUCTURE (Blair et al. 2012; Safner et al. 2011), so analyses of isolation by distance should be done before using STRUCTURE and similar programs. This is particularly important in conservation genetics as management recommendations have been based on STRUCTURE artifacts in populations that had isolation by distance (Frantz et al. 2009; Perez et al. 2018). In particular for the salamander study, there is significant isolation by distance in the Lower Galilee and Mt. Carmel (Figure 6.30), so STRUCTURE could easily lead to artifacts within these regions. Finally, STRUCTURE is computationally slow, making it inapplicable to the large genomic data sets that are increasingly available.

0.25

Upper Galilee

0.2

Fst/(1-Fst)

Fst/(1-Fst)

0.25

0.15 0.1 0.05 0 –0.6

0.2

Lower Galilee

0.15 0.1 0.05

–0.1

0.4

0.9

1.4

0 –0.6

1.9

Log distance (km) 0.25 Fst/(1-Fst)

0.2

–0.1

0.4

0.9

1.4

Log distance (km)

Carmel

0.15 0.1 0.05 0 –0.6

–0.1

0.4

0.9

1.4

Log distance (km)

Figure 6.30 Isolation by distance for Salamandra infraimmaculata within the three major geographic regions. The Mantel test was not significant for the Upper Galilee (top panel) but was significant for the Lower Galilee (middle panel) and Mt. Carmel (lower panel). Source: From figure 8 in Sinai et al. (2019). © 2019, Springer Nature.

229

Population Genetics and Microevolutionary Theory

Because of these problems, many programs similar to STRUCTURE have been developed that solve one or more, but not all, of the problems mentioned above. For example, another Bayesian clustering program is Bayesian Analysis of Population Structure (BAPS) (Corander et al. 2003). It differs in two significant ways from STRUCTURE: first, BAPS allows the use of geographical location in addition to genetic data to place individuals into clusters, and, second, it provides a direct Bayesian estimation of K. BAPS estimated K as 8 (9 when Tel-Dan is included), not 2 as with the deltaK method. There is no overlap of clusters between Mt. Carmel and the Galilee in the BAPS output (Figure 6.31), which is consistent with the K = 2 STRUCTURE analysis. However, BAPS indicates much subdivision within the Galilee but not Mt. Carmel, which is consistent with the variation within the Galilee found in PCA. In particular, the Upper Galilee is mostly genetically homogeneous, and the northern part of the Lower Galilee clusters with the Upper Galilee, indicating some gene flow across the narrow valley that separates the Upper and Lower Galilee. The remainder of the Lower Galilee is fragmented into many distinct clusters, consistent with its many areas of low-quality habitat (Figure 6.26). It is also possible to abandon the model-based approach entirely, with its many assumptions about biology (e.g. random mating populations) and its inherent computational burden. Greenbaum et al. (2016) developed NetStruct that uses networks based on genetic similarity between pairs of individuals as an alternative method of defining and testing subpopulations from multilocus genotype data that can be applied to large genomic data sets. Genetic similarity between two individuals is measured by: 1 L 1 Ll=14

1 − pa,l I ac,l + I ad,l + 1 − pb,l I bc,l + I bd,l

6 57

ean

Sea

Sij =

Tel-Dan

dite

rran

Lebanon

Me

230

Upper Galilee

Syria Lower Galilee

Mt. Carmel

Figure 6.31 Genetic clustering of salamanders in northern Israel obtained with BAPS. Identical colors identify populations with a homogeneous genetic composition, while different colors represent genetically differentiated populations. Source: Derived from data in Sinai et al. (2019).

Gene Flow and Population Subdivision

where Sij is the genetic similarity between individuals i and j with individual i having alleles a and b (which could be the same or not) and individual j having alleles c and d at locus l, px,l is the frequency of allele x at locus l in the total sample, Ixy,l is an indicator variable that is 1 if x = y at locus l and 0 otherwise, and L is the total number of loci surveyed. The weighting by one minus the global allele frequency in Eq. (6.57) gives greater weight to alleles that are globally rare. Rare alleles tend to be recent (from coalescent theory, Chapter 5) and are highly informative about gene flow patterns, as was shown by the rare alleles in the collared lizard study (Figure 6.12). Equation (6.57) is for autosomal loci, but similarity can be generalized to any level of ploidy. A network is constructed in which every individual in the sample is a node and is connected by an edge weighted by Eq. (6.57) to every other individual in the sample. Subpopulations (“communities” in network theory) are subsets of individuals that are more densely connected to each other than to individuals outside their subpopulation. A partition of the network into subpopulations is inferred from the modularity of the partition. Modularity measures whether the partition is more or less internally connected than would be expected if connections were randomly distributed, that is, the null hypothesis is no subdivision. The modularity, Q, of a particular subpopulation is defined as the weight of the intra-subpopulation connections minus the expected weight of the intra-subpopulation connections in a random network that preserves the edge weights: Q=

1 S∗

Sij − i j

1 S∗

Sin δij

Sim m

6 58

n

Smn is the sum over all pairwise individual genetic similarities in the network and δij

where S∗ = m n

is one if individuals i and j are in the same subpopulation and 0 otherwise. The expected value of Q is zero if there is no population subdivision. The statistical significance of Q is determined by random permutations of the genetic data (Appendix B). Subpopulations are indicated by significant positive modularities. Like STRUCTURE, subpopulations are defined only by individual genotype data and not by prior classifications. Unlike STRUCTURE, the identified subpopulations are not constrained to be random mating demes in Hardy–Weinberg but can have a variety of genetic properties. Different internal subpopulation structures can coexist within the same total network. Also, closely linked markers can be used in addition to unlinked and loosely linked markers because linkage disequilibrium is not used to identify subpopulations, unlike STRUCTURE. Subpopulations emerge directly from the genetic data, so no K value has to be assigned a priori. Unlike STRUCTURE, a test of the statistical significance of the partition is given, and it is possible to have all individuals in a single community (K = 1 and Q = 0), testing directly the null hypothesis of no population subdivision. The hierarchical nature of population structure is easily studied within this network by gradually increasing the threshold of the edge weights, say τ, on the modularity partitions. For a given τ, all edges with Sij < τ are eliminated from the network. Starting at τ = 0, the threshold is increased until the subpopulation partition based on modularity changes in a statistically significant fashion. This process of increasing τ is then continued, with statistical testing of all changes in the resulting subpopulation partitions, until all subpopulation signals are lost. Depending on the data set, one can still get clusters of family members and extended families, either within a subpopulation or scattered among subpopulations. In this manner, the hierarchical nature of population structure can be observed and quantified with statistical testing at every threshold level that changes the partition, unlike STRUCTURE. Finally, this individual similarity approach is much more computationally

231

232

Population Genetics and Microevolutionary Theory

(a)

(b)

(c)

Figure 6.32 Genetic clustering in the study at three hierarchical levels obtained with NetStruct. Different colors represent different significant genetic clusters. The Lower Galilee sites are enclosed by a solid-lined polygon, and the Upper Galilee sites are enclosed by a dashed line polygon. At each sampling site, the distribution of assignments of individuals to clusters is shown. (a) The highest hierarchical level, obtained by analyzing the network of all individuals without edge pruning. (b) The second hierarchical level, obtained with edges representing genetic similarity below 0.12 pruned. (c) The third hierarchical level, with edge weights below 0.22 pruned. Source: From figure 8 in Sinai et al. (2019). © 2019, Springer Nature.

efficient than STRUCTURE or other model-based alternatives, so a network analysis is feasible to implement with the increasingly large data sets that are available in the genomic era (Greenbaum et al. 2019b). The results of applying NetStruct to the salamander data are shown in Figure 6.32. At τ = 0 (no pruning of edges), two statistically significant (p < 0.001) clusters were detected (Figure 6.32a). These two genetic clusters correspond to Mt. Carmel and the Galilee, with a small amount of shared genetic similarity between these two regions. This result corresponds to the STRUCTURE result with K = 2 (Figure 6.27). As τ is increased, a significant change in modularity occurred when edges below a genetic similarity of 0.12 were pruned. This pruning revealed three significant genetic clusters in the Galilee (Figure 6.32b), detecting some of the variation found by PCA (Figure 6.29). Note than none of the Galilean breeding sites correspond to a pure, single genetic cluster. This result is not surprising in light of the previous work by Bar-David et al. (2007) on dispersal abilities in these salamanders that essentially have to forage for breeding sites such that the same individual can be found at more than one breeding site during a single breeding season. Nevertheless, the distribution of these genetic clusters is not randomly distributed geographically (Sinai et al. 2019). The null hypothesis of random distribution throughout the entire Galilee is rejected at p < 0.0001 through an exact permutation test (Appendix B). When the test is restricted just to the Upper Galilee, the null hypothesis is not rejected (exact p = 0.1639), similar to the high degree of homogeneity inferred for the Upper Galilee by BAPS (Figure 6.31). However, the null hypothesis of random distribution within the Lower Galilee is strongly rejected (p < 0.0001). At this hierarchical level, the Lower Galilee is split into two homogeneous groups (Figure 6.32b), with the northern sites in the Lower Galilee being homogeneous with the Upper Galilee (exact p = 0.1547), just as occurred in the BAPS analysis (Figure 6.31). This pattern is reasonable because a stream descends from the Upper Galilee and then turns down the valley, abutting against the Lower Galilee at these sites. Hence, this is a highly probable dispersal route from the Upper Galilee to the Lower Galilee. Overall, at this second hierarchical level, the Galilee is split into two subpopulations: the Upper Galilee plus the northern Lower Galilee versus the more southern portion of the Lower Galilee. The last significant change in

Gene Flow and Population Subdivision

modularity occurs at a threshold of 0.22 (Figure 6.32c). Five new significant (p < 0.001) genetic clusters appear at this level in the Galilee. The distribution of these genetic clusters of highly similar individuals is typically mixed at a single breeding site, reinforcing the pattern observed in Figure 6.32b and indicating that such mixing has occurred in recent times. As shown in Figure 6.32c, the clusters observed at this level are not randomly distributed across the entire Galilee (p < 0.0001), but unlike the level shown in Figure 6.32b there is significant heterogeneity both within the Upper Galilee (p < 0.0001) and Lower Galilee (p < 0.0001) in Figure 6.32c. The Lower Galilee heterogeneity can be explained in part by the three northern sites of the Lower Galilee displaying nonsignificant differentiation with several southern Upper Galilee sites that are near the stream that descends toward the Lower Galilee (exact p = 0.1223), as well as a pattern in the remaining Lower Galilee sites that is consistent with isolation by distance (Figure 6.30) as one goes from the northeast to southwest in the Lower Galilee. This differs from the BAPS output (Figure 6.31) that yields only discrete geographic subpopulations and cannot depict the gradual change in frequencies observed in the Lower Galilee in Figure 6.32c. As mentioned previously, NetStruct neither requires random mating subpopulations nor homogeneity of subpopulation internal structure. This was not much of an issue for the salamanders, but it was for a NetStruct analysis of human populations (Greenbaum et al. 2016). Insight into this potential heterogeneity among subpopulations can be achieved by measuring each individual’s strength of association (SA) to the subpopulation to which the individual was assigned through modularity: SA C, i = QC − max QCk k

i

6 59

where C is a significant modularity partition into communities (subpopulations) and Ck(i) is a partition identical to C except that individual i is assigned to subpopulation k instead of to its original subpopulation. Equation (6.59) examines only the maximum value over k, but the values for all k’s other than the originally assigned subpopulation can also be calculated and provide a measure of association of the individual with all the other subpopulations in the partition, much like STRUCTURE and related programs can estimate the proportion of an individual’s ancestry that comes from each of the assumed K random-mating demes. A high value of SA in Eq. (6.59) indicates that the individual is strongly associated with the subpopulation to which it was assigned, whereas low values indicate that there is at least one other subpopulation to which the individual shows a strong genetic association. This is expected to occur when an individual has recent ancestors from these other subpopulations, either due to gene flow or admixture. Plotting the SA values for every individual in the subpopulation yields a strength of association distribution (SAD) that describes the internal genetic structure of that subpopulation. When NetStruct was applied to a coarse geographic sampling of 11 human populations, the intermediate hierarchy identified three subpopulations: an African community, an East Asian community, and an Indo-European community (Greenbaum et al. 2016). The SADs for these three communities are shown in Figure 6.33. Each of these three subpopulations is very different in its internal structure. The African community displays the highest SA of all of the communities, but it is bimodal and has a trailing left skew. This reflects the fact that this community consists of both individuals from Africa and AfricanAmericans from the United States, which as previously noted is an admixed population. The East Asian community of Chinese and Japanese individuals has a weaker SA on average than Africans, but shows a single sharp peak with a small variance, indicating that this community has not been affected by much recent gene flow or admixture with distant populations compared to the African/ African American community. The Indo-European community shows the weakest SA on average

233

Population Genetics and Microevolutionary Theory

Indo-European community East-Asian community African community

Density

234

0.000

0.001

0.002

0.003

0.004

Strength of association

Figure 6.33 Strength of association distributions (SAD) for three human sub-populations identified by an intermediate threshold of genetic similarity in 11 human populations. Dashed lines indicate the mean SA for each community. Source: Greenbaum et al. (2016). © 2016, The Genetics Society of America.

and displays considerable left skew. This group is known to have had an extensive history of admixture over many thousands of years (Chapter 7) as well as much admixture in the last 500 years. Thus, NetStruct inferred three communities at the intermediate threshold level, and each one displayed different internal properties of genetic similarity. NetStruct produces a true hierarchical analysis, as shown in Figure 6.32. For the salamanders, we are dealing only with 3 major geographic regions, 15 microsatellite loci, and 3 levels of hierarchy. Accordingly, a simple figure such as Figure 6.32 can capture the hierarchical nature of population structure in these salamanders. However, when NetStruct deals with larger samples, more geographical coverage, and many more genetic markers, the resolution of hierarchies can become too complex for a simple diagram such as Figure 6.32. Accordingly, Greenbaum et al. (2019b) developed a Population Structure Tree that portrays hierarchies in population structure. For example, they analyzed 938 human individuals sampled from 52 sites scattered around the world and scored for 647 976 SNPs. Figure 6.34 displays the low-resolution Population Structure Tree that emerged from the NetStruct analysis (see the original paper for a fine-resolution version). No root is shown because Figure 6.34 is not necessarily an evolutionary tree, and branch lengths reflect only an overall individual genetic distance regardless of how that distance arose. At the higher levels of the hierarchy, the main branches correspond to continental groups, indicating that intercontinental dispersal was one of high resistance throughout much of human evolutionary history. More recent aspects of gene flow are apparent at the lower levels of the hierarchy. For example, some individuals of Pima and Maya ancestry cluster with the Americas community, as expected for native Americans, but other individuals in this group cluster with some East Asians and Indo-Europeans, as also indicated by other analyses (Pickrell and Pritchard 2012). This splitting of individuals is expected under admixture. It cannot be emphasized enough that a Population Structure Tree is not an evolutionary tree of the populations – it is simply a devise to display the hierarchies found in a species’ population structure regardless of how those hierarchies originated. A population tree implies a series of population splits followed mostly by isolation, as will be discussed in detail in the next

Gene Flow and Population Subdivision

SUB-SAHARAN AFRICA

OCEANIA

Figure 6.34

EUROPE AMERICAS MIDDLE EAST NORTH AFRICA CENTRAL AND SOUTHERN ASIA

EAST ASIA

Maya & Pima (subset)

The population structure tree for humans. Source: Greenbaum et al. (2019b).

chapter. When the same data used in Figure 6.34 are subject to a test for fitting a tree-like structure (see Chapter 7 for such tests), the null hypothesis of a tree is strongly rejected with p < 10−200 (Templeton 2018a) – an abysmal fit to a population tree to say the least. This makes it clear that the population structure tree shown in Figure 6.34 is not an evolutionary tree of populations.

A Final Warning Fst and fst and related measures can be measured from genetic survey data. In contrast, measuring gene flow directly from dispersal studies is often difficult. The theory developed in this chapter shows that even rare exchanges between populations can have major consequences for population structure. Long distance dispersal is often the most difficult type of dispersal to study, but as we learned, its genetic impact can be great even when exceedingly rare. Moreover, dispersal of individuals is not the same as gene flow, as shown by the D. mercatorum and Hine’s emerald dragonfly examples. Therefore, gene flow, as opposed to dispersal, is frequently more accurately measured from genetic survey data than from direct observations on the movements of plants and animals. However, estimating gene flow from F or f and related statistics often assumes that the underlying cause of differentiation among demes or localities is due to the equilibrium balance of genetic drift and recurrent gene flow. Such estimates of gene flow are sometimes of dubious biological validity when forces other than recurrent gene flow and drift influence the spatial pattern of genetic variation, and/or when drift and gene flow may not be in an equilibrium balance. For example, suppose a species is split into two large subpopulations that have no genetic interchange whatsoever (that is, m = 0). Equation (6.29) then tells us that fst = 1. However, this is an equilibrium prediction, but historical events could have occurred that placed the populations far out of equilibrium or that created large temporal fluctuations in the amount of gene flow or drift. When equilibrium is disrupted, it can take time for the relationship shown in Eq. (6.29) to become established. If the subpopulations are large, it would take many generations after the cessation of gene flow before we would actually expect to see fst = 1. Until that equilibrium is achieved, fst < 1 so that using the equilibrium equation in this case would incorrectly indicate that m > 0 when in fact m = 0. On the other hand, suppose a species recently expanded its geographical range into a new area from a small and genetically homogeneous founder population. The local demes formed in the new geographical range of the species would display much genetic homogeneity for many generations

235

236

Population Genetics and Microevolutionary Theory

because of their common ancestry regardless of what Nevm value is established in the newly colonized region (Larson 1984). Once again, if we sampled shortly after such a range expansion event, we would mistakenly infer high values of m regardless of what the current values of m were. Suppose now that an event occurred that restored gene flow among previously long-isolated populations. If m were small, fst would decline only slowly, so the observation of a nonequilibrium fst in this case would imply less gene flow than is actually occurring at present. These hypothetical examples show that fst is an effective indicator of current gene flow only if your populations are in equilibrium and have not been influenced by recent historical events. This is a big IF, so we will address how to separate current population structure from historical events in the next chapter.

237

7 Population History Premise 1 that DNA can replicate implies that genes have an existence in space and time that transcends the individuals that temporarily bear them (Chapter 1). In Chapter 6, we saw how we can study and measure the pattern of genetic variation over space. Genetic variation is ultimately created by the process of mutation (premise 2), and when a new genetic variant is first created, it is confined to one point in space and time. An initial mutant gene can only spread through space with the passing of generations and many DNA replication events. The spread of genes through space therefore occurs through time. The spread of the gene through space over the generations is influenced in part by evolutionary forces such as gene flow. Our first model of gene flow in Chapter 6 had dispersal occurring at a rate m every generation. Under this model, gene flow would influence the spread of the gene every generation. Suppose that genetic exchange among populations only occurred every tenth generation. Still, any genetic variants that persist for hundreds or more of generations (as many do in large populations) would have their spatial distribution influenced by multiple occurrences of gene flow. Gene flow in these models is therefore recurrent, that is, forces or events that occur multiple times during the time from the present to the coalescence time to the common ancestral molecule for all the homologous molecules in a sample. The time to the most recent common ancestral molecule is a natural time period for defining recurrence because this is the time period in which the current array of genetic variation at the locus of interest has been shaped and influenced by evolutionary forces. Once we go back in the past beyond the most recent common ancestral molecule, there is no genetic variation observable in the coalescent process, so there is no potential for observing the effects of recurrent evolutionary forces such as gene flow. Besides recurrent forces or events, unique or rare events such as colonizing a new area or having the population split into two or more isolates by climatic change could also have influenced the current spatial pattern of genetic variation. Historical events are events that occurred only once or at most a few times during the time from the present to the coalescence time to the common ancestral molecule for the sample of homologous DNA molecules. In general, both recurrent and historical events influence how genes spread through space and time. When we measure a pattern of spatial variation through a statistic such as fst, we often do not know how much the measured value has been influenced by recurrent evolutionary forces such as gene flow, genetic drift, and system of mating and how much it has been influenced by historical events such as population fragmentation (the split of a population into two or more subpopulations with little to no subsequent genetic interchange), range expansion (the expansion of populations of a species into new geographical areas), and bottleneck or founder events that induced drastic changes in effective population sizes. As indicated in Chapter 6, equilibrium equations relating to the balance of recurrent evolutionary forces such as drift, gene flow, and/or mutation must be Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

238

Population Genetics and Microevolutionary Theory

used cautiously when interpreting genetic observations. Such equations were derived under the assumption of a long-standing balance of recurrent evolutionary forces that are unchanging through time and uninfluenced by any historical events. Such an assumption may not be true for many real populations. To avoid making such an assumption, we need to go beyond just describing the current spatial pattern of genetic variation and instead investigate the spread of genes through both space and time together. One of the first statistics proposed to investigate historical events, and one still used, is Tajima’s D statistic (Tajima 1989a, b). Tajima used the neutral site frequency spectrum (SFS), Eq. (5.28), and the expected number of segregating sites under the infinite sites model (Eq. 5.27), to derive the statistic: k − S a1

D=

c1 S a1 + c2 S S − 1

a21 + a2

where k ij k=

i10cM (~225 Years)

244

0.01

De 100

100 1.0

10.0

Figure 7.3 Inferred dispersal surfaces (σ) and population density surfaces (De) over time for Europe as inferred by the program MAPS applied to genetic data from 2234 Europeans. Time is measured by shared identity-by-descent (IBD) length bins (a) 1–10 cM. Source: Al-Asadi et al. (2019).

with the maternally inherited mtDNA (Soodyall 1993) – a pattern also consistent with the oral traditions of the Lemba that the original admixture involved Jewish men and Bantu women. One extreme null hypothesis is that gene flow and admixture played no role in determining current population structure; rather, current population structure arose exclusively from historical fragmentation events that split past populations into isolated subpopulations that had no gene flow or admixture between them. Under this null hypothesis, only historical splitting events determine population structure and recurrent gene flow plays no role whatsoever. If this null hypothesis is true, population structure can be captured through a population tree that portrays the evolutionary history of populations as a series of splits of ancestral populations into two or more daughter

Population History

populations that are isolated from one another and evolve thereafter as separate lineages or branches on the population tree. The concept of a population genetic distance was introduced in Chapter 6, and we saw how such genetic distances could be plotted over geographical distance to test the fit to a model of recurrent gene flow constrained by isolation by distance (Figures 6.19 and 6.21). Population genetic distances can also be used to test the fit of a model of historical fragmentation followed by complete genetic isolation. Populations that were fragmented in the past and then remain genetically isolated inevitably diverge from one another over time due to genetic drift and mutation. This was shown experimentally in Chapter 4 with Buri’s experiments on 107 isolated subpopulations fragmented from a common ancestral stock of Drosophila melanogaster (Figure 4.4). The population genetic distances among isolated subpopulations increase with increasing time since the fragmentation event as the different subpopulations acquire different alleles by mutation and diverge in allele frequencies by drift. If different subpopulations split from one another at different times, the population genetic distances should ideally reflect the order of their splitting in time, that is, the older the split, the larger the population genetic distance. In Chapter 5, we saw that molecule genetic distances can be used to estimate an evolutionary tree for haplotypes through such algorithms as neighbor-joining. Similarly, population genetic distances can be used to estimate an evolutionary tree of populations by use of the same algorithms. Recall from Chapter 5 that recombination can undercut the very idea of a haplotype tree itself. When recombination occurs, a single haplotype can come to bear different DNA segments that had experienced different patterns of mutation and coalescence in the past. Thus, there is no single evolutionary history for such recombinant haplotypes. When recombination is common and uniform in a DNA region, the very idea of a haplotype tree becomes biologically meaningless. Similarly, an evolutionary tree that accurately portrays the historical relationships among populations can be estimated only if the populations are isolates after fragmentation. The analog of recombination in this case is gene flow and/or admixture. If genetic interchange occurs among members of the populations, then populations do not fit well into an evolutionary population tree that depicts population splits and divergence under isolation as the sole agents of current evolutionary relatedness. Most tree-building algorithms generate population trees from pairwise population genetic distances regardless of whether or not the null hypothesis of a population tree (treeness) is true. Figure 7.4 gives an estimated neighbor-joining tree of 52 human populations by Hunley et al. (2016) using 645 autosomal microsatellite loci in 1037 individuals – the same data as in the NetStruct analysis shown in Figure 6.33. As noted in Chapter 6, both the NetStruct analysis and a previous analysis using the program TreeMix (Pickrell and Pritchard 2012) found significant evidence of admixture in these data that violate the fundamental assumption of a population tree. Moreover, these same data fit an isolation by distance and resistance model very well (Ramachandran et al. 2005), as shown in Figure 7.5. In light of these analyses, “To tree or not to tree, that is the question?” (Smouse 1998). Population genetics deals primarily with intraspecific populations, so gene flow and admixture are realistic alternatives to a population tree. Hence, there is a great need to test the underlying assumptions that would justify a population tree. Fortunately, many such tests exist. Cavalli-Sforza and Piazza (1975) derived a log-likelihood ratio test (LRT) (Appendix B) of the null hypothesis of treeness, but their test was computationally intensive and therefore little used for many decades. Technically, their test is only valid when the population tree is estimated by maximum likelihood, but this is also computationally difficult at present with a data set as large as that used in Figure 7.4. However, once we have a tree, it is possible to calculate this test statistic, and the resulting test rejected the null hypothesis of the population tree shown in Figure 7.4 with a p level less than 10−200 – the limit of the program used (Templeton 2018a). This p-value is biased

245

Population Genetics and Microevolutionary Theory

Native American Populations

East Asian Populations

Oceania Populations

Central South Asian Populations

European Populations

North African and Middle Eastern Populations Sub-Saharan African Populations

Population Genetic Distance Figure 7.4 A neighbor-joining population tree of 52 human populations. Different geographic regions are indicated by text and colors. Source: Modified from figure 1.B in Hunley et al. (2016).

0.20

fst

246

0.10

0.00 0

5000

10,000

15,000

20,000

25,000

Geographic Distance Using Waypoints (km)

Figure 7.5 Pairwise fst’s between 52 human populations across the globe versus the geographic distances that minimize travel over large bodies of water. Source: Modified from figure 1, p. 15943 in Ramachandran et al. (2005).

downwards because the tree in Figure 7.4 is not a maximum-likelihood tree, but it is doubtful if this fact alone could account for such an extremely low p level. Moreover, when the test is implemented with a smaller sample of human populations with a population tree estimated with maximum likelihood, the null hypothesis is still rejected with a p-value of 1.3 × 10−9 (Long and Kittles 2003).

Population History

Hence, the “tree” shown in Figure 7.4 is not a tree at all, but rather a pseudo-tree that misrepresents the underlying data that are strongly influenced by admixture and recurrent gene flow constrained by isolation by distance and resistance. Indeed, the very shape of the “tree” shown in Figure 7.4 indicates its lack of treeness. The genetic distances between two populations predicted by a tree are found as the sum of the branches that interconnect them in the tree. Notice that these predicted distances that the tree algorithm attempts to fit to the observed distances tend to increase with increasing geographical distance from the sub-Saharan African populations that have very short total branch lengths to the root compared to all other populations (under an ideal tree with a molecular clock, all current populations should be equally distant from the root). As will be shown later in this chapter, the sub-Saharan populations are the primary genetic source for all the other human populations. Such a shape in a “tree” with increasing cumulative branch lengths from the root with increasing geographical distance from the source populations is the expected artifact when an isolation by distance relationship (Figure 7.5) is forced into a “tree” (Templeton 2018a). The “tree” in Figure 7.4 is simply an artifact of the tree-generating algorithm that has no biological meaning other than an artifactual distortion of the better-fitting model of isolation by distance given in Figure 7.5. Another class of tests for treeness is based upon a fundamental phylogenetic property that an evolutionary tree involving four taxa (populations in our case) has one and only one internal branch in an unrooted tree. Suppose further that the populations are chosen such that one is confident that one population is an outgroup (Chapter 5) and that one of the other populations is closer to the outgroup than the remaining two. This results in the unrooted tree shown in Figure 7.6. Now, consider the four population statistic (Reich et al. 2009): f 4 P1 , P2 , P3 , P4 = p1 − p2 p3 − p4

72

where the pi’s are allele frequencies for a particular allele at a particular locus in the genetic survey in the respective populations Pi’s. Note that the internal branch shown in Figure 7.6 means that p1 − p2 should be uncorrelated with the difference in allele frequencies for the same allele between populations P3 and P4, p3 − p4, because the internal branch of the tree does not affect allele frequency differences between the population pair (P1, P2) under the null hypothesis of treeness. This means that the expected value of Eq. (7.2) is zero if the underlying tree (Figure 7.6) is true. However, if gene flow or admixture occurred between P3 and P1 and/or P2, then p1 − p2 is affected by P3 across the internal branch, and a nonzero expected value of Eq. (7.2) could occur. Of course, even if the assumed tree were correct, we could get deviations from the tree prediction that f4 = 0 due to sampling error and to shared ancestral polymorphisms and lineage sorting (Figures 5.17 and 5.18). These sources of error should cause random deviations from the expected value of zero, but should average out when the f4 statistic is estimated from a large number of alleles and loci as long as the underlying tree structure is true. Reich et al. (2009) used a normalized version of f4 to test the null hypothesis that there was no gene flow or admixture between P3 and (P1, P2). Reich et al. (2009) applied this approach to SNP data on 25 human populations from India along with a few outgroup

Figure 7.6 The evolutionary tree of four populations (Pi, i = 1,…,4) when P4 is an outgroup and P3 is regarded as being evolutionarily closer to the outgroup population than the other two populations. The dashed line shows the only internal branch present in such a tree.

P1

P3

P2

P4

247

248

Population Genetics and Microevolutionary Theory

populations from Africa, Europe, and East Asia. They used a principal component analysis to choose reasonable P3 populations, but also examined each SNP allele across all combinations of four populations, adjusting for linkage disequilibrium across the SNPs. Almost every possible tree topology of subsets of four populations resulted in a strong rejection of the null hypothesis of a population tree. A principal component analysis (Chapter 6) revealed a strong east–west gradient in the 25 Indian populations in the degree of relatedness to Chinese versus Europeans, a result more consistent with an isolation by distance model than with a tree model. There are many related statistics to Eq. (7.2) that are collectively known as “f statistics” in the literature. The reader of the population genetic literature needs to be aware that these f statistics are quite different from the other multiple f statistics already encountered in this book (e.g. f, F, fst, etc.). One commonly used statistic in this family is known as the ABBA/BABA test that is based on the tree shown in Figure 7.6 being true. This test focuses specifically on the errors associated with the coalescent process discussed in Chapter 5. Suppose the outgroup population, P4, has allele A, and we search through the genetic survey data to find cases in which exactly one of the remaining three populations shares A but two have a derived allele, B. If the assumed tree (Figure 7.5) is true (no gene flow or admixture) and if there is no homoplasy, ancestral polymorphism, and lineage sorting, then the derived allele B must have arisen on the internal branch, as shown in Figure 7.6, resulting in the allele configuration BBAA for the four populations when ordered as (P1, P2, P3, P4). However, homoplasy, ancestral polymorphisms, and lineage sorting can create random deviations from this perfect tree pattern. Since P4 has to have A by definition, there are only two alternative error patterns: ABBA and BABA. These two patterns are just random deviations from the perfect tree pattern of BBAA under the hypothesis that the tree is true, so the expected number of ABBA cases should equal the expected number of BABA cases. There are many versions of the statistics testing this expectation, but the simplest one is (Durand et al. 2011): N ABBA − N BABA N ABBA + N BABA

73

where N(XYZW) is the number of nucleotide sites or loci at which the pattern XYZW occurred. Under the null hypothesis of a population tree, the expected value of Eq. (7.3) is zero, and the statistical significance of deviations from zero can be evaluated (Durand et al. 2011). Although the ABBA/BABA test is often presented as a test for admixture, it is actually testing the null hypothesis of a population tree (Peter 2016; Ralph 2019). Besides admixture, recurrent gene flow (Solis-Lemus et al. 2016; Ralph 2019) and isolation by distance (Eriksoon and Manica 2014) can all lead to the rejection of a population tree through the ABBA/BABA test family. Networks provide another means of testing for treeness. Networks of individuals and populations were introduced in Chapter 6. A population tree is a specific type of network, so it is always a subset of the edges shown in the more general networks discussed in Chapter 6. Accordingly, networks allow the investigation of evolutionary relationships that do not fit a tree model (Bapteste et al. 2013), but do fully encompass trees as a special case. One common approach to generating a more generalized network that includes a tree as a sub-network is to start with a population tree, test for treeness for specific subsets of branches, and then add on gene flow/admixture edges as needed onto the population tree to increase the overall fit of the network to the data. This is the strategy of the TreeMix program (Pickrell and Pritchard 2012). Pugach et al. (2016) analyzed genomic data on human populations from Siberia with TreeMix, but found the TreeMix results difficult to interpret and in contradiction to well-accepted aspects of the population history. These difficulties may arise from a more fundamental problem with starting with a population tree and then adding edges as needed to explain gene flow and admixture. Solis-Lemus et al. (2016) and Wen and Nakhleh

Population History

(2018) showed that when gene flow and/or admixture are present, basing inference on an initial tree is statistically inconsistent (Appendix B), that is, the estimators do not converge to the true answer with increasing amounts of data. To say the least, inconsistency is a highly undesirable statistical property. For example, this means that the TreeMix program is unlikely to reconstruct past admixture events accurately, and the more data you have, the more probable it is that TreeMix will give you the wrong answer. This implies that we should start with a general network that makes no assumption of a tree. Bradburd et al. (2016) developed a program SpaceMix that does not assume any underlying population tree. Instead, it starts with a geographic framework because gene flow and population movements are often constrained by geography. Overlaid upon this framework are admixture arrows, so admixture/gene flow is tested for in the context of a null hypothesis of geography rather than a null hypothesis of a population tree. Pugach et al. (2016) also analyzed their data with SpaceMix and found that the SpaceMix results fit the genomic data better and without contradictions to well-accepted aspects of the population history. SpaceMix indicated a history that included isolation by distance, long-distance dispersals, and multiple admixture events – all of which violate the assumption of a population tree. It is unlikely that any tree could capture the complexity of this data set, and it was therefore an inappropriate starting point for the analysis. The literature in population genetics is filled with many population trees, particularly the area of human population genetics (Templeton 2018a). Unfortunately, the vast majority of population trees appearing in the literature are never subjected to testing; yet, when tested, the underlying data frequently reject the hypothesis of treeness. Population trees overall are an unlikely explanation for the current population structure of many species. Instead, population structure is frequently influenced by gene flow, isolation by distance and resistance, and admixture, and only occasionally by fragmentation followed by isolation. Population trees should never be assumed in population genetics; rather, the null hypothesis of treeness should always be tested before accepting a population tree.

Using Haplotype Trees to Study Population History We have already seen how haplotypes and haplotype trees can be used to infer past effective population sizes and admixture/gene flow events. However, haplotype trees can be used to achieve even greater insight into a population’s or species’ recent evolutionary history. Haplotype trees provide information about a haplotype’s existence through time. The temporal information inherent in a haplotype tree can be used to shed light upon the evolutionary history of the spatial distribution of current haplotypic variation. To see why, consider the study of Templeton & Georgiadis (1996) on mitochondrial DNA (mtDNA) restriction site variation in Eastern African populations of buffalo (Syncerus caffeer) and impala (Aepyceros melampus). Regarding the different mtDNA haplotypes in these samples as alleles, the F-statistic estimator of Davis et al. (1990) yields an Fst of 0.08 for the buffalo and an Fst of 0.10 for the impala. Both of these Fst values are significantly different from zero, but they are not significantly different from each other. Moreover, in both species, most of the geographical sites surveyed were relatively close together in Kenya and Tanzania, but one site (Chobe) was far to the south. In both species, the Chobe samples had many haplotypes not found in the other locations, and it was the Chobe samples that were responsible for the significant Fst’s in both cases. Hence, this F-statistic analysis implies that both species are equally subdivided, have comparable rates of gene flow, and display restricted gene flow primarily between the Chobe versus the Kenya/Tanzania localities. Now, let us add on information from the mtDNA haplotype trees for these two species with seemingly similar patterns of spatial variation as measured by F-statistics.

249

250

Population Genetics and Microevolutionary Theory

Figure 7.7 shows the haplotype tree estimated from the mtDNA and indicates the haplotypes found only in Chobe in both species. In the buffalo, the Chobe haplotypes are scattered throughout the haplotype tree; in the impala, the Chobe haplotypes are tightly clustered within the tree. Although both species show the same degree and pattern of spatial subdivision as measured by Fst, they have obviously achieved this degree of subdivision in very different fashions through time. Clearly, the use of haplotype trees allows a finer discrimination of biological pattern than an Fstatistic analysis. The reason for this is straightforward; by using a haplotype tree, you examine a spatial/temporal pattern of genetic variation whereas you examine only the current spatial pattern with the F-statistic. The scattered spatial/temporal pattern found in the buffalo (Figure 7.7) indicates recurrent genetic interchange between Chobe and the more northern populations throughout the time period from the coalescence of the sampled mtDNA haplotypes to the present. The impala pattern is more difficult to interpret. Such a strong evolutionary clustering of haplotypes in a geographical region, particularly when the haplotype clusters are separated by a long branch length with missing intermediates, is often interpreted as evidence of a past fragmentation event (Avise 1994). However,

(a) Buffalo (Syncerus caffer) Haplotypes found only in Chobe 4 3

2

9

18

0

5

21

6

12

17

19

8 7 0 14 27

0

26

13 10

0

0 16

28

0

15

0

1

0

0

20

25

0

0

0

11

23

0

0

0

0 22

0

24

0 0 0 0 0 0 0 0

(b) Impala (Aepyceros melampus) 16 14 13 15

12

0

5 10

11

22

0

0

21 0

8 1 6 0 0 4 3 2 0 0 25 17 18 23 0 19 9

7

20

24

Figure 7.7 The unrooted mtDNA haplotype trees for two species of ungulates: (a) the buffalo (Syncerus caffer) and (b) the impala (Aepyceros melampus) sampled from the same geographical sites over eastern Africa. Each line represents a single restriction site change in the haplotype tree, with dashed lines indicating ambiguity under statistical parsimony. Haplotypes are designated by numbers, and a “0” indicates an internal node in the haplotype tree that was not found in the sample of current haplotypic variation. The haplotypes found in the distant, southern location of Chobe are shaded.

Population History

because impala are found in intermediate geographical locations that were not sampled, it is possible that this pattern arose from recurrent gene flow under an isolation by distance model (Chapter 6) such that geographically intermediate populations would fill in the missing haplotype nodes and show a gradual shift from one cluster of haplotypes to the other, as discussed in Chapter 6. Indeed, a rigorous quantitative analysis of these data (of a type to be discussed below) reveals that the sparseness of sampling prevents one from distinguishing between isolation by distance versus fragmentation of the Chobe population from the Kenyan/Tanzanian populations of impala (Templeton and Georgiadis 1996). The ambiguity in interpreting the impala mtDNA tree illustrates the dangers of making biological inferences by a visual inspection of how geography overlays upon a haplotype tree. Such visual inferences make no assessment of adequate sample sizes for statistical significance nor adequate sampling of geographical locations for distinguishing among potential causes of geographical associations. One method for addressing these difficulties is a nested-clade phylogeographic analysis (Templeton et al. 1995) that quantifies the associations between geography, time, and the haplotype tree. As we saw earlier concerning rare haplotypes and shared IBD segments, information about the age of a haplotype can be used as a powerful tool to investigate the trajectory of that haplotype over space through time. A haplotype tree contains much information about the relative ages of all haplotypes. One method of obtaining temporal information from the haplotype tree is to use a molecular clock to estimate the ages of the various haplotypes in a rooted tree. This seemingly simple approach is often not easy to implement because of the difficulties associated with obtaining a good calibration for the clock, possible deviations from the clock, and the inherent large evolutionary stochasticity associated with the coalescent process itself. We will make use of such detailed temporal reconstruction later in this chapter, but nested-clade analysis extracts much temporal information from a haplotype tree without the need for a molecular clock or age estimates. Nested-clade analysis first uses the haplotype tree to define a series of hierarchically nested clades (branches within branches). Such nested hierarchies are commonly used in comparative evolutionary analyses of species or higher taxa, but can also be applied to the haplotype variation found within a species if that variation can be placed into a haplotype tree (Templeton et al. 1987a). Temporal information is captured by the nested hierarchies because the age of a specific haplotype or clade of haplotypes must be less than or equal to the age of the clade within which it is nested. If the haplotype tree is rooted, then we can define a set of nested hierarchies in which each clade is strictly younger than the clade within which it is nested. In either case, a nested hierarchy of clades captures much of the relative temporal information inherent in a haplotype tree that does not depend upon a molecular clock. Templeton et al. (1987a) and Templeton and Sing (1993) proposed a set of rules to produce a nested series of haplotypes and clades. To achieve the first level of nesting, one starts at the tips of the haplotype tree. Recall from Chapter 5 that a tip simply refers to a haplotype that is connected to the tree by only one branch. For example, Figure 7.8 shows a simple haplotype tree for mtDNA from elephants obtained by mapping restriction enzyme cut sites (Georgiadis and Templeton, unpublished data). The tree consists of eight haplotypes found in savanna elephant populations from eastern Africa, with an Asian elephant haplotype used as an outgroup to root the tree. Notice that haplotypes 2, 3, 5, 6, 8, and 9 are all connected to the tree by a single branch, and hence are tips. In contrast, haplotypes 1, 4, and 7 have more than one branch connecting to them in the tree, and therefore represent interior nodes of the tree. These haplotypes are therefore called interior haplotypes. To create the first level of nested haplotypes, move one mutational step from the tips into the interior, and place all haplotypes that are interconnected by this procedure into a single clade.

251

252

Population Genetics and Microevolutionary Theory

For example, moving one mutational step in from tip haplotype 9 connects you to interior haplotype 7 (Figure 7.8). Similarly, moving in one step from tip haplotype 8 also connects you to interior haplotype 7. Therefore, haplotypes 7, 8, and 9 are grouped together into a “1-step clade” which is designated as clade 1-3 in Figure 7.8. Similarly, moving in one mutational step from tip haplotypes 5 and 6 connects to interior haplotype 4, which are now nested together into 1-step clade 1-2; and moving in one mutational step from tip haplotypes 2 and 3 connects to interior haplotype 1, which defines clade 1-1 (Figure 7.8). In this small haplotype tree, all haplotypes are now found in one of the 1-step clades, but in larger haplotype trees, there may be many interior haplotypes that are more than one mutational step from any tip haplotype. In those cases, the initial set of 1-step clades at the tips are pruned off the haplotype tree, and the same nesting procedure is then applied to the more interior portions of the pruned tree. Additional rounds of pruning and nesting are repeated as needed until all haplotypes have been placed into 1-step clades. Additional nesting rules are needed in case some haplotypes or clades are left stranded between two other nesting categories. Such a situation is shown in Figure 7.9. In cases such as these, the stranded haplotype or clade (haplotype C in Figure 7.9) should be nested with either the A-B clade or with the D-E clade. To determine which, first see if one clade has haplotypes that are mutationally closer to the stranded clade than the other. For example, suppose in Figure 7.9 that haplotypes A, B, C, and E are actually present in

2 AT

3 A

1–1

5 SHM

6 S

8 H

9 V 1–3

1–2 7 HVM

4 ATHV

Asian Elephants

1 ATSHV

Dc Dn

A = Amboseli T = Tsavo S = Sengwa H = Hwange V = Victorial Falls M = Matetsi

Figure 7.8 The rooted mtDNA haplotype tree for African elephants sampled from six geographical sites over eastern Africa. Each line represents a single restriction site change in the haplotype tree. Each circle represents a haplotype, with the number in the circle being the haplotype designation. Below the number, letters indicate the sampling sites at which that particular haplotype was found. Below that are bars that indicate the clade distance (black) and nested-clade distance (gray), as given in Table 7.1. The length of the bars in the legend box corresponds to a distance of 1000 km. The relative sizes of the circles represent the number of different sites at which a haplotype was present. The dashed line rectangles enclose the haplotypes nested together into 1-step clades using the nesting rules of Templeton et al. (1987a) and Templeton and Sing (1993). Three 1-step clades result, designated by 1−x, where x is a number designating a specific 1-step clade. Sources: Templeton et al. (1987a) and Templeton and Sing (1993).

Population History

1–1 A

1–2 B

C

D

E

Figure 7.9 A simple haplotype network consisting of five haplotypes (A through E). The standard nesting rules start at the tips and go one mutational step inward, so haplotypes A and B are nested together into 1-step clade 1-1, and haplotypes D and E are nested together into 1-step clade 1-2. This leaves haplotype C unnested between clades 1-1 and 1-2.

the sample of genes, but that D is an inferred intermediate node that is not actually present in the sample. In such a case, haplotype C is mutationally closer to clade 1-1 than to clade 1-2, and therefore C should be nested into clade 1-1. When the stranded clade is mutationally equidistant to its nesting alternatives, the stranded clade should be nested with the nesting clade that has the smallest number of observations as this nesting maximizes statistical power (Templeton and Sing 1993). The second level of nesting uses the same rules, but the rules are now applied to 1-step clades rather than haplotypes and result in “2–step clades”. For example, for the elephant mtDNA tree shown in Figure 7.7, clade 1-3 is a tip 1-step clade. Moving into the interior one mutational step from 1 to 3 connects to the interior clade 1-2. Therefore, clades 1-2 and 1-3 are nested together into a 2-step clade. However, in this case, clade 1-1 is left unnested. In order not to leave any clade out of the nested design at this level, clade 1-1 is nested together with 1-2 and 1-3 to form a single 2-step clade that contains all the African elephant haplotypes as there is no other possible clade with which to nest it. In the case of a larger haplotype network, this nesting procedure is repeated using 2-step clades as its units, and so on until a nesting level is reached that would result in only a singleclade spanning the entire original haplotype network. In Figure 7.8, the procedure ends at the 2step level as there is only a single 2-step clade. Nested haplotype trees such as that shown in Figure 7.8 contain much temporal information. For example, within clade 1-3, haplotypes 8 and 9 are younger than haplotype 7. Note that we do not know if haplotype 8 is younger than haplotype 9 or vice versa, but we do know that both of these tip haplotypes must be younger than the interior haplotype to which they are connected. Even if the tree were unrooted, coalescent theory predicts that tips are highly likely to be younger than the interiors to which they are connected (Castelloe and Templeton 1994), so both rooted and unrooted trees contain temporal information in their nested-clade hierarchies. This temporal information extends to the higher nesting levels. For example, defining the age of a clade as the age of its oldest member, we know that clade 1-3 is younger than clade 1-2 because 1-3 is the tip relative to the 1-2 interior. In this manner, turning a haplotype tree into a series of nested clades captures much information about relative temporal orderings, although some aspects of time are left undefined. Nevertheless, even this partial information about temporal ordering can be used to analyze the spread of haplotypes and clades through space and time in a manner that does not depend upon a molecular clock or dating. A nested-clade analysis also requires the spatial distribution of haplotypes and clades of haplotypes to be quantified. The geographical data are quantified in two main fashions (Templeton et al. 1995; Posada et al. 2006). The first is the clade distance, Dc, that measures how widespread the clade is spatially. When measuring spatial spread with geographical distances, the clade distance is determined by calculating the average latitude and longitude for all observations of the clade in the sample, weighted by the local frequencies of the clade at each location. This estimates the geographical center for the clade. Next, the great circle distance (the shortest distance on the surface of a sphere between two points on the surface) from a location containing one or more members of the clade to the geographical center is calculated, and these distances are averaged over all locations

253

254

Population Genetics and Microevolutionary Theory

containing the clade of interest, once again weighted by the frequency of the clade in the local sample. Sometimes, geographical distance is not the most appropriate measure of space. For example, suppose a sample is taken of a riparian fish species. Because rivers do not flow in straight lines and because the fish are confined in their movements to the river, the geographical distance between two sample sites on the river is not relevant to the fish; rather, the important distance in this case is the distance between the two points going only along the river. In cases such as these, the investigator should define the distances between any two sample points in the most biologically relevant fashion (including resistance rather than just geographical distance), and the clade distance is now calculated as the average pairwise distance between all observations of the clade, once again weighted by local frequencies. Table 7.1 shows the geographical clade distances calculated for all the haplotypes and clades for the elephant mtDNA shown in Figure 7.8. Haplotype 3 has a clade distance of 0 km, indicating that this haplotype was found in only a single location. Haplotype 2 has a clade distance of 81 km, indicating the close geographical proximity of the two locations (Amboseli and Tsavo, both located in Kenya) in which this haplotype is found. Haplotype 1 has a clade distance of 1021 km, reflecting the fact that it is found at high frequency over a large geographic area extending from Kenya to Zimbabwe. Table 7.1 gives the clade distances for all the other haplotypes in Figure 7.7 and for all the 1step clades formed from them. It is also important to note that the clade distances, being an average weighted by local clade frequencies, are a function of both how geographically widespread a clade is and also the frequencies of the clade across this geographical range. For example, note in Table 7.1 that clade 1-2 has a clade distance of 460 km, whereas clade 1-1 has a clade distance of 884 km. Yet, as shown in Figure 7.8, members of clade 1-2 are found at all six sampling sites in eastern Africa, Table 7.1 The clade and nested-clade distances for the haplotypes and 1-step clades shown in Figure 7.8, and their old minus the young average clade and nested-clade distances within a nested group. Within 1-Step Clades

Within Total Tree Dc

Dn

35

1021L∗∗∗

1027L∗∗∗

2

20

81S∗∗∗

657S∗∗∗

3

1

0

601

944L∗∗∗

373L∗∗∗

Haplotypes

No. in Sample

1

Old-Young 4

11

959L∗∗∗

832L∗∗∗

5

16

114

249S∗

6

3

156S∗

0 L∗∗∗

Old-Young

862

598

Dc

Dn

1–1

884

1173L∗∗∗

1–2

460S∗∗∗

768S∗∗∗

1–3

49S∗∗∗

759S∗∗

L∗∗∗

7

27

47

47

8

1

0

126

9

1

0

68

47

−50

Old-Young

1-Step Clades

L∗∗∗

626

409L∗∗∗

Note: Significance relative to the null hypothesis of no geographical associations of haplotypes or clades within a nesting clade is indicated as determined by 1000 random permutations using the program GEODIS, with L and S designating significantly Large and Small, respectively, and ∗ = significant at the 5% level, ∗∗ at the 1% level, and ∗∗∗ at the 0.1% level.

Population History

whereas clade 1-1 is found in only five. The reason for this apparent discrepancy is that the five sites at which 1-1 is found cover nearly the same geographical distance as the six sites at which 1-2 is found and because clade 1-1 is found in high frequency in both Kenya and Zimbabwe, whereas clade 1-2 is common only in one area, Zimbabwe. Therefore, on the average, clade 1-2 is much more geographically concentrated than clade 1-1. The second measure of geographical distribution of a haplotype or clade is the nested clade distance, Dn. The nested-clade distance quantifies how far away a haplotype or clade is located from those haplotypes or clades to which it is most closely related evolutionarily, that is, the clades with which it is nested into a higher level clade. For geographical distance, the first step in calculating the nested-clade distance is to find the geographical center for all individuals bearing members not only of the clade of interest but also bearing any other clades that are nested with the clade of interest at the next higher level of nesting. For example, to calculate Dn for haplotype 1 nested in 1-1 in Figure 7.8, one first finds the geographical center of haplotypes 1, 2, and 3 pooled together (the haplotypes nested within 1-1). This is the geographical center of the nesting clade. The nested-clade distance is then calculated as the average distance that an individual bearing a haplotype from the clade of interest lies from the geographical center of the nesting clade. Once again, all averages are weighted by local frequencies. When the investigator defines the distances between sample locations, the nested-clade distance is the average pairwise distance between an individual bearing a haplotype from the clade of interest to individuals bearing any haplotype from the nesting clade that contains the clade of interest. In those cases where some other distance is used (as in our fish example), the nested-clade distance is defined as the average pairwise difference between all copies of the clade of interest to all copies of clades in the same nesting group, including itself. Table 7.1 also gives the geographic nested-clade distances for all the elephant mtDNA haplotypes and the 1-step clades into which they are nested. For example, the nested-clade distance of haplotype 1 is 1027 km, which is nearly identical to its clade distance of 1021 km. This means that the geographical center of haplotype 1 is almost the same as the geographical center for clade 1-1, resulting in very little difference between these two distance measures. In contrast, the nestedclade distances for haplotypes 2 and 3 are 657 km and 601 km respectively, indicating that the geographical centers of these two haplotypes are located far away from the geographical center of clade 1-1. In contrast, the clade distances for haplotypes 2 and 3 are 81 and 0 km, respectively. Thus, although these two haplotypes are far away from the geographical center of clade 1-1, all individuals bearing these haplotypes are found close together, with bearers of haplotype 3 being confined to a single sampling location (corresponding to a clade distance of 0). Just as we saw above that it is dangerous to make biological inferences from a visual overlay of geography upon a haplotype tree, it is equally dangerous to make biological inferences from just the observed values of quantitative distance measurements such as those given in Table 7.1. For example, haplotype 3 has a clade distance of 0 km. However, from Table 7.1, we also see that haplotype 3 was only observed once in the entire sample. Hence, haplotype 3 has to have a clade distance of 0 because a single copy of this haplotype must by necessity occur at only one location. Consequently, this small clade distance is without statistical significance. In contrast, tip haplotype 2 has a clade distance of 81 km, but this is based on 20 observations of this haplotype (Table 7.1). Thus, we can be much more confident that haplotype 2 has a highly restricted geographical range than haplotype 3. Our degree of confidence in the numbers presented in Table 7.1 can be quantified by testing the null hypothesis that the haplotypes or clades nested within a high-level nesting clade show no geographical associations given their overall frequencies. This null hypothesis is tested by randomly permuting the observations (Appendix B) within

255

256

Population Genetics and Microevolutionary Theory

a nesting clade across geographical locations in a manner that preserves the overall clade frequencies and sample sizes per locality (Templeton et al. 1995). After each random permutation, the clade and nested-clade distances can be recalculated. By doing this a thousand or more times, the distribution of these distances under the null hypothesis of no geographical associations for a fixed frequency can be simulated. The observed clade and nested-clade distances can then be contrasted to this null distribution, and we can infer which distances are statistically significantly large and which are significantly small. Table 7.1 also shows the significantly (at the 5% level) small and large clade and nested-clade distances as determined by the computer program GEODIS (Posada et al. 2000). Because our biological interest in haplotype trees centers around how space and time are associated, some statistical power can be enhanced within a nesting clade by taking the average of the clade and nested-clade distances for all the tips pooled together and subtracting the tip average from the corresponding average for the older interiors. The average interior-tip difference still captures the temporal contrast of old versus young within a nesting clade, but often has greater power to reject the null hypothesis of no geographical association. The interior minus tip clade and nested clade average distances are also shown in Table 7.1. One great statistical benefit of the nested design is that the tests in different nested clades are independent (Prum et al. 1990). Therefore, GEODIS corrects the significance levels for multiple testing across the entire nested design using a standard Dunn-Sidak correction. One potential problem with nested-clade analysis is that it depends upon an estimate of the haplotype tree, which is often estimated with some error or ambiguity (Chapter 5). Templeton and Sing (1993) addressed this problem. First, procedures such as statistical parsimony and other coalescent criteria can greatly reduce the ambiguity in an estimate of the haplotype tree (Chapter 5). Second, much of the ambiguity that remains does not affect the nested design. One of the most common sources of ambiguity in haplotype trees is in the exact connections between clusters of haplotypes separated by a long branch, as shown by the loop of ambiguity at the end of the long branch interconnecting the haplotype cluster at the right-side of the 3’ LPL haplotype tree (Figure 5.20b) with the left-side haplotype cluster. The nested design would be the same regardless of how this loop is resolved. However, in some cases, the ambiguity would affect the nested design, as was the case for the loop shown in the impala mtDNA tree (Figure 7.7). Templeton and Sing (1993) solved problems like this by simply including all haplotypes in such a loop of ambiguity into a single nested clade. This ensures that the results of the analysis are not affected by the ambiguity, but it does reduce the genetic resolution of the nested design and thereby erodes statistical power. Heckerman (2007) solved this problem by deriving a Bayesian (Appendix B) version of nested-clade phylogeographic analysis that explicitly quantifies the uncertainty of the underlying haplotype tree structure. This uncertainty is then fully propagated throughout the remainder of the analysis, including the nested design itself and the statistical significance of the clade geographic measures. Nested-clade analysis was the first method in the field now called statistical phylogeography because all phylogeographic inference is based on statistically significant results rather than just a visual inspection of how the haplotype tree overlays upon geography. Statistical significance is not the same as biological significance. Statistical significance tells us that the measures we are calculating are based upon a sufficient number of observations that we can be confident that geographical associations exist with the haplotype tree. However, statistical significance alone does not tell us how to interpret those geographical associations. To arrive at biological significance, we must examine how various types of recurrent gene flow or historical events can create specific patterns of geographical association.

Population History

Expected Patterns Under Isolation by Distance As detailed in Chapter 6, isolation by distance creates associations between genetic variation and geography. Because gene flow restricted by isolation by distance implies only limited movement by individuals during any given generation, it takes time for a newly arisen haplotype to spread geographically. Obviously, when a mutation first occurs, the resulting new haplotype is found only in its area of origin. With each passing generation, a haplotype lineage that persists has a greater and greater chance of spreading to additional locations via gene flow. Hence, the clade distances should increase with time under a model of restricted gene flow. One of the more common types of restricted gene flow is isolation by distance (Wright 1943; Chapter 6). Under isolation by distance, the spread of a haplotype through space occurs via small geographical movements in any given generation, resulting in a strong correlation between how widespread a haplotype (or clade) is (as measured by Dc) and its temporal position in the haplotype tree. The older the haplotype, the more widespread it is expected to be. Moreover, newer haplotypes are found within the geographical range of the haplotype from which they were derived (taking into account sampling error), and since geographical centers move slowly under this model, the clade and nested distances should yield similar patterns of statistical significance. The expectations under isolation by distance are illustrated by the mtDNA of elephants in the savannas of eastern Africa (Figure 7.8 and Table 7.1). As can be seen from Figure 7.8, the geographical range of a haplotype consistently increases as we go from younger (tip) to older (interior) haplotypes no matter where we are in the haplotype tree. The distances given in Table 7.1 quantify this visual pattern. Starting with haplotypes nested within clade 1-1, we see that the clade distance of the older interior haplotype 1 is significantly large, whereas the clade distance of the younger tip haplotype 2 is significantly small. Overall, the older interior clade distance is significantly larger than the average of the younger tip-clade distances. This same pattern of statistical significance is observed in a parallel fashion with the nested-clade distances. Thus, both clade and nested-clade distances within clade 1-1 show a significant decline as one goes from older to younger haplotypes, as expected under isolation by distance. Moreover, this same pattern is repeated within clade 1-2. (Because tip haplotypes 8 and 9 were each only observed one time, there is no statistical significance to the patterns observed within clade 1-3.) As we go to the 1-step clades nested within the single 2-step clade for the elephant mtDNA tree, we also see both clade and nested-clade distances declining with declining age in a statistically significant fashion. Hence, the pattern of statistically significant results obtained with elephant mtDNA implies that savanna elephants over eastern Africa were genetically interconnected by recurrent gene flow that is restricted by geographical distance.

Expected Patterns Under Fragmentation Historical events can also create strong associations between haplotypes and geography. One such event is past fragmentation followed by complete or nearly complete genetic isolation. Genetic isolation means that haplotypes or clades that arose after fragmentation but in the same isolate will show concordant restricted spatial distributions that correspond to the geographical area occupied by the isolates in which they arose. Genetic isolation also means that fragmented populations behave much as separate species, but with the barrier to gene flow being only geographical in this case. Thus, the relationship of haplotype trees to geographically fragmented populations is like that shown in Figure 5.18 for the relationship between haplotype trees and species, but now with the additional restriction that isolates (the “species” shown in Figure 5.18) inhabit different geographical areas.

257

258

Population Genetics and Microevolutionary Theory

Not all types of fragmentation are covered by the original, single-locus nested-clade analysis. For example, single-locus nested-clade analysis is not applicable to the case of micro-vicariance in which an ancestral population is fragmented into numerous local isolates such that there is no to little correlation of geographical distance with genetic isolation (Templeton et al. 1995). An example of this type of micro-vicariance is illustrated by the fst’s of the Ozark populations of collared lizards in Figure 6.21. However, single-locus nested-clade analysis is applicable to allopatric fragmentation in which an ancestral population is split into two or more isolates such that each isolate consists of geographically contiguous subpopulations occupying an area that is mostly geographically separated from the areas occupied by other isolates. It is also applicable to long-distance colonization in which a long distance dispersal event establishes a new population that is disjunct from the ancestral population. This results in fragmentation as well when there is little to no subsequent gene flow between the ancestral population and the colony. A multi-locus version of nested-clade analysis will soon be described that can deal with all types of fragmentation. Figure 5.18 reveals that several patterns are possible under fragmentation depending on the time of fragmentation relative to the time scale of the coalescent process. If the fragmentation event lasts longer than the typical coalescent time, the haplotype tree will develop monophyletic clades of haplotypes that mark the isolates (Figure 5.18c). If the fragmentation event is much older than the coalescent time, many mutations should accumulate, resulting in the clades that mark the different isolates being interconnected with branch lengths that are much longer than the average branch length in the tree. However, not all cases of fragmentation and isolation are marked by strict monophyly of haplotype tree clades (Panels a and b in Figure 5.18), and strict monophyly can also be destroyed by subsequent admixture/gene flow events (Templeton 2001). Therefore, a strict monophyletic correspondence of clades with geography is a strong but not a necessary indicator of fragmentation given adequate geographical sampling. Regardless of whether there is monophyly or not, haplotypes or clades that arose after the fragmentation event either do not spread beyond the confines of the isolate in which they arose or spread in a highly restricted manner if limited introgression is present. This means that the clade distance tends not to increase much beyond the geographical ranges of the fragmented isolates. Even if this clade had been introduced to another isolate by some rare admixture or dispersal event, the frequency of the clade in the other isolate will generally be rare as long as isolation is the norm. Such rarely occurring admixture events therefore have little impact on clade distance. Although the magnitude of the clade distance is severely restricted by fragmentation, the same is not true for the nested-clade distances. The nested-clade distances can suddenly become much larger than the clade distances when the nesting clade contains haplotypes or clades found in other isolates, either because the nesting clade is older than the fragmentation event itself and therefore has descendants in more than one isolate, or because the nesting clade has members in more than one isolate due to the sorting of ancestral polymorphic lineages (Panels a and b in Figure 5.18). These sudden increases of nested-clade distance relative to clade distance generally occur in the higher nesting levels under fragmentation because the discrepancies generally arise within those nesting clades containing ancestral polymorphic lineages that predate the fragmentation event (Figure 5.18). In contrast, under restricted gene flow, as clades get older and older, the clades tend to become increasingly uniform in their spatial distributions. This means that the clade and nestedclade distances tend to converge with increasing age of the clade under restricted but recurrent gene flow models, as was observed for the savanna elephant mtDNA (Figure 7.8). Thus, a large increase in nested-clade distance over clade distance as one goes backwards in time is yet another signature of past fragmentation.

Population History

Another way of having a clade not increase its geographical distribution with increasing time into the past is when there is sufficient gene flow such that the clade has spread over the entire geographical range of the species. This results in a loss of significant geographical associations at the deepest clade levels because there is sufficient gene flow and time to have homogenized the geographical distributions of these older clades. However, fragmentation often appears at the oldest levels of the nested design, just the opposite of the expectations for spatial homogenization via gene flow. This is yet another signature of an historical fragmentation event. An example of these types of allopatric fragmentation patterns is given by nested-clade analyses of haplotype trees at five loci (mtDNA, Y-DNA, and three nuclear genes showing no evidence for recombination) in African elephants with Asian elephants as an outgroup, using data given in Roca et al. (2005). Figure 7.10 shows the inferences relating to fragmentation that were made from nested-clade analysis applied to these five haplotype trees. In all cases, the inferred fragmentation events split the African elephants into two mostly non-overlapping allopatric but contiguous groups: one population living primarily in the forested areas of central West Africa, and a second population living primarily in the savanna regions that border the forested region on the north, east, and south. Although no two haplotype trees have the same topology with respect to the groups of Asian elephant, African savanna elephant, and African forest elephant, all five trees lead to the inference of a past fragmentation event that subdivided the African elephants into mostly forest and savanna geographical regions. The commonly accepted population tree for these groups of elephants is to have the Asian elephant as the outgroup, followed by a split between the savanna and forest populations within Africa. Note that the inferred fragmentation event for the BGN locus in Figure 7.10 results in a haplotype tree that corresponds to this population tree, thereby corresponding to an example of the tree shown in Figure 5.18c. In contrast, the fragmentation event splits the haplotype tree of PHKA2 into the topology shown in Figure 5.18b. The remaining three haplotype trees in Figure 7.10 all have some haplotypes that are shared by both savanna and forest populations (like in Figure 5.18a), with the most extensive haplotype sharing being for mtDNA. Indeed, in all three of these cases, nested-clade analysis inferred a fragmentation event followed by range expansion and secondary contact resulting in some introgression, that is, admixture (Figure 7.10). These inferences are consistent with the observations of Roca et al. (2005) and Roca (2019) that in areas where the forest and savanna inter-digitate, savanna bulls mate with forest females. The hybrid offspring are fertile, but their larger size prevents them from living in the forest. In the savanna, the hybrid males tend to be excluded from mating by the larger savanna bulls, but the hybrid females backcross to savanna bulls, causing introgression of some genes, and most strongly mtDNA, from forest to savanna populations. Figure 7.10 makes it clear that nested-clade analysis does not equate haplotype trees to population trees. In that figure, five different tree topologies, only one of which was consistent with the commonly accepted population tree, all yield the same inference of allopatric fragmentation. Figure 7.10 also illustrates that the fragmentation inferences of nested-clade analysis are robust to lineage sorting and limited introgression/admixture. This robustness is achieved because nested-clade analysis inferences are based on local tests within the haplotype tree and do not depend on the overall haplotype tree topology, and because nested-clade analysis uses statistical criteria rather than absolute criteria such as perfect reciprocal monophyly.

Expected Patterns Under Range Expansion Another type of historical event that can create strong geographical associations is range expansion (including colonization). When range expansion occurs, those haplotypes found in the

259

260

Population Genetics and Microevolutionary Theory

Y-DNA mtDNA

Past Fragmentation Followed by Range Expansion & Secondary Contact

PLP

BGN

Past Fragmentation

PHKA2

Figure 7.10 The haplotype trees for five genomic regions (Y-DNA, mtDNA, and three nuclear loci: BGN, PLP, and PHKA2), as estimated by Roca et al. (2005). Each circle represents a distinct haplotype. The lines between haplotypes represent mutational changes between haplotypes, with lines with no tic marks representing a single mutational change and lines with tic marks representing multiple mutational changes with the number of changes equal to the number of tic marks. Pink represents haplotypes found in Asian elephants, blue represents haplotypes found in African elephants sampled from a forest habitat, and green represents haplotypes found in African elephants sampled from a savanna habitat. The relative frequencies with which some haplotypes are found in forest versus savanna habitats are indicated by the blue and green areas within a haplotype circle. NCPA inferred past fragmentation for two haplotype trees and past fragmentation followed by range expansion and secondary contact for three trees, with arrows indicating the location in the haplotype trees associated with the inference of fragmentation. Source: Based on data from Roca et al. (2005).

ancestral population/s that were the source of the range expansion will become widespread geographically (large-clade distances). This will sometimes include relatively young haplotypes or clades that are globally rare and often restricted just to the ancestral area. However, some of those young, rare haplotypes in the ancestral source population can be carried along with the population range expansion (haplotype “surfing”), resulting in clade distances that are large for their frequency. Moreover, some haplotypes or clades that arise in the newly colonized areas

Population History

(and being new mutations, tend to be tips) may have small clade distances, but will often be located far from the geographical center of their ancestral range, resulting in large nested-clade distances. An example of a well-documented range expansion is the movement of humans into the New World from Eurasia. Figure 7.11 shows a portion of a nested analysis of some human mtDNA clades that yield a significant signature of range expansion (Templeton 1998a) based upon a survey of mtDNA genetic variation in Torronni et al. (1992). This figure indicates the rough geographical distributions of three 2-step clades of mtDNA haplotypes that are nested together into a single 3-step clade. The oldest clade in this group, 2-6, is found primarily in northeastern Asia, but also in North and Central America. This peculiar geographical distribution caused by the human range expansion into the Americas results in a significant reversal of the clade versus nested-clade distances, with the clade distance being significantly small whereas the nestedclade distance is significantly large. This reversal reflects the fact that almost all copies of this haplotype are found in a relatively restricted region of Asia, but the range expansion caused the geographical center of the 3-step clade within which 2-6 is nested to shift to the east, yielding a significantly large nested-clade distance. The mutation defining clade 2-7 apparently occurred during the expansion into the Americas, and, as a result, this clade was carried along (surfed) with the expansion to yield a widespread distribution within the Americas. Its clade distance of 2431 km is larger than the clade distance of 2103 km for 2-6 even though 2-7 is younger than 2-6. This illustrates a reversal of the expectation under isolation by distance. The mutation defining clade 2-8 occurred after the expansion, being limited to a portion of North America with a significantly small clade distance of 644 km. However, its nested-clade distance is very large (3692 km), indicating that this relatively rare and spatially restricted haplotype clade arose far from the geographical origins of its closest evolutionary neighbors. Note in this case that it is

2–6 Root

2–8

A Significant Range Expansion in Human mtDNA 2–7

Figure 7.11 A portion of the human mtDNA tree from Torroni et al. (1992) as nested in Templeton (1998a). The rough geographical ranges of three 2-step clades nested within a single 3-step clade are shown. Clade 2-6 is the oldest clade as inferred from outgroup rooting, and is indicated by darker shading. The geographical ranges of the two tip clades, 2-7 and 2-8, are indicated by lighter shading. Sources: Torroni et al. (1992) and Templeton (1998a).

261

262

Population Genetics and Microevolutionary Theory

Table 7.2 Expected patterns under isolation by distance, allopatric fragmentation, and range expansion. Isolation By Distance

Allopatric Fragmentation

Range Expansion

DC tends to increase with increasing age, interior status, and nesting level.

DC can abruptly stop increasing with increasing age, interior status, and nesting level.

Some tip or young clades can have significantly large DC’s when their ancestral clades do not.

DN tends to increase with increasing age, interior status, and nesting level and converges with DC with increasing nesting level.

DN can abruptly increase with increasing age, interior status, and nesting level while DC does not.

Some tip or young clades can have significantly large DN’s and significantly small DC’s.

There is an increasing overlap of spatial distributions of clades with increasing age, interior status, and nesting level.

Older, higher level clades can show nonoverlapping or mostly nonoverlapping spatial distributions. Such clades are often connected by long branches in the tree.

Clades with the patterns described in the rows above are found in the same general area that is geographically restricted relative to the species’ total distribution.

the younger clade that has the small clade distance and large nested-clade distance. In contrast, under allopatric fragmentation, older clades tend to have a small clade distance and large nested clade distance. These patterns associated with range expansion are distinct from those generated by either isolation by distance or fragmentation. These and the other patterns described above are summarized in Table 7.2. As can be seen from that table, restricted gene flow, fragmentation, and range expansion can all be distinguished by a detailed examination of the patterns formed by significantly small or large clade and nested clade distances. However, sometimes, a range expansion is accompanied by a fragmentation event. For example, a species colonizes a new area during a favorable climatic period (a range expansion), but subsequent climatic change isolates the colony from the ancestral range (allopatric fragmentation). In such cases, the signature is a mixture of the fragmentation and range expansion patterns (Templeton 2004a). Table 7.2 shows that biological inference is made from patterns of the primary statistics that are calculated and not just a single statistic. Hence, the clade and nested-clade distances and the old versus young contrasts are summary statistics that can be combined for biological inference. Moreover, as pointed out with the impala example (Figure 7.7), sometimes, the pattern associated with significant clade and nested-clade distances is an artifact of inadequate geographical sampling. In light of these complexities (which reflect the reality of evolutionary possibilities and sampling constraints), an inference key is provided with the GEODIS program to make the inferences discussed above and others in a systematic and consistent fashion that takes into account both predictions for coalescent theory and sampling considerations. This inference key has been extensively validated by a set of 150 positive controls (Templeton 2004a, 2008b) from real data sets where prior information existed concerning the existence of fragmentation and/or range expansion events in the evolutionary history of the species under study. No other phylogeographic method has been subject to such an extensive validation with actual data sets. In this analysis, any event that was inferred that was not known a priori was regarded as an error or false positive. Some of these events could be true, but regarding them all as false positives makes this analysis conservative. Despite this conservative criterion for false positives, the false-

Population History

positive rate was at or below the nominal rate (0.05) set by the program. Hence, nested-clade analysis is a reliable method for inferring past fragmentation and range expansion events in the history of a species.

Multiple Patterns in Nested-Clade Analysis Many data sets in the 150 positive controls used to validate the inference key of nested-clade analysis included multiple events with prior knowledge. The statistical analysis of these 150 positive controls revealed that nested-clade phylogeographic analysis inferred these multiple events without any statistical interference, that is, false inference rates were not affected by multiple events in the data set. Hence, nested-clade phylogeographic analysis is an ideal technique for inferring complex evolutionary histories. Figure 7.12 provides an example of inferring multiple events from a single haplotype tree. This figure shows a mtDNA haplotype tree from the tiger salamander, Ambystoma tigrinum, roughly overlaid upon the geographical area sampled (Templeton et al. 1995). There is prior evidence that these salamanders were split into eastern and western groups (formally recognized as different subspecies) during the last glaciation, which created inhospitable conditions in the upper Great Plains. Nested-clade phylogeographic analysis of the mtDNA haplotype tree detects a significant fragmentation event. As shown in Figure 7.12, there is one branch in the tree that consists of 14 mutational steps that is much longer than any other branch in the tree. Moreover, this long branch separates the mtDNA tree into two clades that have strong geographical associations and correspond to the two named subspecies. Clade 4-1 consists of the haplotypes labeled A through F in Figure 7.12 and is found in Missouri. In contrast, clade 4-2 (consisting of haplotypes I through Z) is found from the NE L I R M

CO

U J

K

Y T Y S P

E X

O

KS

Clade 4–2

W

D

Z F

C

MO

A

B

Clade 4–1

Figure 7.12 A rough geographical overlay of the Ambystoma tigrinum mtDNA haplotype tree upon sampling locations found in the states of Missouri (MO), Kansas (KS), Nebraska (NE), and Colorado (CO). Haplotypes are designated by letters and are enclosed within a shape that indicates their rough geographical range. Many of these haplotypes are found as polymorphisms in the same ponds, but are shown as nonoverlapping in this figure for ease of pictorial representation. Consequently, the indicated geographical distributions are only approximate. A line without any tick marks indicates a single restriction site change on that branch in the haplotype tree. Lines with tick marks indicate multiple restriction sites changes, with the number of tick marks indicating the number of restriction site differences. Source: Templeton et al. (1995).

263

264

Population Genetics and Microevolutionary Theory

front range of the Rocky Mountains, across the Great Plains and just barely overlaps clade 4-1 in the northwestern corner of Missouri. Note that, within clades 4-1 and 4-2, there is tendency for geographical range to increase as one goes from tips to interiors, just like the elephants in eastern Africa (Figure 7.8). A nested-clade analysis reveals that much of the pattern of significant clade and nested-clade distances within 4-1 and within 4-2 are due to isolation by distance. As mentioned above, a sudden increase in nested-clade distance without a corresponding increase in clade distance is one signature of fragmentation. This signature is also present in the salamander data. The clade distance of haplotype C in Figure 7.12, the oldest haplotype within 4-1, is 191 km and its nested-clade distance is 189 km. Likewise, the clade distance of haplotype Y, the oldest haplotype within 4-2, is 208 km and its nested-clade distance is also 208 km. The clade distance of clade 4-1 as a whole is 201 km and that of clade 4-2 is 207 km, indicating no further increase in geographical range for these higher level clades. Thus, within clades 4-1 and 4-2, a pattern of isolation by distance prevails that has reached its maximum geographical extent by the time we encounter the oldest haplotypes within each of these clades. The failure of the clade distances of the 4-step clades to increase indicates the abrupt halt of recurrent gene flow. In contrast to the stability of the clade distances at the 4-step level, the nested-clade distances of 4-1 and 4-2 increase significantly to 607 and 262 km, respectively, indicating that both of these 4-step clades are marking fragmented populations located in different geographical areas. Thus, there are a variety of observable patterns that can discriminate fragmentation from isolation by distance. After the end of the Ice Age, these two groups of salamanders have expanded their ranges, coming into contact in northwestern Missouri in historic times. The “stretched out” distributions of many of the haplotypes shown in Figure 7.12 also reflect the pattern associated with range expansion within each subspecies, with the western subspecies expanding to the east, and the eastern subspecies expanding from southeastern Missouri into northwestern Missouri. Because the clades yielding a significant inference of range expansion are nested within clade 4-1 and within clade 4-2 (the clades marking the fragmentation event), these range expansions must have occurred after the fragmentation event. This inference is consistent with historic records that indicate that it was only recently that these two populations came into contact in northwestern Missouri. This pattern of independent range expansion within each subspecies is overlaid upon a pattern of isolation by distance occurring within each subspecies within the confines of an older fragmentation event, as previously noted. Thus, the present-day spatial pattern of mtDNA variation found in these salamanders is due to the joint effects of fragmentation, range expansion, and gene flow restricted by isolation by distance. There is nothing about the evolutionary factors of restricted gene flow, fragmentation events, or range expansion events that make them mutually exclusive alternatives. One of the great strengths of the nested-clade phylogeographic inference procedure is that it explicitly searches for the combination of factors that best explains the current distribution of genetic variation and does not make prior assumptions that certain factors or events should be excluded. Moreover, by using the temporal polarity inherent in a nested design (or by outgroups when available), the various factors influencing current distributions of genetic variation are reconstructed as a dynamic process through time. Hence, nested-clade analysis does not merely identify and geographically localize the various factors influencing the spatial distribution of genetic variation, rather it brings out the dynamical structure and temporal juxtaposition of these evolutionary factors.

Integrating Haplotype Tree Inferences Across Loci or DNA Regions The 150 validating data sets indicated a low rate of false positives for nested-clade phylogeographic analysis, but false negatives were much more common, that is, events known to have occurred in

Population History

these species but were not detected. This is not surprising. No one locus or DNA region can capture the totality of a species’ population structure and evolutionary history. The processes of mutation and genetic drift, which shape the haplotype tree upon which the nested analysis is based, are both random processes, so sometimes the expected pattern will not arise just by chance alone. Moreover, haplotype trees only contain the branches in the coalescent process that are marked by a mutational change, so any haplotype tree-based analysis can detect only those events marked by a mutation that occurred in the right place in time and space. The occurrence of such a mutation is to some extent random, and therefore we expect that a nested-clade analysis will miss some events or processes just by chance alone. The DNA region sampled also determines important properties for inference. For example, mtDNA and Y-DNA tend to have shorter TMRCA’s than nuclear genes (Figure 5.9), and nuclear genes also show much variation in coalescent times (Figure 5.9). Accordingly, different DNA regions cover different time periods, so any one region is an incomplete temporal sample. Moreover, different regions show different rates of mutation, so their ability to mark past events varies (Templeton 1998a, 2004a). The temporal scale of coalescence and the substitution rate of mutations also constrain what a “recurrent” evolutionary process means. When gene flow restricted by isolation by distance is inferred, it does not necessarily mean that gene flow occurred every generation. Rather, such an inference only means that gene flow occurred recurrently relative to the mutation rate at that locus during its period of coalescence, and this can vary from locus to locus. Hence, the potential for a locus to detect a recurrent process can vary across loci. Time depth also influences what we see as recurrent versus historical. Recurrence only means the process occurred repeatedly upon the time scale of coalescence, which varies from locus to locus (Figure 5.9). One method of overcoming these problems is to study multiple loci or gene regions with little or no recombination. This both increases the number of potentially informative mutations and broadens the time range of inference. However, a simple pooling of inferences from many loci would also increase the false-positive rate. This problem can be addressed by cross validation in which an inference must be confirmed by another data set or by subsampling the original data set. In experimental science, cross validation can be achieved by replicating the experiment. However, a particular gene has only one, un-replicated evolutionary history. For phylogeographic inference, we can analyze other genes scattered over the genome that should all share a common history to some extent, that is, we replicate evolutionary history across genes or DNA regions within the genome. By looking at multiple DNA regions rather than just one, we can not only cross-validate our inferences, but we also can obtain a more complete evolutionary history because multiple loci are collectively more likely than any single gene to contain some mutations at the right place in time and space to mark past events or processes. The variation in the temporal period being sampled by a particular locus creates a problem in integrating inferences across loci. When the nested-clade analysis is applied to a single locus, the nesting hierarchy itself gives us the relative temporal sequence of the events and gene flow patterns that we are inferring. However, there is no such simple temporal ordering across loci. What we need is a common time scale for all loci. The molecular clock method of Takahata et al. (2001) was described and used in Chapter 5 to estimate the time of the most recent common ancestral haplotype (TMRCA) for the 25 human DNA regions (Figure 5.9) using a 6 000 000 year ago calibration for the divergence of humans and chimpanzees (Haile-Selassie 2001; Pickford and Senu 2001). Errors in the calibration point would shift the estimated times for all loci by an equal proportional amount, so such errors do not affect cross validation across loci but only affect absolute dating.

265

266

Population Genetics and Microevolutionary Theory

The procedure of Takahata et al. (2001) can also be used to date any clade in a rooted haplotype tree by calculating the ratio of the average nucleotide differences within the clade, say k, to one-half the average nucleotide difference between an outgroup species and the species of interest. The method of Takahata et al. (2001) estimates the mean age of a clade but not the variance. Tajima (1983) showed that given k, the average pairwise number of nucleotide differences among present-day haplotypes across the node in the tree to be aged, then the expected time to coalescence, T, is T=θ 1+k

2μ 1 + nθ

74

with a variance of σ2 = θ2 1 + k

4μ2 1 + nθ

2

75

where n is the number of nucleotides that were sampled, μ is the mutation rate, and θ is the expected nucleotide heterozygosity (this is a different but equivalent parameterization of that given by Tajima). In this case, we estimate T using the procedure of Takahata et al. (2001). Note that Eq. (7.4) can be substituted into Eq. (7.5) to yield: σ2 = T 2 1 + k

76

The above variance reflects the fact, as shown in Chapter 5, that the variance of the coalescent process is proportional to the square of the mean (in this case the mean is T). It also reflects the fact that our ability to obtain accurate estimates from a molecular clock depends upon the mutational resolution, in this case measured by k. If very few mutations have occurred since an event happened, Eq. (7.6) tells us that we will not be able to estimate the time of the event accurately using this phylogenetic approach based upon a molecular clock. Kimura (1970) has shown that the overall distribution of time to coalescence is close to a gamma distribution (Appendix B). The mean calculated from the procedure of Takahata et al. (2001) and the variance calculated from Eq. (7.6) can be used to estimate the gamma probability distribution of the haplotype clade that marks the inference of interest. In this manner, the ages of the inferred events and processes can be regarded as random variables rather than known constants. Hence, time is regarded as being sampled with error, with the gamma distributions giving the sampling distribution. We now can begin a formal cross-validation analysis across loci. First, a nested-clade phylogeographic analysis is performed on each locus separately. Second, we retain only those inferences observed in two or more genomic regions. For example, Figure 7.10 shows that all five haplotype trees in African elephants yield a significant inference of fragmentation, so this inference is retained at this step. Third, we retain only those inferences that survive step 2 that are also geographically concordant at two or more genomic regions. All five of these significant inferences of fragmentation in Figure 7.10 separate the elephants into a forest population in central west Africa and a savanna population that surrounds it, albeit with some limited introgression of forest population haplotypes into the savanna population. Hence, all five fragmentation events are geographically concordant and retained at step 3. However, the question still remains: are all five genomic regions detecting the same fragmentation event, or were there multiple forest/savanna fragmentation events in the evolutionary history of African elephants? To address this question, we now ask in the subset of inferences that survived steps 1, 2, and 3, are the inferences also temporally concordant across loci? Temporal concordance can be tested with the gamma distributions. First, consider events such as fragmentation or range expansion events. Templeton (2004b) showed that the maximum-

Population History

likelihood ratio test (Appendix B) of the null hypothesis that there is only one event (all gamma distributions share a common T) versus the alternative of multiple events (each locus or DNA region is associated with a unique T, say ti for locus i) is given by j

G = −2

1 + ki i=1

1−

ti + ln t i − ln T T

77

where j is the number of loci that detected the event, ti is the time estimated for the event from locus i, ki is the average nucleotide divergence at locus i used to estimate ti, and j

ti 1 + ki T=

i=1 j

78 1 + ki

i=1

is the maximum-likelihood estimator of the time of the event under the null hypothesis that all loci are detecting a single event. The test statistic G is asymptotically distributed as a chi-square with j−1 degrees of freedom. Small values of G favor the hypothesis of a single event, whereas large values favor the hypothesis of many distinct events. Figure 7.13 shows the five gamma distributions obtained using the Takahata method and Eq. (7.6) for the five genomic regions displaying geographically concordant inferences of fragmentation of African elephants. Applying test 7.7 to the five locus specific gamma distributions yields G = 1.497 with 4 degrees of freedom, yielding a p-value of 0.8272. The five genomic regions are also temporally concordant in inferring a fragmentation event. Hence, the null hypothesis of a single fragmentation event in African elephants is accepted. Moreover, the maximum-likelihood estimate of the time of this fragmentation event is 4 200 000 years ago using Eq. (7.8) and a 5 000 000 years ago calibration for the split between Asian and African elephants (Palkopoulou et al. 2018).

0.5

Probability Density Function

mtDNA Y-DNA

0.4

BGN

0.3 PLP 0.2

PHAK2 0.1

0.0

0

2

4

6

8

10

Time (Millions of Years Before Present)

Figure 7.13 The gamma distributions for the times of the inferred fragmentation events of African elephants into forest and savanna populations from five different genomic regions, as shown in Figure 7.10.

267

268

Population Genetics and Microevolutionary Theory

Pooling together the inferences from j homogeneous loci also results in a gamma distribution with mean and variance (Templeton 2004b): j

ti 1 + ki Mean = T =

i=1 j

79 1 + ki

i=1 j

Var T =

j

1 + k i 2 Var t i

1 + ki ti 2

i=1

=

2

j

i=1

1 + ki

7 10

2

j

1 + ki

i=1

i=1

Equations (7.9) and (7.10) can be used to generate the gamma distribution for the pooled data, which in turn can be used to generate 95% confidence estimators for the pooled mean. For the inference of fragmentation in African elephants, the 95% confidence interval for the pooled date of 4 200 000 years ago is 3 100 000–5 600 000 years ago, using symmetric tails. However, since we used a calibration date of 5 000 000 years, we can condition on the constraint that the confidence interval cannot extend beyond 5 000 000 years. Using this conditional distribution, the 95% confidence interval is 3 200 000–5 000 000 years ago. Similarly, three of the genomic regions in the African elephant data inferred not only a fragmentation event but also secondary genetic exchange (resulting in limited introgression from the forest population into the savanna population). Applying test 7.7 to the three locus-specific gamma distributions of Y-DNA, mtDNA, and PLP (Figure 7.9) yields G = 0.043 with 2 degrees of freedom, yielding a p-value of 0.979. Hence, the null hypothesis of secondary contact following fragmentation is accepted at 3 700 000 years ago (95% confidence interval: 5 000 000–2 700 000 years ago). Equation (7.7) is appropriate for testing temporal concordance of events, but not recurrent evolutionary forces. We therefore now turn our attention from events to recurrent processes in the evolutionary history of a species. Because gene flow is a recurrent evolutionary force, there is no expectation that different inferences of restricted gene flow from the various genes should be temporally concordant, in contrast to historical events such as fragmentation events or rapid range expansions. Instead, we cross-validate the inference of gene flow in a given time period by quantifying the amount of overlap of the gamma distributions in that time range. The LRT of the null hypothesis of no gene flow (isolation) between geographic areas identified by the nested-clade analysis in the time period between time l and time u is (Templeton 2009a): u

j

ℓn 1 −

LRT isolation in l, u = − 2 i=1

l

t i ki e − ti Ti 1 + ki

1 + ki T i

1 + ki

Γ 1 + ki

dt i

7 11

where Γ is the standard gamma function (Appendix B). An example of the use of Eq. (7.11) is given by the multi-locus nested-clade phylogeographic analysis of 25 genomic regions from human populations over the globe (Templeton 2004b, 2005, 2013, 2015). Figure 7.14 presents all the cross-validated results of restricted gene flow. As can be seen, there are many cross-validated inferences of various types of restricted gene flow in human evolutionary history, primarily isolation by distance. However, not all inferences of isolation by distance were cross-validated temporally. For example, Figure 7.15 gives the gamma distributions for the times of restricted but nonzero gene flow between Eurasia and sub-Saharan Africa as inferred from

Population History

Time

Africa

S. Europe N. Europe

Gene Flow with Isolation by Distance and Some Long Distance Dispersal Shown by mtDNA, Y-DNA, X-linked DNA and Autosomal DNA

S. Asia

N. Asia

Pacific

Americas

Range Extensions Into New Areas Shown by EDN, mtDNA, MS205, MC1R, MX1, and TNFS5F

Male-mediated Out of Asia Expansion With Admixture Shown by Hb β and Y-DNA Out of Africa Expansion of Homo sapiens With Admixture With Eurasian Populations Shown by HFE, HS571B2, RRM2P4, mtDNA, and Y-DNA

0.13 (0.10 to 0.17) MYA

Gene Flow with Isolation by Distance and Some Long Distance Dispersal Shown by CYP1A2, ECP, G6PD, HFE, Hb β, MSN/ALAS2, RRM2P4, and Xq13.3 Acheulean Out of Africa Expansion With Admixture With Eurasian Populations Shown by FUT6, G6PD, Hemoglobin β, HFE, Lactase, MS205, and MC1R

0.65 (0.39 to 0.97) MYA

Out of Africa Expansion of Homo erectus Shown by CYP1A2, FUT2, and Lactase 1.90 (0.99 to 3.10) MYA Africa

S. Europe

S. Asia

Figure 7.14 Recent human evolution as reconstructed from all cross-validated inferences made using nested-clade analyses on 25 DNA regions. Major range expansions of human populations are indicated by red arrows. Genetic descent is indicated by vertical lines and gene flow by diagonal lines. The cross-validating DNA regions are indicated for all significant inferences. The estimated dates and 95% confidence intervals for the three out-of-Africa range expansions are given on the left. See Templeton (2015) for details. Source: Templeton (2015).

19 clades from the 25 gene regions. The gene MX1 (shown by a dashed line in Figure 7.15) was excluded because it did not temporally cross validate with any other locus and was later found to be an artifact of a paralogous copy (see Chapter 12) that had inadvertently been included in the original sample. Seven of the remaining inferences of gene flow with isolation by distance between Africa and Eurasia were dated to the early Pleistocene in the time interval between the first out-of-Africa expansion (1 900 000 years ago) and the Acheulean expansion (650 000 years ago) (Figure 7.14). The likelihood ratio test of the null hypothesis of isolation (no gene flow) between African and Eurasian populations given by Eq. (7.11) yields a log-likelihood ratio statistic of 12.08 with 6 degrees of freedom, which is not significant at the 5% level. Hence, after the first

269

Population Genetics and Microevolutionary Theory

3

2

f(t)

270

1

0

0

0.5

1.0 1.5 2.0 Time in Millions of Years Before Present

2.5

3.0

Figure 7.15 The distributions for the ages of the youngest clade contributing to a significant inference of restricted gene flow, primarily with isolation by distance, between Eurasia and sub-Saharan Africa. The x-axis gives the age in millions of years before present, and the y-axis gives the gamma probability distribution, f(t). The genes or DNA regions yielding these distributions are, as ordered by their peak values of f(t) going from left to right, Xq13.1, MSN/ALAS2, HFE, FIX, HFE, G6PD, bHb, ECP, RRM2P4, EDN, PDHA1, CYP1A2, FUT2, FUT6, FUT6, FUT2, CYP1A2, CCR5, and MX1 (see Templeton 2005 for details on the genes). The curve for MX1 is shown in a dashed line to emphasize its outlier status. Source: Templeton (2002b).

expansion of humans (Homo erectus) into Eurasia, there was no significant gene flow between Eurasian and African populations up to the time of the Acheulean expansion, and we cannot reject the null hypothesis that these geographic populations were genetic isolates. Hence, these seven inferences of restricted gene flow are also eliminated by the temporal cross-validation criterion given in Eq. (7.11) in addition to the one inferred from MX1. In the time interval between the Acheulean expansion (650 000 years ago) and the expansion of anatomically modern humans out of Africa (130 000 years ago), there are 11 inferences of gene flow between Africa and Eurasia, with Eq. (7.11) yielding a log-likelihood ratio statistic of 23.94 with 10 degrees of freedom, which yields a p-value of 0.008. Hence, the null hypothesis of isolation between Africa and Eurasia is rejected during this time interval and restricted gene flow between Eurasia and sub-Saharan Africa in the late Pleistocene is cross validated. This test result indicates that humans by 650 000 years ago had the capability of moving both in and out of Africa and did so on a recurrent basis. An earlier nested-clade phylogeographic analysis on 10 genomic regions in humans resulted in a model very similar to that shown in Figure 7.14 (Templeton 2002b). The dominant and extremely popular model of human evolution at the time was the out-of-Africa replacement hypothesis, shown in Figure 7.16. The nested-clade model given in Templeton (2002b) was extremely controversial when published because of four major inconsistencies with the popular out-of-Africa replacement model. First, and most importantly, the nested-clade model indicated that there was not a complete replacement of archaic Eurasian populations by anatomically modern populations dispersing out of Africa. Instead, the nested-clade analysis (Figure 7.14) indicated that there was a small but highly significant amount of admixture, resulting in a “mostly out-of-Africa” model rather than complete replacement (Templeton 2002b). Second, the nested-clade analysis indicated significant gene flow, albeit restricted, between Eurasian and sub-Saharan populations from about 650 000 years ago to the present – not complete isolation as shown in Figure 7.16. Third, the nested-

Population History

Figure 7.16 The out-of-Africa replacement hypothesis. This hypothesis presents human evolution as a population tree, with the base defined by Homo erectus dispersing out of Africa to establish populations in Europe and Asia. Much later, about 60 000–50 000 years ago, anatomically modern Homo sapiens populations, which had evolved earlier in Africa, expanded out of Africa into Eurasia. These modern humans did not interbreed with the archaic Eurasian populations but rather drove them to complete extinction and replaced them, as indicated by the broken lineage lines associated with the earlier European and Asian populations.

Africans

Europeans

Asians

Dispersal of Homo sapiens Out of Africa About 60,000 to 50,000 Years Ago

Dispersal of Homo erectus out of Africa at > 1 MYA

Out-of-Africa Replacement

clade analysis indicated that there was a two-stage dispersal of modern humans out of Africa, with the first stage into the southern tier of Eurasia at about 130 000 years ago (about twice the age accepted by proponents of the replacement hypothesis) followed by a second stage 50 000 years ago or less into the more northern regions of Eurasia. Fourth, the nested-clade analysis found a significant population expansion out of Africa in the mid-Pleistocene that also resulted in admixture. This mid-Pleistocene expansion out of Africa was unexpected and was not present in any of the models of human evolution that were popular at the time, and in particular the replacement model. Despite archeological evidence indicating a cultural expansion into Eurasia of the African Acheulean stone tool culture during this time period (Hou et al. 2000), the inference of an Acheulean expansion and admixture (Figure 7.14) was ill received and was not incorporated into the models of human evolution by many others. Despite its controversial reception, the conclusions of the multi-locus nested-clade phylogeographic analysis of human evolution have withstood the test of time, whereas the alternatives, including the out-of-Africa replacement model, have not. Ancient DNA studies have vindicated the nested-clade inference of limited admixture between the modern human populations dispersing out-of-Africa with archaic Eurasian populations (reviewed in Templeton 2018a, b). Ancient DNA and extensive genomic surveys confirmed the nested-clade inference of admixture/gene flow since the mid-Pleistocene within Eurasia (reviewed in Templeton 2018a, b) and between Eurasia and sub-Saharan Africa (Chen et al. 2020b). Moreover, newer studies on current human genetic variation also indicate past movement of genes from Eurasia into Africa (Groucutt et al. 2015; Cabrera et al. 2018). Nongenetic data also support the inference of gene flow between Eurasia and subSaharan Africa. Larrasoña (2012) reconstructed the paleoclimatic history of the Saharan Desert over the past 350 000 years, and found wet, “green Sahara” phases and expansion of subtropical savannas at 330,000; 285,000; 240,000; 215,000; 195,000; 170,000; 125,000; and 80,000 years ago. Hence, there were multiple, recurrent climatic phases that would have allowed humans to disperse out of and into sub-Saharan Africa. The Homo fossil record also supports the conclusion of widespread gene flow since the mid-Pleistocene. Fossils of multiple individuals over short time spans

271

272

Population Genetics and Microevolutionary Theory

from sites from the Middle and Late Pleistocene have revealed extreme variability within a single site coupled with remarkable similarity between sites that are roughly contemporaneous, implying “sporadic, but continuing multidirectional migrations and gene flow” (Simmons 1999, p. 107). The oldest fossils with key morphological features of modern humans are no longer from sub-Saharan Africa but instead are from Morocco in northwestern Africa at 315 000 years ago and South Africa at 259 000 years ago, indicating a pan-African origin of Homo sapiens (Hublin et al. 2017). No matter how the origin of H. sapiens is interpreted in light of these fossil discoveries, human populations had to be dispersing across the Sahara around 300 000 years ago or before, supporting the inference of trans-Sahara gene flow by Nested-Clade Phylogeographic Analysis (NCPA) in this time period (Figure 7.14). The date of 130 000 years ago for the expansion of modern humans out of Africa is now well supported by many diverse data sets. New genetic data sets also date the initial expansion of modern humans out of Africa to 125 000 years ago (Cabrera et al. 2018), between 90 000–130 000 years ago (Scally and Durbin 2012), 115 000 years ago (Scozzari et al. 2014), and 120 000 years ago (Groucutt et al. 2015). Nongenetic data also support the nested-clade date. Of the wet periods of green Sahara over the last 350 000 years, one of the most extreme (and therefore most optimal for population dispersal) was the one occurring around 125 000 years ago (Larrasoña 2012), which also overlapped with an Arabian Peninsula wet phase 130 000–125 000 years ago, which would further facilitate dispersal into Eurasia (Jennings et al. 2015). Kutzbach et al. (2020) gave evidence for wetter conditions in northern Africa and the Arabian peninsula at about 125 000 years ago that would increase vegetation and narrow the width of the Saharan–Arabian desert and semidesert zones, thereby facilitating dispersal into Eurasia. Hence, nested-clade analysis dates modern humans as dispersing out of Africa at one of the most climatically optimal times in the last 350 000 years, in great contrast to the replacement model. The fossil and archeological records also support the date of 130 000 years ago with a 95% confidence interval of 100 000 to 170 000 years ago (Figure 7.14). Many anatomically modern traits and archeological features appeared in pan-Africa around 300 000 years ago (Hublin et al. 2017), and then first appeared out of Africa in the Levant by about 177 000 years ago (Hershkovitz et al. 2018). Between 120 000 and 80 000 years ago, there is extensive archeological and fossil evidence for “modern” humans throughout the southern tier of Eurasia, including far eastern China (Liu et al. 2010a, 2015; Bae et al. 2017), consistent with Figure 7.14. Between 73 000 and 63 000 years ago, “modern” humans were in Sumatra (Westaway et al. 2017), Arabia between 95 000 and 86 000 years ago (Groucutt et al. 2018), and Australia by 65 000 years ago (Clarkson et al. 2017). Hence, the expansion of “modern” humans was already extensive throughout the Old World before the out-of-Africa replacement model predicted it even began (Figure 7.16). Finally, the Acheulean expansion is now more strongly supported by archeological, paleoclimatic, and paleontological data, as reviewed in Templeton (2018a, b). New genetic data also vindicate the nested-clade inference of a mid-Pleistocene expansion with admixture. Rogers et al. (2020) used bootstrapping to fit various historical models to human genomic data. Their best model had an out-of-Africa expansion around 700 000 years ago with admixture with the Eurasian populations descended from the initial colonization of Eurasia by African populations in the early Pleistocene, a result completely consistent with the significant nested-clade inference made in Templeton (2002b) and shown in Figure 7.14. Hence, all four of the controversial inferences made by multi-locus nested-clade phylogeographic analysis have been vindicated and strongly supported by subsequent discoveries and analyses. Multi-locus, nested-clade, phylogeographic analysis has also been vindicated by computer simulations. Knowles and Maddison (2002) performed computer simulations to test the validity of nested-clade analysis. Although the multi-locus version (Templeton 2002b) had been published

Population History

10,000

t = 5,000 Generations = 1/8 TMRC

10,000

10,000

t = 5,000 Generations = 1/8 TMRC

10,000

10,000

10,000

10,000

Figure 7.17 An outline of the micro-vicariance simulations performed by Knowles and Maddison (2002). The simulation starts with a large ideal population of 10 000 individuals (top oval) that then undergoes two rounds of rapid binary fragmentation and isolation without bottlenecks to produce four large isolates, shown by the ovals at the bottom of the figure.

9 months before, Knowles and Maddison only examined the single-locus version from the 1990s. Figure 7.17 shows the design of their simulations. They simulated a large population of 10 000 ideal individuals. This initial population underwent two rounds of splits followed by complete isolation with no bottlenecks. Moreover, the time between these splits was only 5000 generations, which is about 1/8 the expected coalescence time to the most recent common ancestral molecule. The four isolates at the bottom of Figure 7.17 were sampled for a single-locus nested-clade analysis. Because there were no bottlenecks in this simulation and the time between fragmentation events was short in terms of coalescence time, these simulations ensured much retention of ancestral polymorphism and lineage sorting, thereby making this an extremely difficult inference problem. Moreover, this simulation is one of micro-vicariance, which was explicitly excluded from single-locus nested-clade analysis (Templeton et al. 1995). It was therefore not surprising that the single-locus analysis could not accurately reconstruct this evolutionary history. Moreover, Knowles and Maddison acknowledged that their own phylogeographic approaches also did poorly with these simulated data sets. To see if these difficulties would extend to the multi-locus nested-clade phylogeographic analysis that was already available to Knowles and Maddison before they published their work, Templeton (2009b) regarded their 10 simulations as a single data set of 10 different genes and analyzed the simulated data in exactly the same manner as the previously published multi-locus, nested-clade, phylogeographic analysis of 10 genomic regions (Templeton 2002b). Now, the simulated scenario was inferred with complete accuracy and no false positives (Templeton 2009b). Hence, the simulations of Knowles and Maddison reveal the enhanced ability of the multi-locus version of nested-clade analysis to reconstruct a very difficult evolutionary history without creating false positives. Panchal and Beaumont (2010) performed a more complex range of simulations that included several models of gene flow of varying intensities. They then evaluated multi-locus nested-clade analysis using five simulated loci. Table 7.3 gives their results on the probabilities of various false positives.

273

274

Population Genetics and Microevolutionary Theory

Table 7.3 The probabilities of false positives under multi-locus, nested-clade, phylogeographic analysis of five loci under four different models of gene flow in the simulations of Panchal and Beaumont (2010). Probability of a False Positive Gene Flow

Events

Model

IBD

IBD + LDD

CRE

FRAG

LDC

Panmictic

0.1639

0.0042

0.0028

0.0000

0.0000

IBD



0.2792

0.0278

0.0028

0.0031

IBD + LDD





0.0406

0.0021

0.0021

Island

0.5306



0.0368

0.0007

0.0014

Note: The gene flow models are panmictic, isolation by distance (IBD), isolation by distance plus long-distance dispersal (IBD + LDD), and the Island model (which has LDD). The events include contiguous range expansion (CRE), allopatric fragmentation (FRAG), and range expansion through long-distance colonization (LDC). Source: Panchal and Beaumont (2010). © 2010, Oxford University Press.

The nominal false-positive rate was set at 0.05. As can be seen, the false-positive rates for all three events were below 0.05, often substantially, under all models of gene flow. Hence, multi-locus nested-clade analysis has a very low type II error rate for events. The error rate exceeded the nominal rate of 0.05 only for inferences about gene flow (Table 7.3). However, unlike Panchal and Beaumont’s implementation of the cross-validation procedure for events that included temporal cross-validation, Panchal and Beaumont misrepresented nested-clade analysis by stating that gene flow inferences “are not subjected to further ‘cross-validation’ because there is no stipulation that the inferences should be concordant across time” (Panchal and Beaumont 2010, p. 418). Their references for implementing nested-clade analysis included Templeton (2009a), which explicitly discussed the need for crossvalidating inferences of gene flow and derived the test given by Eq. (7.11) along with worked examples, and Templeton (2004b) that gave an earlier version of Eq. (7.11) that only dealt with intervals from a time in the past to the present (to be used later in this chapter). Recall that of the 19 inferences of Eurasian-African gene flow, 8 (42%) were eliminated by the use of Eq. (7.11) as false positives. This example reveals that the recommended cross-validation test is necessary for gene flow inferences. Because Panchal and Beaumont (2010) deliberately did not perform the recommended temporal cross-validation test, their error rates on gene flow inferences are patently overinflated. After correcting for this misrepresentation, the simulations of Panchal and Beaumont (2010) show that multilocus, nested-clade, phylogeographic analysis when implemented as recommended has extremely low false-positive rates for all types of inferences.

Model-Based Approaches to Phylogeographic Analysis Another approach to phylogeographic analysis is to propose a detailed model of evolutionary history or a set of alternative histories, and then evaluate how well a given proposed history fits the observed genetic data. One method of implementing this approach is to execute computer simulations of the proposed model/s. Simulations have long been used in population genetics to assess models, but this approach has benefitted from tremendous increases in computational power, more efficient simulation algorithms, and the development of more statistically sophisticated methods for judging goodness of fit and testing alternative models.

Population History

Prior Distribution of Model Parameter θ

Observational Data

μ 1 Compute Summary Statistic

θ1

μ from Observational Data

Simulation 1

θ2

Simulation 2

θ3

... θn

2 Given a Certain Model, Perform n Simulations, Each with a Parameter Drawn from the Prior Distribution

Simulation 3

Simulation n

... 3 Compute Summary Statistic μi for Each Simulation

μ1

μ2

μ3

μn

? ρ(μi,μ) ≤ ε 4 Based on a Distance ρ(•, •) and a Tolerance ε, Decide for Each Simulation Whether its Summary Statistic is Sufficiently Close to that of the Observed Data.

Figure 7.18

Posterior Distribution of Model Parameter θ

5 Approximate the Posterior Distribution of θ from the Distribution of Parameter Values θi Associated with Accepted Simulations.

A flow diagram of Approximate Bayesian Computation. Source: Sunnåker et al. (2013).

One of the more commonly used simulation-based methods in phylogeography is Approximate Bayesian Computation (ABC) (Beaumont et al. 2002; Beaumont 2010; Sunnåker et al. 2013). ABC phylogeographic analysis starts by the specification of one or more models of evolutionary history that are fully parameterized, that is, there are parameters for effective size, gene flow rates, admixture proportions, times for range expansions or fragmentation events, etc. This is a Bayesian approach (Appendix B) in which all parameters are treated as random variables with a prior probability distribution. The parameter values for any given run of the simulated model are sampled from these priors (Figure 7.18). The output of these runs is a simulated genetic data set, and a set of summary statistics are calculated for each simulation run (Figure 7.18). The summary statistics (e.g. SFS’s, Tajima’s D, linkage disequilibrium patterns, expected heterozygosities, various f/F statistics, genetic distances, shared IBD segments, etc.) can vary considerably depending upon the nature of the data (SNPs, haplotypes, microsatellites, etc.) and the nature of the model and desired

275

276

Population Genetics and Microevolutionary Theory

inference (population tree model, admixture model, isolation by distance model, etc.). The summary statistics from the simulation are then compared to the same summary statistics calculated from the observed data. Simulations that yield summary statistics that are sufficiently close (called the tolerance) to the observed summary statistics are retained and used to approximate the local posterior distribution of the model parameters (Figure 7.18). Once the posterior distributions are approximated, they can be used in a variety of standard Bayesian inference procedures. Moreover, formal testing of alternative models is possible, sometimes by equating the proportions of simulations that each model contributes to the posterior distributions of the alternative models under consideration. Choosing summary statistics is an important aspect of ABC (Cooke and Nakagome 2018). This is particularly true when attempting to choose among different models, as model choice under ABC can become highly inaccurate with a poor choice of summary statistics (Prangle et al. 2014). One wants to use a sufficient number of summary statistics to capture the relevant information contained within the data. However, as the number of summary statistics goes up, the more difficult it becomes to find a close match between simulated and observed summaries, which in turn makes it more difficult to obtain a good local approximation to the posterior distributions – often called “the curse of dimensionality.” ABC only attains a good approximation when the summary statistics are highly informative and limited in number. Therefore, effort spent on choosing appropriate summary statistics is essential for ABC. For example, Dellicour et al. (2014) were interested in range expansion, allopatric fragmentation, and isolation by distance in phylogeographic inference. They used simulations to first evaluate how informative 12 summary statistics would be with respect to these three evolutionary processes. They found that the summary statistics that were most informative varied with the desired inference (range expansion, fragmentation, and isolation by distance) and that each statistic is useful only in a portion of the possible parameter space. Jay et al. (2019) evaluated many potential summary statistics on ABC performance with 1000 simulated data sets. Their simulations included five different phylogeographic scenarios, and they discovered that different combinations of summary statistics worked best for the different scenarios. Smith and Flaxman (2020) showed that different summary statistics worked better for a RADseq (Appendix A) data set versus a short genomic sequence data set. These studies demonstrate that there is no universal set of summary statistics that works for all phylogeographic analyses and data types and that the choice of appropriate summary statistics should strongly vary among studies. Although much attention has been focused on choosing summary statistics and improving computational efficiency and the accuracy of the approximations in ABC, little attention has been given to assigning priors to all the parameters (Figure 7.18). This of course is not a problem limited to ABC but is common to all Bayesian procedures. For example, the program STRUCTURE (Chapter 6) uses a Bayesian procedure to assign individuals or portions of their genomes to various populations. This program contains a default ancestry prior that assumes that all source populations contribute equally to the pooled sample of individuals to be sorted. For many data sets, this is not a good assumption and yields poor individual assignments that can be substantially improved by using a different prior (Wang 2017). As Wang notes, having a default prior makes the program easy to use (by allowing the user to ignore the task of choosing a prior), but makes STRUCTURE easy to misuse (by allowing the user to passively choose an inappropriate prior). Unfortunately, despite the popularity of Bayesian procedures in population genetics, many users invest little or no effort in choosing priors, and reviewers and editors typically ignore this problem as well. For example, STRUCTURE is used extensively in the population genetic literature, but rarely do the authors even mention the prior that they used or justify the default prior that they passively accepted for their data set. See Appendix B for further discussion of this important problem.

Population History

The fundamental rationale for Bayesian statistics is that prior information can be used for statistical inference (Appendix B). Indeed, the prominent statistician Bradley Efron only uses Bayesian analysis in the presence of genuine priors (Efron 2013), that is, priors based on previous, informative data. A second class of priors are Laplace priors, that is, noninformative priors based on no actual prior information but are a device to use Bayes’ theorem and to explore Bayesian analyses. Such noninformative priors should be viewed as exploratory and subjective, have poor properties in multi-dimensional models, and require later reassessment for performance and reproducibility (Fraser et al. 2016). The final class of priors is opinion priors that are based on opinion and subjective views rather than actual prior data. Opinion priors generally should be avoided in a Bayesian analysis (Fraser et al. 2016). The first step in choosing a prior is to discover what prior information exists. To illustrate how this step should be executed, we will examine the ABC phylogeographic analysis of human evolution presented by Fagundes et al. (2007), often presented as an exemplar of ABC phylogeographic analysis. Figure 17.19 shows the three basic classes of models they considered for human evolution. In all, 32 priors were needed to describe these models, and all 32 were uniform or log-uniform distributions. Uniform distributions are typically the noninformative Laplace priors invoked for a continuous variable bounded between two limits or for a bounded discrete variable with constant probabilities assigned to all the discrete outcomes possible within these bounds. For example, one of the parameters in their models is M, the proportion of admixture of an expanding modern African population with archaic Eurasians. M can vary between 0 (no admixture with archaic Eurasians, that is, replacement of the archaic Eurasians by the expanding populations) and 1 (replacement of the expanding populations by archaic Eurasians). Fagundes et al. placed a noninformative uniform prior on M over the interval of 0 to 1. Noninformative, uniform priors often work with low dimension problems (see a worked example in Appendix B), but even in low dimensions, such noninformative priors can lead to incorrect and misleading inferences in realistic biological situations (Link 2013). Moreover, there is a widespread notion that default priors are noninformative, but this is not the case in many realistic biological situations and can result in “pathological properties” (Banner et al. 2020). In higher dimensions, many uniform priors jointly result in poor mathematical properties (Fraser et al. 2016) and greatly increase the uncertainty state space, making it likely that approximation algorithms will miss completely high probability regions of the posterior distribution (Joseph et al. 2019) and yield poor model discrimination (Shao et al. 2019). Extensive use of uniform priors in high dimension models is not good Bayesian practice. However, 31 of the 32 uniform priors invoked by Fagundes et al. were in some sense informative. In these cases, Fagundes et al. restricted the range of the uniform or log-uniform priors to a smaller interval than what is theoretically possible. For example, the time of the out-of-Africa expansion of modern humans was assigned a uniform prior of 1600–4000 generations ago, which translates into 40 000–100 000 years ago with a generation time of 25 years, or 32 000–80 000 years with a generation time of 20 years. The practice of using restricted range uniform priors is prone to severe statistical artifacts (see Appendix B for a worked example) unless the prior data are so strong that a zero probability of being outside the range is justifiable with absolutely no chance of error (Garthwaite et al. 2005). This absolute certainty is not the case here. Even by 2007, there was prior information available that the date of this out-of-Africa expansion event could be greater than 80 000–100 000 years ago. First, there was the multi-locus, molecular clock estimation of 130 000 years ago (Templeton 2002b). Although the fossil evidence is now extensive on this point, even before 2007, there was evidence of modern humans outside of Africa between 100 000–135 000 years ago (Grun et al. 2005; Vanhaeren et al. 2006), as well as archeological evidence (Vanhaeren et al. 2006). Similar considerations exist for many of the other priors of limited range used by

277

278

Population Genetics and Microevolutionary Theory

Fagundes et al. (2007). Did Fagundes et al. simply ignore all this prior information or were they certain without any possibility of error that all this prior information was completely false and therefore deserved to be assigned a probability of zero? Unfortunately, there is no way to answer this question because Fagundes et al. provided references to justify only 2 of their 31 priors of restricted range. Thus, as far as a reader can tell, 29 of their priors are opinion priors that should not be used in a proper Bayesian analysis (Fraser et al. 2016). The very first step of any Bayesian analysis should be to scour the literature for prior information and use that information in constructing priors. Unfortunately, the approach of just invoking priors (or using defaults without justifying them) is the norm in much of the biological literature, so the Fagundes et al. (2007) paper is not an outlier in this regard. Using priors of restricted range in particular is not only poor Bayesian practice (Garthwaite et al. 2005; see Appendix B and the discussion of Cromwell’s Rule in Bayesian statistics), it is easily avoided (Lemoine 2019). For example, a beta distribution can be used for variables with a fixed 0-1 range that covers the entire 0-1 range but with parameters that can be adjusted to concentrate almost all of the probability into the subrange indicated by prior data, as shown in Appendix B. Also shown in Appendix B is that Bayesian analyses based on probabilities of zero are prone to pathological artifacts whereas analyses based on assigning very small probabilities (but not zero) outside a desired range are much more robust. For any other variable with a restricted range different from 0 to 1, it is mathematically straightforward to transform the 0-1 beta distribution to cover any finite range desired. For variables with a fixed limit on only one end, such as time, a gamma distribution, such as those used in the multi-locus nestedclade analysis, can be used. As with the beta distribution, the gamma parameters can be adjusted to place most of the probability mass wherever the prior information indicates. Priors of restricted range should never be used in a Bayesian analysis unless prior data truly indicates with absolute certainty that the restriction is correct without any possibility of error. Another serious problem in Fagundes et al. (2007) and in many other Bayesian analyses in biology is ignoring the logical relationships among the models being compared. One of the hallmarks of Bayesian analyses is that it always remains in the realm of probability measures (Appendix B). One fundamental property of a probability measure is that if A is a proper logical subset of B (i.e. event A is a special case of the broader event B), then the probability of A must be less than or equal to the probability of B. Note from Figure 7.19 that the out-of-Africa replacement model is a special case (M = 0) of the more general out-of-Africa with possible admixture model (0 ≤ M ≤ 1). Fagundes et al. (2016) estimated the posterior probability of the out-of-Africa replacement model as 0.781, the posterior probability of out-of-Africa with possible admixture model as 0.001, and the posterior probability of the multiregional model as 0.218. On this basis, they strongly rejected the model in which there was any admixture between the expanding African populations with archaic Eurasian populations – favoring the popular replacement model at that time. However, note that 0.781 > 0.001, so the special case (M = 0) is much more “probable” than the general case (0 ≤ M ≤ 1) within which it is nested. When a probability measure violates the mathematical constraints of formal logic and measure theory, the result is said to be incoherent (Gabriel 1969) – a patently highly undesirable statistical outcome. Fagundes et al. (2007) arrived at an incoherent result because they treated all of their models as mutually exclusive and exhaustive, rather than as nested or overlapping. However, Bayesian model comparisons are sensitive to nesting (Lavine 1991; Shao et al. 2019), as demanded for any probability measure. Moreover, although prior evidence indicated that M > 0, it also indicated that M was small (Templeton 2002b). In ABC, this meant that simulations with M = 0 would often be in the tolerance limit (Figure 7.18), whereas simulations with M’s greater than zero sampled from a uniform on 0-1 would not (e.g. 90% of the sampled M’s would be greater than 0.1). This is an artifact of the tolerance limits used in the approximation and illustrates that

Population History Africa

Eurasia

Americas

M=0

Out-of-Africa Replacement

Africa

Eurasia Americas

Africa

Eurasia Americas

0≤M≤1

Out-of-Africa With Possible Admixture

Multiregional

Figure 7.19 The three classes of models of human evolution simulated for an ABC analysis by Fagundes et al. (2007). M measures the amount of admixture of an expanding African population with archaic Eurasians. The double-headed arrows in the Multi-Regional Model represent gene flow between Africans and Eurasians. Source: Templeton (2018b).

approximations are not always accurate. A coherent test is possible by seeing if the 95% credible region of the posterior distribution of the parameter involved in the nesting relationship overlaps the value for the hypothesized special case, a well-established coherent Bayesian test for nested models (Lindley 1965). When the posterior distribution of M obtained by Fagundes et al. (2007) was used in this coherent test, the replacement model was rejected relative to the admixture model with p < 0.025 (Templeton 2010a) – a reversal of five orders of magnitude in the relative probabilities of these two models! Unfortunately, many of the programs used to implement ABC and other model-based approaches to phylogeography do not evaluate the logical relationships of the models being compared, so it is up to users to avoid incoherent inference by a rigorous examination of the logical relationships among the simulated models. The above criticisms are not directed at Bayesian analyses in general nor ABC in particular. These are powerful and legitimate approaches to many problems in population genetics and biology. These criticisms are presented only to serve as a guide to the proper implementation of these Bayesian approaches. The main advise to users of Bayesian programs is to put much effort into constructing priors with actual prior results, do not accept defaults blindly, take much care in defining the models to be simulated, and be cognizant of the logical relationships of the models being simulated such that only coherent tests are used to compare alternative models. ABC is not the only method for executing model-based phylogeographic inference. Chung and Hey (2017) developed an alternative Bayesian phylogeographic analysis that uses a Markov Chain Monte Carlo (MCMC) method instead of ABC. A Markov chain is a stochastic process through time in which the probability of a particular state of the system at one time period depends only on the state of the system in the previous time period through a set of transition probabilities. For example, Eq. (5.4) defines a time-forward stochastic process for the average probability of identity by descent in terms of its previous state and two transition probabilities: [1/(2N)] for drift and μ for mutation. Similarly, Eq. (5.21) defines a time-backward transition probability from a state of two-separate DNA lineages to one through coalescence. The Monte Carlo part of MCMC refers to generating

279

280

Population Genetics and Microevolutionary Theory

the transition probabilities with a random number generator on the computer to simulate the next state. Starting from some initial state, often close to the means of the priors, the next generation sample is simulated and replaces the previous state if it fits the data better using some predefined optimality criterion; otherwise, one goes back to the initial state and simulates again. This process is repeated to generate a chain of states that progressively fit the data better and better. Because the state at iteration i strongly affects the simulated state at i + 1, adjacent states are highly correlated and the initial assumed state can affect many subsequent iterations. Because of this sensitivity to the initial state, many of the early iterations are discarded as the burn-in period until a stationary phase is entered. Sometimes, more than one stationary phase may exist, so this iterative process may not reach the optimal stationary phase from some initial states. It is therefore important to try several initial states to minimize this problem. After the stationary phase has been reached, adjacent iterations of simulated states are still highly correlated, so samples of simulated states are taken every n iterations, with n chosen to be large enough to reduce these correlations among the sampled states. The simulated posterior distribution is then estimated from these sampled stationary states. Chung and Hey (2017) used MCMC to simulate coalescent trees, and then added gene flow analytically upon these sampled trees to generate the posterior distributions under multiple models. Another type of Markov chain used in model-based phylogeographic inference is the hidden Markov model (HMM). We have already mentioned that the coalescent process can be simulated and sampled through a MCMC. However, as noted in Chapter 5, the gene tree (e.g. Figure 5.8) is generally not observable, that is, it is hidden. So gene trees can be simulated as a hidden Markov process, and once they are obtained, simulated mutations can be overlaid upon them (often as a Poisson process with a specific mutation model, such as infinite sites) to make an observable haplotype tree (e.g. Figure 5.11) from which simulated statistical values can be calculated. These simulated coalescent HMM’s also provide a way of statistically evaluating phylogeographic models (Spence et al. 2018; Steinrücken et al. 2019). Some model-based approaches are based on maximum likelihood (Appendix B) rather than Bayesian analysis (Bertl et al. 2017). Typically, there is no analytical solution to find the maximum-likelihood estimators in the complex, high-dimensional models often found in phylogeography, but Bertl et al. developed a highly efficient stochastic approximation that eventually converges to the maximum-likelihood solution given a set of appropriate summary statistics. They show that their method can work efficiently with a large number of summary statistics both by simulations and by estimating the recent evolutionary history of orangutan populations with 56 summary statistics. Other maximum-likelihood approaches designed for short sequence blocks are given by Hearn et al. (2014) and Lohse et al. (2016). One common feature of all the model-based approaches is that they need to fully specify the set of models that will be subject to analysis. This also means that when models are tested with these procedures, they are only tested relative to this narrow set of user-specified models. Hence, the “best” model in this set may not correspond to reality at all. Yang and Zhu (2018) showed that when all of the models are nearly equally wrong, Bayesian procedures can generate high posteriors that seem to strongly support one of these models and reject the others, thereby producing overconfidence in the results. Two simulation papers on human evolution that were published within a week of one another illustrate this difficulty. Eswaran et al. (2005) simulated the replacement model (Figure 7.19) and a model with isolation by distance, and concluded, as indicated by their title, that “Genomics Refutes an Exclusively African Origin of Humans” because they found evidence for extensive gene flow among archaic human populations, including Africans with non-Africans, and no complete replacement. In contrast, Ray et al. (2005) claimed that their simulations “unambiguously distinguish between a unique origin and a multiregional model,” and favored a complete

Population History

African replacement versus a model of multiregional lineages with weak gene flow (the rightmost model in Figure 7.19). Thus, both sets of authors had great confidence that their simulations either supported or rejected the replacement model, but their model universes were not the same as they simulated very different alternatives to the replacement model. Indeed, there may be no contradiction at all in light of these different alternatives as both studies are consistent with the ordering: multiregional with isolation by distance > replacement > multiregional with weak and sporadic gene flow. Hence, whether or not replacement seems to be best or the rejected model depends upon the details of how gene flow was simulated and the alternative models to which replacement is being compared. Another serious limitation of the model-based approach is that you can only test that which you specify a priori. There is no room for unexpected results. For example, none of the models of human evolution shown in Figure 7.19 include the mid-Pleistocene Acheulean expansion seen in Figure 7.14 even though there is genetic, paleoclimatic, archeological, and fossil evidence supporting this expansion (Templeton 2018a, b), much of which was available well before 2007. Thus, all the simulated models of human evolution in Fagundes et al. (2007) were incomplete in their model universes with respect to the available prior information. In terms of a Bayesian analysis, this absence is equivalent to deciding that all the genetic, paleoclimatic, archeological, and fossil data indicating an Acheulean expansion are absolutely false with complete certainty. Such absolute certainty is rarely justified in science, and particularly in phylogeography. The discovery of the Acheulean expansion through nested-clade analysis was unexpected and reveals an important difference between nested-clade and model-based phylogeographic analyses – no prior model is needed for nested-clade analysis. Instead, the model of evolutionary history arises directly out of the data analysis as a set of statistically significant, cross-validated inferences. This is a great strength of nested-clade analysis over model-based approaches as it eliminates all the difficulties and artifacts associated with choosing a limited set of prior models. On the other hand, model-based approaches have several advantages over nested-clade analysis. First, nested-clade analysis is appropriate for one type of data: haplotype trees from genomic regions that have little or no recombination. Although it is no longer difficult to find such genomic regions and evolutionary history is most clearly written on them, there is also much information in other types of genetic data that can be used in model-based approaches but not by nested-clade analysis. This difference in the breadth of data sets that can be analyzed leads to a second advantage of model-based approaches: the haplotype trees used by nested-clade analysis generally provide only a coarse model of evolutionary history limited by mutational resolution, whereas model-based approaches can use much more data that allows a finer grain of inference by using summary statistics that depend on allele frequencies, etc. For example, the nested-clade analysis of humans (Figure 7.14) revealed much evidence for genetic interchange among archaic human populations within Eurasia and between Eurasia and Africa, but the details of these genetic interchanges were not inferred. In contrast, model-based approaches can reveal genetic interchange and admixture between specific populations at specific times. A third advantage of model-based approaches is that they provide estimators of the evolutionarily important parameters, whereas nested-clade analysis does not except for the timing of events or gross time intervals for gene flow. Another widely perceived difference between nested-clade analysis and model-based inference is that biological interpretations seem to be more direct in model-based approaches. In nested-clade analysis, the statistical significance of the summary statistics is connected to their biological interpretation via the inference key. Hence, the interpretive criteria are explicit and based on observable patterns. However, biological interpretation is also not direct in model-based approaches. The biological interpretation of the results of a model-based analysis is strongly determined and constrained by the choices made

281

282

Population Genetics and Microevolutionary Theory

for the summary statistics, the priors (for Bayesian analyses), and the models to be simulated or not simulated. The real difference between nested-clade analysis and model-based approaches is that the biological interpretation of the results is completely explicit and transparent in nested-clade analysis, but it is hidden and implicit in model-based analyses. At this point, it should be obvious that nested-clade and model-based phylogeographic analyses have complementary strengths and weaknesses. Accordingly, these two approaches are potentially synergistic when used together (Garrick et al. 2010; Templeton 2010b). Nested-clade analysis can provide the basic framework of evolutionary history without requiring prior knowledge or subjective opinions, including the discovery of unanticipated model features. Model-based approaches can then test within this model universe, fill in details, and estimate parameters. Biological inference is much stronger when these two approaches are used together than is possible when either is used alone. Strasburg et al. (2007) used this integrative approach to study the phylogeography of a vertebrate parthenogen, the Australian gecko Heteronotia binoei. First, a nested-clade phylogeographic analysis was performed, with the results shown pictorially in Figure 7.20. This analysis indicated that there were two old lineages, one centered in far-western Australia (3N2) and the second located farther inland to the east (3N1). The 3N1 lineage underwent an initial set of range expansions to the south-west, south-east, and east at 240 000 years ago, followed by dispersal restricted by distance, and then another range expansion of a more eastern population to the southeast at 60 000 years ago. The 3N2 lineage underwent range expansion to the east, northeast, and north at 70 000 years ago followed by dispersal restricted by distance. These range expansions overlaid well upon the climatic history of this region. Coalescent-based migration model analysis was then used to estimate the effective migration rates and 95% confidence limits throughout the ranges of both lineages, indicating the specific regions and rates that were interchanging individuals. Overall, the 3 N1 lineage had more effective migrants between subregions than the 3 N2 lineage, so although nested-clade analysis inferred dispersal restricted by distance in both lineages, the restrictions were more quantitatively extreme in the 3 N2 lineage. Additional model-based analyses indicated that the range expansions in the 3 N2 were also occurring with population size expansion, with the timing of these size expansions coinciding well with the nested-clade inferred timings of the range expansions. The 3 N1 lineage also had significant and rapid population size expansions that fit well to the range expansion times from nested-clade analysis. All in all, much more insight and detail was gained about the phylogeography of these geckos by using both nested-clade and model-based analyses.

Direct Studies over Space and Past Times DNA technology has advanced to the point that we can now extract and survey DNA from the environment and nonliving sources (Brunson and Reich 2019). Environmental DNA (eDNA) is DNA that is shed by organisms through hair, skin, excrement, etc. into aquatic or terrestrial environments that can be sampled and surveyed with these modern technologies. For example, most of the genetic samples on the wild ass study discussed in Chapter 4 (Renan et al. 2015) were obtained from feces because obtaining direct samples from these large mammals would often involve shooting them with a tranquilizing dart, obtaining a blood sample, and then insuring that the animal fully recovers. This is a highly invasive, time-consuming, and expensive procedure that also puts the animal at some risk for harm and at the very least constitutes a severe disturbance of the animal. Noninvasive sampling of eDNA from feces, hair, etc. has allowed population genetic surveys of many species, often of great conservation concern, that would otherwise not be practical.

Population History

3N1 3N2

Approximate Geographic Origins

Figure 7.20 The origin and spread of 3N1 and 3N2 parthenogens Heteronotia binoei. Solid lines outline the current distribution of the 3N1 lineage and dashed lines that of the 3N2 lineage. Arrows indicate significant range expansions. Also shown are timing estimates for expansions and hypothesized future expansions in 3N1 parthenogens based on climate modeling. Nested-clade analysis also inferred dispersal restricted by distance in several locations after expansion events. Phylogeographic events are overlaid on the predicted distribution in color for parthenogenetic Heteronotia binoei based on a statistical distribution model for present climatic conditions. Times given here are point estimates, but confidence intervals were also obtained. Source: Strasburg et al. (2007).

More relevant to this chapter is that modern DNA techniques have also allowed the sampling of DNA from nonliving specimens from the past, such as museum and herbarium specimens, and even from fossils (ancient DNA, or aDNA). Such sampling allows us to directly survey genetic and genomic variation from the past to gain insights into the past evolutionary history of species that would be impossible to obtain from only current sampling (Billerman and Walsh 2019). For example, one issue of great concern to conservation biologists is the loss of genetic diversity in a species because such diversity is the engine of evolutionary change and adaptation, which is more important than ever in this era of global climate change (Templeton 2017). Leigh et al. (2019) used historical specimens to estimate a 5.4–6.5% decline in within-population genetic diversity of 91 wild species since the industrial revolution. This study included many species that are not endangered, but the situation can be much worse for endangered species. One such endangered species is the Hine’s emerald dragonfly (Somatochlora hineana), one of the few federally endangered insect

283

284

Population Genetics and Microevolutionary Theory

species in the United States (Walker et al. 2020). This species is a specialist on fens – wetlands with calcareous seepage flow. There has been extensive destruction of fens and other wetland habitats in North America over the past several decades, and many previously known populations of this species have been completely extirpated. Figure 7.21 displays the results of a survey of mtDNA haplotype variation in current and extirpated populations of this dragonfly, with the extirpated populations represented by 12 museum specimens collected between 1929 and 1961 from Ohio. As can be seen in Figure 7.21, there is a north–south gradient in genetic diversity (to be discussed shortly), so the most comparable current populations to the extirpated Ohio populations are the ones collected in the 1990s from northern Illinois and southern Wisconsin, two regions of comparable size and similarly located on the southern border of the Great Lakes populations. After adjusting for sample size differences, there is a significant loss in the number of haplotypes (36% loss) and expected haplotype heterozygosity (31% loss). Hence, we have a rate of loss of genetic diversity greater than 30% over just several decades, far above the overall rate of loss of 6% over more than a century (Leigh et al. 2019). The study of Walker et al. (2020) also illustrates the strength of performing an integrated analysis of current specimens with historic specimens. The fens inhabited by these dragonflies are primarily found in the Great Lakes region of North America. However, there are also populations found in a small part of the Ozarks (the same highland region where the collared lizards discussed in Chapter 6 live). The fens in the Ozarks are located primarily along two nearby rivers that run in parallel over a length of a little over 100 km – an area much smaller than the Great Lakes region. Geographically, the Ozark populations appear to be a minor, disjunct group separated by a long distance from the bulk of the main species’ range. However, a nested-clade phylogeographic analysis on the haplotype tree that included both current and extirpated populations revealed that the Great Lakes region was established by a range expansion from the Ozarks. This conclusion is consistent with paleoclimatic data that indicate there were no fens in the current Great Lakes region until the Last Glacial–Interglacial transition (13 000–8500 years before present) (Yu 2000), whereas the Ozark fens have never been glaciated. Moreover, there is no significant reduction in genetic diversity in the entire Great Lakes region when extirpated populations are included relative to the Ozarks, although the current Ozark populations contain significantly more genetic diversity than all the current Great Lakes populations pooled together. Walker et al. hypothesized that there could have been many geographically intermediate fens available for these dragonflies as the glaciers retreated, so this expansion into the Great Lakes region could have involved large numbers of individuals. Indeed, no significant bottleneck in the past was detected in the Great Lakes population using model-based analysis. As this expansion occurred, there was a significant trend to lose genetic diversity the farther one got away from the Ozark source, a pattern also seen in human populations with increasing distance from the African source (Ramachandran et al. 2005). Moreover, the Ozarks had many haplotypes not observed in the Great Lakes regions, and these were significantly enriched for being evolutionarily older haplotypes and clades. Thus, despite its small area and distance from the main species range, the Ozark populations are the most diverse current reservoir for genetic diversity in this species and represent the core population in an historical sense. Until the study of Walker et al. (2020), ecological and genetic surveys focused primarily on the Great Lakes region. By placing the ecological and genetic studies into their historical context by using museum specimens, nested-clade analysis, and model-based inferences, the limitations of making conservation decisions based solely on current geographic distributions were clearly revealed. Both the Ozarks, an important reservoir of genetic diversity, and the Great Lakes region, an area suffering rapid loss of diversity through population extirpations, need more attention and resources for the protection of this endangered species.

Population History

Figure 7.21 A map showing the distribution and frequencies of the mtDNA haplotypes in the Hine’s emerald dragonfly (Somatochlora hineana) in the Ozarks (the south-western pie chart) and four subregions of the Great Lakes region (moving clockwise from the most northern chart: northern Michigan and Ontario, Ohio, northern Illinois and southern Wisconsin, and central and northern Wisconsin). Thick lines indicate lake and ocean boundaries; thin lines indicate state boundaries. The areas of pie charts are roughly proportional to their actual geographical range in the Great Lakes region, but the Ozark pie chart is associated with an area that is smaller than any of the Great Lakes subregions. The Ozark pie chart has been increased in size beyond its proportional area to display its greater haplotype diversity. Source: Walker et al. (2020).

285

Population Genetics and Microevolutionary Theory

Going back further in time with aDNA, Allentoft et al. (2014) studied moas, an extinct assemblage of nine species of large, wingless ratite birds native to New Zealand. These species went to extinction in the late thirteenth century shortly after the arrival of Polynesians. Allentoft et al. were able to obtain mtDNA and microsatellites on 217 radiocarbon-dated individuals from four species dating from 12 966 years before present to 602 years before present. They were able to produce skyline plots and measures of genetic diversity, as shown in Figure 7.22, that indicated no drop in inbreeding effective size nor in genetic diversity prior to extinction – quite different from what was found for the endangered Hine’s emerald dragonfly. Allentoft et al. also performed an ABC phylogeographic analysis of several demographic models using both mtDNA and microsatellites on D. robustus, the species with the largest sample size, that indicated a large variance effective size and no population decline. These results indicate that these moa species were doing well before humans colonized the islands and began hunting these flightless birds. Going back even further in time to the last Ice Age, Barlow et al. (2018) and Cahill et al. (2018) analyzed genetic material from current and past populations of polar, brown, and the extinct cave bear, with material going back as far as 35 000–72 000 years before present. Both groups of researchers

Log Ne*τ

(a)

1.E6

1.E5

1.E4

1.E3

(b) Expected Heterozygosity HE

286

D. robustus 40

30

20 Time (kyr)

10

0

0.8 P. elephantopus 0.6

0.4

E. curtus

0.2 E. crassus

0.0 3000

2000 Time (yr BP)

1000

Polynesian Colonization

0

Figure 7.22 Demographic history and genetic diversity of extinct species of moas. Panel (a) gives the Bayesian skyline plot for Dinomis robustus (n = 87), where the y-axis depicts the inbreeding effective female population size multiplied by generation time. Year zero corresponds to the age of the youngest sample at 602 years before present. Panel (b) shows the expected heterozygosity (HE) for six microsatellite loci, measured across time in the four moa species (n = 188). Data points represent the mean age and mean HE (with SE) of the moa individuals in 1000-y time bins. Source: Allentoft et al. (2014). © 2014 National Academy of Science.

Population History

used ABBA/BABA-type tests to show that there was significant genetic interchange among these groups, with living brown bears having about 0.9–1.8% of their DNA from cave bears, with a 41 000-year-old specimen of a brown bear having 2.4% of its DNA from cave bears. Indeed, admixture or gene flow among all three of these taxa has been common in their recent evolutionary history. The data indicate that past climate change sometimes brought these groups together, resulting in peaks of genetic exchange in the past. These past patterns indicate that current climate change may well bring further gene flow or admixture between polar bears and brown bears (Cahill et al. 2018). Much of the aDNA work has focused on human evolutionary history, and this work is reviewed in Templeton (2018a, b). Through the use of ABBA/BABA and other tests, this work has clearly shown much gene flow and admixture of humans at and since the mid-Pleistocene, including admixture with archaic Eurasians after the most recent out-of-Africa expansion event and more ancient admixture after the Acheulean out-of-Africa expansion (Rogers et al. 2020) – exactly as shown by multilocus, nested-clade analysis (Figure 7.14). It should be pointed out that the ABBA/BABA and other 4-taxa tests are tests of the null hypothesis of a population tree, and generally cannot distinguish between admixture and recurrent gene flow (including isolation by distance) as causes for rejecting the null hypothesis of treeness (Durand et al. 2011; Eriksson and Manica 2014; Peter 2016). There are ways of distinguishing these different types of genetic exchange, but often, the sample size from fossils, particularly from very old ones, is insufficient for this purpose. We discussed this earlier in this chapter with respect to the confoundment of effective size and population structure that is intractable from genomic data from a single individual. Thus, although the human literature almost always interprets ABBA/BABA test results as admixture between populations that have “split” in the past (that is, a population tree), there really is no statistical evidence supporting a tree-like structure and their attendant “splits.” As pointed out earlier in this chapter, current human populations do not even remotely fit a population tree (Figure 7.4). These splits and their times of occurrence are artifacts of forcing data into tree-building programs that will produce a tree regardless of whether or not the data has an underlying tree-like structure.

Historical Population Genetics and Macroevolution Much of statistical phylogeography is based on coalescent theory. We have already seen that the coalescent process can and often does transcend the evolutionary process of speciation (Figures 5.17, 5.18, and PHKA2 in Figure 7.9), that is, the creation of a new species. Hence, statistical phylogeography is a tool not only for studying intraspecific evolutionary history but interspecific history as well. Such studies can illuminate much about the meaning and nature of species and speciation, the fundamental building blocks of macroevolutionary theory. As shown by the examples in this chapter, phylogeographic analyses can detect historical events such as fragmentation, colonization of new areas, founder events, bottleneck events, hybridization and admixture events. Many of these events are extremely rare. For example, only three detected out-of-Africa range expansion events occurred in the human lineage over the past 2 000 000 years (Figure 7.14), but these rare events still had a major impact on what it means to be human today. Speciation itself is a rare event (Slatkin 1996), but rare events can be amplified to global importance at the macroevolutionary level (Templeton 1986). What is critical to understand is that the types of rare events amenable to phylogeographic inference are also the types of rare historical events that are important in the processes of speciation (Templeton 1981). The ability to reconstruct these events through phylogeographic analysis therefore adds a powerful analytical tool to studies on

287

288

Population Genetics and Microevolutionary Theory

the process of speciation, and indeed upon the very meaning of species (Templeton 1998b, 1999b, 2001; Templeton et al. 2000b). In 1859, Darwin (1859) published “On the Origin of Species By Means of Natural Selection.” Despite the prominent use of the word “species” in his title, Darwin’s book was more about natural selection than about species or speciation, the evolutionary process that leads to a new species. Indeed, Darwin did not even have a clear definition of species. Instead, Darwin wrote “I look at the term species as one arbitrarily given, for the sake of convenience, to a set of individuals closely resembling each other ….” If species are truly “arbitrarily given,” they are not necessarily real biological entities. However, during the mid-twentieth century synthesis of Darwin’s ideas with Mendelian genetics and systematics, two of the major architects of this synthesis, Theodosius Dobzhansky and Ernst Mayr, argued that species were indeed real biological entities and offered a definition of species known as the biological species concept. Under this concept, species are groups of actually or potentially interbreeding natural populations that are intrinsically reproductively isolated from other such groups. By the mid-twentieth century, the biological species concept was the dominant concept among zoologists, but it was not so popular with botanists or microbiologists (Templeton 1989). As the twentieth century progressed, higher resolution genetic survey techniques were developed that revealed some introgression and hybridization even among many animal species, and aDNA studies have revealed this to be an even more common phenomenon than previously thought, as illustrated by the polar/brown bear example discussed above. As a result, even many researchers who primarily worked on animals began to question the generality of the biological species concept, and a plethora of new species concepts were proposed in the last quarter of the twentieth century. Many of these newer species concepts emphasized the biological reality of species but used a broad array of evolutionary roles to define species (Hull 1999). Some regarded species as a reproductive community, as is done by the biological species concept, but the boundaries of that reproductive community could be defined positively by mate recognition in the “recognition species concept” rather than only negatively by reproductive isolation in the biological species concept. Others emphasized the biological role of species as an ecological community that shared a particular niche or role in the ecosystem. Finally, many of the newer species concepts defined species as an evolutionary lineage of some sort. The cohesion species concept unites all three of these evolutionary roles to define species (Templeton 1989). A cohesion species is an evolutionary lineage that maintains its cohesiveness as a lineage over time because it is a reproductive community capable of exchanging gametes and/or it is an ecological community sharing a derived homologous adaptation or set of adaptations that are needed for successful reproduction in a particular environment (demographic or ecological exchangeability). The primary attribute of a cohesion species is that it is an evolutionary lineage, so the cohesion species is a special case of a species being an evolutionary lineage. However, not all lineages are species under the cohesion concept. A lineage exits only because of acts of reproduction, so a reproductive community defined by innate reproductive isolating barriers in a sexual organism will define a cohesion species. Consequently, the biological species concept is a special case of the cohesion species concept. Whenever the biological species concept works, so will the cohesion species concept. However, the cohesion species concept is broader than the biological concept because it also recognizes that a lineage exists only because of acts of successful reproduction that occur in the context of an environment to which the species is adapted, and therefore the cohesion species concept adds on positive criteria for a reproductive community in addition to the traditional isolating barriers. The cohesion concept also adds on ecological criteria as a means of identifying species boundaries as lineages that are ecological communities and can even be applied to asexual taxa (Templeton 1989).

Population History

One great advantage of the cohesion species concept is that species can be inferred by rejecting two null hypotheses (Templeton 2001): (1) the organisms sampled are derived from a single evolutionary lineage; and (2) reproductive communities AND/OR ecological communities do not correspond to the lineages identified by rejecting null hypothesis 1. The rejection of both of these null hypotheses leads to the inference of a cohesion species. Advances in both genetics and ecology are making it ever more practical to test these two null hypotheses. For example, we have already seen how multi-locus nested-clade analysis can detect fragmentation events with high power (Templeton 2009a) and with low false-positive rates (Table 7.3). Fragmentation is the splitting of a single lineage that defines two or more new evolutionary lineages – and therefore potential species under the cohesion concept. As an example, multi-locus nested-clade analysis identified a significant fragmentation event identified in all five gene regions in African elephants (Figure 7.10), and these five inferences were cross-validated as a single fragmentation event occurring 4 200 000 years ago. Hence, savanna and forest elephants are potentially different cohesion species because we reject the first null hypothesis. Debruyne (2005) examined the data of Roca et al. (2005) and some additional data from captive African elephants and concluded that there was no evidence for more than one species of African elephant. Debruyne (2005) used an evolutionary lineage species concept that demanded reciprocal monophyletic clades and indeed the only haplotype tree out of the five that corresponds to the presumed species tree is that for BGN (Figure 7.10). However, Debruyne’s requirement of reciprocal monophyletic clades is inconsistent with coalescent theory that proves that following a fragmentation event, lineage sorting (as seen for PHKA2 in Figure 7.10), or limited secondary contact (as seen for Y-DNA, mtDNA, and PLP in Figure 7.10) can cause a failure of reciprocal monophyly. Interestingly, Debruyne (2005) did claim that the forest and savanna forms do not correspond to cohesion species, but this claim was made in a single sentence without any of the recommended hypothesis testing necessary to implement the cohesion concept (Templeton 2001). The only way to check if Debruyne’s claim is valid is to implement the cohesion species by testing the two null hypotheses. We have already seen that the first null hypothesis is rejected and cross validated, so there are two evolutionary lineages in Africa that split 4 200 000 years ago even though there has been some lineage sorting and limited introgression. Inferring lineages with nested-clade analysis does not require reciprocal monophyly of all or even any haplotype trees. Given that the first null hypothesis has been falsified, it is now necessary to test the null hypothesis that the lineages defined above do not correspond to different reproductive communities and/ or different ecological communities. Besides fragmentation, three other significant, cross-validated inferences were made with the 5-locus nested-clade analysis of the data in Roca et al. (2005). One was a range expansion of savanna elephants from the southern and eastern portions of their range to the north. Of more relevance to testing species status, there were also significant, cross-validated inferences of gene flow restricted by distance within savanna elephants and within forest elephants. In both cases, at least one of the gamma distributions describing the timing of this restricted gene flow had much of its probability mass concentrated near 0, the present. Hence, we can use the method of Templeton (2004b) to identify the time, T, in the past over which we can be 95% confident that recurrent gene flow restricted by isolation by distance (i.b.d.) was occurring in the interval [0,T] by finding the T that makes the following equation equal to 0.95: j

T

Pr i b d in 0, T = 1 − i=1

0

t i ki e − ti Ti 1 + ki

1 + ki T i

1 + ki

Γ 1 + ki

dt i

7 12

289

290

Population Genetics and Microevolutionary Theory

where j is the number of clades across all loci supporting the cross-validated inference of gene flow restricted by isolation by distance. For the savanna elephants, the significant signal for recurrent gene flow with isolation by distance goes back to 859 000 years ago with 95% confidence, and for the forest elephants it goes back to 4 200 000 years ago – the entire time period since fragmentation. As described earlier in this chapter, we expect the signal for isolation by distance to become weaker as we go farther back in time and to even disappear if the older clades have reached their lineage boundaries. The time needed to lose such a signal also depends upon the strength of the isolation by distance. The results here indicate that isolation by distance is weaker in savanna elephants than in forest elephants despite savanna elephants having a larger range. This probably reflects the fact that the forest is a much greater barrier to dispersal than an open savanna to a large-bodied mammal. In any case, the significant, recurrent gene flow within both elephant lineages shows that they are both reproductive communities in the positive sense. The significant fragmentation event that separates these two lineages reveals that the combined forest and savanna populations are not a single positive reproductive community despite a limited amount of introgression. Debruyne (2005) pointed out that this introgression shows that hybridization occurs and that the hybrid offspring are fertile, implying that forest and savanna elephants are not separated by intrinsic isolation and are not species under the biological species concept. Isolating barriers would reinforce the boundaries of the reproductive communities in a negative sense, but they are not necessary under the cohesion concept to define a species. Rather, forest and savanna elephants are positive reproductive communities, but their union is not under the second null hypotheses tested by multi-locus, nested-clade phylogeographic analysis, that is, two statistically significant reproductive communities are identified by recurrent gene flow through nested-clade analysis. Moreover, these two reproductive communities correspond to the two evolutionary lineages identified by rejecting the first null hypothesis, thereby falsifying the second null hypothesis as well. Hence, forest and savanna elephants are separate cohesion species. This conclusion is reinforced by the ecological aspect of the second null hypothesis. Habitat data were also gathered at all sample sites by Roca et al. (2005), and all but one (Garamba) could be classified as savanna or forest. Excluding the Garamba sample, a nested-clade analysis of the categorical variables of forest versus savanna habitat types was performed using the same haplotype trees with the version of nested-clade analysis that is applicable to categorical variables (Templeton 1995). In every case, at least one strong significant association with habitat was detected, and Figure 7.23 uses arrows to point to the branches that had the strongest, and usually only, change in habitat associations. Note that these are the same positions used to identify the fragmentation events (Figure 7.10). The second null hypothesis posits that there is no association between these ecological transitions with the previously inferred fragmentation event that defines the two lineages. The probability of a match in haplotype tree i with ni informative branches (i.e. the branches defining the clades have sufficient sample size to potentially reject the null hypothesis of no ecological association) under the null hypothesis of no association between fragmentation and ecological habitat is 1/ni. The probability that all five haplotype trees match the ecological transition with the fragmentation event is the product of these probabilities over all trees. In this case, the probability of a fivefold match under the null hypothesis of no association is 9.45 × 10−6, so the ecological null hypothesis is strongly falsified. Hence, the evolutionary lineages defined by the phylogeographic analysis correspond to two different ecological communities as defined by the habitat analysis. The null hypotheses that define cohesion species (only one lineage; reproductive and ecological communities are not associated with lineages) are thereby falsified, so the forest and savanna elephants are indeed two different cohesion species.

Population History

5 X 10–15 1.1 X 10–23 Y-DNA mtDNA

3.4 X 10–93 3.9 X 10–136

PLP

BGN

8.6 X 10–28

PHKA2

Figure 7.23 The haplotype trees of elephants for five genomic regions (Y-DNA, mtDNA, and three nuclear loci: BGN, PLP, and PHKA2), as estimated by Roca et al. (2005), using the same representations shown in Figure 7.9. Arrows point to the location in each haplotype tree that had the strongest association with the ecological habitat as determined by nested-clade analysis for categorical variables. Source: Based on data from Roca et al. (2005).

Bears in the genus Ursus provide another example. As discussed earlier in this chapter, the recognized species of polar bear (U. maritimus) and brown bear (U. arctos) have an evolutionary history of limited admixture and introgression. Luna-Aranguré et al. (2020) built a mtDNA and geographical database for all Ursus species as well as some outgroups, including aDNA from cave bears. Given geographical location data, the realized ecological niche for a species can be modeled with respect to underlying environmental variables using a maximum entropy analysis, as was done in Chapter 6 for the study on Salamandra inftraimmaculata (Sinai et al. 2019). Luna-Aranguré et al. performed ecological niche modeling on four Ursus species (the polar and brown bears, as well as the American black bear, U. americanus, and the Asian black bear, U. thibetanus). The ecological

291

292

Population Genetics and Microevolutionary Theory

G 1 2.99 [2.81 – 3.16] H 1 0.91 [0.86 – 0.96]

A 1 7.03

B 1 4.55 [4.31 – 4.79]

NODE posterior AGE [range]

D 1

U. americanus U. thibetanus U. arctos U. maritimus U. spelaeus & deningeri H. malayanus M. ursinus

7

6

F 1 0.95 [0.52 – 1.61]

C 1 4.17 [3.97 – 4.41]

5

3.85 [3.64 – 4.06]

E 1 1.94 [1.21 – 2.71]

4

3

2

1

0

Figure 7.24 Time-calibrated Bayesian haplotype tree in million years, based on 114 unique haplotypes obtained from 689 d-loop sequences of the four Ursus species studied ecologically. In addition, one sequence was included from each of the extinct species Ursus spelaeus and Ursus deningeri and of the sun bear (Helarctos malayanus) and the sloth bear (Melursus ursinus) as outgroup. The color bar on the right depicts the species (names in the insert). Source: Luna-Aranguré et al. (2020). © 2020 John Wiley & Sons.

modeling revealed distinct ecological niches for all four bear species, and these niche models were consistent with past ranges inferred from the fossil record under changing climatic conditions. These distinct ecological niches displayed a strong association with mtDNA lineages for these bear species (Figure 7.24). Note that the polar bear niche is associated with a monophyletic mtDNA lineage, but this lineage is embedded within the broader brown bear lineage, reflecting the major role of introgression between brown and polar bears that was discussed earlier. However, there is no doubt that a strong lineage/ecological niche association exists, as it does for the elephants. Luna-Aranguré et al. (2020) also investigated intralineage variation in the ecological models by developing a phyloclimatespace analysis to investigate the relationship of the mtDNA haplotype trees within named species with the ecological space (Figure 7.25). The patterns seen in Figure 7.25 were quantified by generating a pairwise molecule genetic distance matrix from the estimated mtDNA haplotype tree and a pairwise environmental distance matrix between the points in the ecological model. Matrix correlations were calculated from these paired distance

Population History

Distance To The Root

Vietnam Japan

Russia

1000 1500 2000 2500 3000

Occurrence Records Fossil Occurrences

500

500 1000 1500 2000 2500 3000

(b)

North Korea

Alaska Occurrence Records Fossil Occurrences

Oregon

Colorado Montana

0

Precipitation

(a)

–50

0

50

100

150

200

250

–100

100

200

1500

Alaska Occurrence Records Fossil Occurrences

Greenland (South)

Occurrence Records

1000

Fossil Occurrences

Japan

500

Greenland Kamchatka

Canada Russia

Russia –100

0 Temperature

Iran 100

Alaska

0

500 1000 1500 2000 2500 3000

(d)

0

Precipitation

(c)

0

Texas

200

–300

–200

–100

0

100

Temperature

Figure 7.25 Phyloclimatespace results for (a) Ursus thibetanus, (b) Ursus americanus, (c) Ursus arctos, and (d) Ursus maritimus. The red lines indicate branches of the mtDNA haplotype tree within each species such that the end point of a branch (a haplotype) is on its respective climatic combination of precipitation and temperature where that haplotype is centered, and, also, the same is true for the internal nodes of the tree and its root through ancestral reconstruction. The current occurrence records for each species and the environmental values associated with the fossil occurrences are also shown in the model. Some geographical regions are labeled according to the occurrence records and the localities for the mtDNA sequences. Source: Luna-Aranguré et al. (2020). © 2020 John Wiley & Sons.

matrices, and significance testing of the correlation between these matrices was executed through random permutation (Appendix B). The Asian black bear displayed the largest amount of intraspecific correlation between the haplotype tree and ecological space (0.2929, p = 0.001), indicating much possible local adaptation, followed by the brown bear (0.1228, p = 0.001), and the American black bear (0.0146, p = 0.010). The polar bear showed no significant intraspecific signal of potential local adaptation (0.009, p = 0.268), indicating a homogeneous ecological niche for all polar bears.

293

294

Population Genetics and Microevolutionary Theory

As these examples and others (Templeton 1998b, 1999b, 2001; Templeton et al. 2000b) show, modern genetics allows more precise definitions of species. More importantly, species status can be tested by falsifying null hypotheses. Falsification of null hypotheses protects one against nonquantitative, subjective claims. With explicit testing, the nature of the data supporting species status and the statistical strength of that conclusion are all explicit and objective. Moreover, this hypothesis testing framework provides insight into the process of speciation. For example, many biologists feel that absolute geographic isolation due to a major geological or environmental change that splits a species into two or more completely isolated subpopulations is a necessary prerequisite for speciation, particularly among sexually reproducing animals (Mayr 1970; Coyne and Orr 2004). In both the bear and elephant examples, speciation has occurred without a complete geographic rupture leading to total genetic isolation, and indeed in both examples there is limited recurrent genetic introgression. In both cases, adaptive evolution to extremely different environments appears to be a driver of speciation even though complete geographic isolation was not necessary for speciation to occur. Obviously, more groups need to be studied before we can evaluate the relative roles of ecological divergence and geographic isolation in speciation, but population genetics has provided the tools to execute such studies. Hence, population genetics has an important role to play in both micro- and macroevolutionary theory and practice.

295

Part 2 Genotype and Phenotype

297

8 Basic Quantitative Genetic Definitions and Theory In Chapter 1, we introduced the three premises upon which population genetics is founded. In Chapters 2 through 7, we explored the roles of premise one, DNA replication, and premise two, DNA mutation and recombination, on the fate of genes through space and time. Many powerful evolutionary mechanisms were uncovered during this exploration of premises one and two, but our discussion of evolutionary mechanisms remains incomplete until we weave the third premise into this microevolutionary tapestry. The third premise is that the information encoded in DNA interacts with the environment to produce phenotypes (measurable traits of an individual). Premise one, DNA replicates, tells us that genes have an existence in time and space that transcends the individual. This transcendent behavior of genes does not imply that individuals are not important. The evolutionary fate of genes does depend on the individuals that carry the genes. DNA cannot replicate except through the vehicle of an individual living and interacting with its environment. Therefore, how an individual interacts with the environment plays a direct role in the ability of DNA to replicate. As pointed out in Chapter 1, the fact that DNA replication is sensitive to how an individual interacts with its environment is the basis of natural selection and adaptive evolution. Premise three says that you inherit a response to an environment, not traits per se. Thus, the environmental context in which individuals live and reproduce cannot be ignored if we want a full understanding of evolution. In this chapter and the following two, we will lay the foundation for understanding the relationship between genotype and phenotype, a relationship that is essential to understand before turning our attention to natural selection and adaptive evolution in the final chapters of this book. We will also explore how the genetic variation that arises in a population via premise two (DNA mutates and recombines) influences phenotypic variation in the population. Our basic approach to modeling evolution in the previous chapters was to start at a particular stage in the life history at one generation and then continue through the organisms’ life cycle until we reach the comparable stage at the next generation. We had to specify the genetic architecture and the rules of inheritance in going from individual genotypes to gametes, and we had to specify the rules of population structure to go from gametes to diploid individuals in the next generation. However, up to now, we ignored the fact that after fertilization, an individual zygote develops in and reacts to its environment to produce the traits that characterize it at different stages of life, including its adult traits. We now extend our models to include how individuals within each generation develop phenotypes in the context of an environment and how phenotypes are transmitted from one generation to the next. This problem of phenotypic transmission is far more complicated than Mendelian transmission of genes, as we shall see in this chapter.

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

298

Population Genetics and Microevolutionary Theory

“Simple” Mendelian Phenotypes Mendel was able to work out many of the rules for inheritance by focusing upon phenotypes that were primarily determined by a single locus and that had a simple mapping from genotype to phenotype in the environment in which the organisms normally lived and developed. Such simple genotype/phenotype systems are still the mainstay of introductory genetics textbooks because they illustrate Mendelian inheritance in a straightforward manner. A deeper examination of these “simple” Mendelian systems typically reveals more complexity in the relationship between genotype and phenotype than is usually presented in the textbooks. These “simple” Mendelian systems therefore provide an excellent vehicle for illustrating the issues we must deal with in going from genotype to phenotype. Consider sickle cell anemia – the workhorse example of a “simple” Mendelian trait found in most genetic textbooks. Sickle cell anemia is a form of hemolytic anemia (that is, the red blood cells tend to lyse) that can lead to a variety of deleterious clinical effects and early death. Sickle cell anemia is commonly presented as a single nucleotide trait in which one nucleotide change in the sixth codon of the gene coding for the β-chain of hemoglobin produces the S allele (with valine at the sixth position) from the more common A allele (with glutamic acid at the sixth position). This single amino acid substitution in turn changes the biochemical properties of the resulting hemoglobin molecule, which in turn is typically presented as leading to the phenotype of sickle cell anemia in individuals who are homozygous for the S allele. Thus, sickle cell anemia is typically presented as a single locus, autosomal genetic disease and the S allele is said to be recessive to A. Note that the word “recessive” refers to the mapping of genotype to phenotype, which, in this case means that the genotype homozygous for the S allele has the phenotype of sickle cell anemia, whereas AA and AS individuals do not have this phenotype. Recessiveness and allied concepts in Mendelian genetics, such as dominance, co-dominance, epistasis (when the phenotype is influenced by interactions between two or more genes), and pleiotropy (when a single genotype influences many different traits), are not innate properties of an allele. Instead, such words apply to the genotype–phenotype relationship in the context of a particular environment. To illustrate this, we will now explore many different phenotypes associated with the S and A alleles and examine the genotype–phenotype relationship in the context of an environment.

The Phenotype of Electrophoretic Mobility Appendix A outlines the genetic survey technique of protein electrophoresis. Under some pH conditions, glutamic acid (associated with the A gene product) and valine (associated with the S gene product) will have a charge difference. Hemoglobin normally exists as a tetramer consisting of two α-globin chains and two β-globin chains. Under the appropriate buffer conditions, the hemoglobin tetramer can be disassociated into its component polypeptide chains. When protein electrophoresis is performed on blood samples in the proper pH and buffer environment, the β-hemoglobin chains display different electrophoretic mobility phenotypes in individuals differing in their genotypes with respect to the S and A alleles, as shown in Figure 8.1 (the α-globin chains, which have a distinct electrophoretic mobility from both types of β-chains, are not shown). As can be seen in that figure, the SS and AA homozygous genotypes each produce a single band, but with distinct phenotypes of electrophoretic mobility. However, note that the electrophoretic phenotype of the AS heterozygotes is the sum of the phenotypes of the two homozygotes. Thus, the phenotype associated with each allele in homozygous conditions is fully expressed in the heterozygote as well. In this case, we would say that the A and S alleles are codominant.

Basic Quantitative Genetic Definitions and Theory

Origin

Genotype

AS

SS

AA

Direction of Migration

Figure 8.1 Protein electrophoresis of the genotypic variation associated with the A and S alleles at the hemoglobin β-chain locus in humans. The buffer environment is such that the hemoglobin molecule is disassociated into its component α and β chains and the β-chains have a charge difference depending upon the amino acid they have at the sixth position of the amino acid chain.

The Phenotype of Sickling Sickle cell anemia gets its name from the fact that the red blood cells (the cells that carry the hemoglobin molecules) will distort their shape from their normally disk shaped form to a sickle shape (Figure 8.2) under the environmental conditions of a low partial pressure of molecular oxygen (O2). Hemoglobin is the molecule that transports oxygen from the lungs to the tissues throughout our bodies. When the oxygen is released by the hemoglobin molecule due to a low partial pressure of oxygen in the ambient environment, an allosteric change occurs in the hemoglobin molecule. This three-dimensional change causes the valine in the βS-globin to protrude outward, where it can stick into a pocket in the three-dimensional structure of an α-chain on an adjacent hemoglobin molecule. The hemoglobin molecules are tightly packed together in the red blood cells, making Figure 8.2 Red blood cells showing the sickle cell shape (center) and the normal, disk shape. Source: Eye of Science/Science Source.

299

300

Population Genetics and Microevolutionary Theory

such a joining of βS-globin to α-globin likely. Indeed, long strings of these joined hemoglobin molecules can assemble, and these strings in turn distort the shape of the red blood cell, leading to the trait of sickling under environmental conditions of a low partial pressure of oxygen. Such environmental conditions occur in the capillaries when oxygen is taken up by a peripheral tissue. Such conditions also occur at high altitude, or during pregnancy (fetal hemoglobin has a higher oxygen affinity than adult hemoglobin, allowing the fetal blood to take oxygen from the mother’s blood across the placenta). Because both AS and SS genotypes have βS-globin chains in their red blood cells, both of these genotypes show the sickling trait under the appropriate environmental conditions. Therefore, with respect to whether red blood cells sickle or not, the S allele is dominant.

The Phenotype of Sickle Cell Anemia In SS homozygotes, there are no βA-globin chains to disrupt the strings of joined hemoglobins, so the strings tend to be longer in SS individuals than in AS individuals. As a consequence, the distortion of the normal shape of red blood cell tends to be more severe in SS individuals than in AS individuals under the environmental conditions of a low partial pressure of oxygen. Indeed, the distortion can be so severe in SS individuals that the red blood cell ruptures, losing its hemoglobin molecules and leading to anemia. The highly distorted red blood cells also pass poorly through the narrow capillaries, which is one of the places with the appropriate environmental conditions to induce sickling. The anemia and the inability of the distorted cells to move easily through the capillaries can lead to wide spectrum of phenotypic effects (Figure 8.3) known collectively as sickle cell anemia. Low Partial Pressure Of O2

Abnormal Hemoglobin

Sickling of Red Blood Cells

Clumping of Cells and Interference with Blood Circulation

Rapid Destruction of Sickle Cells

Anemia

Collection of Sickle Cells in Spleen

Local Failures in Blood Supply

Overactivity of Bone Marrow

Heart Damage

Dilation of Heart

Muscle and Joint Damage

Gastrointestinal Kidney Tract Damage Damage

Brain Damage

Lung Damage

Paralysis

Pneumonia

Increase in Amount of Bone Marrow Weakness and Lassitude

“Tower Skull”

Figure 8.3

Poor Physical Development

Impaired Mental Function

Heart Failure

Rheumatism

Abdominal Pain

Kidney Failure

Enlargement, then Fibrosis of Spleen

The sickle cell anemia syndrome. Source: Modified from Neel and Schull (1954).

Basic Quantitative Genetic Definitions and Theory

As Figure 8.3 shows, sickle cell anemia is actually a complex clinical syndrome with multiple phenotypic effects (pleiotropy) and much variation in expression from one individual to the next. Indeed, the symptoms vary from early childhood death to no clinical symptoms at all (the absence of clinical symptoms in some individuals will be discussed in Chapter 12 but is due in part to epistasis with other loci). However, regardless of the exact degree of expression, the clinical syndrome of sickle cell anemia is found only in SS individuals. Therefore, with respect to the phenotype of sickle cell anemia, the S allele is recessive.

The Phenotype of Malarial Resistance The AS and SS genotypes show resistance to falciparum malaria (Friedman and Trager 1981), one of the most lethal forms of malaria in humans. The malarial parasite enters the red blood cells of its host. A cell infected by the falciparum but not by the other malarial parasites develops knobs on its surface which leads to its sticking to the endothelium of small blood vessels. In such sequestered sites, sickling takes place in AS and SS individuals because of the low oxygen concentration in these small blood vessels. The infected red cell is also more acidic than the uninfected cell, a pH environment that enhances the rate of sickling. The spleen generally removes the red blood cells with the distorted sickle morphology before the parasite can complete its life cycle, leading to the phenotype of malarial resistance. Obviously, this phenotype can only be expressed under the environmental conditions of malarial infection. Because the genotypes AS and SS show this type of resistance to malaria, the S allele is dominant for the phenotype of malarial resistance.

The Phenotype of Health (Viability) The A and S alleles influence how healthy an individual is, particularly with regard to viability, the ability of the individual to stay alive in the environment. First, consider an environment that does not have falciparum malaria. In such an environment, SS individuals have a substantial chance of dying before adulthood due to hemolytic anemia and other complications as shown in Figure 8.3. This is particularly true in areas that have poor health care. Because AA and AS individuals do not suffer from sickle cell anemia, the S allele is recessive for the phenotype of health in a non-malarial environment. Now, consider health in a malarial environment. The SS individuals have poor health because they suffer from sickle cell anemia, but the AA individuals also have poor health because of falciparum malaria and have a high probability of childhood death. However, the AS individuals do not suffer from sickle cell anemia, and they have some resistance to malaria. Therefore, in an environment with falciparum malaria, the S allele is overdominant with respect to the phenotype of health because the AS heterozygotes have superior viability to either homozygote class. Note that the S allele can be dominant, recessive, codominant, or overdominant depending upon which phenotype is being measured and the environment in which the measurement is made. Although we frequently use such expressions as a “dominant allele” or “recessive allele,” such expressions are merely a linguistic short hand for describing the genotype–phenotype relationship in a particular environmental context. Dominance, recessiveness, etc. are NOT intrinsic properties of an allele. Context is always important when dealing with the relationship between genotype and phenotype.

301

302

Population Genetics and Microevolutionary Theory

Nature Versus Nurture? Does nature (the genotype) or nurture (the environment) play the dominant role in shaping an individual’s phenotype? From premise three, we can see that this is a false issue. Phenotypes emerge from the interaction of genotype and environment. It is this interaction that is the true causation of an individual’s phenotype, and it is meaningless to try to separate genotype and environment as distinct causes for the individual’s phenotype. However, in population genetics, we are often concerned with a population of individuals with much phenotypic variability. Accordingly, in much of population genetics, our concern centers on causes of phenotypic variation among individuals within the deme rather than the causation of any single individual’s phenotype. Causes of phenotypic variation in a population are quite distinct from causes of individual phenotypes, and the nature/nurture issue is limited only to causes of variation. We will illustrate these statements by considering yet another “simple” Mendelian genetic disease: phenylketonuria or PKU. The enzyme phenylalanine hydroxylase catalyzes the amino acid phenylalanine to tyrosine and is coded for by an autosomal locus in humans. Several loss-of-function mutations have occurred at this locus (Scriver and Waters 1999; Scriver 2007), and homozygosity for loss of function alleles is associated with the clinical syndrome known as phenylketonuria or PKU. Let k designate the set of loss of function alleles, and K be the set of functional alleles at this locus. Because kk homozygotes cannot catalyze phenylalanine, they have a build-up of phenylalanine, a common amino acid in most foods. The degradation products of phenylalanine, such as phenylketones, also build up in kk homozygotes. The phenylketones are typically found at high levels in the urine of the kk homozygotes, an easily scored phenotype that gives the syndrome its name. However, there are other phenotypes associated with this syndrome. For example, kk homozygotes tend to have a lighter skin color than most individuals that share their ethnic background because one of the main pigments in our skin, melanin, is synthesized from tyrosine, which cannot be produced from phenylalanine in kk homozygotes. However, the reason why PKU has attracted much attention is the tendency for kk homozygotes to suffer from mental retardation. As with sickle cell anemia, there is tremendous heterogeneity in the phenotype of mental ability among kk homozygotes that in part is due to epistasis with other loci (Scriver and Waters 1999). However, we will ignore epistasis for now and just treat PKU as a single locus, autosomal recessive genetic disease. The primary source of phenylalanine is our diet. The kk homozygotes typically have normal mental abilities at birth. While in utero, the kk homozygote is not eating but is obtaining its nutrients directly from the mother. Typically, the mother is a carrier of PKU with the genotype Kk, which means that she can catalyze phenylalanine to tyrosine. After birth, the kk homozygote cannot metabolize the phenylalanine found in a normal diet, and mental retardation will likely soon develop. If a baby with the kk genotype is identified soon after birth and placed on a diet with low phenylalanine, the baby will usually develop a normal level of intelligence. Thus, the same kk genotype can give radically different phenotypes depending upon the dietary environment. Because of the responsiveness of the kk genotype to environmental intervention, many countries require genetic screening of all newborns through a simple urine test to detect the kk homozygotes (Levy and Albers 2000; Berry et al. 2013). The PKU screening program has been successful in greatly reducing the incidence of mental retardation due to kk genotypes. Individuals who are kk are generally advised to maintain a low phenylalanine diet throughout their life. However, phenylalanine is such a common component of most protein bearing foods that such diets are highly restrictive and more expensive than normal diets. Moreover, the beneficial effects of the low phenylalanine diet are strongest in children. Once the brain has fully developed,

Basic Quantitative Genetic Definitions and Theory

kk individuals often do not perceive much of an impact of diet on their mental abilities. As a result, compliance with the diet tends to drop off with age (Singh et al. 2014). Note that the nature of the interaction of genotype (kk) with environment (the amount of phenylalanine in the diet) shifts with ontogeny (development) of the organism. Thus, genotype-by-environment interactions are not static even at the individual level. Prior to the successful screening program, few kk women reproduced due to their severe retardation, but with dietary treatment, many kk women married and had children. However, many of these adult women were now eating a normal diet, and hence had high levels of phenylalanine and its degradation products in their blood. The developing fetus, usually with the genotype Kk that typically develops normally, was now exposed to an in utero environment that inhibited normal brain development. Such Kk children of kk mothers on a normal diet were born with irreversible mental retardation. Is the phenotype of mental retardation due to nature or nurture in this case? Obviously, both are important. One cannot predict the phenotype of an individual on the basis of genotype alone; the genotype must be placed in an environmental context before prediction of phenotype is possible. Thus, what is inherited here is not the trait of mental retardation, but rather the response to the dietary and maternal environments. Now, consider the disease of scurvy. Ascorbic acid (vitamin C) is essential for collagen synthesis in mammals. Most mammals can synthesize ascorbic acid, but all humans are homozygous for a nonfunctional allele that prevents us from synthesizing ascorbic acid. As a result, when humans eat a diet lacking vitamin C, they begin to suffer from a collagen deficiency, which leads to skin lesions, fragile blood vessels, poor wound healing, loss of teeth, and eventually death if the vitamin deficient diet persists too long. Thus, humans have an inherited response to the dietary environment that can lead to the disease of scurvy. Both scurvy and PKU therefore have a similar biological causation at the individual level. Both diseases result from the way in which an individual homozygous for a loss-of-function allele responds to a dietary environment. Yet, PKU is typically said to be a “genetic” disease, whereas scurvy is said to be an “environmental” disease. PKU is considered a genetic disease because although the disease arises from the interaction of genes and environment, the environmental component of the interaction is nearly universal (phenylalanine is in all normal diets) whereas the genetic component of the interaction, the kk genotype, is rare. As a consequence, when PKU occurs in a human population, it is because the person has the kk genotype since virtually all of us have a diet that would allow the PKU response given a kk genotype (at least until the screening program). Hence, the phenotype of PKU is strongly associated with the kk genotype in human populations. The condition of scurvy is also the result of an interaction between genes and environment, but, in this case, the genetic component of the interaction is universal in humans. However, the environmental component of the interaction, a diet without sufficient amounts of ascorbic acid, is rare. Therefore, the phenotype of scurvy is associated with a diet deficient in vitamin C in human populations. When we ask the question what causes PKU or scurvy, our answer is that an interaction of genes with environments is the cause of these diseases. However, when we ask the question what causes some people to have PKU or scurvy and others not, we conclude that genetic variation is the cause of phenotypic variation for PKU, whereas environmental variation is the cause of phenotypic variation for scurvy. As the PKU/scurvy example illustrates, the interaction of genes with environment creates a confoundment between frequency and apparent causation in a population of phenotypically variable individuals. When causation at the individual level arises from an interaction of components, then the rarer component at the level of the population is the one with the stronger association with phenotypic variation. Scurvy is an environmental disease because the dietary

303

304

Population Genetics and Microevolutionary Theory

Table 8.1 A hypothetical disease arising from the interaction of two factors. B1 (0.1)

B2 (0.9)

A1 (0.9)

DISEASE (0.09)

No disease (0.81)

A2 (0.1)

No disease (0.01)

No disease (0.09)

Note: Component A has two trait states in the population, A1 with frequency 0.9 and A2 with frequency 0.1. Component B has two trait states in the population, B1 with frequency 0.1 and B2 with frequency 0.9. All frequencies are shown in parentheses.

environment is rare but the genotypic component is common; PKU is a genetic disease because the dietary environment is common but the genotypic component is rare. The dependency of causation of variation upon frequency in a population is illustrated by a hypothetical example in Table 8.1 in which a disease arises from the interaction of two independently varying components. In particular, the disease only occurs when the first component has state A1 and the second component has state B1. In this population, state A1 is relatively common, having a frequency of 0.9, and state B1 is relatively rare, having a frequency of 0.1. As shown in Table 8.1, the frequency of the disease in the population is given by the product (0.9)(0.1) = 0.09, and the remaining 91% of the population has no disease. Now, suppose that a survey is done in this population on component A. Given that an individual has state A1, then that individual will only show the disease when the individual also has trait B1, but we assume that trait B is not being monitored. Therefore, the probability of the disease in individuals with state A1 is the same as the frequency of trait B1, that is, 0.1. Hence, the frequency of the disease in individuals with trait A1 is just slightly above the overall incidence of the disease in the general population of 0.09. Thus, such a study would conclude that A1 is at best a minor cause of variation in the disease. In contrast, suppose a survey is conducted on component B but not A. The frequency of the disease in individuals with trait B1 is equal to the probability that the individual also has trait A1, that is, 0.9. Hence, trait B1 is strongly associated with the disease in this population. Knowing that a person has trait B1, we would conclude that they have a 10-fold higher risk for the disease than the general population incidence of 0.09, in great contrast to the trivial apparent effect of trait A1. However, we know in this hypothetical example that A1 and B1 are equally important in actually causing the disease in any individual. Although A1 and B1 jointly cause the disease in affected individuals, A1 is not a good predictor of which individuals are at high risk for the disease, whereas B1 is a good predictor of disease risk. The cause of variation for a disease risk in a population is therefore a different concept than the cause of the disease in an individual. When dealing with populations, we must therefore change our focus from causation of phenotypes to causes of phenotypic variation. Nature and nurture can never be separated as a cause of a phenotype (premise three), but nature and nurture can be separated as causes of variation in a population, the focus of the remainder of this chapter.

The Fisherian Model of Quantitative Genetics The phenotypes considered in the previous section fall into discrete categories. Such discrete phenotypes were used by Mendel in his studies that uncovered the basic rules of inheritance. However, many phenotypes display continuous variation, for example, height or weight in humans. One of

Basic Quantitative Genetic Definitions and Theory

the major problems after the rediscovery of Mendelism at the beginning of the twentieth century was how to reconcile the discrete genotypes of Mendelian genetics with the continuous, quantitative phenotypes that often were of more practical utility in agriculture and medicine. We will now examine a model developed by R A. Fisher (1918) that extends Mendelian genetics to cover quantitative phenotypes. Fisher and others realized that there were two major ways in which discrete genotypes could map onto continuous phenotypes. First, there could be variation in the environment that interacts with genotypes to produce a range of phenotypes in individuals who share a common genotype that overlaps with the range of phenotypes associated with other genotypes (Figure 8.4). Second, the phenotypic variation could be associated with genetic variation at many loci. As the number of loci increases, the number of discrete genotypes becomes so large that the genotypic frequency distribution approximates a continuous distribution; so, even a simple genotype to phenotype mapping would produce a nearly continuous phenotypic distribution (Figure 8.5). In general, Fisher regarded most phenotypic variation as arising from both underlying environmental and multi-locus genotypic variation. Just as we characterized our population of genotypes by genotype frequencies, we will characterize our population of phenotypes by phenotype frequencies. For quantitative phenotypes, a continuous probability distribution is used to describe the phenotype frequencies rather than the discrete probabilities that we had used in previous chapters to describe the genotype frequencies. Under the Fisherian model in which many variable factors, both genetic and environmental, contribute to phenotypic variation, we would expect many phenotypic distributions to fit a normal distribution due to the central limit theorem of statistics (see Appendix B). This normal expectation is often found empirically to be a good approximation to the population distribution of many quantitative traits. Moreover, even when a trait does not fit a normal distribution, mathematical transformations can often be used to “normalize” the trait. Therefore, in this chapter, we will assume that any quantitative trait is distributed normally in the population. The normal distribution has many optimal properties, the most important of which, for the current discussion, is that only two parameters, the mean (population average) and the variance (the average squared deviation from the mean), fully describe the entire distribution. In particular, the normal distribution is symmetrically centered about the mean, μ, as shown in Figure 8.6. The variance, σ2, describes the width of the distribution about the mean (Figure 8.6). When the phenotypes of a population follow a normal distribution, the individuals may show any phenotypic value, but the phenotypes most common in the population are those found close to the central mean value. The frequency of a phenotype drops off with increasing deviations from the mean, with the rate of drop off of the frequency depending upon the variance. Because it takes only two numbers, the mean and the variance, to fully describe the entire phenotypic distribution, Fisher’s model of quantitative variation has two basic types of measurements: those related to mean or average phenotypes and those related to the variance of phenotypes. Both types of measurements will be considered, starting with the measurements related to the mean.

Quantitative Genetic Measures Related to the Mean The most straightforward measure related to the mean is the mean phenotype itself, μ. (All the quantitative genetic definitions used in this chapter are summarized in Table 8.2.) The mean μ is the average phenotype of all individuals in the population. Alternatively, μ can be thought of as the mean phenotype averaged over all genotypes and all environments. Let Pij,k be the phenotype of a diploid individual with genotype ij (with ij referring to gamete i and gamete j that came together

305

Population Genetics and Microevolutionary Theory

0.05

0.04

0.03

0.02

0.01

0.00

40

60

80

100

120

140

160

Frequency

306

AA

Aa

aa

Figure 8.4 A continuous phenotypic distribution produced by interactions between the genetic variation at a single locus with two alleles (A and a) with environmental variation. The locus is assumed to have an allele frequency of 0.5 and obeys Hardy–Weinberg, with the height of the histogram at the bottom of the figure indicating the genotype frequencies. The arrows coming from a genotype block in the histogram indicate that the same genotype can give rise to many phenotypes depending upon the environment. A thin solid line indicates the phenotypic distribution arising from the interaction of genotype AA with environmental variation, a thin small-dashed line the phenotypic distribution associated with Aa, and a thin large-dashed line with aa. The areas under these normal distributions are proportional to the genotype frequencies in the population. The thick solid line indicates the overall phenotypic distribution in the population that represents a mixture of the three genotypic specific distributions as weighted by the genotype frequencies.

Basic Quantitative Genetic Definitions and Theory

Frequency

Three Loci

1

0

2

3

4

5

6

7

8

AAbb aaBB AaBb

Two Loci Frequency

Aabb aaBb

AABb AaBB

AABB

aabb 0

1

2

3

4

Aa Frequency

One Locus AA

aa

1

2 Phenotype

3

Figure 8.5 An approximate continuous phenotypic distribution produced by increasing the number of loci affecting phenotypic variation. A simple genotype-to-phenotype model is assumed in which each allele indicated by a small case letter contributes 0 to the phenotype, and each allele indicated by a capital letter contributes +1, with the overall phenotype simply being the sum over all alleles and all loci. At the bottom of the figure, the phenotypic distribution associated with a one-locus, two-allele model with equal allele frequencies is shown, the middle panel shows the phenotypic distribution associated with a two-locus, twoallele model with equal allele frequencies, and the top panel shows the phenotypic distribution associated with a three-locus, two-allele model with equal allele frequencies (the genotypes associated with the phenotypic categories are not indicated in that case). As the number of loci increases, the phenotypic distribution approximates more and more that of a continuous distribution.

to form genotype ij) living in a specific environment k in a population of size n. Then, the mean phenotype μ is μ=

Pij,k n ij

81

k

where the summation is over all genotypes ij and over all environments k, which is equivalent to summing over all n individuals. For example, Maxwell et al. (2013) scored the phenotype of the amount of total serum cholesterol (measured in milligrams, mg) found in a deciliter (dl) of blood

307

Population Genetics and Microevolutionary Theory

0.4 Normal Distribution Function

308

μ=0 σ=1

0.3

μ=1 σ=1

0.2 μ=0 σ=2 0.1

−4

−2

0

2

4

X

Figure 8.6 Three normal distributions that show the role of the mean and variance upon the shape and position of the distribution. The x-axis gives the possible values for a random variable, X, and the y-axis is the corresponding value of the normal distribution function for the specified mean and variance.

serum in 9053 European-Americans (Table 8.3). Summing the values of total serum cholesterol over all 9053 individuals and dividing it by 9053 yields a mean phenotype of μ = 213.02 mg/dl. To investigate the role of genetic variation as a source of phenotypic variation, it is necessary to relate mean phenotypes to genotypes. This is done through the genotypic value of genotype ij, Gij, the mean phenotype of all individuals sharing genotype ij. Letting nij be the number of individuals in the population with genotype ij, then Gij =

82

Pij,k nij k

The frequency of genotype ij in the population is given by nij /n, so Eq. (8.1) can be rewritten as: μ= ij

k

Pij,k = n

ij

nij n

k

Pij,k = nij

ij

nij Gij n

83

Hence, the overall mean phenotype is the average of the genotypic values as weighted by their genotype frequencies. Maxwell et al. (2013) also scored these European-American individuals for many SNPs at the autosomal Apoprotein E (ApoE) locus (see Chapter 5), a locus that codes for a protein that can form soluble complexes with lipids such as cholesterol so that they may be transported in the serum. The ApoE protein also binds with certain receptors on cells involved in the uptake of cholesterol from the blood. They surveyed SNPs that included two missense sites that define the classic three alleles at this locus as determined by protein electrophoresis (Sing and Davignon 1985), labeled ε2, ε3, and, ε4. These three alleles in turn define six possible genotypes. Table 8.3 gives these six genotypes, along with the number of individuals bearing each of these six genotypes, and the genotype frequencies. Table 8.3 also gives the mean phenotypes (genotypic values) of total serum cholesterol found for each of the six genotypes that is obtained by adding the phenotypes of all individuals that

Basic Quantitative Genetic Definitions and Theory

Table 8.2

Quantitative genetic definitions.

Name

Symbol

Mathematical Definition/Meaning

Phenotype of an Individual

Pij,k

The value of a trait for individual with genotype ij living in a specific environment k

Population Size

n

The size of the population (or sample) being measured

Mean

μ

μ=

Pij,k n, the average phenotype of all individuals in ij k

the population Genotype Number

nij

The number of individuals in the population with genotype ij

Genotypic Value of Genotype ij

Gij

Gij =

Pij,k nij , the mean phenotype of all individuals k

sharing genotype ij Genotypic Deviation of Genotype ij

gij

gij = Gij − μ, the deviation of the genotypic value of genotype ij from the mean of the total population

Environmental Deviation

ek

ek = Pij,k − Gij, the residual phenotypic deviation inexplicable with the genetic model being used

Genotype Frequency of ij

tij

The frequency of the unordered genotype ij

Average Excess of Gamete Type i

ai

ai =

t ii pi gii

1 2t ij

+ j i

pi

gij , the average genotypic deviation of all

individuals who received a copy of gamete type i Average Effect of Gamete Type i

αi

The slope of the least squares regression of the genotypic deviations against the number of gametes of type i borne by a genotype

Breeding Value or Additive Genotypic Deviation of Genotype ij

gaij

gaij = αi + αj, the sum of the average effects of the gametes borne by individuals with genotype ij

Dominance Deviation of Genotype ij

dij

dij = gij − gaij, the difference between the genotype’s genotypic deviation and its additive genotypic deviation in a one-locus model

Phenotypic Variance

σ 2P

σ 2P

Pij,k − μ

2

n, the variance of the phenotype over all

ij k

individuals in the population Genetic Variance

σ g2

σ 2g =

ij

tij g2ij , the variance of the genotypic deviations

Environmental variance

σe2

σ 2e = σ 2P − σ 2g, the variance left over after explaining as much as possible of the phenotypic variance with the genetic variance

Broad-Sense Heritability

hB2

hB2 = σg2/σp2, the ratio of the genetic variance to the phenotypic variance

Additive Genetic Variance

σa2

σ 2a =

ij

t ij g2aij , the variance in the additive genotypic

deviations (Narrow-Sense) Heritability

h

2

h2 = σa2/σp2, the ratio of the additive genetic variance to the phenotypic variance

Dominance Variance

σ 2d

σ 2d = σ 2g − σ 2a , the residual left over after subtracting off the additive genetic variance from the genetic variance

309

Population Genetics and Microevolutionary Theory

Table 8.3 The genotypes at the ApoE locus found in a population of 9053 European-Americans, along with their numbers, frequencies, and the genotypic values and deviations for the phenotype of total serum cholesterol. Genotype

ε2/ε2

ε2/ε3

Number

66

1145

210

5399

2048

185

9053

Frequency

0.0073

0.1265

0.0232

0.5964

0.2262

0.0204

1

ε2/ε4

ε3/ε3

ε3/ε4

ε4/ε4

Sum or Mean

Gij (mg/dl)

194.46

201.03

203.74

213.40

219.59

223.07

213.02

gij (mg/dl)

−18.56

−11.99

−9.28

0.38

6.57

10.05

0

Note: All means are weighted by genotype frequencies. Source: Data from Maxwell et al. (2013).

3/3 0.006

0.005

Relative Frequency

310

0.004

0.003

3/4 0.002

2/4 0.001

2/3

4/4

2/2

100

150 200 250 300 Total Serum Cholesterol (mg/dl)

350

400

Figure 8.7 The normal distributions with the observed means and variances for the six genotypes defined by the ε2, ε3, and ε4 alleles at the ApoE locus in a population of 9053 European-Americans. The area under each curve is proportional to the genotype frequency in the population. Source: Data from Maxwell et al. (2013).

share a particular genotype and then dividing by the number of individuals with that genotype. The overall mean phenotype can also be calculated by multiplying the genotypic values by their respective genotype frequencies and summing the products over all genotypes. Figure 8.7 shows the normal distributions fitted to the genotypic values and phenotypic variances within each genotype class. As can be seen, the phenotypic distributions of the different genotypes are centered at different places, reflecting the differences in genotypic values given in Table 8.3. Fisher’s focus was upon genetic variation being associated with phenotypic variation. He therefore was not concerned with the actual mean phenotype of the total population. Fisher therefore calculated the overall mean phenotype μ only so that he could discard it in order to focus upon

Basic Quantitative Genetic Definitions and Theory

phenotypic differences associated with genotypic differences. He eliminated the effect of the overall mean phenotype simply by subtracting it from a genotypic value to yield the genotypic deviation, gij, of genotype ij that is given by the equation: gij = Gij − μ

84

Equation (8.3) shows that the average genotypic value (when weighted by the genotypic frequencies) is μ, and using Eq. (8.4), the average genotypic deviation (when weighted by the genotype frequencies) is: Average gij = ij

nij Gij − μ = n

ij

nij Gij − μ n

ij

nij = μ−μ = 0 n

85

Hence, the average genotypic deviation of the population is always zero. This is shown in Table 8.3 for the ApoE example. All of Fisher’s subsequent statistics related to genes and genotypes are based upon the genotypic deviations and, hence, are mathematically invariant to μ. In this manner, Fisher could focus exclusively upon the phenotypic differences between genotypes in the population irrespective of the value of the actual mean phenotype. Fisher realized that not all differences in phenotypes among individuals are due to genotypes. He therefore gave a simple model of an individual’s phenotype as: Pij,k = μ + gij + ek = Gij + ek

86

where ek is defined as the environmental deviation. Fisher’s choice of terminology for ek was unfortunate and has led to much confusion. The term ek is simply whatever one needs to add on to the genotypic deviation of an individual in order to obtain the value of that individual’s phenotype. This is shown graphically in Figure 8.8. Any factor that causes an individual’s phenotype to deviate from its genotypic mean (the genotypic value in Fisher’s terminology) therefore contributes to ek. For example, many genes are known to affect cholesterol levels in addition to ApoE (Sing et al. 1996; Templeton 2000; Helgadottir et al. 2016; Iotchkova et al. 2016). Therefore, some of the phenotypic variation that occurs within a particular genotypic class (as shown by the curves in Figure 8.7) is due to genetic variation at other loci, yet these genetic factors would be treated as “environmental” deviations in a genetic model that incorporates only the variation at the ApoE locus. Moreover, an individual’s phenotype emerges from how the genotype responds to the environment, making the roles of genotype and environment biologically inseparable at the individual level. For example, many of the genes influencing lipid phenotypes strongly interact with the environment, and particularly diet (e.g. Buckley et al. 2017). There is no clean partitioning of genetic and environmental factors at the individual level. However, Eq. (8.6) gives the appearance that an individual’s phenotype Pij,k can be separated into a genetic component (gij) and an “environmental” component (ek). This is a false appearance. Note that μ, gij, and Gij are all some sort of average over many individuals. Thus, Eq. (8.6) is not really a model of an individual’s phenotype; rather, it is a population model of how an individual’s phenotype is placed into a population context. Because ek is just whatever we need to add on to Gij, a quantity only defined at the level of a population, ek is also only defined at the level of a population. There is no way of mathematically separating genes and environment for a specific individual; Eq. (8.6) only separates causes of variation at the population level. Equation (8.6) separates phenotypic variation into two causes: the phenotypic variation due to genotypic variation in the population and whatever is left over after the genotypic

311

Population Genetics and Microevolutionary Theory

0.4

Normal Distribution Function

312

0.3

0.2

ek

0.1 gij

Pij,k

μ

Gij

Phenotype Value

Figure 8.8 A pictorial representation of how the phenotype of an individual with genotype ij is partitioned into a genotypic deviation, gij, and an environmental deviation, ek, which is the number that needs to added on to the genotypic value, Gij, in order to produce the individual’s phenotypic value of Pij,k.

component has been considered. Therefore, ek is more accurately thought of as a residual population deviation that is inexplicable by the genetic model being used. It does not truly measure “environmental” causes of variation (although it is certainly influenced by them). However, Fisher’s terminology is so ingrained in the literature that we will continue to call ek the environmental deviation, although keep in mind that it is really only the residual deviation inexplicable with the genetic model being used. Fisher defined ek as a residual term because this results in some useful mathematical properties. Recall that μ can be interpreted as the mean phenotype averaged over all genotypes and all “environments” (actually residual factors). Hence, averaging both sides of Eq. (8.6) across both genotypes and environments yields:

ij

k

Pij,k = n

rk ij

μ= ij

μ=μ

k

nij Gij n

nij Gij + n

rk ij

k

ij

nij n

rk + k

+

nij ek n r k ek r k ek = 0

r k ek k

87

k

k

where rk is the frequency of residual factor k in the population. The term Σrkek is the average environmental deviation in the population, and Eq. (8.7) shows that it must be zero. Therefore, by using the devise of deviations from the mean, Fisher has eliminated the mean μ from all subsequent calculations. Subtracting off μ also results in both the genetic factors and residual factors having a mean of zero. So far, all of the parameters of Fisher’s model simply give an alternative description of the phenotypes of individuals. Fisher also addressed the more complicated problem of how phenotypes are transmitted from one generation to the next. This is a difficult problem because, under Mendelian

Basic Quantitative Genetic Definitions and Theory

genetics, the genotypes are not directly transmitted across generations, rather haploid gametes transmit the genetic material. The essential problem faced by Fisher was that although the gametes represent the bridge across generations, the haploid gametes do not normally express the phenotypes found in diploid individuals. Therefore, to crack the problem of the role of genetic variation on transmission of phenotypic variation, Fisher had to come up with some way of assigning a phenotype to a haploid gamete even when only diploid individuals actually express the phenotype. Indeed, the core to understanding Fisher’s model is to think like a gamete! That is, one must look at the problem of the transmission of phenotypes from one generation to the next from the gamete’s perspective. Fisher came up with two ways of quantifying phenotypic transmission from a gamete’s point of view. Fisher’s first measure of the phenotype associated with a gamete is ai, the average excess of gamete type i, the average genotypic deviation of all individuals who received a copy of gamete type i. Consider drawing a known gamete of type i from the gene pool. Then, draw a second gamete according to the normal rules defined by population structure. Then, the average excess of gamete type i is the expected phenotype of the genotypes resulting from pairing the known gamete i with the second gamete minus μ. The average excess mathematically is called a conditional expectation or conditional average. Specifically, the average excess of gamete type i is the conditional average genotypic deviation given that at least one of the gametes an individual received from its parents is of type i. To calculate a conditional expected genotypic deviation, we first need the conditional genotype frequencies given that one of the gametes involved in the fertilization event is of type i. For the onelocus model, let pi be the frequency of gamete i in the gene pool. As seen in previous chapters, pi represents the probability of drawing at random a gamete of type i out of the gene pool. Now, let tij be the frequency of diploid genotype ij in the population where j is another possible gamete that can be drawn from the gene pool. Gamete j is drawn from the gene pool with the rules of population structure that are applicable to the population under study. Note, we are not assuming random mating or Hardy–Weinberg in this model. As shown in Appendix B, the conditional probability of event A given that B is known to have occurred is the probability of A and B jointly occurring divided by the probability of B. Therefore, the probability of an individual receiving one gamete of type i and one gamete of type j (thereby resulting in genotype ij) given that one of the gametes is known to be of type i is tij /pi. There is one complication with this equation for conditional genotype frequencies. In population genetics, it is customary to interpret tij as the frequency of the unordered genotype ij, that is, the frequency of ij plus the frequency of ji. We saw this convention in Chapter 2 in deriving the Hardy–Weinberg genotype frequencies for a model with two alleles, A and a. As shown in Table 2.2, there are actually two ways of obtaining the Aa heterozygote, Aa with genotype frequency pq and aA with genotype frequency qp, for a total unordered frequency of 2pq. When given one of the gamete types, there are no longer two ways of drawing a heterozygous combination. Now there is only one way of getting an ij heterozygote (j i), the other gamete must be of type j. Therefore, when using unordered genotype frequencies, the conditional genotype frequencies given one gamete is of type i are: t ii pi 1 t ij Prob ij given i = t ij i = 2 when j pi Prob ii given i = t ii i =

88 i

313

314

Population Genetics and Microevolutionary Theory

With these conditional genotype frequencies given gamete i, we can now calculate the average genotypic deviation given gamete i, ai, as: ai =

t ii g + pi ii

1 2 t ij j i

pi

gij =

89

t ij i gij j

where gij is the genotypic deviation of the genotype ij. For example, consider the ApoE example given in Table 8.3. There are three possible gamete types in this population, corresponding to the three alleles ε2, ε3, and ε4. Using the genotype frequencies in Table 8.3, the allele frequencies for these three gametes are calculated to be p2 = 0.0821, p3 = 0.7727, and p4 = 0.1451 for the ε2, ε3, and ε4 alleles, respectively. Now suppose we know an individual has received the ε2 allele. Then, we know that this individual must be either genotype 2/2, 2/3, or 2/4. The frequencies of these genotypes are 0.0073, 0.1365, and 0.0232, respectively, from Table 8.3. Using Eq. (8.8), the conditional frequencies of these three genotypes given a ε2 allele are 0.0073/0.0821 = 0.0888 (using more decimal points in the division than shown here) for genotype 2/2, (1/2)(0.12/65)/ 0.0821 = 0.7701 for genotype 2/3, and (1/2)(0.0232)/ 0.1412 = for genotype 2/4. These three conditional genotype frequencies sum up to one (with a slight deviation due to rounding error in this case) because they define the set of mutually exclusive and exhaustive genotypes in which a gamete bearing an ε2 allele can come to exist. From Table 8.3, we can also see that if an ε2 allele is combined with another ε2 allele to form the 2/2 genotype, then it has the genotypic deviation of −18.56. Likewise, if an ε2 allele is coupled with an ε3 allele to form a 2/3 genotype, it has a genotypic deviation of −11.99, and if an ε2 allele is coupled with an ε4 allele to form the 2/4 genotype, it has a genotypic deviation of −9.28. Therefore, the average excess of a gamete bearing an ε2 allele is, using Eq. (8.9), a2 = 0 0888

– 18 56 + 0 7701

– 11 99 + 0 1412

– 9 28 = – 12 19 mg dl

Similarly, one can calculate the average excess of the ε3 allele to be 0.28 mg/dl and that of the ε4 allele to be 5.79 mg/dl. Thus, on the average, individuals who received an ε2 allele will have a total serum cholesterol level that is 12.19 mg/dl lower than the overall population mean, individuals who received an ε3 allele are close to the overall population mean (only 0.28 mg/dl above μ on the average), and individuals who received an ε4 allele will have a total serum cholesterol level that is 5.79 mg/dl higher than the overall population mean. In this manner, we have assigned a phenotype of cholesterol level to a gamete even though that gamete never actually displays a cholesterol level of any sort. Basically, the average excess is answering the question of what phenotype a gamete is expected to have once fertilization has occurred using the rules of population structure for that population. The average excess looks at phenotypic variation from the gamete’s perspective, not that of the diploid individual. Note also that the value of the average excess is explicitly a population-level parameter: it depends not only upon the average phenotype of individuals with a given genotype but it is also a function of gamete and genotype frequencies and population structure. Thus, any factor that changes population structure or gamete frequencies can change the average excess of a gamete even if the relationship between genotype and phenotype at the individual level is unchanging. In the above calculations, we made no assumption about population structure, simply taking the observed genotype frequencies as is. However, the calculations of the average excess are simplified for the special case of random mating. Under Hardy–Weinberg, tii = pi2 where pi is the frequency of allele i, and tij = 2pipj when j i. Hence, from Eq. (8.8), the conditional frequency of genotype ij given i is simply pj. This makes sense for random mating. Given one gamete bearing allele i, the

Basic Quantitative Genetic Definitions and Theory

probability of drawing at random a second gamete for a fertilization event of type j is simply its allele frequency, pj. Therefore, for the special case of random mating in a Hardy–Weinberg population, Eq. (8.9) simplifies to ai =

8 10

p j gij j

where the summation is over all alleles j in the gene pool, including i. As an example, we will redo the ApoE example, but this time assuming that the population is in Hardy–Weinberg. Table 8.4 shows the expected Hardy–Weinberg genotype frequencies, which are not significantly different from the observed ones (the chi-square test for Hardy–Weinberg from Chapter 2 is only 0.90 with 3 degrees of freedom, yielding p-level of 0.83). Hence, random mating is an appropriate model for this population. Table 8.4 assumes that the genotypic values are unchanged, that is, the same genotypes have the same mean phenotypes, as given in Table 8.3. However, because the genotype frequencies have changed somewhat, the mean μ of total serum cholesterol changed very slightly to 213.06, and, therefore, the genotypic deviations under random mating are slightly different from those given in Table 8.3. These differences serve as a reminder that Fisher’s quantitative genetic framework is applicable only to populations and not individuals: the same mapping of individual genotypes to individual phenotypes (which are the same for Tables 8.3 and 8.4) can result in different quantitative genetic parameters for populations differing in their gene pool or population structure. Under random mating, the probability of an ε2 allele being coupled with another ε2 allele is p2 = 0.0821, of being coupled with an ε3 allele is p3 = 0.7727, and of being coupled with an ε4 allele is p4 = 0.1451. Using the genotypic deviations in Table 8.4 and Eq. (8.10), the average excess of a gamete bearing an ε2 allele is a2 = 0 0821

– 18 60 + 0 7727

– 12 03 + 0 1451

– 9 32 = – 12 14 mg dl

The values for the average excesses of the other alleles are also shown in Table 8.4. The differences with the values calculated from the data in Table 8.3 are minor, as expected, given that the actual population is not significantly different from a Hardy–Weinberg population. As before, a gamete Table 8.4 The genotypes at the ApoE locus found in a sample of 9053 European-Americans, along with their expected Hardy–Weinberg frequencies, and the values of various quantitative genetic parameters for the phenotype of total serum cholesterol under the assumption of random mating. Genotype

ε2/ε2

ε2/ε3

ε2/ε4

ε3/ε3

ε3/ε4

ε4/ε4

Sum or Mean

H.W. Frequency

0.0067

0.1269

0.0238

0.5971

0.2243

0.0211

1.000

Gi (mg/dl)

194.46

201.03

203.74

213.40

219.59

223.07

213.06

gi (mg/dl)

−18.60

−12.03

−9.32

0.34

6.53

10.01

0.00

Gametes

ε2

ε3

ε4

Frequency

0.0821

0.7727

0.1451

1.000

ai (mg/dl) = αi

−12.14

0.26

5.55

0.0

Genotype

ε2/ε2

ε2/ε3

ε2/ε4

ε3/ε3

ε3/ε4

ε4/ε4

gaij (mg/dl)

−24.28

−11.88

−6.59

0.53

5.81

11.10

0.0

dij (mg/dl)

5.68

−0.16

−2.74

−0.19

0.71

−1.10

0.0

Note: All means are weighted by genotype frequencies or allele frequencies. Source: Data from Maxwell et al. (2013).

315

316

Population Genetics and Microevolutionary Theory

bearing the ε2 allele tends to lower total serum cholesterol relative to μ in those individuals who receive it, a gamete bearing the ε3 allele tends to produce individuals close to the overall population mean, and a gamete bearing the ε4 allele tends to elevate cholesterol levels relative to μ in those individuals who receive it. Fisher also defined a second measure that assigns a phenotype to a haploid gamete. The average effect of gamete type i, αi, is the slope of the least squares regression of the genotypic deviations against the number of gametes of type i borne by a genotype. To calculate the average effect, let t ij gij − αi − α j

Q= i

2

8 11

j

then solve for ∂Q/∂αi = 0 simultaneously for every αi. This solution minimizes the value of Eq. (8.11), which in turn means that this solution corresponds to the values of αι and αj that minimize the squared deviation from the genotypic value gij weighted by the genotype frequency. Using the criterion of a squared deviation, the average effects are those values assigned to a haploid gamete that best explain the genotypic deviations of diploid genotypes. The average excess has the straightforward biological meaning of being the average genotypic deviation that a specific gamete type is expected to have, but the biological meaning of the average effect is less apparent. To gain insight into the biological meaning of the average effect, consider the model in which the genotype frequencies can be described in terms of allele frequencies and the system of mating inbreeding coefficient f (Chapter 3) or fit (Chapter 6) that incorporates deviations from both random mating and population subdivision. Then, the general multi-allelic solution to the simultaneous equations defined by (8.11) yields (Templeton 1987b): αi =

ai 1+f

8 12

Equation (8.12) reveals some important biological insights about average excesses and average effects. First, we can see that the relationship between these two measures of phenotype assigned to gametes is affected by system of mating and population subdivision, as both factors influence f (or fit). This sensitivity to population-level factors emphasizes that the phenotypes assigned to gametes depend upon the population context and not just the relationship between genotype and phenotype. Second, we can see that the average excess and average effect always agree in sign. If the average excess for gamete i indicates that it increases the phenotype above the population average, then the average effect will also indicate the same and likewise for lowering the phenotype below the population average. Moreover, the average excess is zero if and only if the average effect is zero. Finally, for the special case of Hardy–Weinberg, f = 0 and the average excess and average effect are identical. Thus, the average effect is assigning a phenotype to a gamete in a manner proportional to and sometimes identical to average excess. So far, we have examined the parameters Fisher used to describe the phenotypes of diploid individuals (genotypic value and genotypic deviations) and to assign phenotypes to haploid gametes (average excesses and effects). However, to address the problem of the transmission of phenotypes from one generation to the next, we need to look forward to the next generation of diploid individuals. In doing so, we will assume a constant “environment” over time. Without this assumption, any change of environments or frequencies of residual factors in general can alter the relationship of genotype to phenotype at the individual level, and thereby change all population-level parameters. Because an individual genetically only passes on gametes to the next generation, Fisher argued that an individual’s phenotypic “breeding value” is determined by the individual’s gametes and not

Basic Quantitative Genetic Definitions and Theory

the individual’s actual phenotype. In particular, Fisher defined the breeding value or additive genotypic deviation of individuals with genotype ij, gaij, as the sum of the average effects of the gametes borne by the individual, that is, gaij = αi + αj. Thus, an individual’s breeding value depends solely upon the phenotypic values assigned to the individual’s gametes (which are population level parameters) and not to the individual’s actual phenotype. Table 8.4 gives the additive genotypic deviations for the six genotypes at the ApoE locus assuming random mating. To calculate these, all we had to do was add up the average effects (which in this case are the same as the average excesses because of random mating) of the two alleles borne by each genotype. Note that the additive genotypic deviations are not the same as the genotypic deviations. We are not asking about the phenotype of the individual, but, rather, we are looking ahead to the next generation and asking about the average phenotype of that individual’s offspring. The gametes, not the individual’s intact genotype, determine this future phenotypic impact, and, hence, the individual’s “breeding value” is determined by what the individual’s gametes will do in the context of a reproducing population and not that individual’s phenotype per se. Fisher measured the discrepancy between the genetic deviation and the additive genotypic deviation by the dominance deviation of genotype ij, dij, the difference between the genotype’s genotypic deviation and its additive genotypic deviation; dij = gij − gaij. Note that the dominance deviation is yet another residual term; it is the number that you have to add on to the additive genotypic deviation to get back to the genotypic deviation (gaij + dij = gij). The dominance deviations for the ApoE genotypes under the assumption of random mating are given in Table 8.4. It is unfortunate that Fisher named the genotypic residual term dij the “dominance” deviation. Dominance is a word widely used in Mendelian genetics and describes the relationship between a specific set of genotypes to a specific set of phenotypes. Dominance as used in Mendelian genetics is not a population parameter at all. However, the dominance deviation of Fisher is a population-level parameter; it depends upon phenotypic averages across individuals, allele frequencies, and genotype frequencies. It has no meaning outside the context of a specific population. A non-zero dominance deviation does require some deviation from codominance in a Mendelian sense, but the presence of Mendelian dominance (or recessiveness) does not ensure that a non-zero dominance deviation will occur. In this chapter, we have developed only the one locus version of Fisher’s model, but in multilocus versions, there is yet another residual term called the epistatic deviation (which we will examine in Chapter 10). Once again, the population-level residual term of epistatic deviation should never be confused with epistasis as used in Mendelian genetics. Mendelian epistasis is necessary but not sufficient to have an epistatic deviation (Cheverud and Routman 1995). Because the epistatic deviation is the last genetic residual to be calculated, it is often very small, with most of the effects of Mendelian epistasis having already been incorporated into the additive and dominance deviations. We have already seen a hint of this in Table 8.1. In that table, the disease is due 100% to an interaction between the A1 and B1 factors. Suppose these are alleles at two loci in a haploid organism. Then, the disease is due 100% to Mendelian epistasis. However, at the population level, we see only a minor marginal (additive) effect of the common allele A1, but a large marginal (additive) effect of the rare allele B1, as previously shown. Hence, at the population level, the disease is primarily associated with an additive effect of B1 with very little population-level epistasis. As this example shows, Mendelian epistasis often contributes to the additive effects at the population level. Unfortunately, this double use of the word “epistasis” has caused and continues to cause great confusion in the literature (MäkiTanila and Hill 2014; Álvarez-Castro and Le Rouzic 2015; Huang and Mackay 2016). The bottom line is that the lack of an epistatic deviation per se tells one little to nothing about the amount of Mendelian epistasis affecting the phenotype of interest.

317

318

Population Genetics and Microevolutionary Theory

Quantitative Genetic Measures Related to the Variance A normal phenotypic distribution is completely defined by just two parameters, the mean and the variance. Up to now, we have discussed Fisher’s quantitative genetic parameters that are related to mean phenotypes or deviations from means. A full description of the phenotypic distributions now requires a similar set of parameters that relate to the variance of the phenotype. The most straightforward of these measures is the phenotypic variance that is the variance of the phenotype over all individuals in the population: Pij,k − μ

σ 2P = ij

2

8 13

n

k

For example, the phenotypic variance of the European-American sample is 1524.90 mg2/dl2. The phenotypic variance is nothing more than the variance of the phenotypic distribution. Hence, μ and σ p2 provide a complete description of the overall phenotypic distribution under the assumption of normality. Substituting Eq. (8.6) into (8.13) and noting that the summation over all n individuals is the same as summing over all genotypes and residual factors as weighted by their frequencies, we obtain: 2

σ 2P =

t ij r k gij + ek ij

σ 2P

t ij r k g2ij + 2gij ek + e2k

= ij

σ 2P

k

k

ij

σ 2P

8 14

t ij g2ij

=

rk + 2

ij

r k ek +

t ij gij ij

k

t ij g2ij

=

2

k

r k e2k

t ij ij

k

r k e2k

+ k

because the tij’s and the rk’s must sum to one as they define probability distributions over genotypes and residual factors, respectively, and because the mean genotypic deviation and the mean environmental deviation are both 0, as shown previously. Equation (8.14) shows that the phenotypic variance can be separated into two components. Recalling that all variances are expected squared deviations from the mean (Appendix B) and that both gij and ek have means of zero, we can see that both of the components in Eq. (8.14) are themselves variances. The first component is the genetic variance, σ g2, the variance of the genotypic deviations or equivalently the average genotypic deviation squared: σ 2g =

t ij g2ij

8 15

ij

Fisher called the second component of Eq. (8.14) the environmental variance, σ e2, the variance of the environmental deviations or equivalently the average environmental deviation squared: σ 2e =

r k e2k

8 16

k

Although σ e2 is called the “environmental” variance, it is really the residual variance left over after explaining a portion of the phenotypic variance with the genetic model being used. With these definitions, Eq. (8.14) can now be expressed as: σ 2P = σ 2g + σ 2e

8 17

Basic Quantitative Genetic Definitions and Theory

that is, the total phenotypic variation in the population can be split into a component due to genotypic variation and a component not explained by the genotypic variation being modeled or monitored. We can use Eq. (8.15) to calculate the genetic variance in the European-American sample for the phenotype of total serum cholesterol, assuming Hardy–Weinberg, from the data given in Table 8.4 as: σ g 2 = 0 0067 − 18 60 0 2243 6 53 2

2

2

+ 0 1269 − 12 03

+ 0 0211 10 01

2

+ 0 0238 − 9 32

2

+ 0 5971 0 34

2

+

2

2

= 34 52 mg dl

The residual variance can now be calculated by solving Eq. (8.17) for σ e2 to obtain σ e2 = 1524.90–34.52 = 1490.38 mg2/dl2. The proportion of the total phenotypic variation that can be explained by genotypic variation is the broad-sense heritability, hB2, the ratio of the genetic variance to the phenotypic variance: hB 2 = σ g 2 σ p 2

8 18

For example, in our Hardy–Weinberg European-American sample, the broad-sense heritability of the phenotype of total serum cholesterol is 34.52/1524.90 = 0.023. This means that 2.3% of the phenotypic variance in total serum cholesterol levels in this population can be explained by the genotypic variation at the ApoE locus with respect to the alleles ε2, ε3, and ε4. This may seem like a small amount of the total variance, and indeed it is, but ApoE explains more of the phenotypic variance in total serum cholesterol levels in human populations than any other single locus. Broad-sense heritability refers to the contribution of genotypic variation to phenotypic variation, and not the contribution of gametic variation to phenotypic variation. Broad-sense heritability is therefore not useful in modeling phenotypic evolution across generations or the transmission of phenotypes from one generation to the next. To address these issues, it is necessary to have variance components that are related to the phenotypic impact of gametes. Fisher created such a measure with the additive genetic variance, σ a2, the variance in the additive genotypic deviations (breeding values) or, equivalently, the average additive genotypic deviation squared: σ 2a =

t ij g2aij

8 19

ij

For example, using the Hardy–Weinberg frequencies in Table 8.4 for the tij’s and using the additive genotypic deviations in that table, the additive genetic variance (often just called the additive variance) can be calculated to be 33.26 mg2/dl2. The proportion of the transmissible phenotypic variation is measured by the narrow-sense heritability (often just called heritability), h2, that represents the ratio of the additive genetic variance to the phenotypic variance: h2 = σ a 2 σ p 2

8 20

Returning to the Hardy–Weinberg European-American sample given in Table 8.4, the heritability of total serum cholesterol is = 0.022. As we saw above, genetic variation at the ApoE locus for the alleles ε2, ε3, and ε4 could account for 2.3% of the phenotypic variation. The heritability tells us that the transmission of the gametes at this locus can explain 2.2% of the phenotypic variance in the next generation.

319

320

Population Genetics and Microevolutionary Theory

The additive genetic variance is always less than or equal to the genetic variance, so as Fisher did with his mean-related measures, he defined the dominance variance to be the residual genetic variance left over after subtracting off the additive genetic variance, that is: σ 2d = σ 2g − σ 2a

8 21

For example, the dominance variance for the Hardy–Weinberg European-American sample in Table 8.4 is 34.52–33.26 = 1.26 mg2/dl2. Similarly, when multi-locus models are used, there is another residual variance term called the epistatic variance. As with the mean-related measures, the dominance and epistatic variances refer to residual variance components in a population and should not be confused with dominance and epistasis as used in Mendelian genetics. We now have the basic parameters and definitions of Fisher’s model of quantitative genetics. In the next two chapters, we will see how to apply these concepts to two major cases: the first when the gene loci underlying the quantitative variation are not measured and the second when genetic variation at causative loci or loci linked to causative loci are being measured.

321

9 Quantitative Genetics Unmeasured Genotypes

In the previous chapter, the basic definitions of Fisher’s quantitative genetic model were introduced through the worked example of how genetic variation at the ApoE locus influences the phenotype of total serum cholesterol level in a human population. Because the ApoE genotypes of the individuals in this population were known, the average phenotype of a genotype was easy to estimate since all we had to do was take the average cholesterol values of those individuals sharing a common genotype. Likewise, all the other parameters of Fisher’s model could easily be estimated in this case by making use of the data on ApoE genotypes. However, when Fisher first developed his theory, very few genes were identifiable in most species, and the genes underlying most quantitative traits were unknown. As a result, in the vast majority of cases, it was impossible to measure the genotypes associated with the phenotypes of interest. Modern genetics gives us the ability to measure many genotypes related to quantitative traits, as already illustrated by the ApoE example in the previous chapter. However, even today, the genotypes underlying most quantitative traits in most species are not known or measurable. Therefore, the methods used in the previous chapter to estimate the parameters of Fisher’s model cannot be used if we cannot assign an individual to a specific genotype category. Fisher realized that to make his model useful and practical, it would be necessary to estimate at least some of the parameters even when no genotypes were known. Fisher therefore provided statistical methods for estimating the parameters of a genetic model when no direct genetic information is available. This is a remarkable accomplishment that made his model of immediate practical utility in medicine and agriculture, a utility that is enhanced when we can measure the genotypes, as will be shown in the next chapter. How can we do genetics when we do not measure genotypes? Fisher and others came up with three basic approaches:

•• •

Correlation between relatives The response to selection Controlled crosses

All of these approaches require some sort of information about the genetic relationships among individuals. In the first approach, we must gather information about the pedigree relationships between individuals whose phenotypes are being scored, although genetic markers now allow a direct estimation of relationship in the absence of pedigree information (Chapter 3). In the second approach, we must measure the phenotypes of the individuals from a population being selected for some trait values and then measure the phenotypes in their offspring. With the last approach, we make crosses, such as the classical Mendelian crosses of F1, F2, and backcrosses. In every case, information is being generated about the genetic relationships among a group or groups of

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

322

Population Genetics and Microevolutionary Theory

individuals. Each method uses this information about genetic relationships as a proxy for the unmeasured genotypes. We will now examine how this is accomplished for these three major unmeasured genotype approaches.

Correlation Between Relatives Fisher (1918) used the correlation among relatives as his primary tool to calculate quantitative genetic variances and related parameters. To see how he did this, we first need to introduce the idea of a phenotypic covariance that measures the extent to which a phenotype in one individual deviates from the overall population mean in the same direction and magnitude as the phenotype of a second individual. Suppose that we measure the phenotypes of paired individuals, say X and Y, such that xi and yi are the phenotypes for a specific pair, i. We saw in Chapter 8 that the mean, μ, measures the central tendency of these phenotypes and that the variance, σ2, measures how likely they are to deviate from the mean. When we measure the phenotype as a sample of paired observations (xi, yi) with i = 1, …,n where n is the number of pairs sampled, we can also measure how the phenotype “covaries” between individuals. For example, if xi is a phenotypic trait value larger than average, does this give you any information about yi? The extent of the tendency of the paired observations to covary is measured by the covariance of x and y (Appendix B), that is, the expected (average) value of the product of the difference between the phenotype of individual xi and the mean phenotypic value of the population from which individual xi is sampled times the difference between the phenotype of individual yi from the mean value of the population from which individual yi is sampled: n

x i − μx yi − μy

Cov X, Y =

91

n

i=1

where Cov(X,Y) is the covariance between the paired variables X and Y. Equation (9.1) measures how the phenotypes of X and Y covary. For example, suppose that there is tendency for X and Y to have similar values relative to their respective means. This implies that if X is larger than its mean, then Y also tends to be larger than its mean. Thus, both (xi − μx) and (yi − μy) tend to be positive together, so their product is likewise positive. X and Y covarying together also implies that if X is smaller than its mean, then Y does likewise. In that case, both (xi − μx) and (yi − μy) tend to be negative together, so their product still tends to be positive. When such a trend exists, positive terms dominate in averaging the products, resulting in an overall positive covariance. Thus, a positive covariance means that the traits of the two individuals tend to deviate from their respective means in the same direction. In contrast, a negative covariance indicates that as one individual’s phenotype deviates in one direction from its mean, then the other individual’s phenotype tends to deviate in the other direction with respect to its mean. This results in many products between a positive number and a negative number, yielding a negative product and an overall negative average. When the covariance is zero, positive and negative products are both equally likely and cancel one another out on the average. This means that there is no association between the trait of X and the trait of Y in the populations being studied. There is one special case of covariance that will be useful in subsequent calculations, namely that the covariance of a variable with itself is simply the variance of the variable. This is shown by: n

n

x i − μx x i − μy

Cov X, X = i=1

x i − μx

n= i=1

2

n = σ2

92

Quantitative Genetics

Fisher (1918) related his genetic model to covariances by letting X be the phenotype of one individual and Y being that of a second (often related) individual. He then used principles of Mendelian genetics to predict how these phenotypes should covary within classes of paired individuals defined by a particular type of genetic relatedness. Let us first consider a sample of paired individuals from a population. Using the model given in Chapter 8 (Eq. 8.6), we can describe the phenotypes of an individual from the population as follows: Pi = μ + gai + gdi + ei

93

where Pi is the phenotype of individual i, μ is the mean phenotype in the population for all individuals, gai is the additive genotypic deviation for individual i, gdi is the dominance deviation for individual i, and ei is the environmental deviation for individual i (we have dropped the multiple subscript convention used in Chapter 8 that indexes both genotypes and environments because we are now assuming we cannot measure the genotypes, only the phenotypes of individuals). We can now express the covariance of the phenotypes of a pair of individuals from the population with mean μ , say xi and yi, as: Cov Pxi , Pyi = E Pxi − μ Pyi − μ

= E gaxi + gdxi + exi

gayi + gdyi + eyi

= E gaxi gayi + gdxi gayi + exi gayi + gaxi gdyi + gdxi gdyi + exi gdyi + gaxi eyi + gdxi eyi + exi eyi 94 Recall from Chapter 8 that Fisher defined his environmental deviation as a residual term relative to the genotypic deviation and similarly defined the dominance deviation as a residual term relative to the additive genotypic deviation. Mathematically, this means that all the unlike deviation parameters in Eq. (9.4) were defined in such a way as to have a covariance of zero. Thus, all the cross-product terms of unlike deviations must average to zero by definition. Hence, Eq. (9.4) simplifies to: Cov Pxi , Pyi = E gaxi gayi + E gdxi gdyi + E exi eyi

95

Because all of these deviation parameters by definition have a mean of zero, Eq. (9.5) can also be expressed as: Cov Pxi , Pyi = Cov gaxi , gayj + Cov gdxi , gdyj + Cov exi , eyi

96

Fisher assumed that each individual experienced an independent “environment” (that is, residual factors), so the last term in Eq. (9.6) is zero by assumption. This is a critical assumption because it means that the covariance in phenotypes among individuals is only a function of covariances among genetic contributors to the phenotype, that is, the first two terms in Eq. (9.6). The assumption that different individuals have environmental deviations with zero covariance can be violated, and when that happens, the simple Fisherian model being presented here is no longer valid. When correlations exist in the environmental deviations of different individuals, either more complicated models are needed that take into account the correlations (see Lynch and Walsh 1998) or special sampling designs are needed to eliminate or adjust for such correlations. For example, one class of paired individuals frequently used in human genetic studies of quantitative traits is identical twins. Such paired individuals share their entire genotype, but they also frequently share many common environmental factors since much of the human environment

323

324

Population Genetics and Microevolutionary Theory

is defined by our families, schooling, etc. which are typically also shared by identical twins. To eliminate this environmental covariance, some studies go to great expense and labor to find identical twins who were separated shortly after birth and reared apart, in the hope that this will eliminate or at least reduce the environmental covariance. However, even in these studies, significant covariance can still exist for some phenotypes because of other shared environmental features, such as sharing the same womb before birth (Devlin et al. 1997). When a positive covariance exists between the environmental deviation terms among related individuals, the simple Fisherian model will inflate the genetic component of phenotypic variation. This is a serious error that can lead to spurious biological conclusions, but, for the purposes of this book, we will assume the ideal case of no covariance between the environmental deviations of any two individuals in the population. Now, we will consider specific classes of pairs of individuals. First of all, consider the class of mating pairs drawn from a random mating population. Random mating means that there is no covariance between the phenotypes of the mating individuals, so Eq. (9.6) must be zero in this case. Moreover, random mating means that the mating individuals do not share any more genes in common than expected by chance alone, so the covariance of all the genetically related parameters of the mating individuals (the first two terms of Eq. 9.6) must also be zero. Hence, under random mating, the covariance between parental phenotypes from mating pairs is zero because the covariances between all the terms in Eq. (9.6) are also zero. Now consider the covariance between a parent (indicated by the subscript p) and an offspring (indicated by the subscript o). For this pair, Eq. (9.6) becomes (with the assumption of no covariance between environmental deviations): Cov Pp , Po = Cov gap , gao + Cov gdp , gdo

97

Recall from Chapter 8 that the additive genotypic deviations are the only aspects of phenotype that can be passed on from parent to offspring through a gamete. Because the dominance deviation is the residual genetic component that cannot be passed on from generation to generation, the covariance between the dominance deviations of parent and offspring must be zero by definition. Therefore, a phenotypic correlation between parent and offspring is attributed solely to the additive genotypic component of Eq. (9.7); Cov(gap, gao). However, a parent passes on only one gamete to an offspring. The other gamete comes from the other parent, and it is assumed to be unknown. Recall that the additive genotypic deviation of an individual is the sum of the average effects of two gametes. However, because the genotypes are not measured, we cannot split the additive genotypic deviation of the parent into two gametic components, as we did in Chapter 8. However, we do know under Mendelian genetics that one half of the genes of the parent are passed onto the offspring. Therefore, under the supposition of Mendelian inheritance, our best estimate of the average effect of the gamete passed onto the offspring o from parent p is αp = 1/2gap. Of course, the offspring received another gamete from the other parent, say parent m, and let the average effect of this gamete be αm. Therefore, we have that the additive genotypic deviation of the offspring is the sum of the two average effects, namely: gao = 1 2gap + αm

98

Substituting Eq. (9.8) into (9.7) and recalling that Cov(gdp, gdo) = 0 by definition, we have: Cov gap , gao = Cov gap , 1 2gap + αm = Cov gap , 1 2gap + Cov gap , αm

99

Quantitative Genetics

Under random mating, there is no covariance between the average effects of gametes from one parent with that of the other. Therefore, in a random mating population, Cov(gap, αm) = 0, and Eq. (9.9) simplifies to: Cov gap , gao = Cov gap , 1 2gap = 1 2Cov gap , gap = 1 2Var gap = 1 2σa 2

9 10

since the covariance of any variable with itself is the variance of the variable (Eq. 9.2) and the variance of gap is by definition the additive genetic variance, σa2 (Eq. 8.19). With Eq. (9.10), Fisher has managed to relate a fundamental quantitative genetic parameter, the additive genetic variance, to a phenotypic covariance that can be calculated simply by measuring the phenotypes of parents and their offspring. No actual genotypes need to be measured to estimate this fundamental quantitative genetic parameter. Fisher also related other genetic parameters to the correlation between the phenotypes of classes of individuals. A correlation is a covariance that has been standardized by the product of the standard deviations of the two variables being examined (Appendix B). In particular, the correlation between X and Y is: ρXY =

Cov X, Y Var X × Var Y

9 11

The correlation coefficient measures associations among variables on a −1 to +1 scale, regardless of the original scale of variation. Substituting Eq. (9.10) into the numerator of Eq. (9.11), we have that the correlation between the phenotypes of parents and offspring is: 1 2σ 2 a

ρpo =

9 12

Var Pp × Var Po Assuming that the phenotypic variance, σ P2, is constant for the parental and offspring generations, then Var(Pp) = Var(Po) = σ P2, and Eq. (9.12) becomes: ρpo =

1 2σ 2 a σ 2p

= 1 2h2

9 13

where h2 is the heritability of the trait (Eq. 8.20). In this manner, Fisher could estimate the important quantitative genetic parameter of heritability as twice the observable phenotypic correlation between parents and offspring. Note that Eq. (9.13) allows us to estimate that portion of the phenotypic variance that is due to genotypic variation that can be transmitted to the next generation through gametes even though not a single genotype contributing to the phenotypic variance is being measured or is even known. Fisher also refined this estimate based upon parent/offspring correlations. Any offspring has two parents, and Eq. (9.13) makes use of information only on one parent. If the phenotype of both parents is known, then we can calculate the average phenotype of the two parents, known as the midparent value, as 1/2Pm + 1/2Pf where the subscript m denotes the phenotype of the mother and f the phenotype of the father. When we have information about both parents, Eq. (9.8) becomes: gao = 1 2gam + 1 2gaf

9 14

325

Population Genetics and Microevolutionary Theory

The covariance between the midparent value and the offspring is therefore: Cov midparent, offspring = Cov

1 2g am

+ 1 2gaf , 1 2gam + 1 2gaf

= 1 4Var gam + 1 4Var gaf

= 1 2 σa 2

9 15

This is the same result as when we used just one parent (Eq. 9.10). However, the total phenotypic variance of the midparents is only half the original phenotypic variance because it is a variance of an average of two values. Therefore, 1 2σ 2 a

ρpo =

1 2σ 2 p

×

σ 2p

=

1 2σ 2 a σ 2p

=

1 2 h2

9 16

Another statistical method for measuring the association between the phenotypes in the parental generation with the phenotypes in the offspring generation is the least-squares regression coefficient (Appendix B) of offspring phenotype on midparent value. Fisher showed that in general the leastsquares regression coefficient between X and Y is related to the correlation coefficient between X and Y, as follows: bYX = ρXY

σ 2Y σ 2X

9 17

where bYX is the regression coefficient of Y on X. Putting Eq. (9.16) into (9.17), we have: σ 2o = 1 2σ 2 p

bop = ρpo

1 2 h2

1

= h2

9 18

1 2

Hence, another way of estimating the narrow-sense heritability of a trait is to calculate the regression coefficient of offspring phenotype on midparent value, as illustrated in Figure 9.1.

Midparent Phenotype

Offspring Phenotype

326

2

e=

p Slo

h

Figure 9.1 A hypothetical example of a regression of the offspring phenotype upon the midparent value. The slope of the resulting regression line is the heritability of the trait, h2.

Quantitative Genetics

The above equations show how data on the phenotypes of parents and offspring, with no genotypic data, can be used to estimate the additive genetic variance and the heritability of the trait. This makes sense because the phenotypes of a parent and offspring are genetically linked only by what is transmissible through a gamete, and that is exactly what the additive genetic variance and heritability were designed to measure. Fisher also showed that estimates of the other genotypic components of variance are possible by looking at the phenotypic correlation of other types of genetic relatives. For example, consider the covariance between full-siblings (two individuals with the same mother and father). Full-siblings receive half of their genes from the common mother and half from the common father. Accordingly, Eq. (9.14) is applicable to both of the full-siblings, and likewise Eq. (9.15) describes their covariance for their additive genotypic deviations, which is 1/2σ a2. Recall from Chapter 8 that the motivation for developing the additive genotypic deviation was the fact that parents pass on to their offspring only a haploid gamete and not a diploid genotype. The dominance deviation was defined as that residual of the genotypic deviation that is left over after taking out the effects transmissible through a haploid gamete. However, full-siblings share not only half of their genes, but they also share some of the same genotypes. In particular, under Mendelian inheritance, full-siblings share exactly the same genotype at one quarter of their autosomal loci, that is, both siblings receive identical alleles from both parents with probability (1/2)(1/2) = 1/4. Therefore, full-siblings share one quarter of their dominance deviations, the part of the residual genotypic deviations due to single locus diploid genotypes. Hence, Cov(gds1, gds2) = 1/4σ d2 where s1 and s2 index a pair of full-siblings. Moreover, full-siblings will share some multi-locus genotypes as well and therefore some of the epistatic deviations. This will not be explicitly modeled here (for a fuller treatment, see Lynch and Walsh 1998). Therefore, Eq. (9.7) when applied to full-sibling pairs becomes: Cov full siblings = 1 2σ 2a + 1 4σ 2d

9 19

Dividing Eq. (9.19) by the phenotypic variance yields the phenotypic correlation between fullsiblings: ρs1,s2 =

Cov full siblings = σ 2P

1 2σ 2 a

+ 1 4σ 2d σ2 1 2 h2 + 1 4 d = σ 2P σ 2P

9 20

Recall from Eq. (9.13) that the phenotypic correlation between a parent and offspring is half the heritability. Therefore, if σ 2d > 0, then the phenotypic correlation between full-siblings is expected to be greater than that of a parent and offspring under Mendelian inheritance. For example, consider the phenotype of systolic blood pressure in humans (Miall and Oldham 1963). The correlation between parent and offspring in one human population was measured to be 0.237. This implies that the heritability of systolic blood pressure in this population is 2 × (0.237) = 0.474 = h2. In this same population, the phenotypic correlation between full-siblings is 0.333. From Eq. (9.20), the difference between the phenotypic correlation of full-siblings from that of parent and offspring should be 1/4(σ2d/σ2P). Therefore, we can estimate the portion of the total phenotypic variance that is due to dominance deviations as (σ2d/σ2P) = 4 × (0.333–0.237) = 0.384. Also, note that adding (σ2d/σ2P) to the heritability provides an estimate of the broad-sense heritability, which in this case is hB2 = 0.474 + 0.384 = 0.858. Thus, 85.8% of the variation in systolic blood pressure is attributable to genotypic differences in this population, with 14.2% of the phenotypic variance due to “environmental” deviations. Moreover, the genetic proportion in turn can be split into an additive part that is transmissible from parent to offspring (47.4%) and a nontransmissible portion of 38.4%. Thus, by studying more than one class of relatives, a fuller description of the contribution of genetic

327

328

Population Genetics and Microevolutionary Theory

variation to phenotypic variation is possible even though not a single genotype is actually measured or known. The unmeasured approach using phenotypic correlations can be extended to many other classes of relatives and can accommodate some shared environmental influences, such as shared nongenetic maternal effects (Lynch and Walsh 1998). This approach can therefore provide much insight into the genetic contributions to phenotypic variation for many traits in any species for which it is possible to know the genetic relatedness among individuals. The key requirement to this approach for quantitative genetic analysis is to have information about the degree of relatedness among a sample of individuals. Traditionally, this information came from pedigree data, which has its own shortcomings of variable and shallow pedigree depths and the assumption that all base individuals in pedigrees are unrelated (Chapter 3). Moreover, the requirement for pedigree information limited the relatedness approach to the few populations for which such information was available, such as humans and domestic species, or laboratory organisms for which breeding could be controlled. Ritland (1996) proposed that molecular markers could be used to measure relatedness in natural populations, thereby allowing quantitative genetic studies to be applicable to a much broader range of populations. However, the number of molecular markers at the time of Ritland’s proposal precluded accurate estimators of relatedness and thereby poor quantitative genetic estimators (Gienapp et al. 2017). As discussed in Chapter 3, modern molecular genetic technologies have eliminated this problem, and it is now possible to perform quantitative genetic analysis through relatedness with virtually any species and in natural populations. The molecular relatedness measures discussed in Chapter 3 are designed to estimate the proportion of the genomes shared by two individuals, ideally due to identity-by-descent. Accordingly, regressions based on these measures of relatedness are sensitive only to the additive genetic component of quantitative inheritance (Gienapp et al. 2017). As mentioned above, nonadditive dominance variance requires some deviation from strict Mendelian codominance and requires sharing of genotypes and not just alleles or identity-by-descent segments. This genotype information can also be captured by constructing a dominance relationship measure that depends upon genotype data (Muñoz et al. 2014). Having these two relationship measures also allows the estimation of dominance variance and of some epistatic components, such as additive-by-additive interactions, dominance-by-dominance interactions, and additive-by-dominance interactions. Nonadditive genetic variances are often difficult to estimate from pedigree data, and a more accurate separation of the additive and nonadditive components is often possible with marker-inferred relationships than with pedigree data (Gienapp et al. 2017). For example, Table 9.1 shows the partitioning of additive and nonadditive components for the phenotype of height in the tree Pinus taeda for which both pedigree and marker data were available (Muñoz et al. 2014). The goodness of fit of each model to the data is indicated by the Akaike Information Criterion (AIC, see Appendix B), with smaller values indicating better fits. As can be seen from Table 9.1, all analyses identified about the same level of broad-sense heritability, showing that the partitioning of the phenotypic variance into genetic and “environmental” components was about the same for both types of relatedness data. However, the pedigree analyses consistently estimated greater additive effects (narrow-sense heritability) but less dominance variance and no epistatic variance, whereas the marker analyses allocated much more of the genetic variance to the nonadditive components, and particularly to epistasis. The marker analyses in every case fit the data better than the corresponding pedigree analysis, with the overall best fitting model being the marker analysis with dominance × dominance epistatic variance (Table 9.1). In the next chapter, we will discuss how genomic markers can be used to identify specific loci or regions of the genome that contribute to phenotypic variation (genome-wide association studies, or GWAS). At first glance, this may seem a better use of genomic markers than simply using them as

Quantitative Genetics

Table 9.1 Estimates of genetic parameters and goodness of fit through Akaike Information Criterion (AIC) for several genetic analysis models applied to height data from the tree Pinus taeda. Model

h2

σ 2d σ 2P

σ 2A × A σ 2P

σ 2D × D σ 2P

σ 2A × D σ 2P

hB2

AIC

P_A/A

0.233

0.055

0.000





0.288

2605.66

M_A/A

0.088

0.023

0.154





0.264

2603.06

P_D/D

0.228

0.058



0.000



0.286

2603.80

M_D/D

0.139

0.009



0.121



0.269

2600.38

P_A/D

0.231

0.056





0.000

0.288

2604.76

M_A/D

0.125

0.006





0.135

0.266

2601.08

Source: Data from Muñoz et al. (2014). Note: The models are P_A/A (pedigree analysis including additive-by-additive epistatic variance), M_A/A (marker analysis including additive-by-additive epistatic variance), P_D/D (pedigree analysis including dominance-bydominance epistatic variance), M_D/D (marker analysis including dominance-by-dominance epistatic variance), P_A/D (pedigree analysis including additive-by-dominance epistatic variance), and M_A/D (marker analysis including additive-by-dominance epistatic variance).

indicators of relatedness in a classical quantitative genetic analysis. However, these two types of analyses are not mutually exclusive, and the marker-relatedness analysis has several advantages over GWAS. First, GWAS ideally requires a well-annotated reference genome, which is not available for many species. Second, accurate relatedness analyses typically require up to tens of thousands of markers, whereas accurate GWAS typically require hundreds of thousands to millions of markers, although the number of markers needed can vary considerably from species to species depending upon chromosome number, recombination rate, effective population sizes, system of mating, and population structure and history (Gienapp et al. 2017). More importantly, the typical unit of statistical analysis in GWAS is a single marker (typically an SNP), which makes the study of nonadditive components of genetic variation difficult and often ignored, as will be discussed in the next chapter. Moreover, by having a large number of SNPs as the units of analysis, there is a corresponding large statistical penalty for multiple testing to avoid false positives in GWAS that erodes statistical power. Because of these factors and others, GWAS tends to explain only a small portion of the heritability of a trait – the problem of “missing heritability” to be discussed in Chapter 10. This problem is acerbated in natural populations for which the environmental component of phenotypic variance tends to be larger than it is in domestic or laboratory populations. Thus, GWAS conducted in natural populations has limited power and frequently fails to identify any significant genetic associations even when significant heritability is indicated (Gienapp et al. 2017; Perrier et al. 2018). Moreover, the markers that show up in GWAS must have strong marginal effects, but as shown in Table 8.1, the markers with strong marginal effects are affected by the frequencies of other alleles and environments when interactions occur. Heterogeneity in gene pools and environments leads to a serious lack of reproducibility of GWAS results in natural populations (Holger et al. 2018). For all these reasons, using genomic markers for a classical relatedness analysis has a place in modern quantitative genetics that has not been displaced by GWAS.

The Distinction Between Heritability and Inheritance Before proceeding to the other two major methods of unmeasured genotype analyses, we need to further investigate the meaning of heritability. Frequently, the primary genetic parameter that a

329

330

Population Genetics and Microevolutionary Theory

geneticist wants to estimate is the heritability. As we will see in the next section and in Chapter 11, it is the heritability that determines the response to selection, either artificial or natural. As we have seen, heritability can be estimated even when no genotypes are being examined. Because heritability is such an important concept in quantitative and evolutionary genetics, it is critical to draw a distinction between heritability and inheritance, two words that are sometimes confused with one another. Inheritance describes the way in which genes are passed on to the next generation and how the specific genotypes created after fertilization develop their phenotypes. For example, the mental retardation associated with phenylketonuria (PKU) is said to be inherited as a single-locus, autosomal recessive trait under a normal dietary environment (Chapter 8). The phrase “single-locus, autosomal” tells us that the genes that are passed on from generation to generation that are associated with the phenotype of PKU-induced mental retardation are found at just one locus on an autosome, and hence Mendel’s first law of equal segregation is required to understand and model the passage of PKU from generation to generation. The word “recessive” tells us that an individual must be homozygous for a specific allele in order to display the PKU trait in a normal dietary environment. The inheritance of traits is the primary focus of Mendelian genetics, and it requires some knowledge of the underlying genetic architecture. The concept of inheritance is applied to specific individuals in specific family contexts. In contrast heritability measures that portion of the parental population phenotypic variance that is passed on to the offspring generation through gametes. Heritability requires no knowledge of the underlying genetic architecture. Heritability is not defined for individuals, but rather for a population. Heritability is explicitly a function of allele and genotype frequencies, whereas inheritance is not. If any of these population-level attributes are altered, heritability can change but inheritance does not. Inheritance is necessary but not sufficient for heritability. If we know a trait is heritable, then we know that there are genetic loci affecting phenotypic variation, and these loci will have a pattern of inheritance, whether we can observe it or not. On the other hand, knowing that a trait is inherited only tells us that genetic variation affects the phenotype in some individuals, but without knowing the frequencies of the genotypes, system of mating, etc., we can say nothing of heritability. For example, PKU is inherited as a single-locus, autosomal recessive trait under normal dietary environments (Chapter 8), but is PKU heritable? Recall that the k allele class is very rare in the human population and that a random mating model seems to be appropriate for this locus. If we assign a phenotypic value of “1” to the “normal” genotypes under this pattern of recessive inheritance (KK or Kk) and a value of “0” to the affected genotype (kk) in a normal dietary environment, then the average excess/average effect of the PKU allele k in a random mating population in which q is the frequency of allele k is ap = 1 − q 1 − 1 − q2

+ q 0 − 1 − q2

= q2 − q3 − q + q3 = − q 1 − q

9 21

As q goes to zero, so does −q(1 − q). Therefore, because q is very small in human populations, the average excess/average effect of the k allele is close to zero. Similarly, the average excess/average effect of the K allele is also close to zero. Therefore, PKU is not heritable even though it is inherited! Another way of showing the lack of heritability of PKU is to consider the phenotypic correlation between parents and offspring. For a trait like PKU that is due to a rare, autosomal recessive allele in a random mating population, almost all babies born with PKU came from parents who are heterozygotes, that is, PKU children come from Kk × Kk crosses. However, the phenotype of Kk individuals is “1.” Therefore, PKU children with the phenotype of “0” come from “normal” parents with phenotype “1,” and “normal” children came from “normal” parents with phenotype “1.”

Quantitative Genetics

This means that there is no correlation between the phenotypes of the parents and the phenotypes of the children. Once again, we conclude that PKU is not a heritable trait. Indeed, many genetic diseases in humans are inherited as autosomal recessives for a rare allele. Very few of these genetic diseases that afflict humankind are heritable, although all are clearly inherited. Inheritance and heritability should never be equated.

Response to Selection Another way of estimating heritability with phenotypic data alone (although this can also be enhanced by using molecular markers, Gienapp et al. 2017) is through the response to selection. This approach is widely used in agriculture. Keeping the environmental factors as constant as possible across the generations, plant and animal breeders select some segment of the parental population on the basis of their phenotype (for example, the cows that produce the most milk, the plants with the highest yields, etc.) and breed these selected individuals only. Obviously, the point of all this is to alter the phenotypic distribution in the next generation along a desired direction. The Fisherian model given above allows us to predict what the response to selection will be. From Eq. (9.18) or Figure 9.1, we see that the offspring phenotype can be regressed upon the midparent values. Now suppose that only some parental pairs are selected to reproduce. Assume that the selected parents have a mean phenotype μs that differs from the mean of the general population, μ, as shown in Figure 9.2. S = μs − μ is called the intensity of selection and equals the mean phenotype of the selected parents minus the overall mean of the total population (the selected and nonselected individuals). We can then use the regression line to predict the phenotypic response of the offspring to the selection, as shown in Figure 9.2, to be: R = Sh2

9 22

2

S

Offspring Mean

e lop

=h

R

Offspring Phenotype

Midparent Phenotype S Selected Parent Mean

Figure 9.2 The response (R) to selection as a function of heritability, h2, and the intensity of selection (S). The y-axis is drawn to intersect the x-axis at μ, the overall mean phenotype of the parental generation (including both selected and nonselected individuals).

331

332

Population Genetics and Microevolutionary Theory

where h2 is the heritability of the phenotype being selected, and R is the response to selection as measured by the mean phenotype of the offspring (μo) minus the overall mean (selected and nonselected individuals) of the parental generation, μ, that is, R = μo–μ. Equation (9.22) shows that a population can only respond to selection (R) if there is both a selective force (S) and heritability (h2) for the trait. If there is no additive genetic variation, heritability is zero, so there will be no response to selection no matter how intense the selection may be. Hence, the only aspects of the genotype–phenotype relationship that are important in selection (and hence in agricultural breeding programs and adaptation via natural selection, as we will see in Chapter 11) are those genetic aspects that are additive as a cause of phenotypic variation. Keep in mind that this is additivity at the population level, and dominance and epistasis at the Mendelian level can certainly contribute to the response to selection as these Mendelian properties can and typically do affect the additive genetic variance. If the heritability of a trait is known, say from studies on the phenotypic correlations among relatives, then Eq. (9.22) allows the prediction of the response to selection. However, if the heritability is not known, then the heritability can be estimated by monitoring the intensity and response to selection as h2 = R/S. For example, Clayton et al. (1957) examined variation in abdominal bristle number in a laboratory population of the fruit fly Drosophila melanogaster. They first estimated the heritability of abdominal bristle number in this base population through parent–offspring regression (Eq. (9.18)) to be 0.51. In a separate experiment, Clayton et al. selected those flies with high bristle number to be the parents of a selected generation. The base population had an average of 35.3 bristles per fly, and the selected parents had a mean of 40.6 bristles per fly. Hence, the intensity of selection is S = 40.6–35.3 = 5.3. The offspring of these selected parents had an average of 37.9 bristles per fly, so the response to selection is R = 37.9–35.3 = 2.6. From Eq. (9.22), the heritability of abdominal bristle number in this population is now estimated to be R/S = 2.6/5.3 = 0.49, a value not significantly different from the value estimated by offspring–parent regression.

The Problem of Between-Population Differences in Mean Phenotype Before we can discuss the third method for doing unmeasured-genotype quantitative genetics (crosses between populations), we must first address the problem of dealing with populations that have different mean phenotypes. Fisher designed all the quantitative genetic parameters in his model (Chapter 8) for a single population, sharing a common gene pool and characterized by a single system of mating for the phenotype of interest. Moreover, in order to estimate some of these parameters when genotypes are unmeasured, Fisher also had to assume that the environment (actually residual factors) was constant across the generations in the sense that the probability distribution of the environmental deviations was unchanging with time and that all individuals had an independent environmental deviation (this assumption can be relaxed in more complicated models). However, a frequent problem in quantitative and population genetics occurs when we want to compare two populations with distinct phenotypic distributions. Because two different populations may have distinct gene pools and may live in a different range of environmental conditions, the biological meaning of changes in the phenotype from one to the other is difficult to evaluate. Do the populations differ because they have different allele frequencies but have identical genotype to phenotype mappings and the same environment? Or do they differ because they have completely different allelic forms? Or do they differ because they have the same alleles and the same allele frequencies, but differ in the probabilities of various environmental or other residual factors? Or do they differ because of a combination of different allele frequencies, different alleles, and different

Quantitative Genetics

environments? The Fisherian model outlined in Chapter 8 does not address any of these questions because it was designed to be applicable only to a single population. The only statistic related to a comparison of different populations is Eq. (9.2), which is restricted to a population of parents and a population of their resulting offspring. Even in this limited comparison, Eq. (9.2) assumes that the environmental residuals have identical distributions in both populations. Equation (9.2) therefore does not provide a general method for comparing populations – even parent and offspring populations when they live in different environments. The inability of the Fisherian model to compare populations is particularly evident when we want to understand a difference in mean phenotype between two populations. The first step in Fisher’s model is to subtract off the mean of the population from all observations. All of Fisher’s quantitative genetic parameters are defined in terms of deviations from the mean and hence are mathematically invariant to the overall mean value of the population. Consequently, none of Fisher’s quantitative genetic measures such as broad-sense or narrow-sense heritability have anything to do with the mean phenotype of the population. Nevertheless, sometimes, the argument is made that because a trait is heritable within two different populations that differ in their mean trait value, then the average trait differences between the populations are also influenced by genetic factors (e.g. Herrnstein and Murray 1994). Because heritability is a within-population concept that refers to variances and not to means, such an argument is without validity. Indeed, heritability is irrelevant to the biological causes of mean phenotypic differences between populations. To see this, we will consider four examples. First, in Chapter 8, we examined the role of the amino acid replacement alleles at the ApoE locus in a European-American population from the 2010s that had a mean total serum cholesterol level of 213.06 mg/dl (Maxwell et al. 2013). In contrast, Sing and Davignon (1985) studied a Canadian sample of men from the mid-1980s that had a mean phenotype of 174.2 mg/dl. Hallman et al. (1991) studied the role of the same ApoE polymorphisms in nine different human populations, whose mean total serum cholesterol levels varied from 144.2 mg/dl (Sudanese) to 228.5 (Icelanders). These mean differences in cholesterol levels span a range of great clinical significance, as values above 200 mg/dl are considered an indicator of increased risk for coronary artery disease. Hence, these nine populations are greatly different in their phenotypic distributions in a manner that is highly significant both statistically and biologically. Despite these large differences in mean total serum cholesterol levels, a Fisherian analysis of the ApoE polymorphism within each of these populations results in estimates of the average excesses and effects, heritabilities, etc. that are statistically indistinguishable, as illustrated in Figure 9.3 for the average excesses for the Sudanese versus Icelanders. How can such large mean phenotypic differences between populations be totally invisible to the Fisherian model? It is known that the phenotype of total serum cholesterol is strongly influenced by many residual and environmental factors, such as diet, exercise regimes, alcohol consumption, and smoking. The populations examined by Hallman et al. (1991) differ greatly in these environmental variables, so it is not surprising that they also differ in their mean phenotypes. The homogeneity of the results of the Fisherian analyses of the ApoE polymorphism arises from two factors:

• •

The populations all have similar allele frequencies in their respective gene pools; e.g. the frequencies of the ε2, ε3, and ε4 alleles in the Sudanese with the lowest average total serum cholesterol is 0.081, 0.619, and 0.291, respectively, and in the Iceland population with the highest average total serum cholesterol, it is 0.068, 0.768, and 0.165. The genotypes all map onto phenotypes in a similar manner relative to their population means, that is, a genotype that tends to be above the mean in one population tends to be above the mean and by the same relative amount in all populations.

333

Population Genetics and Microevolutionary Theory

10 Average Excess of Total Serum Cholesterol in mg/dl

334

5 0

ε2 ε3

ε4

−5 −10 Icelanders −15 Sudanese −20 Average of 9 Populations −25

Figure 9.3 The average excesses of the ε2, ε3, and ε4 alleles at the ApoE locus on the phenotype of total serum cholesterol in Sudanese, Icelanders, and the average over nine populations of humans. No significant differences exist in the average excess values across populations. Source: Data from Hallman et al. (1991).

The main impact of the environmental factors in this case is to shift the genotypic values (the mean phenotype of a genotype) up or down by approximately the same amount in all genotypes. Therefore, although environmental factors are clearly having a large effect on μ, the mean phenotype of the population, and upon the genotypic values, Gij, there is hardly any effect at all upon the genotypic deviations, Gij−μ. As shown in Chapter 8, all of the important parameters in the Fisherian quantitative genetic model are ultimately functions of the genotypic deviations, so if the genotypic deviations are the same and the genotype frequencies are similar (which they are for the ApoE locus), the quantitative genetic inferences will also be similar, as indeed they are in this case. This example shows that two populations can differ greatly in their mean phenotypes even though the populations are genetically homogenous in alleles, allele frequencies, and average excesses. This example also illustrates that differences in environment can contribute to significant differences in mean phenotypes even though the phenotype is influenced by genetic variability in all populations to the same quantitative degree. Hence, the Fisherian parameters are irrelevant to, and tell us nothing about, the mean phenotypes of populations. Our second example relates to the phenotype most abused by the spurious argument that high heritabilities imply that mean population differences are due to genetic differences between the populations: the intelligence quotient score (IQ) designed to measure general cognitive ability in humans. One of the earliest studies on the heritability of IQ is that of Skodak and Skeels (1949), a study that is still cited by those such as Herrnstein and Murray (1994) who make the argument that differences in IQ scores between populations are genetically based. The study of Skodak and Skeels is frequently cited in this context because they concluded that IQ has a high heritability. However, a closer examination of their results actually illustrates the inapplicability of heritability to mean phenotypic differences between populations – even a parent–offspring comparison in this case. As mentioned earlier, one great complication in human studies is the fact that environmental variables are often correlated among relatives due to shared family environmental effects. Skodak and Skeels attempted to eliminate or at least reduce these environmental correlations by examining the IQ scores of a population of adopted children and comparing the adopted children both to their biological mothers and to their adoptive mothers. Figure 9.4 shows the normal curves obtained

Quantitative Genetics

Adopted Children 0.025 Biological Mothers

Adoptive Mothers

Probability

0.020

0.015 0.010 0.005

60

80

100

120

140

160

IQ Score

Figure 9.4 The normal distributions of IQ scores in the biological mothers (black curve), adopted children (red curve), and adoptive mothers (dashed brown curve) from the study of Skodak and Skeels (1949). Source: Modified from Skodak and Skeels (1949).

with the observed means and variances of these three populations (adopted children, biological mothers, and adoptive mothers). As shown in Figure 9.4, the adoptive mothers had a much higher average IQ score (110) than the biological mothers (86). These two populations of mothers also differed greatly in socioeconomic status, a strong indicator of many environmental differences in human populations. Skodak and Skeels also measured the correlation of the phenotypic scores between the adoptive children and their biological and adoptive mothers. There was no significant correlation between the IQ score of the children with that of their adoptive mothers, but there was a highly significant correlation of 0.44 between the IQ scores of the children and their biological mothers. From Eq. (9.13), this implies that the heritability of IQ is 0.88. Skodak and Skeels therefore concluded that genetic variability is the major contributor to variation in IQ scores within the population of adopted children. But it would be a mistake to infer from these data that IQ is genetically determined at the individual level or that environmental factors play no role in an individual’s IQ score. The average IQ score for the general population is adjusted to have a mean of 100. The subpopulation of mothers willing to give up their children for adoption were not randomly drawn from this general population, but rather were drawn disproportionately from the lower socioeconomic sectors of the general population. The highly selected nature of the population of biological mothers is reflected in the fact that they had a mean IQ of 86, nearly a full standard deviation below the general mean. The selective intensity in this case is S = (86–100) = −14, showing that the biological mothers were 14 IQ points below the mean of the general population. Using a heritability of 0.88, the response in IQ score of the children of the biological mothers should be, using Eq. (9.22), R = (−14)(0.88) = −12.32. Therefore, the expected average IQ of the adoptive children should be, under the assumption of a constant environment across the generations, 100 + R = 87.68. However, as Figure 9.4 shows, the mean IQ score of the adopted children was 107 – nearly 20 IQ points higher than that predicted from Eq. (9.22) and statistically indistinguishable from the mean IQ of the adoptive mothers! Why such a discrepancy? The adoption agencies at that time were also highly selective in choosing the adoptive families, placing the children in families with higher than

335

336

Population Genetics and Microevolutionary Theory

average socio-economic status. The highly selective nature of the families into which the children were placed is reflected in the average IQ scores of the adoptive mothers, which deviates from the general mean of 100 but in a direction opposite that of the biological mothers. The adopted children showed no correlation – a measure of covariation – with their adoptive mothers for IQ, but their mean IQs matched that of their adoptive mothers and not their biological mothers. Hence, Skodak and Skeels concluded that the environments associated with the adoptive families strongly contributed to a significant increase in IQ scores (well over a standard deviation) for the adoptive children. Thus, two major conclusions emerge out of the study of Skodak and Skeels:

• •

IQ has a high heritability within the population of adopted children, and genetic variation is the primary cause of phenotypic variation in the IQ scores of these children. The environment strongly influences the IQ scores of these children.

For those who do not understand the concept of heritability, these two conclusions may seem contradictory, but they are not. Heritability has nothing to do with the mean phenotype. If the altered socioeconomic environments into which these children were placed had a uniform and highly beneficial impact upon all of them, the mean IQ score could rise substantially (as it did) but without altering the strong positive correlation between IQ scores of biological mothers and adopted children. As shown explicitly in Eq. (9.1), the correlation coefficient measures the association between the paired observations relative to their respective population means. Consequently, the high heritability of IQ in this study indicates that biological mothers who had an IQ below the mean of 86 had a strong tendency to have a child with an IQ below the children’s mean of 107. Likewise, a biological mother with an IQ above the mean of 86 had a strong tendency to have a child with an IQ above the children’s mean of 107. The actual mean values themselves of 86 and 107 are subtracted off each observation in the very first step of calculating the correlation coefficient (Eq. 9.1, Appendix B) and are thereby mathematically irrelevant to the calculation of the correlation coefficient and of the heritability. It is only the deviations from the mean that influence heritability, not the mean value itself. Therefore, even if the heritability were 1, the environment could still be a major determinant of the mean phenotype, and thereby a major determinant of the actual phenotypic values of each individual. In the previous two examples, the environment had a strong impact upon the phenotype, but the environmental variation simply shifted the phenotypes up or down in a similar manner for all individuals irrespective of their genotype. The interpretation of phenotypic differences between populations is even more complicated when different genotypes respond to the environment in different, nonuniform fashions. For example, different populations of the plant Achillea (a yarrow) live at different altitudes and show large phenotypic differences in size and growth form, as well as much variation within a population (Clausen et al. 1958). The bottom panel of Figure 9.5 shows some of the variation found in five individual plants collected along an elevation gradient in California. An entire adult plant can be grown from a cutting in this species, which clones the entire intact genotype of that individual. Cuttings from these five plants were then grown at low-, middle, and high-elevation sites, with the results shown in the three panels of Figure 9.5. The responses to environmental variation were substantially different across these five fixed genotypes. For example, the genotype that was the tallest at low elevation is of medium height at the middle elevation and dies at the highest elevation. In contrast, the shortest plant at the low elevation site becomes even shorter at the middle elevation site, but then grows much taller at the high elevation site. This heterogeneity in genotypic response to variation in elevation illustrates the norm of reaction, that is, the phenotypic response of a particular genotype to an environmental factor. In the ApoE example, all genotypes shared a common norm of reaction to the environmental variation that existed among

Quantitative Genetics

cm 50

Died

Died

San Gregorio

Knight’s Ferry

50

100

50

Mather

Tenaya Lake

Big Horn Lake

Figure 9.5 Variation in growth form and height as a function of elevational environmental variation in five cloned genotypes of the plant Achillea (yarrow) grown in a low elevation (bottom panel), a mid-level elevation (middle panel), and a high elevation (top panel). The original locations of the source of the clones are indicated by the names given below the bottom panel and are ordered from lowest elevation to highest in going from left to right. Source: Based on Clausen et al. (1958).

the nine populations sampled. However, for the plant Achillea, there is considerable genotypic variation in the norm of reaction itself. In this sense, the norm of reaction can be regarded as a phenotype (albeit a difficult one to measure as it requires measurements upon common genotypes in different environments) – a phenotype that can also show genotypic variability and heritability in some populations for some traits (height in Achillea) but not others (serum cholesterol as influenced by ApoE in humans). The fourth and last example involves the differences in male head shape between two species of Hawaiian Drosophila, Drosophila silvestris and Drosophila heteroneura (Figure 9.6). As can be seen from that figure, the head shapes are extremely different, with D. heteroneura having a hammerhead shape, whereas D. silvestris has a rounded head that is typical of most Drosophila species. Indeed, there is no phenotypic overlap between the male head shapes of these two species. Moreover, there is little phenotypic variation within some laboratory strains of each species for male head shape (Templeton 1977a; Val 1977), and the little variation that exists in these laboratory

337

338

Population Genetics and Microevolutionary Theory

hl

hl

hw D. heteroneura

D. silvestris

Figure 9.6 The male shapes in two species of Hawaiian Drosophila, Drosophila heteroneura (left), and Drosophila silvestris (right). Shape is measured by the ratio of head length (hl) to head width (hw) or various mathematical transformations of that ratio. Source: From figure 1 in Val (1977). Copyright © 1977 The Society for the Study of Evolution.

strains seems to be attributable to environmental variation (Templeton 1977a). Therefore, the heritability of head shape is zero within these strains of both species. D. heteroneura and D. silvestris are interfertile, and the genetic basis of their head shape differences can therefore be studied directly, as will be detailed in the next section. These crosses clearly indicate that the head shape differences between these strains of different species are largely genetic. In this case, heritability of head shape differences is small within these laboratory strains of D. heteroneura or D. silvestris, yet the head shape differences between the strains of D. heteroneura and D. silvestris are primarily genetically based. Collectively, these four examples show that there is no relationship between heritability within a population and the biological basis of phenotypic differences between populations. Traits can have high heritabilities within a population, yet large phenotypic differences between may be primarily due to environmental factors (e.g. IQ in the study of Skodak and Skeels 1949), or traits can have low or no heritability within a population, yet large phenotypic differences between may be primarily due to genetic differences (e.g. head shape in D. heteroneura and D. silvestris). This is not a surprising conclusion because the entire Fisherian model given in Chapter 8 was designed to investigate causes of variation in deviations from the overall population mean within a single population. The Fisherian model given in Chapter 8 was not designed to study between population differences. We therefore need a new quantitative genetic approach to study between population differences. An unmeasured genotype approach to this problem is given in the following section.

Controlled Crosses for the Analyses of Between Population Differences As shown in the previous section, the traditional quantitative genetic model is inapplicable to the study of between population differences. Moreover, with unmeasured genotypes, there is a serious confoundment between environmental and genetic factors when comparing differences between populations. This problem can be circumvented by finding or creating a hybrid or admixed

Quantitative Genetics

population and then studying the distribution of phenotypes in these hybrid and other crosses under a common environmental regime. Hybridization and admixture occur naturally in many circumstances. Such admixed populations provide a valuable opportunity for studying the genetic basis of phenotypic differences between the original parental populations. However, the uncontrolled nature of such natural crosses poses difficulties. If the admixture was very recent or ongoing and it is possible to identify individuals as being members of the traditional Mendelian categories of F1, F2, backcross, etc., then the approaches to be discussed in this section are applicable. In many cases of natural hybridization or admixture, it is impossible to accurately classify individuals or otherwise quantify the degree of admixture for an individual without the use of genomic markers. In those cases for which genomic markers are available, it is better to use the admixture to map the parts of the genome that actually contribute to phenotypic variation, as will be discussed in the next chapter on measured genotype approaches. In this chapter, we limit the discussion to the third major way of performing unmeasured genotype analyses: controlled crosses between populations or the use of recent and ongoing admixture between populations in which individuals can be unambiguously classified into traditional Mendelian cross categories. The approach of controlled or known crosses can address the genetic basis of phenotypic differences between populations. We will only deal with the simplest case in which the two populations have fixed genetic differences for the genes affecting the trait of interest, although more complicated situations can be handled (Lynch and Walsh 1998). In this simple case, the original parental lines can be regarded as homozygous stocks with no internal genetic variation. This also means that the trait has no heritability within either parental line for the reason that there is no genetic variation within lines. Consider now the simplest situation in which the phenotypic differences between the populations are due to a single locus in which individuals from the two parental populations (P1 and P2) are homozygotes and the F1 between them are heterozygotes: Population: Genotype: Genotypic Value:

P1 AA GAA

F1 Aa GAa

P2 aa Gaa

As with the Fisherian model for a single population, our focus is not upon the genotypic values per se, but rather upon the phenotypic differences between the two parental populations. The difference in mean phenotypes between the parental populations is GAA − Gaa, and as long as we preserve this difference, we can rescale the genotypic values to any mathematically convenient value, just as the traditional one-population Fisherian analysis is mathematically invariant to the overall population mean μ, as shown in the previous section. A convenient transformation of the original genotypic values is based upon the midparental value of the two parental populations, mp = (GAA + Gaa)/2. Then, let a = new genotypic value of AA = GAA − mp = (GAA − Gaa)/2 d = new genotypic value of Aa = GAa − mp −a = new genotypic value of aa = Gaa − mp = (Gaa − GAA)/2 = −(GAA − Gaa)/2 Note that the difference in mean phenotypes between the parental populations is now a-(−a) = 2a = (GAA − Gaa). Thus, the original mean phenotypic differences are preserved under this transformation, but now we have the mathematical advantage of describing the three original genotypic values by just two parameters, a and d, thereby simplifying our model.

339

340

Population Genetics and Microevolutionary Theory

Let σe2 be the phenotypic variance within population P1. This variance is assigned a subscript of e to emphasize that it must be an environmental variance because by assumption there is no genetic variation within P1; all individuals are homozygous AA. Now assume that environmental variation is identical in both parental populations and all crosses between them (the “common garden” aspect of this design) and that all genotypes respond to this variation with identical phenotypic variances. The phenotypic environmental variance σe2 is applicable to the individuals within population P2 (aa) and the F1 heterozygotes Aa as well. Thus, σe2 has the biological interpretation in this model as being the within-genotype phenotypic variance. With these assumptions, the various traditional Mendelian crosses have phenotypic means and variances as given in Table 9.2 (Cavalli-Sforza and Bodmer 1971). Because we have assumed that the original parental strains are homozygous, the first cross in Table 9.2 that has any genotypic variation within it is the F2. As can be seen from Table 9.2, the phenotypic variance in the F2 can be partitioned into a portion reflecting the mean differences between the genotypes as weighted by their F2 frequencies and σe2, the within-genotype variance. Moreover, note that the between-genotype component of the phenotypic variance can be divided into two terms, one that is solely a function of a and one that is solely a function of d. The portion that is a function of a is defined to be the additive genotypic variance and the portion that is a function of d is defined to be the dominance variance. Note that despite the identical terminology, the additive and dominance variances given in Table 9.2 are not in general the same as the additive and dominance variances given in Chapter 8. In Chapter 8, the additive and dominance variances are defined for a population with arbitrary genotype frequencies, but, in Table 9.2, the additive and dominance variances are defined specifically for an F2 population. The system of mating in this case is highly nonrandom and fixed by the investigator. Because of this, the genotype frequencies are the same as Mendelian probabilities, that is, the genotype frequencies of AA, Aa, and aa are in a 1 : 2 : 1 ratio. Hence, the additive and dominance variances here are strictly limited to a manipulated F2 cross and have no generality to any other population. Although Table 9.2 was derived only for a one-locus model, as long as each locus contributes to the phenotype in a completely additive fashion with no epistasis, then a has the biological interpretation of being the sum of all the homozygote effects in the P1 and d the sum of all the heterozygote effects in the F1, and all else in Table 9.2 remains the same. Hence, Table 9.2 gives the expected phenotypic means and variances when the genetic component of the phenotypic difference between the original two parental populations is due to fixed differences at one locus or many loci with additive gene action. By performing these crosses and measuring the phenotypic variances and means, one can estimate the underlying genetic parameters of this idealized genetic model. For example, do the male head shape differences between Drosophila heteroneura and D. silvestris (Figure 9.6) have a genetic basis? To answer this question, we first need a measure of head shape. Val (1977) measured the head length (hl) and head width (hw) (Figure 9.6) in individuals from laboratory strains of both species and their F1, F2, and backcross populations. Head length and width is a function of both head shape and head size. In order to measure only head shape and not size, Templeton (1977a) transformed the original measurements to polar coordinates. The angle θ = arctan(hl/hw) measured in radians is a convenient measure of head shape because it depends only upon the relative proportions of the original head measurements through their ratio hl/hw and is therefore mathematically invariant to their absolute size. Table 9.3 shows the results obtained by using this transformation on the results obtained by Val (1977). One simple way of estimating the parameters of this model is to equate the expectations with the observed values (Cavalli-Sforza and Bodmer 1971). When multiple observations can be related to the same parameter, an average of the relevant observed values is taken weighted by

Table 9.2 The expected means and variances for the one-locus model of between population phenotypic differences. Genotype Frequency Population or Cross

AA

Aa

aa

Phenotypic Mean

Phenotypic Variance

P1

1





a

σe2

P2





1

−a

σe2

F1



1



d

F2

1

/4

1

/2

BC1

1

/2

1

/2



1

/2

BC2 Sum of BC1 and BC2

1

/4

– 1

/2

1

/4(a) + 1/2(d) + 1/4(−a) = 1/2d 1

/2(a) + 1/2(d) = 1 /2(a + d) 1 /2(d) + 1/2(−a) = 1/2(d − a) 1 /2(a + d) + 1/2(d − a) = d

σe2 1

/4(a − 1/2d)2 + 1/2(d − 1/2d)2 + 1/4(−a − 1/2d)2 + σe2 = 1/2a2 + 1/4d2 + σe2 = σa2 + σd2 + σe2 1 /2[a−1/2(a + d)]2 + 1/2[d−1/2(a + d)]2 + σe2 =1/4a2 + 1/4d2−1/2ad + σe2 1 /2[d−1/2(d − a)]2 + 1/2[−a−1/2(d − a)]2 + σe2 =1/4a2 + 1/4d2 + 1/2ad + σe2 1 /2a2 + 1/2d2 + 2σe2 = σa2 + 2σd2 + 2σe2

Note: The between-population additive and dominance variances are defined in this model to be σa2 = 1/2a2 and σd2 = 1/4d2, respectively. All other parameters are defined in the text.

342

Population Genetics and Microevolutionary Theory

Table 9.3 A quantitative genetic analysis of male head shape differences between Drosophila silvestris and Drosophila heteroneura, using the model given in Table 9.2. Populaion or Cross

N

Mean θ

Var (×10 000)

Expected Variance

D. silvestris

20

1.157

1.21

σe2

D. heteroneura

20

1.234

0.66

σe2

F1

37

1.197

0.67

σe2

F2

71

1.207

1.59

σa2 + σd2 + σe2

BC1

141

1.187

1.55

BC2

123

1.224

1.21

Sum = σa2 + 2σd2 + 2σe2

Source: Modified from Templeton (1977a).

their sample sizes (for variances, the sample size is adjusted for the estimation of the mean by subtracting one; see Appendix B). For example, the phenotypic variances of the two parental strains and the F1 should all equal σe2. Therefore, letting Vi be the observed estimated phenotypic variance in strain or population i (letting i = s designate D. silvestris and i = h D. heteroneura), the estimate of σe2 is: Ve =

ns − 1 V s + nh − 1 V h + nF 1 − 1 V F 1 = 0 000081 ns − 1 + nh − 1 + nF 1 − 1

9 23

Similarly, the additive and dominance variances can be estimated by: V a = 2V F 2 − V BC1 + BC2 = 0 000042

9 24

V d = V F 2 − V a − V e = 0 000036 or V d = 1 2V BC1 + BC2 − 1 2V a − V e = 0 000036

9 25

and

Hence, the total genotypic variance in this case is 0.000042 + 0.000036 = 0.000078. Given that the estimated environmental variance is 0.000081, this means that about half of the variability in male head shape between these species is attributable to genetic differences between the species. More sophisticated statistical and genetic models can be used for controlled crosses and natural hybrids with individuals of known cross type (e.g. Wu and Li 2000, and see Lynch and Walsh 1998). However, the important message is that to study the genetic basis of between population differences, genetic crosses in a common environment are essential. It is impossible to infer the biological basis of inter-population differences in phenotypes from observations on the phenotypes of populations that potentially live in different environments. It is essential to control the environment and place all genotypes into identical environments (as was done, for example, in the studies on the plant Achillea, also called “the common garden” design) and do crosses, the traditional mainstay of Mendelian genetic analysis. The traditional definitions of genotypic variance, heritability, etc. as given in Chapter 8 are irrelevant to the problem of between population phenotypic differences, and arguments based on such intra-population parameters have no biological validity.

Quantitative Genetics

The Balance Between Mutation, Drift, and Gene Flow Upon Phenotypic Variance When genotypes are observed, we can measure the balance between the evolutionary forces of mutation, drift, and gene flow upon genetic variation (Chapters 2 through 7). However, in this chapter, we are assuming that only phenotypic variation is observable, not genotypic variation. Nevertheless, with some additional assumptions, it is still possible to make some predictions about the balance of these evolutionary forces upon quantitative genetic variation that contributes to phenotypic variation. Suppose that there is phenotypic variation for a neutral trait. By a neutral trait, we mean that all individuals make the same average genetic contribution to the next generation regardless of an individual’s trait value. Moreover, suppose, phenotypic variation for this trait is influenced by environmental variation in a manner that is homogeneous throughout the species’ range and affects within-genotype phenotypic variance to the same extent in all genotypes (just as assumed in the previous section). With these assumptions, first consider the balance between mutation and drift upon the genotypic variance in a single, undivided population. Because phenotypes are being measured and not genotypes, the mutation rate μ is not directly relevant, rather, only the extent to which new phenotypic variation is being produced by mutation is relevant. Therefore, let σm2 be the mutational additive variance, that is, the additive genetic variance in the phenotype that is created by mutation every generation. This mutational additive variance is the analog of the mutation rate in this model of phenotypic evolution. Just as we assumed in previous chapters that μ was constant over the generations, we make the same assumption about mutational additive variance. Lynch and Hill (1986) then showed that the equilibrium additive genetic variance is given by: 2N eλ σm 2

9 26

where Neλ is the eigenvalue effective size that measures the rate at which alleles become fixed or lost (Chapter 4). As we saw in Chapter 4, the balance of mutation and drift upon various measures of genetic variation depends upon the product of an effective population size times the mutation rate (reflecting the balance of the power of mutation divided by the power of drift, which is inversely proportional to an effective population size). Equation (9.26) says much the same at the phenotypic level. Additive genetic variance is created by an amount σm2 per generation by the action of mutation, and it is lost at a rate 1/(2Neλ) due to the action of drift per generation. The equilibrium additive genetic variance is just the ratio of the rate of creation divided by the rate of loss. Whitlock (1999) extended this basic model to incorporate various models of population subdivision and gene flow. In dealing with subdivided populations, we are concerned both with the amount of additive genetic variance available within the local demes, designated by Vwithin, and the additive genetic variance found among the demes, Vamong. When population subdivision is measured by Fst (Chapter 6), then the equilibrium additive genetic variances are: V within = 2N eλ 1 − F st σm 2 V among = 4N eλ F st σm 2

9 27

where Neλ is the eigenvalue effective size for the species as a whole and not the local demes. Recall that Fst itself is a function of the inbreeding effective size and the pattern and amount of gene flow (Chapter 6). Hence, a great deal of biological complexity is implicit in Eq. (9.27). Equations (9.27) show that the amount of additive genetic variance within a local deme increases with decreasing Fst, converging to the single, undivided population result (Eq. 9.26) when Fst = 0.

343

344

Population Genetics and Microevolutionary Theory

As shown in Chapter 6, this means in turn that as local eigenvalue or inbreeding effective sizes become larger and/or as there is more gene flow between demes, more and more additive genetic variance becomes available within the local demes. In contrast, the additive genetic variance among demes is an increasing function of Fst, so this component of additive genetic variance is augmented by smaller local eigenvalue or inbreeding effective sizes and less gene flow. As subdivision becomes increasingly extreme, more and more of the additive genetic variance shifts from the within component to the among, until at Fst = 1, there is no additive genetic variance within demes and all additive genetic variance exists due to fixed genetic differences among demes. The total additive genetic variance in the species is Vwithin + Vamong, which from Eq. (9.27) is: 2N eλ σm 2 1 + F st

9 28

Whenever a population is subdivided due to local drift and restricted gene flow, Fst > 0 and Eq. (9.28) will be greater than Eq. (9.26), the total additive genetic variance in a species consisting of a single, undivided population. Recall from Chapter 6 that a subdivided population has a total variance effective size larger than an undivided population with the same census size. The same is true for eigenvalue effective size (Maruyama 1972). Consequently, for the same number of individuals, Neλ is generally going to be larger for a subdivided population than for an undivided one. Moreover, population subdivision converts more of what would be nonadditive genetic variance in an undivided population into additive variance (Goodnight 1995). These impacts of population subdivision further increase the total additive genetic variance. Hence, population subdivision not only causes a partitioning of additive genetic variance within and among demes but it also increases the total amount of additive genetic variance found in the species as a whole given the same mutational input per generation. As we will see in Chapter 12, this impact of subdivision upon additive variance can have important adaptive consequences.

345

10 Quantitative Genetics Measured Genotypes

As seen in the previous two chapters, Fisher developed a framework for analyzing the genetic contributions to phenotypic variability even when no genotypes were measured. In the same year that Fisher published his paper (1918), Payne (1918) studied the genetic basis of scutellar bristle number in Drosophila melanogaster by crossing two lines with different bristle numbers and followed their response to selection. Payne made his genetic inferences about the underlying genetic contribution to the phenotype of scutellar bristle number by monitoring the fate of several visible genetic markers scattered throughout the genome that displayed fixed differences between the original parental strains. By following these markers and overlaying bristle number data on the marker genotype data, Payne inferred that several loci affect bristle number. This was perhaps the first example of using measured genotypes to study the genetics of a multi-locus quantitative trait. Although the measured genotype approach is as old as the Fisherian unmeasured approach, the measured approach has only recently become widespread. There are two primary reasons for this new excitement over an old approach.

• •

Molecular genetics has revealed many polymorphic loci scattered throughout the genome of most species. As a result, genetic markers are now available for virtually any organism, and the scoring of the markers is more amenable to automation than the visible markers used by Payne. Moreover, much is known about the biochemical and/or physiological functioning of some of these genetic markers, which can allow the formulation of direct hypotheses about specific genotypes influencing specific phenotypes. Computing power is now available to implement intensive calculations.

This combination of molecular genetic and computer technology allows studies on quantitative traits that were impractical earlier and that extends the applicability of measured genotype approaches in quantitative genetics to many more phenotypes and organisms. There are two main types of measured genotype approaches. First are marker association studies that measure genotypes at loci that are not known to directly affect the phenotype of interest but that may display indirect associations, typically through genetic linkage, with those loci that do influence the phenotypic variation. Payne’s study is an example of this approach. He had no reason to propose that the visible markers he was monitoring were in any way related to the phenotype of scutellar bristle number. However, in the context of a cross between two strains and subsequent reproduction, he reasoned that any marker located on the same chromosome as a locus actually affecting scutellar bristle number would display phenotypic associations through genetic linkage. Marker association studies can be divided into three major subtypes on the basis of

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

346

Population Genetics and Microevolutionary Theory

the primary factor that is thought to lead to the indirect associations between genetic variation at marker loci and the phenotypic variation of interest:

• • •

Admixture Mapping. As seen in Chapters 3 and 7, past or ongoing admixture events between genetically differentiated populations induce extensive linkage disequilibrium. If the parental populations also display phenotypic differences that have a genetic basis, then marker loci that have large differences in allele frequency in the parental populations are expected to show indirect phenotypic associations in the admixed population through linkage disequilibrium with the loci actually contributing to the phenotypic differences between the parental populations. This type of marker study is appropriate for studying the genetics of between-population phenotypic differences. Markers of Linkage. Linkage can affect the cosegregation of genetically variable loci within a family or set of controlled crosses even in the absence of linkage disequilibrium between the loci at the population level. Therefore, studies of marker segregation in families with known pedigrees or controlled or known crosses provide a method for detecting indirect associations that do not depend upon a population’s evolutionary history or upon linkage disequilibrium in the population. This approach is appropriate for studying the genetics of both intra- and interpopulational phenotypic differences. Genome-Wide Association Studies. Linkage disequilibrium occurs within a single population. Indeed, as shown in Chapter 2, the very act of mutation creates disequilibrium. Disequilibrium induced by mutation generally decays very rapidly due to recombination under random mating and is therefore mostly found between polymorphic sites with a low recombination rate. As shown in Chapter 4, linkage disequilibrium can extend across larger segments of DNA if recent founder or bottleneck effects had occurred. As with markers of interpopulation disequilibrium, this approach makes use of a population’s past evolutionary history, but tight physical linkage is generally required to detect indirect phenotypic associations through intrapopulation linkage disequilibrium. This type of marker study is appropriate for studying the genetics of the phenotypic diversity found within a single population but can be extended to two or more subpopulations.

All of these marker approaches share the common theme of indirect associations influenced by physical genomic linkage between markers and causative loci. The second major type of a measured genotype approach is the candidate locus study in which prior information is used to implicate specific loci as likely to contribute to the phenotype of interest. The candidate locus approach is appropriate for studying the genetics of both intra- and interpopulational phenotypic differences. This approach can and often is integrated with the marker loci approach. In choosing markers for a marker association study, an investigator can include genetic markers at known candidate loci. Also, when marker studies reveal a segment of the genome that appears to contribute to phenotypic variation, that segment can be screened in detail for potential candidate loci, which then become the focus of subsequent studies. The measured genotype approaches offer greater insight into the underlying genetic architecture of phenotypic variation than is possible with unmeasured or marker approaches. Moreover, candidate locus studies allow aspects of the genotype-to-phenotype mapping to be studied and estimated that are inaccessible to the unmeasured approaches. For example, we saw in Chapter 8 that a measured genotype approach using the candidate ApoE locus allowed the estimation of genotypic values, a simple quantitative genetic parameter that is difficult or impossible to estimate with unmeasured genotype approaches. Also, interactions among factors, both genes interacting with genes and genes interacting with environments, are much easier to study with measured

Quantitative Genetics

than with unmeasured genotype approaches. Finally, both types of measured genotype approaches offer the promise of actually identifying the specific loci that contribute to phenotypic variation in the population, thereby opening up many more research possibilities. We will now discuss in more detail these measured genotype approaches in quantitative genetics.

Marker Association Studies Admixture Mapping The study of phenotypic associations with genetic markers in recently admixed populations is the measured genotype analog to the use of controlled or known crosses to study between population differences with the unmeasured approaches (Chapter 9). As with the unmeasured approaches, the primary focus is typically on the genetic basis of phenotypic differences observed between the parental populations. The advantage of the measured approach over the unmeasured approach is that it can be applied to cases of natural hybridization or admixture. With uncontrolled admixture events, we can often identify an admixed population but cannot characterize the cross status of specific individuals. To implement the unmeasured approach illustrated in Table 9.2 in the previous chapter, it was essential to classify individuals as coming from specific crosses such as F1, F2, and various backcrosses. After a few generations of admixture, such a simple categorization is usually impossible, and the unmeasured genotype approach described in Chapter 9 cannot be used. However, molecular markers can be used to estimate the degree of admixture for particular individuals within a recently or ongoing admixed population and to identify the specific parts of an individual’s genomes that carry chromosome segments from a specific parental ancestral population (Chapter 6). African Americans provide an example of an uncontrolled admixture between western European populations and western, tropical African populations that has been going on for about 350 years (Jin 2015; Chapter 6). This ongoing admixture creates a complex genetic situation in which current individuals can vary greatly in their percentages of ancestral genome segments, and these segments in turn can vary greatly in size and location due to recombination and segregation over many generations. In order to identify the ancestral origins of genomic segments in present-day individuals, it is first necessary to have ancestral informative markers (AIMs) – polymorphic sites that have large allele frequency differences between the ancestral populations. In this genomic age, AIMs can be found even for a species with little overall geographic differentiation, such as humans (Chapter 6). For example, as of the end of 2019, there were over 335 million validated SNPs in the human genome, and only a few thousand are needed to provide the coverage needed for an admixture mapping study in African Americans. Because of isolation by distance and other restrictions to gene flow (Chapter 6) and because of the recency of some mutations that have not yet had time to spread far (Chapter 7), it is almost always possible to find a few thousand AIMs out of millions of markers even in a species displaying little overall differentiated population structure. Once AIMs have been identified, Bercovici et al. (2008) have developed a program for choosing those AIMs that cover the genome in a manner that allows the identification of ancestral blocks. The act of measuring genotypes does not necessarily avoid the problem of spurious associations that can be created when populations and environments are nonrandomly associated. Hence, this approach is best when it can be demonstrated that the individuals from the admixed population also live in a common environment. Under these circumstances, such studies can study the genetic

347

348

Population Genetics and Microevolutionary Theory

basis of between-population differences in a manner that eliminates or reduces the confoundment that arises when the ancestral populations have both genetic and environmental differences. African Americans and European Americans differ greatly in their risk for developing nondiabetic end-stage kidney disease (ESKD), a disease that requires either kidney dialysis or a kidney transplant to avoid death. An individual’s racial identity has a major impact on many environmental factors that an individual experiences in America, so admixture studies on African Americans are generally limited to individuals who self-identify as African Americans (note, “race” here is used only as a cultural term, as biological races do not exist in humans, Templeton 2013, 2018a). Shlush et al. (2010) performed an admixture mapping study on 576 self-identified African Americans with ESKD, about half of whom had diabetes. Using the program of Bercovici et al. (2008), they assembled a panel of 2016 AIMs to identify the genomic regions of European versus west African ancestry in their African American sample. The logic behind admixture mapping is shown in Figure 10.1. One looks for genomic regions that are shared with high probability across the individuals with the phenotype of interest (non-diabetic ESKD in this case) but that is not shared with high probability across the individuals without the phenotype of interest (diabetic ESKD in this case). Shlush et al. (2010) found only one genomic region on chromosome 22 that was statistically in excess for African origin genome segments in the non-diabetic ESKD subjects. They therefore added several more SNPs on chromosome 22 to refine the location and mapped the association to a small region that contained only a few genes: MHY9, a gene that codes for a cytoskeletal protein in many cell types, including podocytes, the kidney cell type that degenerates in ESKD and that contained the SNP with the strongest association and four genes of the ApoL family, with ApoL1

Cases

Controls

Figure 10.1 A hypothetical chromosome from eight individuals from an admixed population with four having a phenotype of interest (cases) and four without (controls). Chromosomal blocks derived from one of the ancestral populations are shown in white, and gray indicates blocks from the other ancestral population. The chromosomal region indicated by a thick vertical line shows an elevated frequency of the phenotype of interest in chromosomal blocks indicated by gray compared to blocks indicated by white, indicating a genetic association in this region with the case phenotype due to ancestry from the ancestral population indicated by gray.

Quantitative Genetics

being tightly linked to MHY9 and coding for a cell surface protein that is not normally expressed in podocytes. Another group (Kopp et al. 2008) independently performed admixture mapping on African Americans with ESKD and found the same strong SNP association in MHY9 as Shlush et al. (2010), but unlike Shlush et al., Kopp et al. declared “MHY9 is a major-effect risk gene” for ESDK. In contrast, the laboratory in which the work of Shlush et al. was performed took a more cautious approach and performed follow-up studies on the detailed patterns of linkage disequilibrium in this region and of the frequencies of SNPs and haplotypes in this region in other populations, including African populations that were not at increased risk for non-diabetic ESKD (Tzur et al. 2010). Such additional studies are essential before concluding that a specific gene is the “risk gene.” First, as pointed out in Chapter 2, the pattern and magnitude of linkage disequilibrium in many small genomic regions is determined mostly by evolutionary history, both the history of mutational origins creating linkage disequilibrium on a specific genetic background and the history of subsequent mutational homoplasies reducing linkage disequilibrium without recombination. As a consequence, the pattern of linkage disequilibrium among markers within a small genomic region often has no to little correlation with the physical positions of the markers (Templeton 1999a). Hence, a strong association with a marker in one gene does not mean that this gene is causative nor that other nearby genes can be excluded. Second, surveying additional populations places the markers and haplotypes associated with the phenotype of interest into a broader evolutionary context that can be quite illuminating. Such turned out to be the case here. Tzur et al. found that ApoL1 and MYH9 do indeed define a high linkage disequilibrium region in which recombination is apparently rare and in which evolutionary history would play the dominant role of linkage disequilibrium patterns rather than physical location. Indeed, in their detailed analysis of this region, the intron SNP in MYH9 did not show the strongest association with ESKD; rather, the strongest associations in this genomic region were found with two different missense variants in ApoL1 that are in almost perfect negative linkage disequilibrium (i.e. the two risk variants are almost never found on the same chromosome). Tzur et al. also showed that the “risk” SNP in MYH9 was also found in high frequency in African populations that were not at increased risk for non-diabetic ESKD, whereas the two missense variants in ApoL1 were found only in African populations at increased ESKD risk and were absent in African and European populations not at increased disease risk. Subsequent studies in human kidney cell tissue culture and animal models strongly indicate that the ApoL1 risk alleles are indeed causative and that MYH9 plays no role in ESKD (Anderson et al. 2015; Olabisi et al. 2016). How could a gene that is not normally expressed in the kidney contribute to kidney disease? Subsequent studies revealed that ApoL1 can indeed be transcribed in kidney cells in the presence of interferon, a group of signaling proteins made and released by the body in response to certain viral infections. When ApoL1 is expressed in kidney cells, cell death is induced when the risk alleles are present (Nichols et al. 2015). This work suggests that non-diabetic ESKD requires two hits: having one or more of the risk alleles at ApoL1 and having certain viral infections that can induce expression of ApoL1 in the kidney cells, much like the situation outlined in Table 8.1 with factor A1 in that table corresponding to the risk alleles and factor B1 corresponding to certain viral infections. This model is consistent with the pattern of observed risks, namely, that although about 70% of African Americans with non-diabetic ESKD have one or both risk alleles, only about 10% of African Americans with the risk alleles progress to ESKD (Kruzel-Davila et al. 2017). However, when the individual is infected with HIV, a retrovirus, this risk increases to 50% for those individuals with the ApoL1 risk alleles versus 2.5–4% for HIV-infected individuals without the ApoL1 risk alleles (Kruzel-Davila et al. 2017). Accordingly, much of the ongoing clinical research on this system is focusing on other viral candidates that could interact with the ApoL1 risk alleles to induce ESKD (Kruzel-Davila et al. 2017).

349

350

Population Genetics and Microevolutionary Theory

There are three major lessons to be learned from the ESKD example that are applicable to virtually all measured genotype quantitative genetic studies: 1) Association, even when coupled with plausibility, causation. 2) The probability of the trait given the gene the probability of the gene given the trait. 3) Traits arise from interactions between genes and environmental factors (Premise 3 from Chapter 1). The tendency in many modern quantitative genetic studies is to jump from association to causation, resulting in paper after paper announcing the discovery of gene X for trait Y. In actuality, causation is only rarely established and typically requires additional observations and experiments that lie outside the realm of quantitative genetics, just as were needed to show that variants in ApoL1 are causative. Lesson 2 is also often ignored. Most quantitative genetic studies only measure the probability of the gene given the trait. Even when this probability is high, such as it was for the ApoL1 risk alleles in African Americans given that they had non-diabetic ESKD, this does not necessarily mean that bearers of these genes have a high probability of displaying the trait. Indeed, most bearers of these ESDK risk alleles never display ESKD. The probability of the trait given the gene is far more relevant clinically and evolutionarily but is only rarely determined. Most quantitative genetic association studies only determine the probability of the gene given the trait, and typically, additional data need to be gathered to estimate the more relevant probability. Finally, Premise 3 from Chapter 1 should never be ignored even when the genetic associations are exceedingly strong. The association of the risk alleles at ApoL1 was very strong for non-diabetic ESKD, but the environmental factor of viral infections coupled with the risk alleles were likewise important for non-diabetic ESKD. These three lessons are the three fundamental rules of quantitative genetic research that all workers in this area need to engrave upon their foreheads (metaphorically, of course).

Markers of Linkage Physical linkage between two loci affects their pattern of cosegregation in a pedigree or set of controlled crosses regardless of whether or not there is linkage disequilibrium between the two loci in the population as a whole. Consequently, if one of these loci is a measured marker locus and the other is an unmeasured polymorphic quantitative trait locus (QTL) that contributes to phenotypic variation, then the pattern of segregation at the marker locus in a pedigree or controlled crosses will be associated with phenotypic differences among individuals within the pedigree or set of crosses. The strength of this association depends upon two factors:

••

the magnitude of the phenotypic impact of the QTL and the amount of recombination between the marker locus and the QTL.

An experimenter has no control over the first of these factors, but the second can be controlled to some extent by the choice of the number and the genomic locations of the marker loci. Ideally, enough markers should be studied to cover the entire genome. Minimally, this means markers every 20 cM, which implies that any QTL will be ≤10 cM from a marker (and therefore double crossovers will not be important in most species). Instead of using a single-site analysis of association, it is also possible to combine information from two or more sites into an integrated analysis of linkage association. One of the most common approaches of this type is interval mapping in which a QTL is hypothesized to lie between two adjacent markers and the likelihood of the QTL being at various intermediate positions between the

Quantitative Genetics

Marker Locus A

Marker Locus B

QTL X

r – rx

rx

r

Figure 10.2 A hypothetical quantitative trait locus (QTL) (X) located between two adjacent marker loci, A and B. The recombination frequency between A and B is r and that between locus A and the QTL is rx.

flanking markers is statistically evaluated. For example, consider the simple case of a controlled cross design in which two inbred strains are crossed to produce an F1, which in turn is then backcrossed to one of the parental stains. Let the marker alleles from one parental strain be designated by capital letters, and by small letters for the other strain, and the backcross was to this latter strain. Now, consider two marker loci, say locus A and locus B, that are adjacent to one another on the chromosome map with a recombination frequency of r. Now, we hypothesize a QTL locus, say locus X, in between these two marker loci. Assume that one parental strain was originally fixed for the X allele and that the other strain is fixed for the x allele at this hypothetical QTL locus. Assume that the genotypic value of Xx is GXx for the phenotype being measured in the backcross progeny and Gxx for those backcross individuals with genotype xx. Assuming that the markers have a recombination frequency of r that is sufficiently small that double crossovers can be ignored (this assumption can be dropped, but then one needs a mapping function), then the recombination frequency between marker A and QTL X is rx and the recombination frequency between marker B and the QTL is r − rx, as shown in Figure 10.2. Given these assumptions, Table 10.1 shows the expected phenotypic means for the observed marker genotypes as a function of the hypothesized phenotypic effects of the QTL at the hypothesized map position of rx from marker locus A. Similar models exist for F2 crosses and pedigree data, although they are generally more complex. The procedure of interval mapping has also been extended to make use of information from several markers at once and not just the two flanking markers. The important point is that the fit of the observed phenotypes to the expected phenotypes depends upon both the hypothesized phenotypic impact of locus X and its chromosomal position. This fit is typically measured by a likelihood ratio test (Appendix B) of the hypothesis that the QTL at position rx (GXx Gxx in the backcross model shown in Table 10.1) versus the null hypothesis of no phenotypic effect at position rx (GXx = Gxx in the backcross model). The LOD score is the logarithm to the base 10 of the likelihood ratio (Appendix B) of a QTL at a specific position in the genome. Because many such tests are performed and they are not independent, various procedures are used to determine which LOD scores are statistically significant at the level of the entire genome (e.g. Cheverud 2001). Sometimes, the p-values of the likelihood ratio tests or some transformation of the p-values are plotted instead of LOD scores, but these are all alternative ways of showing the level of statistical significance. As noted at the end of Chapter 7 and illustrated by the Drosophila heteroneura/silvestris example in Chapter 8, many species pairs are reproductively compatible. As a result, population and

351

352

Population Genetics and Microevolutionary Theory

Table 10.1 The observed and expected phenotypic means of the measured marker genotypes at two adjacent loci in a backcross experiment. Observed Marker Genotype

Observed Average Phenotype

Possible Genotypes When QTL Is Included

The Expected Frequency Assuming Model in Figure 10.2

Expected Phenotype Given QTL with Genotype Values of GXx and Gxx

AB/ab

GAB

AXB/axb

½ (1 − r)

GXx = GAB

Ab/ab

GAb

AXb/axb Axb/axb

½ (r − rx) ½ rx

[(r − rx)GXx + rxGxx]/r

aB/ab

GaB

aXB/axb axB/axb

½ rx ½ (r − rx)

[rxGXx + (r − rx)Gxx]/r

ab/ab

Gab

axb/axb

½ (1 − r)

Gxx = Gab

quantitative genetic studies are increasingly relevant to problems in macroevolution, particularly with respect to the evolutionary processes of speciation and the genetic basis of species differences. Linkage marker studies have been particularly powerful in addressing this latter problem. Byers et al. (2020) provide an example of such a study. They focused on two interfertile species of Heliconius butterflies, Heliconius melpomene, and Heliconius cydno (Figure 10.3). Mating rituals in these species involve many signals, including the use of male aphrodisiac pheromones during courtship that are necessary for successful mating. These pheromones are located in a special area of the male hindwing called the androconia (left side of Figure 10.3). One of these potential pheromones, octadecanal, is present in H. melpomene males but almost completely absent in H. cydno males (right side of Figure 10.3). Octadecanal elicits significant physiological and behavioral responses in females of both species. Byers et al. (2020) therefore examined the genetic basis of this

P6

P2 P1

P5 IS x

P7

P3

x

C22

Androconia

C23

P1 IS

Heliconius Cydno

Heliconius Melpomene

P4 (C21) 25

30

35

40

45

Time (min)

Figure 10.3 The left side shows the dorsal forewings and male hindwings for Heliconius melpomene and Heliconius cydno, including the silvery androconial region of the hindwing used during male courtship. The right side shows the androconial chemistry through an ion chromatogram of H. melpomene (top) and H. cydno (bottom) that reveals the following chemicals: P1, syringaldehyde; P2, octadecanal; P3, 1-octadecanol; P4, henicosane; P5, (Z)-11-icosenal; P6, (Z)-11-icosenol; P7, (Z)-13-docosenal; IS, internal standard (2-tetradecyl acetate); x, contaminant; C21, henicosane; C22, docosane; C23, tricosane. Source: Taken from figure 1, p. 355 in Byers et al. (2020). © 2020, John Wiley & Sons.

Quantitative Genetics

phenotypic difference in the presence versus absence of octadecanal between these two species through making F1’s and backcrosses in both directions. A linkage map was constructed by first mapping markers onto the H. melpomene reference genome followed by map construction using 447 818 SNPs on the backcrosses. To facilitate computations, markers were thinned evenly across the genome to yield 44 782 SNPs with no missing data that were used to perform the actual marker linkage analysis for the phenotype of the amount of octadecanal in male hindwings. Genomic significance thresholds were obtained by permutation testing (Appendix B). The analysis was performed separately on the two backcross samples, with both backcrosses analyzed with all individuals pooled and with kinship structure taken into account (several different pairs or families were used to construct the backcrosses). Figure 10.4 shows the results of the linkage mapping across the genome, revealing a single significant QTL on chromosome 20. Both backcrosses independently indicated this QTL to be on chromosome 20. The significant region shown in Figure 10.4 spans 3.4 Mb and contains 160 genes (Byers et al. 2020). Hence, although QTL refers to a quantitative trait locus, genome scans do not actually identify loci but rather chromosomal regions, as also previously shown for admixture mapping. Of these 160 genes, 16 were identified as likely candidate genes based on known biosynthetic pathways, but as the ApoL1/MYH9 example revealed, all 160 should still be regarded as candidates. Our knowledge of gene function is often crude and very incomplete, and even when some knowledge exits, it can be misleading, as was the case for MYH9 and kidney disease. It is also possible that more than one locus in this region is influencing the phenotype of interest, but their effects are not

without Kinship with Kinship (LMM with LOCO)

Octadecanal

6

LOD Score

α = 0.001

4

α = 0.01

α = 0.05

2

0 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21

Chromosome

Figure 10.4 A quantitative trait locus (QTL) map for the production of octadecanal in backcrosses of Heliconius melpomene and Heliconius cydno. Red lines indicate analyses that ignored the family structure within the backcrosses, and blue lines indicate analyses that corrected for kinship structure. The straight lines show the genome-wide significance thresholds for different type I error rates (the α’s), with red being the thresholds that ignore kinship and blue the threshold that incorporate kinship.

353

354

Population Genetics and Microevolutionary Theory

separable in the QTL genome scan. This possibility is illustrated by the study of Steinmetz et al. (2002) on the genetic basis of phenotypic variation in the ability to grow under high temperatures in two yeast strains of Saccharomyces cerevisiae. They performed a high-resolution linkage scan of the haploid offspring of hybrids between these two strains, using 3444 markers with an average interval of 1.2 cM. They found two significant QTL regions, one of which corresponded to an interval of 32 kb. The entire 32 kb interval was then sequenced in 12 haploid strains, 6 each of opposite phenotypes for growth under high temperatures. This information when combined with additional genetic tests allowed them to identify three tightly linked loci in this interval. No one locus was either necessary or sufficient for growth at high temperatures; rather, the phenotypic effects of this single QTL region emerge from the joint effects of all three loci with complex cis and trans effects among them. This shows that the acronym QTL is a misleading one; a better acronym would be QTR, a quantitative trait region. However, the term QTL is firmly engrained in the literature, but readers of this literature should always keep in mind that QTL in general refers to a multigene region rather than a single locus. Only a handful of studies actually identify a single locus. Just as mapping studies can make use of more than one marker at a time to detect associations, it is also possible to use more sophisticated genetic models of how QTLs contribute to phenotypic variation. Many models treat each QTL as a single locus with two alleles making its own additive contribution to the phenotype, as in Table 10.1. This is a highly simplified genetic architecture, and more complicated mappings of genotype to phenotype can be used. For example, van der Knaap et al. (2002) crossed a strain of cultivated tomato with highly elongated fruit to a wild tomato relative that produces nearly spherical fruit and went on to bred F2’s. Ninety-seven markers showing fixed differences between the original tomato strains were used to cover the entire tomato genome with an average interval of 12.8 cM. Both single site and interval mapping approaches were used to analyze 85 F2 plants assuming an additive QTL model. The interval mapping analysis revealed four QTLs influencing fruit shape, each on a different chromosome. After identifying these four QTLs, van der Knaap et al. (2002) looked for epistasis between them for the phenotype of fruit shape and found strong, highly significant interactions. They therefore concluded that the four QTLs can be viewed as functionally related in that they control one aspect of fruit shape: the degree of eccentricity. Thus, the original genetic architecture assumed for the initial mapping was shown to be significantly incomplete. One difficulty with the approach of van der Knaap et al. (2002) is that all the QTLs examined for epistasis were identified solely on the basis of their additive effects on the phenotype. However, suppose that the phenotypic contribution of two loci was primarily through their interactions with small marginal or additive effects at each locus. In such a situation, a mapping analysis that examined only one marker site at a time or even one map interval at a time would not detect the QTLs in the first place, particularly given the statistical penalties associated with multiple testing over the entire genome. Therefore, just to incorporate pairwise epistasis into the genetic architecture requires a more refined analysis that examines pairs of markers or intervals. Such an analysis is computationally and analytically much more difficult and moreover generally requires much larger sample sizes. Larger sample sizes are required because in a single-locus, two-allele model, the sample is subdivided into two (backcrosses) or three (F2’s) genotypic categories. However, with just pairwise epistasis, there are four genotypes in a backcross and nine F2 genotypic categories, so the same amount of data is now distributed over more genotypic categories, with a resulting reduction in statistical power. The importance of searching for epistasis is illustrated by the work of Peripato et al. (2004) on the phenotype of liter size in mice. They performed a genome scan on the liter size of 166 females from an F2 cross between two inbred strains of mice. Standard interval mapping identified two

Quantitative Genetics

significant QTLs on chromosomes 7 and 12. These two QTLs account for 12.6% of the variance in liter size. They next reanalyzed the same data with a genome-wide epistasis scan based on a two-locus interaction model and found eight epistatic QTLs on chromosomes 2, 4, 5, 11, 14, 15, and 18 that explained 49% of the variance in liter size. Note that the regions found by the standard interval mapping that assumes no epistasis were not involved in the epistatic component of the genetic architecture. Thus, searching for significant QTLs under the assumption of no epistasis and then looking for epistasis between the identified QTLs (as done in the tomato study of van der Knaap et al. 2002) can miss most of the epistasis that contributes to phenotypic variance. Note also that the single-locus additive model genome scan only explained 12.6% of the phenotypic variance, whereas the two-locus epistatic model genome scan explained 49% of the phenotypic variance. This indicates the importance of epistasis as a component of the genetic architecture of liter size in this mouse population. These experiments with mice reveal a major limitation of the measured marker approach to quantitative genetics. Because the loci directly affecting the loci are not known a priori, a model for their phenotypic effects must always be invoked. Even simple models of single locus, two allele additive phenotypic effects allow us to detect many loci contributing to phenotypic variation, but the mouse experiments illustrate that what we see is dependent upon what we look for. The model of genotype to phenotype mapping imposes a major constraint upon all measured marker approaches and ensures that such approaches will give an incomplete and sometimes misleading picture of the genetic architecture underlying the phenotypic variation. Even when epistasis is specifically looked for between two genome regions, additional epistasis that depends upon three or more genes may go undetected. Indeed, two-and three-locus epistasis was detected in the Drosophila marker-association experiments of Templeton et al. (1976), and the three-locus epistasis was not predictable from the two-locus interactions. Since most of the models used to search for epistasis are limited to interactions between two QTLs, we still tend to underestimate the significance of epistasis as a contributor to phenotypic variance and obtain only an incomplete view of genetic architecture. Also, recall the distinction between a QTR and a QTL. Almost all studies in this area identify QTRs which could hide strong interactions among tightly linked loci, as shown by the work of Steinmetz et al. (2002). Indeed, this is an exceedingly likely possibility because the genomes of many species often have families of functionally related genes close together (Cooper 1999). All of these properties put together indicate that marker association studies are a flawed vehicle for illuminating genetic architecture, but nevertheless such approaches are yielding much more insight into genetic architecture than the unmeasured genotype approaches discussed in the previous chapter. Direct comparisons between the unmeasured genotype approach and the marker linkage mapping association studies are possible because both approaches require either pedigree data or controlled crosses. Consequently, the same data can be analyzed using both measured and unmeasured genotype approaches. For example, Ober et al. (2001) analyzed 20 phenotypes of clinical relevance in Hutterite (a religious group displaying a strong founder effect) pedigrees with both a marker association approach and with the unmeasured approach based on pedigree relatedness outlined in Chapter 9. They found no correlation between the classic heritabilities of these traits with the strengths of association found in the QTL mapping study, indicating that the two approaches were detecting different genetic contributors to phenotypic variation. As pointed out in Chapter 9, the unmeasured genotype approaches detect the heritable genetic contributors to phenotypic variation. In contrast, marker association studies detect inherited genetic contributors to phenotypic variation. For example, marker association studies generally work best when the phenotype is primarily determined by a single Mendelian locus of large effect, as is the case for many autosomal recessive

355

356

Population Genetics and Microevolutionary Theory

genetic diseases. Yet, as pointed out in Chapter 9, such genetic diseases are generally not heritable. Therefore, these two quantitative genetic approaches are not designed to detect the same components of genetic variation contributing to phenotypic variation. The lack of correlation between heritability and QTLs in studies such as Ober et al. (2001) is surprising only to those who confuse inheritance with heritability. This is one source for a problem called missing heritability, which will be discussed in more detail in the next section.

Genome-Wide Association Studies All the marker-association approaches in quantitative genetics depend upon a balance of physical linkage of markers to QTLs to detect phenotypic associations with the markers versus recombination that breaks up that association in order to map and localize the QTLs. The ability to map a QTL to a position in the genome is limited by the number of observable recombination events. At one extreme, as we saw with the ApoL1 and ESKD example, inferences of physical location cannot be made from strength of association when dealing with a genomic region with little or no recombination. With the linkage marker approaches discussed in the previous section of this chapter, there is generally only one or maybe two generations for recombination events to occur in pedigrees and controlled crosses (recall Table 10.1), so mapping ability is rather coarse. The allows investigators to use relatively few markers to cover the genome, but the resulting QTLs often encompass hundreds of genes due to the lack of a large number of recombination events. Admixture mapping on populations that have a history of admixture over many generations have accumulated many more recombination events, and this accumulated diversity in recombination events often allows more precise inferences in the genomic location of QTLs, as shown by the ESKD example. With linkage mapping, recombination is measured directly by a linkage map, as in the Heliconius example. Linkage maps are useful for pedigree and controlled-cross data, but to extend quantitative genetic analyses to a broader range of sampling designs, including samples of unrelated individuals, a different measure of recombination is needed. Equation 2.11 and Figure 2.5 show that linkage disequilibrium in a population tends to decrease as a function of the recombination rate. Consequently, under a broad range of conditions, measures of linkage disequilibrium in a population can serve as an inverse proxy for the recombination rate, with high absolute values of linkage disequilibrium corresponding to small recombination rates and close physical distance in the genome. An example of this is shown in Figure 10.5. Linkage disequilibrium is influenced by the accumulation of many recombination events over many generations (Eq. 2.11 and Figures 2.5 and 5.22). However, phenotypic associations with markers quickly drop off with physical distance in the genome when there is a large number of recombination events. Many markers (hundreds of thousands or more) must therefore be used to ensure that all potential QTLs are tightly linked to a marker in order to map associations through linkage disequilibrium in a population. Genome-wide association study (GWAS) is based upon the premise that the strongest linkage disequilibrium between a QTL and markers will be between the markers that are tightly linked to the QTL and hence the phenotypic associations will be strongest for those observable markers. A typical GWAS performs a phenotypic regression across the allele dosage of each marker locus separately (Visscher and Goddard 2019). Hence, GWAS uses the classic Fisherian model to estimate the average effects of all marker alleles. All nonadditive effects are ignored in a standard GWAS. The results of a GWAS on 1857 house sparrows (Passer domesticus) for the phenotype of bill depth using 181 454 SNPs are shown in Figure 10.6 (Lundregan et al. 2018). Only a single SNP on chromosome 3 with p-value of 3.14 × 10−7 exceeded the genome-wide significance threshold with an average effect of 0.054 and explaining 1.6% of the variance. Since the SNP dosage regressions only

Quantitative Genetics

YRI CEU TSI VER Valley ALB CAN CAB CAR MON ROA

0.25

0.20

r2

0.15

0.10

0.05

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Recombination Distance (cM)

Figure 10.5 The decay of linkage disequilibrium as measured by r2 (Eq. 2.16) with recombination distance in several human populations. The populations labeled ALB, CAN, CAB, CAR, MON, and ROA represent small isolated villages from an Appennine valley in northwestern Italy; Valley is the conglomerate of all of these isolates; VER represents the Italian population from the Veneto region; TSI the Italian population from Tuscany; CEU the European population; and YRI a large African population. Source: Colonna et al. (2013). © 2012, Macmillan.

capture additive effects, this means that the heritability of bill depth revealed by the significant associations in the GWAS was 0.016. Classical quantitative genetic studies based on pedigrees had estimated the heritability of bill depth as 0.333 in males and 0.299 in females (Jensen et al. 2008), which are far greater than 0.016. The small heritability found in the house sparrow GWAS for bill depth versus the much larger heritability found in pedigree studies is a common phenomenon in comparisons of classical quantitative genetic analyses versus GWAS – the problem of missing heritability. There are many factors that could explain why GWAS typically detects only a small fraction of the heritability. One factor relates to the goals of these two types of analyses. In classical quantitative genetics, the goal is often to estimate the heritability itself, whereas, in GWAS, it is to identify specific inherited regions in the genome that have a significant association with the phenotype, as mentioned in the previous section. To achieve this later goal, GWAS must impose a severe statistical threshold to identify significant genomic regions, as shown in Figure 10.6. One way of matching the goals of the two analyses is to use all the GWAS SNP regressions regardless of their p-values to explain as much of the additive variance as possible through SNPs without trying to identify individual SNPs with significant associations. The resulting GWAS heritability estimate using all SNPs is called the genomic heritability. Lundregan et al. (2018) estimated the genomic heritability of bill depth in these sparrows as 0.35 – a value close to the classical heritability estimates. Hence, in this case, all of the missing heritability was recovered at the whole genome level. In most cases, the missing heritability is not fully recovered just by eliminating significance thresholds. For example, human height has a heritability of about 0.8 in classical heritability analysis (Silventoinen et al. 2003), and several GWASs identified about 50 QTLs that collectively

357

5 4 3 1

2

–log10(p–value)

6

7

8

Population Genetics and Microevolutionary Theory

0

358

1

2

3

4 5 6 7 Chromosome

8 9 11 13 17 21

29

Figure 10.6 Results of a GWAS on bill depth in-house sparrows. SNPs are plotted on the x-axis according to their position on each numbered chromosome. Adjacent chromosomes by number alternate between black and gray circles, with the position of the circle indicating a transformed p-value for the null hypothesis of no effect for the SNP on the y-axis. The dotted line indicates the threshold for 5% genome-wide significance. Source: Lundregan et al. (2018). © 2018, John Wiley & Sons

account for a heritability of 0.05 (Gudbjartsson et al. 2008; Lettre et al. 2008; Weedon et al. 2008; Yang et al. 2010). The genomic heritability of height is 0.45 (Yang et al. 2010), so much but not all of the missing heritability was recovered by discarding the significance threshold. This is a common and expected result. de los Campos et al. (2015) showed that classical heritability is related to genomic heritability, hg2, by the equation: h2g = h2

σ 2gwas σ 2a

10 1

where σ 2gwas is the amount of additive variance explained by all the markers used in the GWAS and σ 2a is the classical additive variance. In general, we expect the ratio of these two additive variances to be less than one because markers are often not causative but are only detecting an association through linkage disequilibrium that is typically imperfect. Hence, even if the GWAS detected every genomic region that contributes to the phenotypic variance, we would still get σ 2gwas < σ 2a . The genetic architecture that emerges from GWAS will always be incomplete unless every causative variant is included as a marker (de los Campos et al. 2015). Another cause of missing heritability is rare alleles. There are three important features of rare alleles that can cause them to have a major effect on heritability as measured by pedigree studies but not in a GWAS of unrelated individuals. First, by definition, such alleles are rare in the population, but in a particular family or pedigree, an otherwise rare allele can be quite common and can be carried by many relatives. Second, although rare alleles are rare by definition, the category of rare alleles can be quite common, so many families can be affected by a rare allele, but, in general, the alleles (and even the loci) will be different in different families. For example, Mancuso et al. (2016) performed targeting resequencing on 63 autosomal regions that were candidates for prostate cancer risk from previous GWASs and found many rare SNPs (frequencies of 0.001–0.01) that

Quantitative Genetics

collectively explained 42% of the genetic variance in risk for prostate cancer. This study directly shows that rare alleles collectively can be a major source of additive variance. Third, there is a tendency for rare alleles to have large phenotypic effects. Many authors have sought biological reasons for why rare alleles have large phenotypic effects (e.g. rare alleles are more likely to be recent mutations with deleterious effects that have not yet been eliminated by natural selection), but much of the pattern of rare alleles having a large impact on additive variance is an artifact of just being rare. We saw in Table 8.1 that interaction systems attribute large marginal (additive) effects to rare components simply because they are rare. Fisher’s quantitative genetic model that underlies most GWAS also attributes larger phenotypic deviations to rare alleles than to common alleles simply because all measurements are taken as deviations from the mean. If a genetic factor is common, it has a disproportionate contribution to the mean and, hence, mathematically can show only small deviations from the mean. A rare factor, on the other hand, does not contribute much to the overall mean and hence mathematically can have a large deviation from that mean. To see this, go back to Table 8.4 and the ApoE example for total serum cholesterol. The average effect of the ε3 allele, the most common allele, is 0.26 mg/dl – a very small additive effect. In contrast, the average effect of the ε2 allele, the rarest allele, is −12.15 mg/dl – the largest additive effect in magnitude. Is this difference due to the biological properties of these two alleles? To see if this is the case, consider Table 10.2. This table is identical to part of Table 8.4 except that we have made ε2 a common allele with a frequency of 0.96 and ε3 and ε4 are now rare alleles with frequencies of 0.02 each. Of particular importance is that we have retained the same genotypic values so that the mapping of genotype to phenotype is identical in these two tables. In other words, we have exactly the same biology in how ApoE genotypes affect the phenotype of total serum cholesterol – the only change is one of allele frequencies and nothing about the phenotypes of genotypes. Despite retaining the same genotype-to-phenotype mapping as found in Table 8.4, the average excess of the ε2 allele, now the common allele, is −0.33 mg/dl – the smallest additive effect in magnitude. In contrast, the ε3 and ε4 rare alleles now have large average excesses: 6.54 and 9.34 mg/dl (Table 10.2). Hence, rareness itself results in large additive effects in Fisher’s model. The above three attributes of rare alleles mean that in a classic quantitative genetic analysis based on pedigrees, rare alleles can make a substantial contribution to heritability. Most GWAS studies use random samples of individuals rather than individuals from pedigrees or crosses, and the rareness of these alleles in a random sample makes them virtually undetectable in a GWAS based on a random sample. Moreover, the SNP markers used in GWAS tend to be common by deliberate choice, but common alleles typically tag nearby rare alleles very poorly (Evans et al. 2018). This missing heritability can be recovered in a GWAS on pedigrees rather than a random population Table 10.2 The ApoE example from Table 8.4 with the allele frequencies altered but retaining identical genotypic values for the phenotype of total serum cholesterol in mg/dl. Genotype

ε2/ε2

H.W. Frequency

0.9216

ε2/ε3

ε2/ε4

0.0384

0.0384

ε3/ε3

ε3/ε4

0.0004

0.0008

ε4/ε4

0.0004

Sum or Mean

1

Gi (mg/dl)

194.46

201.03

203.74

213.40

219.59

223.07

195.11

gi (mg/dl)

−0.65

5.92

8.63

18.29

24.48

27.96

0

Gametes

ε2

ε3

ε4

Frequency

0.96

0.02

0.02

1

ai (mg/dl) = αi

−0.33

6.54

9.34

0

359

360

Population Genetics and Microevolutionary Theory

sample, but pedigrees are impossible to obtain for many species and pedigree sampling is far more expensive, difficult, and time-consuming than random sampling of individuals. Hence, this source of missing heritability is not easily circumvented in many cases. A partial solution lies in the fact that although pedigree data are often not available, the genomic markers used in the GWAS can also be used to estimate the realized kinship among all pairs of individuals, as mentioned in Chapter 9. Incorporating the realized kinship matrix can recover some of the missing heritability, but there can still be a downward bias in heritability estimation when the study sample contains many remotely related individuals (Wang and Thompson 2019), as is typically the case for many GWAS samples. Pedigrees also provide information about the phenotypes of mating pairs and hence can be used to detect assortative and disassortative mating, which are common phenomena for many traits (Chapter 3). As shown in Chapter 3, assortative mating can induce linkage disequilibrium between alleles at different loci with similar phenotypic effects, and this in turn can substantially increase the additive genetic variance and heritability (Lynch and Walsh 1998). GWAS based on random samples cannot detect or correct for assortative mating, so assortative mating is yet another source of missing heritability. A standard GWAS is based on Fisher’s one locus quantitative genetic model and performs multiple single-locus regressions. Epistasis between loci is ignored in such an analysis. This in turn is yet another source of missing heritability (Morgante et al. 2018) and is even a worse problem for polygenic prediction of phenotypes (Dai et al. 2020). Recall that epistasis between loci still contributes to additive variance (Chapter 8), but that without explicitly looking for epistasis, many loci are missed completely (e.g. Peripato et al. 2004, as discussed earlier). Epistasis can greatly complicate GWAS both computationally and statistically. Recall that n marker SNPs induce a severe statistical penalty for multiple testing when n is very large, as needed for GWAS. However, even to look at pairwise interactions among these n SNPs requires examining ½ (n2 – n) potential pairwise interactions, which blows up both computational time and the statistical penalty for multiple testing. There are methods of greatly reducing this computational and statistical burden, such as eliminating combinations that have virtually no chance of being significant or finding subsets that can be reduced to a single variable that is independent of the remainder of the data that allow an effectively complete GWAS with epistasis (e.g. Llinares-López et al. 2018; Niel et al. 2018). As with the study of Peripato et al. (2004) discussed earlier, these epistatic GWAS scans find many additional QTLs through epistasis that were totally missed by single SNP GWAS. Another method of reducing the number of contrasts is to focus upon interactions with a small number of candidate loci with the other GWAS markers, but this will be discussed in the candidate locus section. Multi-locus interactions can also be analyzed through a network approach using the custom correlation coefficient (CCC) discussed in Chapter 2 as a measure of linkage disequilibrium. Climer et al. (2014a) analyzed GWAS data on psoriasis, a common but complex skin disease in humans. Two independent data sets were analyzed to ensure replicability, both consisting of cases (individuals with psoriasis) and controls (individuals without psoriasis but matched as much as possible for several other variables, such as gender and age). Previous GWAS studies based on SNPs as units of analysis had identified several genes that are associated with psoriasis, all on human chromosome 6 and within or near the major histocompatibility complex (MHC) region. Collectively, these SNP studies explained less than 20% of the heritability of the disease. The discovery data set analyzed by Climer et al. (2014a) had 929 cases and 681 controls surveyed for 443 020 SNPs scattered across the genome. They first calculated the CCC allele-specific disequilibrium measure (Eq. 2.17) between all alleles at all SNPs. Because the sample is enhanced for the phenotype of psoriasis, there would be an enhancement of disequilibrium among alleles that interacted in such a manner as to

Quantitative Genetics

increase the risk of psoriasis even if there were no linkage disequilibrium between these alleles in the general population. The top 443 020 CCC measures were retained as edges between alleles at different SNPs, thus retaining the same dimensionality as the original SNP-based GWAS. The program BlocBuster (Climer et al. 2014a) not only calculates the CCCs but it also organizes the retained edges into allelic networks. For example, suppose allele A at SNP1 is connected to allele B at SNP2 through a retained edge weighted by CCC, and likewise suppose allele C at SNP3 is connected to allele B at SNP2 through another retained edge. Suppose no other retained edges connect to these three alleles. This would define an allelic network A–B–C. Extremely efficient versions of this program exist to calculate these CCC values and BlocBuster networks (Joubert et al. 2019), so this approach is applicable to even massive genomic data sets. Of the SNP alleles, 71.3% had no retained CCC edges to another SNP allele. These SNPs were eliminated from further study, greatly reducing the dimensionality of the problem. The remaining 54 425 networks ranged from 2 to 313 nodes (alleles), with an average of 4.7 alleles per network. It was these multi-locus allelic networks that were now the units of analysis rather than individual SNPs, further reducing the dimensionality of the analysis. Only one network had a significant association with psoriasis after correcting for multiple testing. This retained network consisted of 17 alleles from SNPs in three different genes (PSORS1C1, PSORS1C2, and CCHCR1). All of these genes had been associated with psoriasis in previous GWASs, and all were located in the MHC region on chromosome 6. One set of these 17 alleles was associated with increased risk for psoriasis, and its allelic inverse at each SNP (all SNPs were biallelic) was associated with protection from psoriasis. This same network replicated in the second data set. The highest odds ratio for any single SNP in this network was 2.41 (that is, having the risk allele at this SNP increased the chances of psoriasis by 2.41 times the chance in the controls), but the odds ratios for the 17 allele network were 3.64 and 3.86 for the discovery and validation data sets, respectively. Hence, far greater phenotypic predictability was achieved at the multi-locus level than at the single-SNP level. The results of Climer et al. (2014a) reveal that individual SNPs may not be the most appropriate unit of analysis in GWAS. Interestingly, in short segments of the genome, networks of nearby SNP alleles produced by BlocBuster turn out to be phased haplotypes (Appendix A). Using haplotypes as alleles in small regions of the genomes can reduce the dimensionality of the GWAS, thereby reducing the statistical penalty of genome-wide significance without losing significant information because of the high linkage disequilibrium among SNPs in small regions. Indeed, there is better information about identity-by-descent in haplotypes than in SNPs, so it is not surprising that haplotypes usually increase associative power over SNPs, as shown by Climer et al. (2014a). Li et al. (2018) have a version of GWAS based upon subdividing the genome into nonoverlapping clusters of haplotypes by placing the break points in areas of low linkage disequilibrium (recombination hotspots, Chapter 1). N’Diaye et al. (2017) used a sliding window to define haplotype regions and found that haplotypes out-performed single SNP analysis. Hamazaki and Iwata (2020) have developed another haplotype-based GWAS that they show can detect associations not detected by SNP GWAS. Simulations and a worked example by Lorenz et al. (2010) led them to conclude that although haplotypes often have greater power to detect associations than SNPs, haplotypes do not work as well as SNPs in capturing associations in genomic regions with much recombination. Hence, they recommend that GWAS should be executed with both SNPs and haplotypes to take advantage of the full information content of the genotype data. Climer et al. (2020) developed an extension of CCC called DUO for analyzing transcriptome data. Transcriptome data (Chapter 1) allow a unique variant of a GWAS-type study. With transcriptomes, the phenotype is the degree of gene expression in an organism or cell type. The genomic markers are the genes themselves. The goal of many transcriptome studies is to relate the molecular-level

361

362

Population Genetics and Microevolutionary Theory

phenotype of gene expression to a higher order phenotype. For example, Webster et al. (2009) examined gene expression in human cortex tissue for 8560 genes from 364 brains. These brains were taken from 176 cases of people with Alzheimer’s disease (one of the most common degenerative brain diseases) and 188 controls. To compare DUO to a modern traditional analysis, Climer et al. (2020) reran the analysis of the data from Webster et al. (2009) with the well-established CoExp method (Ruan et al. 2010). This method calculates the pairwise Pearson correlation coefficients between all differentially expressed pairs of genes. This produced a network of 1565 genes with the edges weighted by the significant pairwise Pearson correlations. Such a large number of genes (18% of all the genes in the study) make biological interpretations difficult. The traditional method therefore looks for correlated clusters of genes within this large network in the hope of revealing more biologically interpretable subsets. Climer et al. (2020) subdivided this large network into highly correlated clusters using three standard methods all based on modularity (see Chapter 6). The three methods yielded diverse results, ranging from 8 to 103 clusters, and it was not obvious which clustering technique is more biologically sound (Climer et al. 2020). Note that this traditional analysis is similar to many measured genotype studies in that it calculates significant expression differences given the phenotype (cases versus controls) and limits all subsequent analyses to this subset of significant genes. However, what is of greater clinical relevance for prediction and diagnosis is the risk of the disease given the gene expression pattern (the transcriptome version of Lesson 2 given earlier in this chapter). For the DUO analysis, Climer et al. (2020) pooled all cases and controls into a single sample regardless of disease status. Instead of looking at gene expression differences given the phenotype, Climer et al. (2020) divided the expression levels in the total sample into three categories: the genes with the highest 25% levels of expression (high, H), the middle 50% of expression values (neutral), and the lowest 25% levels of expression (low, L). Co-expression is measured by first defining four categories or types of gene pairs: both genes with high expression (HH), both with low expression (LL), and genes with opposite expression (HL or LH). Let Rij(t) be the frequency of individuals in the total sample that have expression relationship type t between genes i and j. Then, in analogy to the CCC defined by Eq. (2.17), the elements of the DUO vector correlation are given by: DUOij t = Rij t ff it ff jt

10 2

with the frequency factor ffit = 2[1-fit/1.5] where fit is the frequency of the relevant expression direction for relationship t in the sample. The numbers 2 and 1.5 were chosen to give the DUO elements an approximate 0–1 range. The elements of the DUO vectors were calculated for all gene pairs. After some preliminary explorations, only the top 1000 DUO elements were retained, a number that virtually eliminates all false positives. BlocBuster was then used to construct co-expression networks. Nine distinct networks of two or more genes were defined ranging in size from 2 to 136 genes within them, as shown in Figure 10.7. Climer et al. (2020) next tested for associations between these networks and disease status, with five of the nine networks displaying highly significant associations with corrected p-values no larger than 3.9 × 10−39 (Table 10.3). The largest network of 136 genes had the strongest impact on Alzheimer’s risk with an odds ratio of 3.0 (Table 10.3). Previous studies had indicated through GWAS or implicated biochemical pathways that 63 of these 136 genes were prior candidates for Alzheimer’s disease. This is an incredible enrichment of candidate loci for a genome-wide survey. Moreover, 22% of the genes in these DUO networks did not show significant expression differences between cases and controls. How could this be? Consider the following hypothetical case. Suppose that risk for Alzheimer’s is increased when genes A and B both show high expression, but also when gene A and gene C both have low expression. Looking at the marginal effect of gene A, sometimes it has

Quantitative Genetics

R1 P1 P2

P3

R2

Figure 10.7 Co-expression networks for a sample of 364 brain cortexes (176 Alzheimer cases and 188 controls). Red and blue nodes represent genes with high and low expression, respectively. Each edge represents a significant DUO correlation among the top 1000 edges between two nodes/genes. Five networks had significant associations with Alzheimer Disease status and are labeled R# if associated with increased risk and P# if associated with decreased risk. Source: Climer et al. (2020).

Table 10.3 disease. Network

Co-expression networks of genes (Figure 10.7) with significant associations with Alzheimer’s

No. of Genes

Expression Level

Disease Impact

Odds Ratio

R1

136

High

Increased Risk

3.0

P1

116

Low

Protective

0.7

R2

2

Low

Increased Risk

1.3

P2

2

Low

Protective

0.6

P3

2

Low

Protective

0.4

Source: Data from Climer et al. (2020).

high expression in cases, and sometimes low, so the signal of differential expression is weakened. It is therefore not surprising that the results of the DUO analyses and the traditional analysis upon the same data set are extremely different. The results of Climer et al. (2020) gave a novel and distinct insight into the biology of the disease as shown by the large enrichment of candidate loci and by calculating the risk odds of the disease given the genetic network as opposed to expression differences given the disease status. Both analyses are valid but in different ways, but the results of Climer et al. (2020) clearly indicated that multi-locus expression context is an important factor that should not be ignored. In addition to epistasis, GWAS can also be extended to examine another major feature of genetic architecture: pleiotropy. For example, Chhetri et al. (2019) performed GWAS on 14 morphological and physiological traits on 882 trees of Populus trichocarpa in a common garden experiment. They used whole-genome sequencing to define more than 6.78 million SNPs and analyzed the data for both single-trait and multi-trait measures. Traits were combined based on their pairwise

363

Population Genetics and Microevolutionary Theory

–log10(p–values < 10e–3)

364

Multi-trait Leaf Dry Weight Leaf Length Leaf Width Leaf Area

10.0

7.5

5.0

1

2

3

4

5

6

7 8 9 10 11 12 13 14 151617 18 19 Scaffolds Genome Position

Figure 10.8 GWAS results for individual leaf traits and a multi-trait leaf measure in Populus trichocarpa. The colors of the dots correspond to single- or multi-trait associations as indicated by a key in the figure. The position of the circles indicates the location of the SNP in the genome on the x-axis. Only SNPs with p ≤ 10−3 are plotted. SNPs above the red line passed the genome-wide correction for significance, and SNPs above the blue line are considered suggestive. Source: Chhetri et al. (2019). © 2019, John Wiley & Sons.

correlations and their functions. For example, leaf area, leaf dry weight, leaf length, and leaf width were combined to form a multi-trait set because these traits are highly intercorrelated and represent the leaf as a structural unit. Figure 10.8 shows the GWAS results for these leaf traits considered singly and combined into a multi-trait set. In this example and in the others given in Chhetri et al. (2019), many more significant associations were revealed with multi-trait sets than with single traits, thereby revealing the importance of pleiotropic effects of individual genes. Indeed, for the leaf traits, no single leaf trait had a significant or even suggestive QTL, but the multi-trait leaf measure did (Figure 10.8). It is also possible to study gene × environment interactions with GWAS. Ørsted et al. (2018) studied three traits related to cold tolerance of the fly D. melanogaster over five different rearing temperatures. The cold tolerance traits displayed plasticity (i.e. the norm of reaction discussed in Chapter 9 resulted in different phenotypes across environments), and there was a significant genomic heritability in variability in plasticity, variation within environments, and within-environment variation across environments. These studies show that the “environmental variance” of Fisher’s quantitative genetic model is influenced by genetic variation as well and can even be a heritable trait. This illustrates why it is better to think of σe2 as a residual variance and not just something solely defined by environmental variables. A population structure with genetically distinct local demes can be a danger in GWAS. Consider the following hypothetical example. Suppose that a researcher sampled several people from Quebec, and this sample included some French Canadians and some English Canadians. Because the French Canadians have a population history with some founder and bottleneck events, there are many loci at which these two groups show distinct allele frequencies. Moreover, there is assortative mating by ethnicity, so these two groups have not yet experienced sufficient gene flow to merge into a single, randomly mating deme. Hence, the effects of population history continue to influence the present. As a result, a current sample of people from Quebec would be stratified,

Quantitative Genetics

log p-values

(a)

CONVENTIONAL TEST 10 8 6 4 2 0

log p-values

(b)

1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 1819

X

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 1819

X

EMMA 10 8 6 4 2 0

1

Figure 10.9 Panel (a) shows the results of a conventional GWAS for the phenotype of body weight performed on a stratified sample of mice from two different laboratory strains that differ in mean body weight. Line length indicates −log (p-value) of SNP markers arranged by chromosome and position on the chromosome for the mouse genome. Panel (b) shows the results of the EMMA GWAS that controls for stratification on the same data. Source: Sul et al. (2018) https://doi.org/10.1371/journal.pgen.1007309. Licensed under CC-BY-4.0.

that is, the sample would contain individuals from two or more genetically distinct subpopulations. Now, consider a hypothetical bi-allelic locus in which this population history has resulted in the A allele being very common in French Canadians, and the a allele common in English Canadians. Suppose one now did a GWAS on the phenotype of the language one speaks at home. The result would be that the A allele is associated with speaking French, and the a allele is associated with speaking English. Hooray! We have just discovered the gene that determines the language you speak at home! Sorry, we have only violated Lesson 1 given earlier. This hypothetical example shows how stratified sampling can produce serious associative artifacts in a GWAS, even for a trait that is nongenetic. Sul et al. (2018) created a stratified sample of laboratory mice by mixing together individuals from two different strains that differed in body weight. Figure 10.9a shows the results of a standard GWAS on this stratified sample for the phenotype of body weight. As can be seen, the conventional GWAS detected numerous significant QTLs on every chromosome of the mouse genome. However, are these QTLs for body weight or for strain origin? To address this question, Sul et al. (2018) first constructed a matrix of the relatedness or kinship of all pairs of individuals using the genetic markers that were also used in the GWAS, just as we described in Chapters 3 and 9. In this case, the relatedness matrix would show two distinct submatrices with higher kinship within and separated by distant kinship between. Sul et al. used a spectral decomposition of the relatedness matrix (a well-known procedure from linear algebra) to correct for stratification using a procedure called efficient mixed model association (EMMA). The results of the EMMA GWAS are shown in Figure 10.9b. Now there was only one strong signal for a QTL for body weight on chromosome 8, a region associated with body weight in previous GWASs. Hence, almost all of the QTLs shown in panel (a) were artifacts of stratification. As this figure shows, stratification can cause large and abundant artifacts that should not be ignored. If stratification is suspected, even as a remote

365

366

Population Genetics and Microevolutionary Theory

possibility, it is best to correct for it through a relatedness matrix using EMMA or other methods (e.g. Chaves et al. 2016). A better way of dealing with stratification is to avoid it. This of course requires prior knowledge about the population structure of the species, which can be obtained as discussed in Chapter 6.

Candidate Loci As our biological knowledge has expanded at all levels from genes to organisms, it has become increasingly possible to identify genes whose known function might contribute directly to a phenotype of interest. For example, the gene ApoE discussed in Chapter 8 codes for a protein that combines with insoluble lipids to form soluble apolipoproteins that can be transported in blood serum. One of the apolipoproteins that contains the ApoE protein is high-density lipoprotein (HDL), the second most important contributor to total serum cholesterol. In addition, the ApoE protein binds to the low-density lipoprotein (LDL) receptor, thereby competitively inhibiting the binding and uptake of LDL in peripheral cells. This in turn has an impact on LDL levels, the major component of total serum cholesterol. Thus, the known function of the protein coded for by the ApoE locus relates directly to the phenotype of total serum cholesterol, as well as to both HDL and LDL levels. ApoE is therefore said to be a candidate locus for the phenotype of total serum cholesterol and its major subcomponents. There are tools available for identifying candidate genes even in non-model organisms. One such tool is the Phenoscape Knowledgebase (https://kb.phenoscape.org) that contains curated annotations from genotypes in many species that show variation in a wide variety of phenotypes. For example, Edmunds et al. (2016) used this tool to uncover candidate genes in the ostariophysan fishes that bear resemblance to mutant-scale phenotypes in a model organism, the zebrafish. Such candidates must be regarded as hypotheses, and the candidates were tested for their endogenous expression patterns in the channel catfish, Ictalurus punctatus, that has an ancestral phenotype for loss of scales. These experiments resulted in candidate genes that were useful in testing evolutionary hypotheses about changes in morphology in the ostariophysan fishes (Edmunds et al. 2016). QTLs (or more properly, QTRs) often suggest a set of candidate genes. For example, Tyler et al. (2019) prioritized the genes located in a large QTR for histamine sensitivity in mice by using functional genomic networks, whose links encode functional associations among genes, to identify the genes most likely to be associated with a trait of interest. In this manner, they were able to show that three top-ranked genes did indeed show strong associations with histamine sensitivity (Tyler et al. 2019). However, recall the example of MYH9 and ApoL1, candidates suggested by a QTR. The known functions and pattern of gene expression for these two candidates were initially misleading. Instead, their roles were clarified by additional fine-scale molecular evolutionary studies in the QTR region, additional population-level genetic surveys, and experimental evidence. One common approach for analyzing candidate genomic regions is to treat each SNP as a separate locus. This approach is like doing a mini-GWAS on the SNPs in the QTR or candidate locus. However, polymorphic sites within a candidate region are, virtually by definition, tightly linked and often show strong linkage disequilibrium that is more reflective of evolutionary history than of genomic position. The fundamental premise of a typical GWAS is that linkage disequilibrium is an inverse proxy of recombination (Figure 10.5). Such a premise is often inapplicable when dealing with a small candidate region, making statistical and biological interpretation difficult. Indeed, as we have seen in the MYH9 and ApoL1 example, the typical GWAS premise is not just inapplicable, it can be actively misleading within a candidate region.

Quantitative Genetics

Location in kilobases from the Start of the Sequenced Region 0

1

2

Exon 1

3

Exon 2

Exon 3

4

5

Exon 4

5361 5229B 5229A

4951

4075 4036 3937 3701 3673

3106

2907

2440

1998

1575 1522

1163

832 624 560 545 471 308

73

Polymorphic Nucleotide Sites

Figure 10.10 The physical position of the polymorphic SNPs found by Fullerton et al. (2000) in a 5.5 kb region of the ApoE gene. The two amino acid replacement polymorphic sites that determine the ε2, ε3, and ε4 alleles are indicated by boxed numbers. A larger box shaded with gray encloses the portion of this region that was “sequenced” in a hypothetical study. Source: Modified from Fullerton et al. (2000).

To illustrate the difficulties in biological interpretation that arise from high levels of linkage disequilibrium, consider again from Chapter 5 the genetic survey of Fullerton et al. (2000). They sequenced 5.5 kb of the ApoE region in 96 individuals, revealing 23 SNPs. Figure 10.10 shows the distribution of the 23 SNPs over the sequenced region. This figure also shows the physical positions of the two amino acid changing mutations that define the three major amino acid sequence alleles at this locus, ε2, ε3, and ε4. As shown in Chapters 8 and 9, these alleles are associated with large differences in the phenotype of total serum cholesterol in human populations. Moreover, Stengard et al. (1996) showed that men bearing the ε4 allele have a substantial increase in risk of mortality from coronary artery disease (CAD), as shown in Figure 10.11. Because the protein product of the ε4 allele has many altered biochemical properties relative to the protein products of the other alleles, such as its binding affinity to the LDL-receptor protein, it is reasonable to assume that the amino acid replacement mutation that defines the ε4 allele is the causative mutation of some of this clinically important phenotypic variation. Such will be assumed here. The mutation defining the ε4 allele is the boxed mutation at position 3937 in Figure 10.10. Most studies of candidate loci only sequence or survey SNPs in a small portion of the chromosome. Suppose, hypothetically, that instead of sequencing the 5.5 kb shown in Figure 10.10 that the sequencing had been terminated near the beginning of exon 4, as shown by the gray boxed region in Figure 10.10. Because position 3937 is in exon 4, this position would not have been sequenced in this hypothetical study. Suppose further that no prior knowledge existed about the importance of the amino acid replacement mutations or even their very existence. Hence, the investigators in this hypothetical study would only have available the SNPs in the large gray box shown in Figure 10.10. Suppose now that each of these available SNPs would be tested one by one for associations with CAD. Because the causative mutation at site 3937 is not included in this hypothetical study, any associations detected with the available SNPs would be due to linkage disequilibrium with the ε4 allele. Interestingly, the ε4 allele (the SNP at site 3937 in Figure 10.10) does not show any significant disequilibrium with the SNP at site 3701, the SNP that lies closest to 3937 in the genomic region sequenced in our hypothetical study. In contrast, there is significant disequilibrium between the SNPs at sites 3937 and 832. Accordingly, this hypothetical study would find an association between the SNP at site 832 and CAD mortality. Site 832 lies in the 5 enhancer region of this gene,

367

Population Genetics and Microevolutionary Theory

Figure 10.11 Relative risk of mortality due to coronary artery disease as a function of ApoE genotype in a longitudinal study of Finnish men. Source: Data from Stengard et al. (1996).

6

5 Relative Risk of CAD Mortality

368

4

3

2

1

ε2/ε3

ε3/ε3 ε3/ε4 ApoE Genotype

ε2/ε4 & ε4/ε4

so our hypothetical investigators could easily create a plausible story about how a 5 regulatory mutation was “causing” the observed associations with CAD mortality, even though the causative mutation was actually located past the opposite end of their sequenced region. Why does this hypothetical single SNP association study associate the 5 regulatory end of the gene region with CAD mortality but find no association with a SNP only a little more than 200 base pairs away from the causative mutation? The answer to this can be found in Figure 10.12, the estimated ApoE haplotype tree. A glance at this figure shows why sites 832 and 3937 show significant disequilibrium: the mutations at these sites are located next to one another in the evolutionary tree of haplotypes. As a result, most haplotypes today fall into just two genetic states for these sites, resulting in significant disequilibrium. Indeed, if it were not for homoplasy at site 832, the disequilibrium would have been even stronger. Because no recombination is detected in this region, recombination is irrelevant and linkage disequilibrium primarily reflects temporal proximity in evolutionary history rather than physical proximity on the DNA molecule. In general, linkage disequilibrium is strongest between markers that are old (near the root of the tree) and that are on nearby branches in the haplotype tree (temporal proximity). Physical proximity is irrelevant in the absence or near absence of recombination. Consequently, a candidate gene study should not be regarded as a mini-QTL marker association study; the biological meaning of linkage disequilibrium can be extremely different at the physical scales of a marker study measured in centiMorgans versus a candidate locus study covering only a few thousand base pairs. One way to eliminate the problems caused by disequilibrium within a candidate region is to use haplotypes as the units of analysis. This method transforms the problem into a one-locus,

Quantitative Genetics

1575

21 14 75 15

624 16

13 G at Site 832 & C at Site 3937

0

1998

471

832

23

20

5

3106 31

4951 5361

0 4951

3673 308 27

12

560

73

3937

8 0

15

0

560

560

6

17 624

29

2440

3 G at Site 832 & T at Site 3937

560

4036

0

19

22

0

52

545 4

2 560

62

11

29

63

5361

10

11

7

30

01

832

1

37

28

9

2907

560

4075

560

0

24

624

624 53 61

624 T at Site 832 & T at Site 3937 4

25 1998

15

0

4951

22

26

0

18 T at Site 832 & C at Site 3937

Figure 10.12 The statistical parsimony network based on 23 variable sites (numbered by their nucleotide position in the reference sequence) in a 5.5 kb segment of the ApoE gene. Circles designate the haplotypes, each identified by its haplotype number (1 through 31) either inside or beside the circle. The relative sizes of the circles indicate the relative frequencies of each haplotype in the sampled population. A “0” indicates an inferred intermediate haplotype that was not found in the sample. Each line represents a single mutational change, with the number associated with the line indicating the nucleotide position that mutated. A solid line is unambiguous under the principle of statistical parsimony, whereas dashed lines represent ambiguous inferences under statistical parsimony. The two sites showing significant disequilibrium are enclosed by ovals, and the tree is subdivided by the state of these two sites. Source: Modified from Fullerton et al. (2000)

multi-allelic problem. As already noted earlier with respect to GWAS, haplotypes often increase statistical power and the ability to reveal significant phenotypic associations that are not detectable when each SNP is analyzed separately, and the same can be true for candidate locus studies. For example, Drysdale et al. (2000) examined 13 SNPs in the human beta-2-adrenoceptor gene, a candidate locus for asthma because it codes for a receptor protein on the bronchial smooth muscle cells in the lung that mediates bronchial muscle relaxation. Drysdale et al. could find no association between any of the 13 SNPs and asthma, but when the SNPs were phased (Appendix A) to produce 12 haplotypes, they did find significant associations between haplotypes at this locus and asthma. This result is not surprising: SNPs do not affect phenotypes in isolation but rather only affect phenotypes in the context of the genetic state of all the other SNP states with which they are associated by evolutionary history. Haplotypes recover some of this context, and when the SNP is removed from this context by analyzing it in isolation, biological information is lost, particularly if there was homoplasy. Placing the SNP back into the context of a haplotype recovers that lost information, diminishes the problem of homoplasy, and increases the probability of identity-by-descent of the alleles being used (haplotypes versus SNP alleles).

369

370

Population Genetics and Microevolutionary Theory

Analyzing genotypes defined by haplotypes treated as alleles works well as long as the number of distinct haplotypes or alleles is not too large (Seltman et al. 2001), as was the case in the asthma study for which only three haplotypes tended to be common in most human populations. In cases such as these in which the haplotype diversity is low, the problem of detecting associations between genotype and phenotype can be approached using standard statistical tests, just as was done with the 3-allele system of ApoE in Chapter 8. This approach does not work well when the number of haplotypes is large, which is the more typical situation. For example, recall from Chapter 1 the genetic survey of Clark et al. (1998) on 71 individuals from three populations that were sequenced for a 9.7 kb region within the lipoprotein lipase locus (LPL), representing only about a third of the total gene. In all, 88 polymorphic sites were discovered, and 69 of these sites had their phases determined to define 88 distinct haplotypes (Clark et al. 1998). Thus, using only a subset of the known polymorphic sites in just a third of a single gene, a sample of 142 chromosomes reveals 88 “alleles” or haplotypes, which in turn define 3916 possible genotypes – a number considerably larger than the sample size of 71 people. Indeed, virtually every individual in the sample had a unique genotype, and there were more haplotypes or alleles than individuals. When haplotype diversity is large, even extremely large samples only provide sparse coverage of the possible genotypic space defined by the haplotypes. This sparseness results in low statistical power (Templeton 1999a; Seltman et al. 2001). One solution to the problem of high levels of haplotype diversity is to pool or cluster the haplotypes in some manner that reduces the dimensionality of the problem. The first method of clustering haplotypes for candidate gene analysis was nested-clade analysis (Templeton et al. 1987a). Nested-clade analysis has already been introduced as a tool for phylogeographic analysis (Chapter 7), but it was originally developed for studying genotype–phenotype associations at candidate loci. The underlying rationale for this use of nested-clade analysis is that just as SNPs can be placed into the context of a haplotype to increase the level of biological information, so can haplotypes be placed into their evolutionary context through the device of a haplotype tree, as least when recombination is sufficiently rare in the candidate region. The evolutionary historical context of haplotypes can provide additional biological information and a solution to the problem of large amounts of haplotype diversity. One method of using the haplotype tree to cluster haplotypes is the nested design already discussed in Chapter 7. The premise upon which a nested analysis is based is that any mutation having functional significance will be imbedded in the historical framework defined by the haplotype tree, and therefore, whole branches (clades) of this tree will show similar functional attributes. Nesting has several advantages. First, nesting categories are determined exclusively by the evolutionary history of the haplotypes and not by a phenotypic pre-analysis, thereby eliminating a major source of potential bias. Second, the clades define a nested design that makes full and efficient use of the available degrees of freedom. Nesting performs only evolutionarily relevant contrasts and does not squander statistical power on less informative or redundant contrasts. Third, statistical power has been enhanced by contrasting clades of pooled haplotypes instead of individual haplotypes, thereby directly addressing the problem of too much diversity eroding statistical power. Fourth, the tests in the nested design are independent for haploid or homozygous data sets, simplifying the statistical analysis. The first nested-clade analysis (Templeton et al. 1987a) was performed on the haplotype variation found at the Alcohol dehydrogenase (Adh) locus for 41 homozygous strains of the fruit fly D. melanogaster (Aquadro et al. 1986). The estimated haplotype tree and nested design using the rules outlined in Chapter 7 are shown in Figure 10.13. Table 10.4 shows the results of a nested analysis of variance of the phenotype of Adh activity using the nested design shown in Figure 10.13 (Templeton et al. 1987a). The sums of squares for the 1-step and 0-step (haplotype) clade levels

Quantitative Genetics

14 3–1

1–6

25 3–2

13

1–11

2–3 16

0 4 1–2

1–7

2–1 2

3

1–1

1

22

20 11

7

6

0

0

1–10

23

15

5

1–3

2–5

24

17

1–9

21

19

1–5

2–4 1–8

12 18 9

8 1–4

10

2–2

Figure 10.13 The unrooted haplotype tree for restriction site variation at the Drosophila melanogaster Adh region, along with the nested design (Source: from Templeton et al. 1987a). Haplotypes are indicated by numbers, with a “0” indicating an inferred intermediate not observed in the sample. Each arrow indicates a single mutational change. Haplotypes are indicated by positive integers and are nested together into 1−step clades, indicated by “1−#,” which in turn are nested together into 2-step clades indicated by “2−#,” which in turn are nested together into 3−step clades, 3−1 and 3−2.

are decomposed into their independent components nested within the next higher level. When a clade with a significant association contained three or more possible contrasts, each contrast was tested and subject to a Bonferroni correction to localize the specific branch with the strongest effect. These analyses are then repeated recursively at the higher nesting levels. As can be seen from Table 10.4 or Figure 10.14, four significant associations are identified, which subdivide the haplotypic variation into five allelic classes with respect to Adh activity. Figure 10.15 gives the actual mean phenotypes of all the homozygous strains used in this study, grouping these strains by which one of the five clades they bore. As can be seen, the strongest effect in Table 10.4 is associated with the difference between clade 3–1 versus clade 3–2. The actual distribution of Adh activity is bimodal (Figure 10.15), and this clade contrast captures most of that bimodality. This bimodality is also easily detected by a standard one-way ANOVA (Analysis of Variance, see Appendix B) using the haplotypes as treatments. However, the nested analysis also detected phenotypic heterogeneity within both clades 3–1 and 3–2. Within clade 3–1, the nested analysis in Table 10.4 identifies clade 1–4 as having a significantly different Adh activity than the other clades nested within clade 3–1. As can be seen from Figure 10.15, the four strains that constitute clade 1–4 have the four highest Adh activities within the lower Adh activity mode. Thus, within the lower activity mode, there is a strong association between evolutionary relatedness (being in clade 1–4) and phenotypic similarity. This significant effect is only detectable using evolutionary information. The same is true for

371

372

Population Genetics and Microevolutionary Theory

Table 10.4 Nested-clade analysis of Adh activity at the Adh locus. Source

Sum of Squares

Degrees of Freedom

Mean Square

F-Statistics

3-Step Clades

138.33

1

138.33

366.50a

2-Step Clades

0.88

3

0.29

0.78

Within 2–1

1.50

1

1.50

3.98

Within 2–2

5.74

2

2.87

7.61b

1-Step Clades

1–4 vs. 1–3

Bonferroni Significance 0.50

17 vs. 19

Bonferroni Significance 0 is the selection coefficient against aa homozygotes and 0 ≤ h ≤ 1 is a measure of dominance (h = 0 means that A is completely dominant over a for the phenotype of fitness, and h = 1 means that a is completely dominant over A for the phenotype of fitness, with intermediate values of h reflecting an intermediate deleterious effect on the heterozygotes). The average excess of fitness for the a allele under random mating is (Eq. (8.10)): aa = p 1 − hs − w + q 1 − s − w

12 5

The average excess of allele a is negative for all q > 0, so Eq. (12.1) implies that the only selective equilibrium is qeq = 0 (that is, all copies of the genes at this locus are from the functional allelic class). Note that the selective and mutational equilibrium are complete opposites. If we assume s is much larger than μ, then we can assume that selection will overwhelm mutation when far from the selective equilibrium (which ensures a negative value of aa that is large in magnitude). But when close to the selective equilibrium, even the weak force of recurrent mutation can have a major effect as the average excess of a approaches zero. Given that the population is close to the selective equilibrium, then p ≈ 1 and Eq. (12.4) becomes Δq(mutation) ≈ μ. Also, w ≈ 1 near selective equilibrium because almost all individuals in this randomly mating population are AA homozygotes. Hence, Eq. (12.5) simplifies to aa ≈ p(−hs) + q(−s) = −hs + qs(h − 1). From Eq. (12.3), the equilibrium between selection and mutation is given by qeq =

μ hs + qeq s 1 − h

12 6

since w≈ 1. For the special case of a recessive genetic disease, h = 0, and Eq. (12.6) can be rearranged as qeq2 = μ/s. Hence, the equilibrium allele frequency for an autosomal, recessive deleterious allele is: qeq =

μ s

12 7

Note that the equilibrium frequency of a is explicitly a balance between mutation and selection, in this case as measured by the square root of the ratio of the mutation rate divided by the selection coefficient. As an example of this balance, let the a allele be completely lethal when homozygous (s = 1) and the mutation rate be 10−6. Then, the equilibrium frequency of the a allele from Eq. (12.7) is 0.001—some three orders of magnitude greater than the mutation rate. Because the square root of any number much smaller than 1 is going to be much larger than the original number, Eq. (12.7) implies that recessive, deleterious alleles will accumulate in randomly mating populations until they reach frequencies several orders of magnitude greater than their mutation rates. The reason for the ineffectiveness of natural selection in eliminating these recessive, deleterious mutations is that selection can only eliminate a recessive, deleterious allele when it is in a homozygote, which is a rare event under random mating for a rare allele. Almost all copies of the recessive, deleterious alleles are in heterozygotes where they are protected from selection. As a consequence, the average excess (Eq. (12.5)) is dominated by the heterozygote genotypic deviation (which is close to zero) and the trait has very low heritability (as shown in Chapter 9). The fundamental theorem of natural selection (Chapter 11) predicts that selection is ineffective on this recessive, deleterious trait as long as it is rare, thereby allowing the seemingly weak force of mutation to have a major impact on the equilibrium allele frequency. Although rare at any one locus, recessive deleterious alleles can be collectively common in outcrossing populations or species. For example, genomic surveys found

425

426

Population Genetics and Microevolutionary Theory

that the average Icelander carried 149 loss-of-function alleles, almost all in heterozygous condition (Gudbjartsson et al. 2015). Given the rarity of any one of these loss-of-function alleles, it is unlikely that two Icelanders who are not closely related would carry the same loss-of-function variants, but a mating between two first cousins would be expected to share 18.6 of these loss-of-function variants, and their inbred offspring would be expected to be homozygous for 4.7 loss-of-function alleles. Of course, not all loss-of-function alleles are necessarily deleterious, but many are, so this increased risk of being homozygous for rare loss-of-function alleles is a contributor to why inbreeding depression (Chapter 3) is so common in outcrossing species. Now consider a deleterious allele that has some effect on the heterozygotes such that h > 0. Given that qeq is small because a is deleterious, even small values of h will often yield an hs term that is much larger than qeqs(1 − h). Accordingly, a good approximation to Eq. (12.6) is now qeq =

μ hs

12 8

For example, consider the previous case in which s = 1 and μ = 10−6, but now let h = 0.01. The fitness of the heterozygote is reduced by just 1% relative to the fitness of AA, a decrease so slight that it would be difficult to detect in most studies. Despite this tiny heterozygous effect, Eq. (12.8) now predicts an equilibrium frequency of 0.0001—an order of magnitude less than the equilibrium for this case under complete recessive inheritance. Thus, a fitness decrease of one one-hundredth in the heterozygote yields a tenfold reduction in the frequency of this deleterious allele from the recessive case. Indeed, note from Eq. (12.8) that as long as the heterozygote has a 1% fitness decline (hs = 0.01), it really does not make much difference how deleterious this allele is in the homozygotes. In terms of the equilibrium between selection and mutation, it makes little evolutionary difference whether the fitness of the homozygote is reduced 1% or 100% (lethal); the 1% decline in the heterozygote dominates the evolutionary balance between mutation and selection under random mating. The average excess provides the reason why even slight effects in heterozygotes dominate this evolutionary balance against deleterious alleles; newly arisen mutations in a randomly mating population exist almost exclusively in heterozygotes, and the homozygote effect is, therefore, relatively unimportant. This illustrates that in measuring the balance between mutation and selection, we must always look at selection from the gamete’s perspective as quantified through the average excess.

The Interaction of Natural Selection with Mutation and System of Mating The system of mating can also interact with natural selection to influence evolutionary dynamics and equilibria. However, the interaction of selection with system of mating is more direct than with many other evolutionary forces because system of mating directly influences the average excess (Chapter 8). To illustrate this direct interaction between selection and system of mating, consider the mutation/selection balance models of the previous section but now allowing deviations from random mating as measured by f (Chapter 3). The average excess associated with a deleterious allele now becomes aa = p 1 − hs − w 1 − f + f + q 1 − f

1−s−w

12 9

Note that Eq. (12.5) is a special case of Eq. (12.9) when f = 0 (random mating). Assuming, as before, that w ≈ 1 near equilibrium, the average excess of the deleterious allele a is approximately –fs + (1 – f)[−hs + qs(h − 1)], and when dealing with a recessive allele (h = 0) the average excess further

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

simplifies to −fs−(1 − f)qs. At equilibrium, we expect qeq to be very small, so even modest deviations from random mating with f > 0 can ensure that the fs term dominates over the (1 − f)qeqs term. Hence, at equilibrium in an inbreeding population (f > 0), the average excess of the deleterious a allele becomes −fs, and the equilibrium allele frequency defined by Eq. (12.3) is qeq =

μ fs

12 10

In the example of a recessive lethal allele (s = 1) with a mutation rate of 10−6, recall that Eq. (12.7) yields an equilibrium frequency of 0.001 in a randomly mating population. Now consider the evolutionary impact of a seemingly minor deviation from random mating with f = 0.01 (a 1% deviation from Hardy–Weinberg genotype frequencies). With this modest f, Eq. (12.10) yields an equilibrium allele frequency of 0.0001. Thus, a 1% deviation from random mating is amplified into a tenfold reduction in the equilibrium frequency of a recessive, lethal allele. Equation (12.10) tells us that populations that systematically deviate from random mating with an inbreeding system of mating accumulate fewer deleterious recessive alleles than would a randomly mating population. For example, Chelo et al. (2019) monitored inbreeding depression through genome-wide SNPs in experimental populations of the worm Caenorhabditis elegans under outcrossing, partial selfing, and exclusive selfing. They found that ancestral inbreeding depression was maintained under outcrossing, but became reduced with partial and exclusive selfing, in part due to the elimination of recessive, deleterious alleles. Arunkumar et al. (2015) followed the transition from outcrossing to selfing in the plant Eichhornia paniculata and found evidence for strong purifying selection consistent with the deleterious recessive mutations that had accumulated in the outcrossing ancestors becoming exposed to intense selection under selfing and its resulting high levels of homozygosity even for rare alleles (Eq. (12.10)). Because the average excess is a direct function of f, deviations from random mating will influence the outcome of natural selection on all selected genetic systems, and not just selection against rare, deleterious alleles. To illustrate this pervasive impact of system of mating upon adaptive outcomes, let us return to the example from Chapter 11 of the Bantu peoples adapting to a malarial environment at the β-Hb locus with respect to the A, S, and C alleles. In Chapter 11, we only considered the case of a randomly mating population. Now we assume that a hypothetical inbreeding population with f > 0. As before, consider an initial gene pool with pA ≈ 1, pS ≈ pC ≈ 0 for the frequencies of the A, S, and C alleles, respectively, and with fitnesses as given in Table 11.1 for the malarial environment. With inbreeding, the average excesses of the S and C alleles are (from Eq. (8.9)): aS = pA 1 − f 1 − w + f + pS 1 − f 0 2 − w + pC 1 − f 0 70 − w aC = pA 1 − f 0 89 − w + pS 1 − f 0 70 − w + f + pC 1 − f 1 31 − w

12 11

Under the initial conditions, almost everyone in the population is AA, so the average fitness is close to 0.89, the fitness of the AA genotype under malarial conditions. Hence, the initial response to selection for the S and C alleles is determined by their initial average excesses of: aS = pA 1 − f 0 11 + f + pS 1 − f − f 0 69 = 0 11 − 0 8f aC = pA 1 − f 0 + pS 1 − f

− 0 69 + pC 1 − f

− 0 19 + f + pC 1 − f

− 0 19 ≈ 1 − f 0 11

0 42 ≈ 0 42f 12 12

when pA ≈ 1, pS ≈ pC ≈ 0 in the initial gene pool. Recall that the initial average excesses in the random mating case were 0.11 for S and 0 for C (Eqs. (11.6) and (11.8) in Chapter 11), which is a

427

428

Population Genetics and Microevolutionary Theory

special case of Eq. (12.12) when f = 0. As Eq. (12.12) reveals, inbreeding decreases the average excess of the S allele and increases the average excess of the C allele relative to their values under random mating. When f > 0.11/0.8 = 0.14, Eq. (12.12) reveals that natural selection always operates to eliminate the S allele and favors the C allele. However, even when selection under inbreeding initially favors an increase in both the S and C alleles (when 0 < f < 0.11/0.8), the adaptive landscape is altered by inbreeding to decrease the “height” of the A/S polymorphic peak relative to the random mating case. This alteration broadens the conditions under which the population will evolve toward fixation of C under natural selection. For example, Figure 12.1 shows the adaptive topography with the fitnesses given in Table 11.1 for a malarial environment with f = 0.04, as well as the evolutionary trajectories over this landscape for the same set of initial conditions is considered in Figure 11.7. In contrast to Figure 11.7, all initial populations adapt to malaria in this case through the fixation of the C allele, and none remain polymorphic for sickle cell.

(a)

(b)

C A

S

1.25

C A 0.92

1

S

w 0.90

w

0.88 0.75

C 0.5

pS

C 0.25 A

A

pS pC

pC

pA S

pA S

Figure 12.1 The adaptive surface defined by the A, S, and C alleles at the β-Hb locus in an inbreeding population with f = 0.04 using the fitnesses given in Table 11.1 for a malarial environment. The gene pool space is shown by the triangle near the bottom, with the allele frequencies given by the perpendicular distances from the point to the sides of the triangle, and with the vertices associated with fixation of a particular allele labeled by the letter corresponding to that allele. The vertical axis gives w as a function of these three allele frequencies. One peak exists in this adaptive surface corresponding to fixation for C, as indicated by the white dot. Black dots indicate the initial state of the gene pool for four populations: one starting at pA = 0.95, pS = 0.025, and pC = 0.025; a second at pA = 0.85, pS = 0.025, and pC = 0.125; a third at pA = 0.85, pS = 0.125, and pC = 0.025; and a fourth at pA = 0.75, pS = 0.125, and pC = 0.125. Black lines from these dots plot the evolutionary trajectory across generational time as defined by Eq. (12.1). Part a shows the entire adaptive landscape, and part b shows the same expanded portion as Figure 11.7b.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

These results make sense if you take the gamete’s perspective as measured by the average excess. An S allele in an inbreeding population is more likely to be coupled with another S allele, so the deleterious fitness effects of the SS homozygote become more important and the beneficial effects of the AS heterozygote becomes less important under inbreeding from an S-bearing gamete’s perspective. In contrast, the highly beneficial fitness effects of the CC homozygote become more important under inbreeding from the perspective of a C-bearing gamete. The contrast in evolutionary outcomes between Figures 11.7 versus 12.1 reveals that adaptation cannot be explained only in terms of the fitnesses of genotypes, which are identical in these two cases (Table 11.1). Even complete knowledge of the fitnesses of every individual is inadequate for predicting the course of natural selection. We need, in addition, the information about population structure that directly affects the average excesses of fitness. Hence, our fundamental equation of natural selection for measured genotypes (Eq. (12.1)) reveals that the outcome of adaptive evolution represents an interaction between fitness differences and population structure. Once again, we see the fallacy in the statement that “natural selection is survival of the fittest”—the fittest genotypes do not by themselves define the outcome of adaptive evolution and, indeed, the fitnesses of all the genotypes do not by themselves define the outcome of adaptive evolution. The f in the above equations was interpreted as measuring inbreeding as a system of mating, but f 0 can also arise in a locus specific manner due to assortative or disassortative mating (Chapter 3). Nishi et al. (2020) scored more than a million SNPs in a sample of 1683 human couples of opposite sex and identified many genomic regions associated with assortative and disassortative mating scattered across the genome. Interestingly, they found that the SNPs associated with the higher signals of assortative mating also displayed strong signals of recent positive selection, implying that assortative mating sped up adaptive evolution at these loci. This study reinforces the notion that the course and speed of adaptive evolution cannot be predicted by fitnesses alone, but only with fitnesses in the context of population structure.

The Interaction of Natural Selection with Gene Flow Gene flow can influence the adaptive course of natural selection both by determining what genetic variation is available within a deme’s gene pool and by directly altering allele frequencies (Chapter 6). To see this, consider the simple model from Chapter 6 of symmetrical gene flow at rate m between two demes, 1 and 2. The change in allele frequency at an autosomal locus in deme 1 is given by Δp1 = −m(p1 − p2) where pi is the frequency of the allele in deme i (Eq. (6.2)). Combining the effects of selection and gene flow, Eq. (12.2) becomes: Δp1 =

p1 aA − m p1 − p2 w

12 13

Suppose that aA > 0 in deme 1, that is, natural selection favors an increase in the frequency of allele A. However, if p1 were initially zero, this adaptive course could never start. Suppose further that the A allele is present in population 2. Once A is introduced into deme 1 through gene flow, natural selection can now operate to increase its frequency. As we saw in Chapter 11, natural selection does not create genetic variants but only operates upon the genetic variation available in the gene pool. Gene flow can be an important source of such variation. The sign of the component of allele frequency change induced by gene flow in Eq. (12.13) is determined solely by the initial difference in allele frequencies among the demes, in this case (p1 − p2). The sign of the selective component of Eq. (12.13) is determined solely by the average excess of

429

430

Population Genetics and Microevolutionary Theory

fitness, which in turn is a function of within deme allele frequencies, system of mating, and the genotypic deviations of fitness in the deme. As a result, there is no biological necessity for the selective and gene flow components to have the same sign. In some cases, selection and gene flow can operate in the same direction, allowing an allele to increase (or decrease) in frequency; in other cases, selection and gene flow will be in opposite directions and the evolutionary outcome will depend upon their balance. In general, restricted gene flow will reduce the rate of spread of a beneficial allele relative to panmixia (Barton et al. 2013), but restricted gene flow will reduce the rate of spread of neutral alleles more, and of deleterious alleles even more. Hence, different parts of the genome will effectively have different rates of gene flow because of these interactions with natural selection. This means in turn, from Eq. (6.29), that when selection is present there will be variation across the genome in fst under restricted gene flow. Recall from Chapter 6 that fst measures the amount of variance in allele frequency across subpopulations. The interaction of selection and restricted gene flow causes a sieve-like effect in which some alleles experience effectively augmented gene flow (an allele being positively selected in all subpopulations), neutral alleles experience the general background level of gene flow, whereas deleterious alleles experience effectively less gene flow. In particular, the locus-specific fst should have much higher values than the background level when some of the selection is associated with local adaptation, that is, when different subpopulations experience different environments that select for different alleles. Local adaptation will be discussed in more detail in Chapter 14, but for now recall that we have already seen an example of this in Chapter 11: the adaptations to malaria in human populations that live in malarial regions. We saw that the sickle-cell allele, S, is selected for in a malarial region, but selected against in a non-malarial region (Table 11.1). In cases such as this, the average excess in Eq. (12.13) tends to change sign relative to the sign given to the gene flow component. For example, if allele A is favored in subpopulation 1 and selected against in population 2, then generally p1 > p2 and the gene flow component of Eq. (12.13) will be negative. Gene flow from population 2 will reduce the frequency of A in population 1 below its selective optimum whenever p1 > p2, thereby giving allele A a positive average excess for fitness. Because the two components of Eq. (12.13) tend to have opposite signs under local adaptation, the change in allele frequency induced by gene flow is below the background rate m, that is, local adaptation effectively mimics reduced gene flow for the A locus, thereby increasing the value of fst for that selected locus. Lewontin and Krakauer (1973) proposed that testing for heterogeneity of fst’s across loci would be a test of the null hypothesis that all loci are neutral. Finding significant heterogeneity across loci for fst would indicate at least some loci were selected. The selected loci would tend to be high outliers for local adaptation and low outliers for uniform selection across all subpopulations, particularly for balanced polymorphisms when selection tends to homogenize allele frequencies to a common equilibrium (Eq. (11.13)). However, when there are negative fitness interactions between alleles in an environment (as for S and C in a malarial environment) or negative epistasis (which also exists for human malarial adaptations, as will be shown later in this chapter), even selection in a uniform environment can increase fst as different local populations adapt to the same environment under balancing selection with different alleles and/or loci (Brandt et al. 2018), as we saw with the S and C alleles in different Bantu populations (Figure 11.5). Indeed, Ralph and Coop (2015) show that even selectively equivalent alleles in a balanced polymorphism due to a uniform environment can result in locally differentiated areas when there is sufficiently restricted gene flow across the species range. Hence, although high fst values are usually interpreted as local adaptation, and most likely are, other models of selection can also result in high fst values. Regardless, all these models show that selection can create both high and low values of fst that deviate from the expected background rate due to the balance of drift and gene flow.

0.5 0.4 0.3 0.2 0.1 0.0

Average Pairwise FST

0.6

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

0

5000

10 000

15 000

SNPs

Figure 12.2 An fst outlier analysis of 15 557 SNPs from 11 populations of Andrew’s toad. Source: Guo et al. (2016). © 2016, John Wiley & Sons.

At the time when Lewontin and Krakauer (1973) made their proposal, not enough loci could be surveyed to make their test practical, but that is no longer a problem. Now we can screen enough markers to cover the entire genome. Neutral markers should have a common background fst that reflects the balance between local drift and gene flow (e.g., Eq. (6.29)), although demographic history can also affect the background fst. Regardless, the fst’s should be roughly the same for all neutral regions of the genome, although correcting for gene flow patterns, isolation by distance, and past range expansion events can reduce the incidence of false outliers (Lotterhos and Whitlock 2014, 2015; Whitlock and Lotterhos 2015). Guo et al. (2016) surveyed 15 557 SNPs in 11 population of Andrew’s toad (Bufo andrewsi) from the edge of the Tibetan plateau at an altitude of 2768 m above sea level down to an altitude of 1690 m.a.s.l. They calculated the average pairwise fst for each of these markers, as shown in Figure 12.2. This figure shows that the baseline fst was rather low (the average fst over all SNPs was 0.023), but 132 SNPs (0.8% of the total SNPs) were low outliers and 454 (2.9% of the total SNPs) were high outliers after adjusting for a false discovery rate of 0.01. The low outliers are close to the false discovery rate, but the high outliers are well above it, indicating significant local adaptation in these toad populations. Further insight into the hypothesis of local adaptation is possible by coupling the fst outlier analysis with an environmental outlier analysis. Environmental variables were measured that characterize the habitats in which the subpopulations are located. Guo et al. then looked for significant associations between the marker alleles with the environmental variables after adjustment for population structure using the neutral (nonoutlier) markers. Figure 12.3 shows the results of the outlier analyses for altitude and average annual temperature for the toad populations. In total, 387 SNPs (2.5%) were significant outliers with correlated variation in altitude, and 750 (4.8%) with temperature (121 were significantly correlated with both altitude and temperature). These environmental outlier SNPs significantly overlapped with the 454 high outlier fst SNPs, with 43 (9.5%) being correlated with altitude and 59 (13.0%) with temperature—a pattern that further supports the hypothesis of local adaptation. Correlations with environmental variables need to be interpreted cautiously, however. For example, computer simulations of species that had experienced range expansion reveal that much genetic diversity can align with the expansion axis and create false-positive associations with environmental variables under nonequilibrium conditions (Frichot et al. 2015). It is,

431

Population Genetics and Microevolutionary Theory

Altitude

0 5 10 15 20 25

log10 (BF)

(a)

0

5000

10 000

15 000

10 000

15 000

Temperature

log10 (BF)

(b)

0 5 10 15 20 25

432

0

5000 SNPs

Figure 12.3 An environmental outlier analysis of 15 557 SNPs from 11 populations of Andrew’s toad. Panel (a) is an analysis of altitude, and panel (b) is an analysis of average annual temperature. Source: Guo et al. (2016). © 2016, John Wiley & Sons.

therefore, best to investigate the phylogeography of a species (Chapter 7) before interpreting ecological associations and local adaptation. Outlier fst scans have many robust statistical properties (Jones et al. 2019; Matthey-Doret and Whitlock 2019), but the simulations of Crawford and Nielson (2013) indicate that when the background fst is >0.20, the power tends to decrease. However, they also found that admixed populations can be used to map differentially selected loci with high power even for very large fst’s. We, therefore, turn our attention to admixture, another type of gene flow (Chapter 6). Just as selection interacts with recurrent gene flow to create a selective sieve of genetic interchange, selection does the same with admixture. Workman et al. (1963), Workman (1973), and Adams and Ward (1973) were early advocates of detecting natural selection through the acceleration and retardation of selected genomic regions as deviations from the background admixture rate. These groups focused on the example of African Americans, a population influenced by asymmetric admixture involving demes originally derived from Europe and West Africa (Chapter 6). They measured the allele specific impact of admixture over several generations by estimating M (the current proportion of admixture, Chapter 6) by: M=

pA − pW Change in Allele Freq in African Americans from West Africans = pE − pW Initial Diff in Allele Freq Between Europeans and West Africans 12 14

Because M is standardized for the initial allele frequency difference, M should be identical for all neutral alleles at all polymorphic loci showing any initial difference in allele frequency between Europeans and West Africans if admixture were the only evolutionary force operating. Adams and Ward (1973) estimated M for alleles at several loci in African Americans from Claxton, Georgia (Table 12.1) and found significant heterogeneity across alleles. Such heterogeneity implies that selection may have altered the allele frequency dynamics at some loci. A majority of the alleles yields an estimate of M of about 0.11 for this African–American population, and

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

Table 12.1 Allele

Admixture estimates (M) for several loci in African Americans from Claxton, Georgia, U.S.A. M

Possible Explanation

Ro

0.107

R1

0.110

r

0.117

Fya

0.108

P

0.092

Jka

0.164

A

−0.037

Alleles at several blood group loci that may be neutral and that may reflect the background impact of asymmetric gene flow resulting in about 11% admixture in this African American Population

Materno-fetal incompatibility at the ABO blood group locus may select against European alleles

R2

0.446

T

0.466

Unknown

Hp1

0.619

G6PD A−

0.395

β-Hb S

0.614

Alleles at loci implicated with malarial adaptation. Selection may occur against African alleles in a non-malarial environment

Note: The estimates were significantly heterogeneous across loci. The last column gives possible explanations for that heterogeneity. Source: Modified from Adams and Ward (1973).

combining the estimates across all alleles as weighted by the variances of the estimators yields an overall M of 0.13. The most straightforward interpretation of these results is that these alleles are neutral or nearly neutral in this population and are reflecting the background impact of about 11% admixture. However, there are many allelic outliers from this apparently neutral background M. For example, the A allele at the ABO blood group locus shows a slightly negative M, but not significantly different from 0. Based on this allele alone, there seems to be no admixture at all. Waterhouse and Hogben (1947) have implicated the A allele as a cause of fetal wastage (spontaneous abortions) of AO fetuses arising from matings between women with blood type O (genotype OO) with men of blood type A (genotypes AA or AO) (this type of selection will be examined in more detail in the next chapter). Bottini et al. (2001) have shown increased recurrent spontaneous abortion in matings between women of blood type B (genotype BB or BO) with men of blood type A. West African populations have a higher frequency of the O and B alleles and a lower frequency of the A allele than European populations (Adams and Ward 1973), and much of the original admixture is known historically to be between men of European ancestry with women of African ancestry. This allele frequency and mating pattern would result in the preferential elimination of A bearing fetuses, thereby reducing the flow of A alleles from the European–American population into the African–American population. In contrast to the A allele at the ABO locus, there are several alleles at other loci that have high values of M, about 0.4 or above. A majority of these alleles have been implicated in malarial adaptation, and moreover these same alleles are often associated with deleterious effects in a nonmalarial environment (as already discussed for the G6PD A− and β-Hb S alleles). Selection for malarial resistance was much reduced in North America relative to tropical West Africa, so it is reasonable to conclude that selection would be occurring against the anti-malarial alleles in the North American environment. Selection would favor an increase in alleles of European ancestry

433

Population Genetics and Microevolutionary Theory

at these loci, thereby accentuating the effects of asymmetric gene flow. Overall, the results shown in Table 12.1 illustrate that natural selection interacts with admixture as a selective filter that blocks or retards the flow of some genes from one population to another but accentuates the spread of others. Our ability to investigate the genetic consequences of admixture have increased substantially since the 1970s (Chapters 6 and 10), so this method of detecting selection can now be done by scanning for genomic regions that are outliers above and below the background admixture rate. For example, Norris et al. (2020) scanned for admixture outliers in Latin–American populations that have a history of admixture between three continental populations: Native Americans, Europeans, and sub-Saharan Africans. They first defined ancestry-specific haplotypes (recall AIMs from Chapter 10) and used these to estimate the ancestry fractions (fanc) for each of the three ancestral populations as fanc(i) = hanc(i)/htot where i indexes the three ancestral populations, hanc(i) is the number of ancestral haplotypes from ancestral population i in the sampled genomic region of interest, and htot is the total number of ancestry-assigned haplotypes. The mean, μanc(i), and standard deviation, σ anc(i), of ancestry from population i across the genome were then calculated, and the normalizing transformation (Appendix 2) was used as a single-locus ancestry enrichment score, zanc(i): zanc i =

f anc i − μanc i σ anc i

12 15

Initial p-values were calculated from these scores, and corrections were made for multiple testing. Figure 12.4 shows the results for a scan specifically of enrichment for African ancestry to detect positively selected genomic regions from Africa in the current admixed Latin–American populations. A major region for positive selection was detected on chromosome 6, and further mapping revealed this positive selection occurred in the major histocompatibility complex (MHC), and specifically to a region involved in immune response. These results suggest that admixture enabled extremely rapid adaptive evolution within the last 500 years (about 20 generations) in these human populations. The results of Norris et al. (2020) as well as the older results of Adams and Ward (1973) indicate that admixture should be considered as a fundamental mechanism for the acceleration of adaptive

6 –log10 P African Ancestry Enrichment

434

5 4 Genome-wide Significant Enrichment

3 2 1 0 1

2

3

4

5

6

9 7 8 Chromosome

10 11 12 13 14 15 16

18 20 22

Figure 12.4 A genome scan for outliers for enrichment of African ancestry in Latin American populations. Dots indicate the p-values for specific genomic regions, with contrasting shading to indicate different chromosomes. A dashed line shows the genome-wide significance threshold. Source: Norris et al. (2020). Licensed under CC-BY-4.0.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

evolution. Admixture, even at low levels, often creates a major influx of new genetic variation into a population. As discussed in Chapter 11, natural selection is limited by the genetic variation that is available to a population either by mutation or gene flow (including admixture), so such a pulse of new genetic variation can open up many new adaptive possibilities. For example, modern humans expanded out of sub-Saharan Africa around 130 000 years, and soon their fossil record extended from western North Africa to eastern China (Chapter 7). However, there was a long delay before these modern humans expanded into the more northern parts of the Eurasian continent around 50 000 years ago (Figure 7.14). What caused this delay? Modern humans and Neanderthals overlapped during this lag period in the Levant, so admixture was certainly possible during this time period. Greenbaum et al. (2019a) showed through simulations that infectious diseases and low levels of admixture can provide a plausible explanation for this delay. Their model shows that a high disease burden negatively influences the admixture rate, and indeed the admixture rate between modern humans and Neanderthals was low (Chapter 7). This results in a persistent quasi-stable phase in their simulations. Eventually in their simulations sufficient adaptive introgression in response to novel pathogens occurred through the limited admixture, resulting in a transition from the quasi-stable phase to a rapid phase of evolution that allowed a secondary population range expansion to occur at the expense of the archaic populations. Supporting this hypothesis is the observation that the introgressed genomic segments from archaic populations that are present in modern humans are significantly enriched for signals of positive selection, many of which are related to immune function (Greenbaum et al. 2019a; Hsieh et al. 2019; Gokcumen 2020; Rees et al. 2020). Hence, adaptive introgression resulting from low levels of admixture may have played a major role in shaping modern human evolution, as illustrated in Figure 7.14. The evolutionary significance of admixture cannot be judged just from its average level. Because of the interaction of natural selection with admixture, even low levels of admixture can have a large evolutionary impact.

The Interaction of Natural Selection with Genetic Drift Genetic drift induces random changes in allele frequency, so the second term in Eq. (12.2) when applied to drift is expected to vary in sign at random from generation to generation. Hence, unlike gene flow, genetic drift results in no consistent enhancement or retardation of the effects of selection. To R. A. Fisher, this meant that genetic drift would play little role in shaping adaptive outcomes unless the population size is extremely small or, as we will see below, in dealing with the fate of a newly arisen mutation. To Fisher, selection (the first term in Eq. (12.2)) provides a consistent direction to evolution or maintains an equilibrium, whereas drift causes random deviations from the selective trajectory that tend to cancel each other out in the long run. Under this view, drift has little long-term importance in adaptive evolution. As we will see later in this chapter, Sewall Wright disagreed with this point of view and contended that drift could under some conditions strongly influence the course of adaptive evolution through its interactions with natural selection. However, we began this section by examining the fate of newly arisen, selected mutations—an area in which both Fisher and Wright agreed that drift plays an important evolutionary role. Drift is always strong when dealing with newly arisen mutant alleles because, regardless of the total population size, the mutant is originally present as a single copy and hence is subject to maximal sampling error. To see how selection and drift interact to determine the fate of newly arisen alleles, recall from Chapter 5 that the probability of a newly arisen, neutral, autosomal allele surviving in a population and going to fixation is 1/(2N), its initial allele frequency as a new mutation in a population of

435

436

Population Genetics and Microevolutionary Theory

size N. Let this probability of fixation of a neutral mutation be given by u0 = 1/(2N). Natural selection changes this probability. For example, suppose initially that the population is fixed for allele a at an autosomal locus in a finite randomly mating deme of size N, but mutation creates a single copy of the new allele A. Suppose further that A is a favorable allele under the existing environmental conditions such that the relative fitness are 1 for aa, 1 + s for Aa, and 1 + 2s for AA where the selection coefficient s is greater than zero. In this case aA is greater than zero for every p 0, so under natural selection alone (Eq. (12.1)) we would expect the favorable A allele to go to fixation. However, the second term of Eq. (12.2) can take on negative values by chance due to drift, and when the A allele is still rare in the population, the drift term can be sufficiently negative to lower the frequency of A, even to the point of loss, contrary to the predictions of the fundamental theorems of natural selection. In particular, the probability of fixation of the favorable allele A is not one, as expected from selection alone, nor u0, as expected from drift alone, but rather is (Crow and Kimura 1970): N ev

u=

1 − e−2 N s 1 − e − 4N ev s

12 16

where Nev is the variance effective size. For simplicity, let Nev = N, then Eq. (12.16) becomes: u=

1 − e − 2s 1 − e − 4Ns

12 17

If we take the limit of Eq. (12.17) as s goes to 0 (that is, the A allele approaches neutrality), we get (using Taylor’s theorem) u ≈ 2s/(4Ns) = 1/(2N). Consequently, the neutral fixation probability, u0, is a special case of Eq. (12.17) when the selection coefficient becomes very small. As selection becomes stronger, drift and natural selection interact with one another to influence the evolutionary fate of the favorable allele, as indicated by the product 4Ns appearing in the denominator of Eq. (12.17). In this simple codominant fitness model, the strength of natural selection as measured by the average excess is proportional to s when the allele is rare, and the strength of drift is, of course, proportional to 1/N. Hence, the product Ns is the ratio of the strength of selection divided by the strength of drift. As we have seen many times before, it is the relative strengths of two evolutionary forces that must be taken into account to determine their joint evolutionary impact. Table 12.2 shows the values of u and u0 for s = 0.01 and for various values of N. This table shows that as N gets larger (selection becomes stronger relative to drift), selection is able to cause ever increasingly large deviations from the expectations under pure genetic drift (neutrality). Note, however, that the probability of fixation of this favorable mutant stabilizes at 0.020. This result may seem surprising given that selection favors the fixation of the A allele, yet even in an effectively Table 12.2 The fixation probabilities of a selectively favorable mutant allele with s = 0.01 as a function of population size (N). N

u

u0 = 1/(2N)

Percent Change in Selected Case from Neutral Case

10

0.061

0.050

22%

50

0.023

0.010

129%

100

0.020

0.005

300%



0.020

0.000



Note: The fixation probabilities of a neutral mutant in a population of the same size are also shown, as well as the percent increases in the fixation probability in the selected case compared with the neutral case, 100(u − u0)/u0.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

infinite population the favored allele is much more likely to be lost than fixed. This is not just limited to the case of s = 0.01. In general, as N goes to infinity, the denominator in Eq. (12.17) goes to 1, so the fixation probability is approximately 1 − e2s in large populations. When s is small, 1 − e2s ≈ 2s. This approximation predicts that the fixation probability of a favorable allele with s = 0.01 should be 0.02 in a large population, as is indeed the case in Table 12.2. Note that in this case, the favorable allele will be lost 98% of the time. Hence, even in a large population, the most likely fate of a selectively favorable allele is to be lost due to genetic drift. This occurs because genetic drift overpowers selection in the first few generations, during which it is likely that only one or a handful of copies of the favorable allele exist in the population. Because the direction of genetic drift is random, it is even possible for genetic drift to overpower natural selection and lead to the fixation of a deleterious mutant. Suppose now that A is a deleterious allele under the existing environmental conditions such that the relative fitness is 1 for aa, 1 − s for Aa, and 1 – 2s for AA where the selection coefficient s is greater than zero. In this case, aA is less than zero for every p > 0, so under natural selection alone (Eq. (12.1)) we would expect the deleterious A allele to be lost from the population. However, with drift, the fixation probability for the deleterious mutant is u = (e2s − 1)/(e4Ns − 1), which is positive for all finite values of N. Thus, a deleterious allele has a chance of going to fixation in a finite population, in violation of the fundamental theorems of natural selection. Table 12.3 shows these fixation probabilities for s = 0.01 as a function of population size. As the strength of genetic drift decreases relative to selection with increasing population size, the deviations from neutrality are accentuated in magnitude. However, unlike the case for an advantageous mutant, as population size becomes large, the results converge to that expected from selection alone (elimination of the deleterious allele). This occurs because the primary effect of genetic drift in a large population is to eliminate rare alleles. For an advantageous allele, this effect of drift was in opposition to selection, but in the case of a deleterious allele, both selection and drift tend to reinforce one another in eliminating the newly arisen rare allele. Although Fisher and Wright agreed that drift can interact with selection to have a large impact on newly arisen mutations, Fisher looked upon drift as primarily limiting the amount of genetic variation that would effectively enter a gene pool. However, once a new variant had achieved an appreciable frequency, Fisher regarded the adaptive process as being dominated by natural selection. Wright, in contrast, felt that adaptively important interactions between natural selection and genetic drift could emerge when multiple selective equilibria existed with unequal average fitness (multiple “peaks” of unequal height in the adaptive landscape metaphor introduced by Wright 1932). There are two aspects to the adaptive importance of genetic drift when multiple selective

Table 12.3 The fixation probabilities of a selectively deleterious mutant allele with s = 0.01 as a function of population size (N). N

u

u0 = 1/(2N)

Percent Change in Selected Case from Neutral Case

10

0.041

0.050

−18%

50

0.003

0.010

−68%

100

0.00038

0.005

−92%



0.000

0.000



Note: The fixation probabilities of a neutral mutant in a population of the same size are also shown, as well as the percent increases in the fixation probability in the selected case compared with the neutral case, 100(u − u0)/u0.

437

438

Population Genetics and Microevolutionary Theory

peaks exist; first, the role of genetic drift in determining the initial conditions upon which selection acts; and second, the role of genetic drift in allowing populations to evolve from one selective peak to another in violation of Fisher’s fundamental theorem (called peak shifts). To illustrate the potential importance of genetic drift in influencing the initial conditions under which selection operates, let us return to the hemoglobin β-chain locus with the three alleles, A, S, and C. Figure 12.5 shows the adaptive landscape under the non-malarial environmental fitness given in Table 11.1 in a randomly mating population. This represents the adaptive landscape in West Africa before malarial conditions were created by the introduction of the Malaysian agricultural complex into that area. Instead of a peak, the adaptive landscape in this non-malarial environment has a “ridge” connecting the points corresponding to the fixation of the A and C alleles. This reflects the fact that in the non-malarial environment, the A and C alleles are neutral with respect to one another. Figure 12.5 shows the adaptive trajectories from the same four initial conditions illustrated in Figure 11.7 under a malarial environment. In the non-malarial environment, adaptive evolution simply results in the elimination of the deleterious S allele (or its near loss, as selection becomes ineffective against this deleterious recessive allele as it becomes rare) and takes the population to a point on the A/C adaptive ridge. However, once on top of this ridge, there is no longer any evolution induced by natural selection. The ridge surface is a neutral one influenced only by genetic drift. Hence, populations are free to evolve toward any point on this ridge according to the random dictates of drift. However, once the environment changes to a malarial one (Figure 11.7), none of the

C A

1

S 0.8

w 0.6

0.4 C A

pS

pA

pC

S

0.2

Figure 12.5 The adaptive landscape defined by the A, S, and C alleles at the β-Hb locus in a randomly mating population using the fitnesses given in Table 11.1 for a non-malarial environment. All the graphing conventions used in Figure 12.1 are used in this figure. An adaptive ridge exists in this adaptive surface connecting the A and C vertices, corresponding to all possible A/C polymorphic states with no S alleles. Black dots indicate the initial state of the gene pool for four populations: one starting at pA = 0.95, pS = 0.025, and pC = 0.025; a second at pA = 0.85, pS = 0.025, and pC = 0.125; a third at pA = 0.85, pS = 0.125, and pC = 0.025; and a fourth at pA = 0.75, pS = 0.125, and pC = 0.125. Black lines from these dots plot the evolutionary trajectory across generational time as defined by Eq. (12.1), ending at the selective equilibria points indicated by white dots. All equilibria have the same average fitness, and genetic drift can therefore occur on the ridge.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

populations shown in Figure 12.5 would be upon an adaptive peak. However, the two populations close to fixation of A would evolve under natural selection toward the A/S polymorphic adaptive peak, whereas the two populations that had higher initial frequencies of C would evolve toward the fixation for C adaptive peak. Evolution upon a ridge of neutrality governed by genetic drift could produce local populations with a variety of initial frequencies of the A and C alleles, which in turn would lead to highly divergent adaptive outcomes once the environment changed. Such neutral ridge evolution may be a contributor, perhaps even the determining factor, to the divergent adaptive trajectories observed in West African populations today in which some increase the frequency of S and others the frequency of C (Figure 11.5). In this manner, genetic drift can create a diversity of initial conditions for allelic subsets that are neutral under one environmental condition that strongly influence the adaptive trajectory when the environment is altered. This adds to the difficulty in predicting the course of adaptive evolution, but it also increases the diversity of adaptive responses shown by populations to altered environments. The other adaptively important interaction between selection and drift is a peak shift. Recall that Fisher’s fundamental theorem of natural selection requires populations to evolve under selection only in a manner that increases average fitness. In terms of the metaphor of an adaptive landscape, this means that it is impossible for natural selection to cause a population to evolve “downhill.” However, genetic drift can cause populations to evolve in a manner that violates the fundamental theorem, as we already saw in Tables 12.2 and 12.3. This ability of genetic drift to cause violations of the fundamental theorem creates a finite chance that a population will evolve “downhill” and cross a selective “valley” to reach the slopes of a different adaptive peak. Once in the domain of this new adaptive peak, natural selection makes it more likely to evolve toward the top of the new peak rather than return to the original peak. To see this, consider first a simple two-peak adaptive landscape defined by a one locus, two allele system in a randomly mating population with the following fitnesses: AA with a fitness of 1; Aa with a fitness of 0.9; and aa with a fitness of 0.95. The adaptive

1

0.98

0.96 w 0.94

0.92

0.0

0.4

0.2 0.333

0.6

0.8

1

p

Figure 12.6 The adaptive landscape for a one locus, two allele model with the fitness of AA = 1, of Aa = 0.9, and of aa = 0.95. The surface represents a plot of w under the assumption of random mating against p, the frequency of the A allele. A line at p = 0.333 indicates the minimum value of w and an unstable equilibrium.

439

Population Genetics and Microevolutionary Theory

1.0 0.9 0.8 0.7 0.6 p

440

0.5 0.4 0.3 0.2 0.1 0.0

0

40

80

120

160 200 240 Generation

280

320

360

400

Figure 12.7 Eight runs of a computer simulation of an idealized randomly mating population of size 100 with the fitness model described in Figure 12.6, all starting with the initial condition of p = 0.3.

landscape for this example is plotted in Figure 12.6 as a function of p, the frequency of the A allele. This adaptive landscape has two adaptive peaks, one at p = 0 with w = 0.95 and the second at p = 1 with w = 1. The average excess of fitness of the A allele in this case is aA = p(1) + (1 − p)(0.9) – w, and the average excess of a is aa = p(0.9) + (1 – p)(0.95) – w . At the polymorphic equilibrium, aA = aa, p + (1 − p)(0.9) = p(0.9) + (1 − p)(0.95), which yields peq = 1/3. This polymorphic equilibrium point is also indicated in Figure 12.6. However, as can be seen from Figure 12.6, this equilibrium point is at a fitness valley, not a peak. Therefore, depending upon which side of p = 1/3 a population initially lies, it should either go to fixation of the a allele (initial p < 1/3) or the A allele (initial p > 1/3). Suppose now that a population was at a selective equilibrium with p = 0.3, but the environment shifts to create a new adaptive surface like that shown in Figure 12.6. According to Eq. (12.1) and the fundamental theorem of natural selection, this population should evolve toward loss of the A allele and go to the lower peak shown in Figure 12.6. It is impossible under natural selection alone for such a population to evolve toward fixation of A and to thereby achieve the higher fitness peak. However, Figure 12.7 shows the results of eight runs of a computer simulation of an idealized, randomly mating population of size 100 with these fitnesses and an initial allele frequency of 0.3. As can be seen, two of the eight runs evolved to the peak defined by p = 1: a clear violation of the fundamental theorem of natural selection. Hence, drift interacts with selection to allow evolutionary outcomes that would be prevented by selection alone. In particular, drift plus selection allows for peak shifts that would be prevented by natural selection in large local populations. Although drift allowed peak shifts for some of the runs of this computer simulation, six out of the eight runs evolved to the peak defined by p = 0, thereby obeying the fundamental theorem of natural selection (Figure 12.7). In general, the direction of evolution is biased in favor of the direction initially favored by natural selection. Nevertheless, the fact that some runs went to p = 0 and others to p = 1 shows that genetic drift permits populations to “explore” the adaptive landscape more thoroughly than by natural selection alone. Note also that it is impossible to predict the outcome of any given population in the runs shown in Figure 12.7. We can only describe what should happen if the evolutionary “experiment” were to be repeated many times. If the species of interest were

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

subdivided into several small subpopulations with the same initial allele frequencies, we would have multiple replications of this “experiment.” Each of the subpopulations would evolve, and some would go to the higher peak even if the initial conditions made that violate Fisher’s fundamental theorem. Moreover, when we have multiple subpopulations, another evolutionary force can enter: gene flow. Indeed, Wright considered gene flow a critical force for modulating the adaptive interactions between selection and drift, as we will now see.

The Interactions of Natural Selection, Genetic Drift, and Gene Flow Wright (1932) felt that there would be virtually no chance for a peak shift in a local population with a large variance effective size. Instead, such a population would evolve toward the nearest local peak, and then selection would maintain the population on that local peak, even if a vastly superior adaptive alternative existed. We have already seen an example of this with the β-Hb locus in human populations living in the malarial regions of Africa; most such populations have evolved toward the A/S polymorphic peak even though the fixation for the C peak seems to be a far superior adaptive outcome in terms of average fitness (Figure 11.7), percentage of the population protected against malaria, the degree of protection against malaria, and eliminating deleterious side effects such as severe hemolytic anemia from the population. Because selection could keep populations on inferior adaptive peaks, Wright regarded natural selection as a potential impediment to global adaptation on the adaptive landscape when selection strongly dominates over genetic drift. On the other hand, an isolated population of small variance effective size is also not optimal for adaptive evolution. Genetic drift is powerful in such a population, but the primary manifestation of the power of drift is the rapid loss of genetic variation. Without genetic variation, there is no evolution of any sort. What is needed is to simultaneously have local demes of small variance effective size but with access to much genetic variation. As shown in Chapters 6 and 9, population subdivision can induce both small variance effective sizes at the local deme level yet maintain higher levels of genetic diversity and additive genetic variance at the global population level than an equally sized panmictic population (see Eqs. (6.47) and (9.28)). What is needed is the right amount of gene flow between the local demes: too much, and the population becomes effectively panmictic and selection prevents genetic drift from allowing populations to explore the adaptive landscape; too little, and the local demes generally have little genetic variation and hence low adaptive potential. Under an island model of population subdivision, Wright (1932) argued that an Nm term (the product of the local variance effective size times the migration rate) of about one provided the right balance. This conclusion has been supported by Barton and Rouhani (1993) who showed that peak shifts are most likely if Nm is slightly below one. The analytical work and computer simulations of Bitbol and Schwab (2014) reveal that population subdivision with limited gene flow can facilitate adaptive peak shifts but that other factors can affect the optimal amount of restricted gene flow such as the mutation rate, the number of demes, local deme size, and the strength of selection. Instead of an optimal value, these parameters create a range or plateau of close to optimal values of restricted gene flow to accelerate peak shifts (Bitbol and Schwab 2014). To illustrate peak shifts in a subdivided population, a computer simulation was run using the same fitness parameters as that given in Figure 12.6 and with the local variance effective size being 100, as in Figure 12.7, but now with an island model of gene flow with m = 0.01. Hence, Nm = 1 in these simulations. As before, we assume that the population had adapted to a previous environment with an equilibrium peq = 0.3. More of the local demes will tend to go to the p = 0 peak than the

441

Population Genetics and Microevolutionary Theory

p = 1 peak upon the environmental change, so we will assume that the average frequency of the A allele in the global population is 0.1. Note that 0.1 is well below the threshold of p = 1/3 that separates the domain of the lower peak from that of the higher. Hence, gene flow in these simulations acts as a directional force to bring local populations to the lower fitness peak. Figure 12.8 shows eight runs of the evolution of a local deme under the island model of subdivision, each starting with p = 0.3 but receiving input at rate m = 0.01 every generation from the global gene pool with p = 0.1. As can be seen, two of the eight local demes still evolved toward the higher peak (although gene flow now prevents fixation of A). Obviously, gene flow is not making peak shifts impossible despite the fact that gene flow is consistently pulling down the local frequency of A in these simulations. We need to keep in mind the balances of the evolutionary forces relative to one another to understand why gene flow is not preventing shifts to the higher allele frequency peak even though gene flow is a directional force against shifts to the higher allele frequency peak in this case. Gene flow in one sense is helping genetic drift to explore the adaptive surface because gene flow helps maintain local genetic diversity. This allows both selection and drift to operate at the local level. As long as there is genetic variation within the deme, even the demes that are near the lower frequency peak still have a finite chance of undergoing a peak shift (Figure 12.8). Without gene flow, once fixed on the lower peak, there is no possibility of a peak shift for an isolated deme in the absence of mutation (Figure 12.7). Moreover, the amount of gene flow is still sufficiently low (Nm = 1) so that genetic drift can overcome locally the biases of both selection and directional gene flow to cause a peak shift. Once a deme drifts into the domain of the higher peak, there is a shift in the balance of selection versus the bias of gene flow and the force of genetic drift that favors selection. Recall from our earlier models that what is important is not the magnitude of selection or drift, but rather their ratio (measured by Ns in the models of newly arisen mutations). Similarly, the ability of selection to 1.0 0.9 0.8 0.7 0.6 p

442

0.5 0.4 0.3 0.2 0.1 0.0

0

40

80

120

160 200 240 Generation

280

320

360

400

Figure 12.8 Eight runs of a computer simulation of an idealized randomly mating population of size 100 with the fitness model described in Figure 12.6, all starting with the initial condition of p = 0.3, and part of a subdivided global population with an overall frequency of 0.1 for the A allele with the subdivision corresponding to an island model with Nm = 1.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

overcome a gene flow bias depends upon the relative magnitudes of the selective intensities to the gene flow rate. When selection is stronger in keeping a deme close to the higher peak, the same amount of gene flow or drift is less important relative to selection on the higher peak than on the lower. Thus, for the same amount of gene flow and drift, it is generally more likely to shift from a lower peak to a higher peak than the opposite. Although peak shifts are random at the local deme level, the global probabilities of peak shifts over many demes are biased in favor of higher peaks. Therefore, even though “random” genetic drift is the mechanism for exploring the adaptive surface, the demes preferentially end up on the higher peaks (Templeton 1982b; Barton and Rouhani 1993). The random exploratory mechanism of genetic drift allows the demes to explore more than one adaptive solution. The shift in the balance of selection to drift makes it likely that the demes will remain on the better adaptive peaks that they encounter. Even though natural selection is the only evolutionary force necessary for adaptation, Wright argued that adaptive evolution is more effective when natural selection is not in sole control. Indeed, Wright felt that natural selection could prevent adaptive evolution by keeping a population on a local but not globally optimal adaptive peak. Thus, natural selection is necessary to explain adaptation, but it is not sufficient in Wright’s view. There is another shift in the balance of gene flow relative to selection and drift as this evolutionary process in a subdivided population continues. We assumed in the previous simulations that most populations rather rapidly shifted from p = 0.3 toward the lower peak, resulting in a global allele frequency of 0.1 for the A allele. However, because gene flow now maintains local genetic diversity and hence local genetic drift, and because the shift in the balance of selection to drift and gene flow creates a bias in favor of the higher peak shown in Figure 12.7, more and more local demes will evolve toward the domain of the higher peak as time progresses. This in turn will increase the global allele frequency of the A allele. As this occurs, gene flow shifts from being an evolutionary force biased against the higher peak to a force that is more neutral, and finally to one that is biased in the favor of the higher peak. For example, suppose the global allele frequency has reached a value of 0.3 as more and more demes shifted toward the higher peak. This global allele frequency is still below the 1/3 threshold, but now gene flow is only slightly biased against the higher peak. Figure 12.9 shows the results of computer simulations that are identical to the simulations shown in Figure 12.7 except for a global allele frequency of 0.3 instead of 0.1. In this case, four of the eight subpopulations have gone to the higher peak, and another is well on the way. This is the result of the shift in the balance of gene flow to selection and drift, with gene flow now favoring peak shifts by placing populations into the fitness valley where drift is most effective. Thus, even more demes go toward the higher peak, and the global allele frequency increases even more. Figure 12.10 shows the simulated results obtained when the global allele frequency has reached 0.5. At this point, gene flow has shifted toward a bias favoring the higher peak. It is now virtually inevitable for all the local demes to evolve toward the higher peak (Figure 12.10). Wright argued that gene flow might be even a more powerful force in bringing more and more demes to a superior adaptive peak when there is an interaction between selection and the amount of gene flow. His argument applies to the case in which the superior adaptive peak results in a higher absolute fitness (recall that the adaptive surface is defined from relative fitnesses, so this is not necessarily the case). For example, in the β-Hb A, S, and C example, the superior adaptive peak is associated with fixation for C (Figure 11.7), and populations on this peak are characterized by 100% of the individuals having resistance to malaria that is superior to the resistance of A/S individuals, who represent only a minority of the individuals in populations on the lower peak. Hence, the absolute average viability of individuals (their ability to survive in a malarial environment) is indeed superior for those populations on the higher peak in this example. Wright felt that such a situation would often be true, so that those demes on the higher peaks would have more offspring than demes

443

Population Genetics and Microevolutionary Theory

1.0 0.9 0.8 0.7

p

0.6 0.5 0.4 0.3 0.2 0.1 0.0

0

40

80

120

160 200 240 Generation

280

320

360

400

Figure 12.9 Eight runs of a computer simulation of an idealized randomly mating population of size 100 with the fitness model described in Figure 12.6, all starting with the initial condition of p = 0.3, and a part of a subdivided global population with an overall frequency of 0.3 for the A allele with the subdivision corresponding to an island model with Nm = 1.

1.0 0.9 0.8 0.7 0.6 p

444

0.5 0.4 0.3 0.2 0.1 0.0 0

40

80

120

160

200

240

280

320

360

400

Generation

Figure 12.10 Eight runs of a computer simulation of an idealized randomly mating population of size 100 with the fitness model described in Figure 12.6, all starting with the initial condition of p = 0.3, and a part of a subdivided global population with an overall frequency of 0.5 for the A allele with the subdivision corresponding to an island model with Nm = 1.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

on lower peaks. As a result, the same migration rate per offspring would result in those demes on the higher peaks producing a greater number of migrants going out to other demes. This increase in the absolute numbers of migrants biases the evolutionary process to favor the spread of those alleles associated with the higher peaks throughout the species. However, this differential migration, although empirically supported (Wade 2013), does not appear to be a limiting step in peak shifts and their spread (Phillips 1993; Wade 2013). Wright (1931, 1932) called the above model of adaptive evolution the shifting balance theory in which shifting balances between the relative strengths of selection, drift and gene flow allow local demes in a subdivided population to explore the adaptive surface, then preferentially evolve toward the higher peaks in this surface, and ultimately draw other demes toward the higher peaks via gene flow and selection. From the onset, the shifting balance theory has been controversial (Coyne et al. 2000, Goodnight and Wade 2000; Whitlock and Phillips 2000). Nevertheless, shifting balance was not empirically tested until 60 years after it was proposed (Wade 2016). This long gap between hypothesis and experiment stems in large part from the difficulty and labor involved in testing the shifting balance theory. Such tests require multiple coexisting demes and an overall large population size and manipulation and/or monitoring of many evolutionary parameters such as gene flow, drift, and selection. Wade and Goodnight (1991) provided these first empirical tests using laboratory populations of the flour beetle Tribolium castaneum and demonstrated that shifting balance can work under the appropriate circumstances, and there have been other experiments showing shifting balance (discussed in Wade 2016). However, this does not resolve the question of how often the appropriate circumstances occur. We will therefore examine some of the critical requirements of shifting balance to gauge how likely this mode of adaptive evolution may be.

Population Subdivision Population subdivision with restricted gene flow is a critical requirement for shifting balance according to Wright. Many species do have subdivided populations. Even humans have a global Nm close to one (Chapter 6). Our own species is instructive about the potential for shifting balance. Although our global Nm is close to one, there is much spatial heterogeneity over our geographical distribution for the amount of local population subdivision. This is often true of other species. For example, the eastern collared lizard (Crotaphytus collaris collaris) showed extreme population subdivision with little to no gene flow among isolated demes in the northeastern Ozarks prior to prescribed woodland burning (Figure 6.21), isolation by distance in the southwestern Ozarks and in Texas (Hutchison and Templeton 1999; Figure 6.21), and isolation-by-distance and resistance upon a single mountain after prescribed woodland burning (Figures 6.20 and 6.22). For many species, the necessary degree of subdivision may exist only in a part of the species’ range. Therefore, the entire species does not have to be highly subdivided for shifting balance to occur. Barton and Rouhani (1993) point out that if Nm varies sufficiently from place to place, then a superior fitness peak can be established in the geographical regions where conditions are appropriate for shifting balance and can then spread through the rest of the range by the synergistic effects of selection and gene flow. Besides spatial heterogeneity in the degree of population subdivision, there can also be temporal heterogeneity in subdivision. For example, collared lizards in the northeastern Ozarks prior to prescribed woodland burning were highly fragmented with virtually no gene flow (Chapter 6, Figure 6.21), and therefore, little chance for shifting balance due to a lack of genetic variation in local demes. However, this lack of gene flow itself was only recently created by the suppression of forest fires in much of the Ozarks. When fire regimes were re-established, a population structure

445

446

Population Genetics and Microevolutionary Theory

characterized by restricted gene flow in the context of a metapopulation rapidly emerges (Chapter 6, Figures 6.8, 6.9, 6.20, 6.22). Jangjoo et al. (2020) documented extreme shifts in the population genetic structure among populations of the alpine butterfly, Parnassius smintheus. The population structure in these butterflies varied rapidly over time from highly fragmented local demes with small variance effective sizes that lost spatial genetic structure to a network of local populations showing spatial structure due to isolation by distance and resistance to forest cover. Consequently, there may be only occasional times in a species’ history when population subdivision is appropriate for shifting balance, but when those episodes occur, a species could experience adaptive breakthroughs associated with going to a superior peak that will profoundly influence its future evolutionary fate. Recall from Chapter 6 that after prescribed burning the collared lizards established a type of population structure known as a metapopulation in which the population is subdivided into many local demes that are subject to local extinction and recolonization. Theoretical work (Slatkin 1981; Peck et al. 1998, 2000) has also indicated that the gene flow/population subdivision conditions that are required for shifting balance may not be so restrictive as originally envisioned by Wright (1931, 1932) when recolonization in metapopulations is associated with repeated founder events. Metapopulations can also help spread a superior adaptation throughout the local demes, particularly if most recolonization occurs from just one or a few nearby demes and if the less fit demes are the ones more likely to go extinct and the more fit demes contribute disproportionately to the pool of colonists (McCauley 1993; Wade and Goodnight 1998).

Genetic Architecture Shifting balance also requires multiple local equilibria; that is, multiple adaptive peaks that are separated by fitness valleys in Wright’s metaphor. The shape of the adaptive surface depends in part upon the underlying genetic architecture of fitness. Recall from Chapter 10 that the genetic architecture refers to the number of loci and their linkage relationships, the numbers of alleles per locus that contribute to a trait, and the mapping of genotype onto phenotype (dominance, recessiveness, pleiotropy, epistasis, etc.). The multiple-peak adaptive surface required for shifting balance arises in part from a genetic architecture characterized by strong interactions between genes (either between alleles at the same locus or between alleles at different loci, that is, epistasis) and/or pleiotropy. For example, the two-peak surface shown in Figure 11.7 for the β-Hb A, S, and C example in a malarial environment arises in part from the pleiotropic effects associated with these alleles on the traits of malarial resistance and hemolytic anemia and in part from the interactions between alleles (S is dominant to A for the trait of malarial resistance and recessive for the trait of anemia; C is recessive to A for the trait of malarial resistance and codominant with S for the trait of anemia). One implication of a multiple-peak adaptive surface is that the fitness effects of an allele are highly context dependent. For example, in Figure 11.7 the S allele is a beneficial allele in the context of a randomly mating population close to fixation for the A allele but is a deleterious allele in the context of a randomly mating population close to fixation for the C allele. Fisher counteracted Wright’s view of rugged adaptive landscapes with his own visual metaphor: the adaptive hypersphere. In Fisher’s metaphor, the degree of adaptation of a population is represented by its closeness to a fixed point in a multidimensional space that corresponds to the single optimal adaptive type, in contrast to the multipeak adaptive landscapes of Wright. A second point in Fisher’s hyperspace represents the population’s current average adaptive phenotype. In contrast to the rugged adaptive surfaces of Wright, Fisher envisioned a smooth, continuous relationship between the optimal point and other points in this adaptive space such that the degree of adaptation

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

Position of Current Population

Optimal Population

Figure 12.11 Fisher’s adaptive target. The point in the center of the circle corresponds to a population that is adapted to the environment in the optimal fashion. The point on the circle corresponds to the level of adaptation of the current population, with increasing distance from the optimal point corresponding to decreased levels of adaptation. The circle encloses the area within which a population would be better adapted than the current population. Fixation for a mutation is indicated by an arrow starting at the current population and ending at a new point in space that is random in direction and magnitude from the current population. On the right is an expanded section of the circle in the vicinity of the current population to illustrate the increased chances of mutations with small phenotypic effect being selectively advantageous.

of any population was a simple decreasing function of its distance from the optimal point. Hence, all the points closer to the optimal type than the current population are more fit than the current population. The points corresponding to better adapted populations relative to the current population are, therefore, found within a hypersphere whose center is at the optimum point and whose radius is the distance between the optimum and the current population. When dealing with two dimensions, this hypersphere becomes a simple circle or “target” (Figure 12.11). Fisher felt that natural selection would ensure that the population was near the optimal point. The only reason why populations would not be exactly at the optimal point is that the appropriate mutations had not yet occurred. Thus, the random mechanism responsible for exploring Fisher’s adaptive space is mutation, not genetic drift. Fisher represented mutation as a vector coming from the current population that is random in both direction and magnitude. Fixation for this mutation would move the population to the point in the adaptive space indicated by the end of the mutational arrow. Fisher then calculated the probability of such a random vector having its end point land within the adaptive target, that is, closer to the optimum than the starting point. Fisher showed that this probability is zero whenever the magnitude of the vector exceeds the diameter of the hypersphere (see Figure 12.11). As the size of the fitness effect associated with a mutation declines, the probability of a mutation of random direction resulting in a favorable change increases and ultimately reaches a limit of one-half for mutations of very small effect (Figure 12.11). Fisher, therefore, concluded that mutations of very small effect would provide the primary raw material for adaptive change since they are more likely to be advantageous. Fisher assigned each mutation a single vector to represent its fitness effects. The phenotypic effect is determined at the moment of mutation and remains constant throughout evolutionary history. Crow (1957) provided a rationale for Fisher’s assumption by arguing that

447

448

Population Genetics and Microevolutionary Theory

the genetic background is constantly changing in large, random-mating populations such that those mutations that have a consistent advantageous phenotypic effect regardless of genetic background will be the ones most likely utilized by natural selection to build adaptive traits. Therefore, in the Fisherian model, adaptation occurs by the accumulation of many small mutational steps toward an optimum, with the underlying genetic architecture of adaptive traits being due to a large number of loci with each locus having functional alleles with small, additive phenotypic effects that are insensitive to the genetic background. Matuszewski et al. (2014) pointed out a violation of Fisher’s prediction that mutations of small effect are the primary raw material of adaptive evolution. They considered a geometric model like Fisher’s but with one exception: the adaptive target is moving through time in response to environmental change. In contrast to Fisher’s predictions, larger adaptive steps often occur with a moving optimum. Mutations of small effect are not always the main material of adaptive change even when there is a single adaptive optimum, albeit a moving one. Wright agreed with Fisher that adaptive traits generally have a multi-locus genetic architecture. Unlike Fisher, Wright did not model the phenotypic effects as intrinsic, constant attributes of a specific mutation, rather Wright regarded the phenotypic effects of a mutation as highly context dependent because of inter-allelic interactions, epistasis, and pleiotropy. Indeed, such interactions ensure that only rarely could we regard an allele as being intrinsically of major or minor effect (the critical distinction in Fisher’s metaphor). As shown in Chapter 10, the same allele can be of either major or minor effect because of contextual interactions (e.g. the ApoE/LDLR example in Chapter 10) and from measuring phenotypic effects as deviations from the mean. In contrast to the fixed-length vectors in Fisher’s metaphor (Figure 12.11), what is regarded as a major or minor allele can change dramatically as the genetic background and allele frequencies are altered (Chapter 10). Pleiotropy and interactions, particularly epistasis, make multiple peaks far more probable and peak shifts adaptively important, as supported by theory (Goodnight 1995; Jain et al. 2011; Kaznatcheev 2019), simulations (Bergman et al. 1995; Covert et al. 2013), experiments (Kvitek and Sherlock 2011; Draghi and Plotkin 2013; Puchta et al. 2016), and natural examples (Rogers et al. 2012). Consequently, one critical difference between Fisher and Wright is their opposing views of genetic architecture and particularly the role of epistasis and genetic background. Until recently, we had little insight into the genetic architecture of most quantitative traits. Recall from Chapter 9 that the classical, unmeasured genotype analysis of quantitative genetic traits provides little information about genetic architecture and the role of epistasis and other interaction effects. Indeed, the Fisherian unmeasured genotype analysis is biased against the detection of epistasis (Chapter 10). It is also important to keep in mind that Mendelian epistasis is what creates rugged adaptive landscapes, and the Fisherian epistatic variance is irrelevant to this problem. Because of the measured genotype approach to quantitative genetics, we now can detect epistasis when the effort is made, and these newer studies indicate that epistasis is a common component of genetic architecture (Chapter 10). Measured genotype approaches are finding extensive epistasis even in the genetic architecture of “simple Mendelian traits” that had traditionally been regarded as examples of single-locus genetic architectures. By definition, a single-locus genetic trait is free of epistasis. But are such “simple” traits truly single locus traits, and if not, is there any role for epistasis? Consider sickle-cell anemia—the workhorse example of a “simple” Mendelian trait found in most genetic textbooks. Up to now, sickle-cell anemia has been treated as a single-locus trait in this book. Indeed, sickle-cell anemia is commonly presented as a single nucleotide trait due to one A to T nucleotide change in the second position of the sixth codon of the β-chain of hemoglobin (the S allele) that is said to “cause” sickle-cell anemia when homozygous. However, when individuals

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

who are homozygous for the S allele are examined, tremendous heterogeneity in clinical severity is revealed (Odenheimer et al. 1983; Sing et al. 1985; Williams and Thein 2018). Epistasis is a major determinant of this heterogeneity. For example, the S allele is found in high frequency in certain Greek populations, but clinical manifestations are mild. Studies (e.g. Williams and Thein 2018) have shown that persistence of fetal hemoglobin into the adult can ameliorate the severity of sickle-cell anemia. Fetal hemoglobin is coded for by the tandemly duplicated γ loci (called A and G), which are closely linked to the β locus (Figure 2.7). Normally, the γ genes are turned off after birth and the β gene is activated, leading to the transition from fetal to adult hemoglobin. Several mutations in or near the γ loci can cause persistence of fetal hemoglobin and are found in Greek populations (Berry et al. 1992; Patrinos et al. 1996). Sickle cell is also common in certain populations in Saudi Arabia that live in historic malarial regions (el-Hazmi and Warsy 1996). An Xmn I polymorphic site 5 to the Gγ locus (Figure 2.7) and a HindIII polymorphic site in the Gγ locus is associated with persistence of fetal hemoglobin and is in disequilibrium with the sickle-cell allele in some Arabian populations. This same haplotype is also found in populations from India, who likewise have mild clinical symptoms with homozygous SS and have persistence of fetal hemoglobin (Ramana et al. 2000). Note that having persistence of fetal hemoglobin at the Gγ locus increases the fitness of homozygotes for the S allele at the β-Hb locus because that genetic combination results in sickle-cell homozygous individuals that display mild clinical symptoms (el-Hazmi et al. 1992; Ramana et al. 2000). However, persistence of fetal hemoglobin by itself is expected to decrease the ability of the blood to deliver oxygen to the peripheral tissues because fetal hemoglobin has a higher binding affinity for oxygen (this allows the developing fetus to take oxygen from the mother’s blood across the placenta). The disequilibrium found in both Greek and Arab populations, therefore, favors the high fitness combination at the expense of the low fitness combinations. This nonrandom association between alleles at different loci is, therefore, presumably due to natural selection operating upon a multi-locus, epistatic genetic architecture. Hence, the sickle-cell phenotype is not really a single locus phenotype, but rather in some human populations it is a coadapted gene complex in which the frequencies of alleles at different loci are mutually adjusted with respect to one another by natural selection favoring epistatic combinations with high fitness. In the particular case of the Gγ and β-Hb loci in Greek and Arab populations, the physical closeness of these two loci in the genome (see Figure 2.7) means that the linkage disequilibrium favored by natural selection is not readily dissipated by recombination. The resulting genetic stability of this combination of closely linked alleles means that the combination itself approximates an “allele” in its inheritance pattern. Closely linked loci with coadapted combinations of alleles in linkage disequilibrium are called supergenes (Kelly 2000). Accordingly, the DNA region around the β-Hb locus (Figure 2.7) is really a supergene in the selective context of a malarial environment. The behavior of this region as a supergene also means that the adaptive topographies shown in Figures 11.7, 11.9, 12.1 and 12.5 are actually only threedimensional projections of a higher dimensional adaptive landscape. For example, the frequency of the S allele as shown in these previous adaptive landscapes is actually the frequency of the superallele S that is not linked to the Gγ allele that causes persistence of fetal hemoglobin. To accommodate the Greek and Arab populations into the adaptive surface, the adaptive surface would now need a fourth dimension that measures the frequency of the superallele that has S and the Gγ allele that causes persistence of fetal hemoglobin on the same chromosome. In other words, at the Gγ/β-Hb supergene level there are four alleles: A, C, S-no persistence of fetal hemoglobin, and S-persistence of fetal hemoglobin. By ignoring the epistasis between the Gγ and β-Hb loci and focusing only on the β-Hb locus (the norm in most textbooks), we would mistakenly conclude that Greek, Arab, and Bantu populations living in malarial environments had adapted to malarial

449

450

Population Genetics and Microevolutionary Theory

by being on the same A/S polymorphic adaptive peak. However, when we consider the fourdimensional adaptive surface associated with the Gγ/β-Hb supergene (which unfortunately cannot be adequately depicted in a two-dimensional figure), we find that many of the Greek and Arab populations that are polymorphic for sickle-cell anemia are actually in a different portion of the adaptive surface than the Bantu populations that are polymorphic for sickle-cell anemia. Thus, just considering this one small DNA region alone, different human populations have adapted to malaria by evolving toward three different parts of the fitness surface (increasing the frequency of C as in Upper Volta, the A/S-no persistence of fetal hemoglobin polymorphism as in many Bantu populations, and the A/S-persistence of fetal hemoglobin polymorphism as in some Greek and Arab populations). Obviously, there is no single optimal adaptive target for selection to hit, and different human populations have evolved different adaptive strategies in response to a malarial environment with a genetic architecture profoundly influenced by epistasis. Even our four-dimensional fitness surface with its added adaptive option is an oversimplification. Two other loci unlinked to S are known to affect fetal hemoglobin expression in sickle-cell patients (Galarneau et al. 2010), and there is evidence for an X-linked locus as a contributor to the persistence of fetal hemoglobin in Arabian populations (el-Hazmi et al. 1994a), creating an even more complicated pattern of epistasis involving the HbF/HbS interaction. In addition, as pointed out in Chapter 11, several other loci are involved in malarial adaptation, and these and other loci show epistasis with one another and with sickle cell. For example, the protein haptoglobin forms complexes with hemoglobin and is thereby the major determinant of hemoglobin excretion (Giblett 1969), a symptom of hemolytic anemia. Given this direct physiological interaction, it is not surprising that genetic variants of haptoglobin also display epistasis with the S allele (Giblett 1969). Epistatic interactions for clinical severity of anemia have also been reported between the S allele and α-thalassemia (caused by genetic variation at the α-Hb locus) (el-Hazmi et al. 1994b; Raffield et al. 2018), although S and α-thalassemia show negative epistasis for malarial resistance (i.e. both together have less resistance than either one alone, Dong et al. 2008; Penman et al. 2011). There is also epistasis for clinical severity of sickle-cell anemia with G6PD deficiency (Chapter 11) (elHazmi et al. 1994b; Raffield et al. 2018), and the ApoL1 kidney-disease risk variants (Chapter 10) (Freedman and Skorecki 2014; Kruzel-Davila et al. 2017). G6PD deficiency in turn displays epistasis with thalassemia (caused by mutations at the α and β globin loci) (Siniscalco et al. 1966). The clinical severity of thalassemia is also reduced by persistence of fetal hemoglobin (Xu et al. 2011), which as mentioned above strongly interacts with sickle cells. A GWAS using the S allele as a focal candidate locus identified 390 significant interaction peaks in the genome for clinical status, and a transcriptome analysis revealed 6 significant expression SNPs (eSNP) for interaction effects on clinical status (Quinlan et al. 2014). G6PD deficiency achieves malarial resistance by reducing the ability of the malarial parasite to use the oxidative shunt pathway in the parasitized red blood cell (Giblett 1969)—a different molecular mechanism from that associated with the S allele. Accordingly, G6PD deficiency has a different set of clinical effects than the S allele, even though epistasis partially interweaves the two systems. Some bearers of G6PD deficiency are extremely sensitive to environmental oxidizing agents such as fava beans (Chapter 11). Sensitivity to fava beans (favism) can cause death through hemolytic crisis. However, not all carriers of G6PD deficiency are susceptible to favism, and some of this heterogeneity is due to epistasis between the X-linked G6PD locus and the two autosomal globin loci associated with thalassemia (Siniscalco et al. 1966). It should be clear by now that “simple Mendelian” systems are in reality merely low resolution projections of the complex systems involved in an expanding web of pleiotropic effects and epistatic interactions (Templeton 2000). This web of epistasis and pleiotropy creates the rugged adaptive

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

landscapes that are essential for shifting balance and that often prevents Fisherian selection from finding the globally optimal adaptive peak. Genetic architecture, therefore, does not appear to be a factor that would limit the operation of Wright’s shifting balance process. These results also undermine Fisher’s treatment of mutation as having a fixed and constant phenotypic effect. When it comes to genetic architecture, modern studies have shown that Wright was right and Fisher was off target. It is important to note that shifting balance and Fisherian targets are not mutually exclusive ways of viewing the process of adaptation—both can occur under some circumstances. Wright never regarded shifting balance as a universal model of adaptation; rather, he portrayed it as a special case applicable only under special conditions that may be rare in most species most of the time. However, rarity never means unimportant in evolution. The ability of shifting balance to make adaptive breakthroughs amplifies its macroevolutionary importance even if it is rare within most species.

The Interactions of Natural Selection, Genetic Drift, and Mutation The balance between genetic drift and mutation was shown in Chapter 5 to influence the substitution rate of neutral mutations to be the rate of fixation due to drift times the rate of input of new mutations, or 1/(2N) × 2Nμ = μ). The expected level of neutral polymorphism was also shown to be in Chapter 5 to be about 4Nμ = θ, reflecting the balance of mutation to the strength of drift, or 2 μ 1/(2N) = 4Nμ. The balance of drift and mutation as measured by θ was shown to influence many properties of the coalescent process under the assumption of neutrality, such as the expected number of mutations that occur between coalescent events and over the entire haplotype tree. Natural selection interacts with drift and mutation to alter all of these balances and expectations. These alterations provide several methods of detecting natural selection. After Kimura proposed the neutral theory (Chapter 5), there were extensive efforts made to test it. However, the basic parameters of the neutral theory—μ, the mutation rate to neutral alleles, and various effective population sizes—are difficult to estimate. As a result, many of the resulting data sets were interpreted as supporting neutrality by Kimura and other “neutralists” but were interpreted as supporting selection by “selectionists.” A test statistic was needed that did not depend upon estimating some or all of these elusive parameters. Maynard Smith (1970) proposed testing the neutral hypothesis through its fundamental prediction that the neutral mutation rate both determines the rate of interspecific divergence and influences the amount of intraspecific polymorphism (Kimura 1968a, b). As we saw in Chapter 5, the rate of interspecific divergence under neutrality is proportional to μ, and the expected heterozygosity is θ/(1 + θ) ≈ θ for small θ. The ratio of intraspecific polymorphism to interspecific divergence is proportional to θ/μ = 4N, where N is the inbreeding effective size. Hence, by looking at the relative proportion of intraspecific polymorphism to interspecific divergence, we end up with a statistic that does not depend upon the inestimable neutral mutation rate. Moreover, if all types of mutations and all loci are evolving under the neutral model, this ratio should be 4N for all autosomal mutations and loci. Therefore, one could in theory test neutrality by testing for homogeneity across mutational types and/or loci in the amount of intraspecific polymorphism relative to interspecific divergence, thereby eliminating any need to estimate N or μ. When Maynard Smith published this paper, such a simple homogeneity test was not possible because the primary data on intraspecific polymorphism came from protein electrophoresis and the primary data on interspecific divergence came from

451

452

Population Genetics and Microevolutionary Theory

amino acid sequencing. These two techniques are not comparable in how they measure genetic diversity, making the ratio of intraspecific polymorphism to interspecific divergence difficult to estimate. This situation changed with the advent of haplotype trees, which as pointed out in Chapter 5 can include both intraspecific and interspecific branches in a common evolutionary tree. The first statistical implementation of Maynard Smith’s ideas for testing neutrality was through a simple contingency test of homogeneity on a combined inter/intraspecific evolutionary tree of inversion variation in Drosophila (Templeton 1987c). Inversion evolution in Drosophila was one of the few types of data in the pre-molecular era that could be used to combine intraspecific variation and interspecific differences into a single, integrated evolutionary tree (Sturtevant and Dobzhansky 1936). Indeed, the studies of inversion evolution lead to the “balanced school” that eventually became the main opposition to the neutral theory (Chapter 5). Evolutionary trees of Drosophila inversions were abundant by the 1980s. Templeton (1987c) divided inversion mutations into two topological categories that reflect their position in the evolutionary tree: polymorphic (those mutations located on branches contained within a species) and fixed (those mutations located on branches that interconnect species). The mutations were also divided into two genomic categories: autosomal versus X-linked to test a hypothesis that X chromosome inversions would be under strong selection in Hawaiian Drosophila but not in mainland Drosophila. This created a twoby-two contingency table, and the null hypothesis of homogeneity (neutrality) was tested through a Fisher’s exact test (Appendix B). Consistent with these prior hypotheses, the Templeton contingency test revealed significant accelerated evolution of Hawaiian Drosophila X-linked inversions over autosomes, but accepted neutrality (homogeneity) between X-linked and autosomal chromosomes for the mainland Drosophila repleta group. As molecular genetics advanced, intra/interspecific haplotype trees based on DNA sequences became more common. McDonald and Kreitman (1991) performed the first contingency test of neutrality with DNA sequence data. They used the same “fixed” and “polymorphic” categories as Templeton (1987c), but since they had sequence data on protein-coding genes, the mutational categories of their contingency table were amino acid changing (replacement or nonsynonymous mutations) or not (silent or synonymous mutations, due to the redundancy of the genetic code). Kimura and other neutralists had strongly argued that silent or synonymous mutations were far more likely to be neutral than replacement or nonsynonymous ones, so this categorization should be sensitive to selection operating at the level of the amino acid sequence of the protein. The tree positions should also be sensitive to selection. The fixed differences have obviously gone to fixation within a species and have persisted through time; therefore, they are of proven evolutionary success. The polymorphic class is not yet proven, containing some mutations that will eventually go to fixation and others that will be lost. Hence, if selection biases the probabilities of loss and fixation, as we have seen that it does (Tables 12.2 and 12.3), these two topological categories should be differentially affected. The tree topological categories and the mutational categories together define a two-dimensional contingency table. The hypothesis of neutrality implies homogeneity within this contingency table, so now the test of neutrality is a straightforward contingency test of homogeneity (Templeton 1987c), for which they are many well-known statistical options (Appendix B). One can also just look at the number of fixed and polymorphic nucleotide sites without going through the intermediate step of estimating a haplotype tree, and this is commonly done. However, using only sites and not a haplotype tree makes the analysis much more sensitive to the infinite sites model, which as pointed out in Chapter 5 is frequently violated even intraspecifically, but is violated even more commonly for interspecific comparisons. By constructing a haplotype tree, one can obtain an estimate of homoplasy (multiple mutations to the same state at the same site, and therefore, violations of the infinite site model), and therefore mutational counts based on tree branches

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

are more robust to deviations from the infinite sites model (Chapter 5) than are counts based on fixed and polymorphic sites. Moreover, with some procedures such as statistical parsimony, it is possible to further correct for the undercounting of mutations that are caused by deviations from the infinite sites model (Templeton 1996). Figure 12.12 shows an example of this contingency test approach with an estimated intraspecific and interspecific haplotype tree for mitochondrial cytochrome oxidase II (COII) DNA sequences from several hominoid primates, including humans, as well as the distribution of silent and Homo sapiens

Pan troglodytes

Pan paniscus

Gorilla gorilla Ggo4

Hsa5

Ptr1

Ggo5 Ggo2

Ggo6

Ggo3 Ppa1 Ggo1

Hsa2

Hsa1,3,4 Ptr2,4

Hsa6

Ppa2 Ppa4 Ptr3 Ppa3

Ptr5

Silent Replacement

Tip } Intraspecific Interior Polymorphic } Interior Fixed

1 Mutational Change

Figure 12.12 The intra/interspecific haplotype tree for the mitochondrial gene cytochrome oxidase II gene in humans (Homo sapiens), two species of chimpanzees (Pan troglydytes and Pan paniscus), and the gorilla (Gorilla gorilla). Thick lines indicate replacement mutations and thin lines silent mutations. Solid lines indicate fixed differences between species, dashed lines represent the intraspecific interior branches, and dotted lines indicate the tip branches. The polymorphic branches refer to the combined class of tip and intraspecific interior, and the interior branches refer to the combined class of fixed and intraspecific interior. Small circles indicate interior nodes in the tree that are not represented by any current haplotype. Haplotypes are named with the abbreviation of the species in which they were found and a number indicating the sample identity. Three samples in humans (1, 3, and 4) all had the same haplotype. Source: Modified from Templeton (1996).

453

454

Population Genetics and Microevolutionary Theory

Table 12.4 Two-by-two contingency analysis of the silent/replacement mutations versus polymorphic/fixed tree topological categories for the mitochondrial cytochrome oxidase II gene. Silent

Polymorphic Fixed

Replacement

42

14

113

8

Note: The probability under the null hypothesis of homogeneity is determined by Fisher’s exact test (FET) to be 0.001. Source: Templeton (1996). © 1996, The Genetics Society of America.

replacement substitutions upon all branches (Templeton 1996). As can be seen, the replacement substitutions are primarily found in the intraspecific portions of the haplotype tree. When the mutational numbers are counted up and placed in a two-by-two contingency table (Table 12.4), this visual impression is confirmed by the Fisher’s exact test (Appendix B) that reveals a highly significant deviation from homogeneity/neutrality. In particular, you can see from either Figure 12.12 or Table 12.4 that the replacement mutations are disproportionately found in the polymorphic part of the haplotype tree, that is, the portion of the tree of unproven evolutionary success. Under the assumption that the silent substitutions are more likely to be neutral, this implies that selection at this gene is negative or conservative by preferentially eliminating amino acid changes. This simple contingency test can be extended to yield even more insight into the nature of selection. For example, the motivation for subdividing the mutations into replacement and silent categories was the strong a priori belief that these two classes of mutations would differ with respect to natural selection. In many cases, we have additional a priori information to make even finer distinctions among mutations. In the case of the cytochrome oxidase II protein, we know that the molecule is split into two halves with drastically different biochemical functions. On the N-terminal side of the central aromatic domain, the polypeptide is hydrophobic and is found in association with the transmembrane portion of the cytochrome oxidase complex. The C-terminal side of the molecule is hydrophilic and protrudes into the cytosol. It contains the CuA site, crucial for the transfer of electrons to O2 and for the cytochrome c binding site. Because of the extreme difference in the biochemical role played by these two regions, mutations are categorized as N-terminal mutations or C-terminal mutations, yielding a total of four mutational categories: N-terminal silent, N-terminal replacement, C-terminal silent, and C-terminal replacement. We can also refine our categorization of the topological positions in the haplotype tree. For example, in Chapter 7, we noted how we could distinguish between tip branches (those that lead to a single haplotype) versus interiors branches (those that interconnect internal nodes). All fixed branches between species are interiors, but so are some of the polymorphic branches. Hence, we can subdivide the tree positions into three types: fixed, intraspecific interior, and tip. Recall that the motivation for subdividing the tree into the fixed and polymorphic classes was the a priori belief that these categories should be differentially affected by selection when selection was present. Splitting the intraspecific “polymorphic” class into tip and intraspecific interior also makes sense in potential for being sensitive to selection. Interior haplotypes tend to be older than tip haplotypes (Chapter 7) and have given rise to one or more descent haplotypes. Hence, mutations on interior branches also have more demonstrated evolutionary success on the average than mutations on tip branches. Table 12.5 shows a new 3-by-4 contingency analysis on cytochrome oxidase II that makes use of these more refined mutational and tree-topological positions. The evidence for selection still remains strong, but now we can see that the excess of replacement substitutions is primarily found on the tip branches in the C-terminal portion of the molecule. Hence, the selection for conserving

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

Table 12.5 Contingency analysis of the cytochrome oxidase II data with four mutational categories and three tree topological categories.

N-terminal Silent

N-terminal Replacement

C-terminal Silent

C-terminal Replacement

8

2

12

7

Interior

10

3

12

2

Fixed

60

6

53

2

Tip

Note: Because some of the cells have few observations, an exact permutational test was used to yield the probability under the null hypothesis of homogeneity/neutrality to be 0.000. Source: Templeton (1996). © 1996, The Genetics Society of America.

amino acid sequence is primarily directed to that portion of the molecule involved in electron transport and cytochrome c binding. Both the fixed and intraspecific interior categories have demonstrated some degree of evolutionary success in terms of temporal persistence and leaving descendants, whereas the tip branches have not. Therefore, pooling tips with intraspecific interiors to form the class “polymorphic” will tend to dilute the signal for selection. If any classes are to be pooled, a stronger signal for selection will be achieved by pooling fixed and intraspecific interior into the class “interior” and contrasting it with tips. Such pooling is particularly important when trying to detect weak selection. For example, although synonymous substitutions are regarded as more likely to be neutral in general compared with replacement substitutions, synonymous mutations can have some functional consequences and hence could be under selection to some extent (Sauna and Kimchi-Sarfaty 2011). For example, if the pools of the tRNAs that share a common amino acid differ in concentration within a cell, a silent substitution could affect the rate of translation, and thereby potentially have a selectable phenotypic effect. Such selection can result in codon bias, the preferential use of some codons over others among a synonymous set. For example, Llopart and Aguade (1999) found such codon bias in the RpII215 gene in several species of Drosophila. Using these data, they could divide the synonymous mutations into two categories: transitions from unpreferred to preferred codons based on the observed bias, and transitions from preferred to unpreferred. To see if natural selection was influencing this codon bias, they then performed contingency tests using first the fixed versus polymorphic categories, and second the tip versus interior (both interspecific and intraspecific) categories (Llopart and Aguade 1999). Their results are summarized in Table 12.6. As can be seen, the evidence for selection is only marginally significant when the fixed versus polymorphic Table 12.6 Contingency analyses of the preferred (u in the Drosophila RpII215 gene.

p) and unpreferred (p

u) synonymous substitutions

Fixed

Polymorphic

Interior

Tip

u

p

8

11

16

3

p

u

5

27

14

18

FET Probability 0.0497 Source: Modified from Llopart and Aguade (2000).

FET Probability 0.0074

455

456

Population Genetics and Microevolutionary Theory

categories are used, but it becomes highly significant when contrasting the interior versus tip categories. Another advantage of using tips versus interiors is that it allows the contingency test approach to be used on data sets that contain only intraspecific observations, which is a common occurrence in population genetics. For example, Markham et al. (1998) sequenced the HIV-1 glycoprotein 120 gene in HIV clones extracted from 15 subjects positive for the AIDS causing virus HIV-1. This gene influences the mode of entry of HIV virions into human cells and is a target of the immune system, so there are many potential selective forces that could operate (Templeton et al. 2004). The classes “polymorphic” versus “fixed” cannot be applied to these data, but tips versus interior can, as shown in Table 12.7. A highly significant contingency chi-square is obtained in this case, resulting in a strong rejection of neutrality. Note that in Table 12.7 there are proportionately more replacement mutations in the interior class compared to the tips. This implies that natural selection is favoring amino acid changes in this protein that is both a target for the host’s immune system and influences the host cell types that the HIV-1 virus can infect. This shows that the contingency test approach can not only detect selection but also discriminate between different types of selection: negative selection for conserving sequences as in Table 12.4, and positive selection for change as in Table 12.7. The simple Templeton contingency test of natural selection has proven to be quite flexible and can be extended in a variety of fashions. For example, Castellano et al. (2016) tested the predicted Hill–Robertson interference (Hill and Robertson 1968) that the efficiency of natural selection is reduced when two or more linked selected sites do not segregate freely. They assembled a data set on 6141 autosomal protein-coding loci from Drosophila melanogaster and Drosophila yakuba, which they analyzed with a derivative of the McDonald–Kreitman test. They found that the rate of adaptive evolution due to positive selection was reduced by 27% due to linked selected sites, supporting the Hill–Robertson effect. Castellano et al. (2019) came to similar conclusions in an analysis of humans and nonhuman homininae. Chen et al. (2019b) created different categories of replacement mutations by using a physio-chemical distance between amino-acid pairs and showed with a contingency test approach that in both Drosophila and Hominoids negative selection is stronger within replacement substitutions when the physio-chemical difference between the amino acids is larger, whereas positive selection in general is not affected by the physio-chemical difference. However, the strongest positive selection is associated with large physio-chemical differences, indicating selection favoring a large leap in the physio–chemistry properties of the protein—a result at odds with the Fisherian “target.” Eyre-Walker and Keightley (2009) developed an extension of the McDonald–Kreitman test to correct a bias against detecting positively selected substitutions when deleterious mutations are present because deleterious mutations would inflate polymorphism relative to fixed. Eyre-Walker and Keightley corrected for this bias by estimating the distribution of Table 12.7 Contingency analyses of the silent vs. replacement mutations in tip vs. interior branches for the haplotype trees of the HIV-1 glycoprotein 120 sequences in 15 HIV-positive subjects. Silent

Replacement

Tip

273

640

Interior

110

421

Note: Contingency Chi-square = 14.06 with 1 degrees of freedom, yielding the probability of the null hypothesis being true to be 0.0002. Source: Based on data from Markham et al. (1998).

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

0.2

fitness effects (DFE, see Chapter 5) of new deleterious mutations from the site frequency spectrum (Chapter 5). An accurate estimate of the DFE generally requires information from multiple loci, so this method works best when data from a larger number of loci are available (Cagan et al. 2016). Cagan et al. (2016) used this DFE extension to study natural selection in the great apes using data from 3859 protein-coding genes. They estimated the DFE biased corrected rate of adaptive substitutions from the McDonald–Kreitman tests across all the loci and plotted the adaptive rate against an estimate of variance effective size (Figure 12.13). They found a highly significant positive correlation between the rate of adaptive substitution and variance effective size, a result consistent with the expectation that the efficacy of natural selection increases with population size—a pattern found in other species as well (Strasburg et al. 2011). Although results such as that shown in Figure 12.13 indicate that the rate of adaptation increases with diminishing genetic drift (measured by increasing variance effective size), this type of result does not imply that genetic drift cannot play a positive role in adaptive evolution. For example, genetic drift plays a critical role in Wright’s shifting balance theory, but only in the context of local demes being part of a larger subdivided population. Recall Eq. (6.47) that shows that population subdivision increases the variance effective size, even to the extent that the variance effective size can exceed the census size. Hence, Figure 12.13 is compatible with shifting balance theory as it gives an estimate of variance effective size for the species, not local demes. When dealing with genetic drift and effective size in adaptive evolution, we always need to take careful consideration of the biological level that we are dealing with (e.g. a local deme or a total subdivided population) and the measurement of the force of drift that we are using (e.g. inbreeding effective size or variance effective size, at the local or at the global level). Unfortunately, such care is rarely taken in much of the literature, so readers will have to beware. Another way of making use of the fixed versus polymorphic categories was suggested by Hudson et al. (1987). They tested for homogeneity of the balance of fixed interspecific differences to intraspecific polymorphisms across two or more loci. If one rejects homogeneity across loci with this

–0.2

PanPan PanTroEI PanTroVer Gorilla PonAb

–0.6

–0.4

Alpha

0.0

R = 0.90 P = 0.004

10,000

15,000

Ne

20,000

25,000

Figure 12.13 The correlation between the rate of adaptive substitutions (alpha) and the variance effective population size (Ne) in five species of great apes (PanPan = Pan paniscus, PanTroEl = P. troglodytes ellioti, PanTroVer = P. t. verus, Gorilla = Gorilla gorilla, and PonAb = Pongo abelii). Source: Cagan et al. (2016). © 2016, Oxford University Press.

457

458

Population Genetics and Microevolutionary Theory

HKA test (named after the authors), then one or more loci in the sample are inferred to be not neutral of closely linked to a selected locus. The HKA test does not identify the locus under selection, but if applied to a large number of loci, the outliers are typically regarded as the selected loci. Another approach to detecting selection in protein-coding genes is to examine the ratio of nonsynonymous substitutions per nonsynonymous site (dN) to synonymous substitutions per synonymous site (dS) and to test the null hypothesis of homogeneity either across branches of the haploytpe tree and/or sites in the gene (Yang et al. 2000). Rejecting this null hypothesis indicates the presence of selection, and the type of selection is generally interpreted by the ratio dN/dS. Under neutrality, we expect dN/dS ≈ 1, with negative selection dN/dS < 1, and with positive selection dN/dS > 1. Note that these rates are measured per synonymous/nonsynonymous site. Often, one just looks at all possible mutations to identify the number of synonymous and nonsynonymous sites, which is equivalent to assuming that all nucleotide mutations are equally likely, as in the Jukes–Cantor model (Chapter 5). However, nonrandom mutagenesis and gene conversion (Chapter 1) at the molecular level can bias this test (Berglund et al. 2009; Hurst 2009). The dN/ dS test was initially developed for detecting selection in highly diverged sequences due to substitutions fixed between species, but in principle it can be applied to intraspecific sequences as well. However, for intraspecific samples, this ratio is biased for small pairwise differences between sequences that are typical of intraspecific samples (dos Reis and Yang 2013). Moreover, this ratio is insensitive to the force of selection and behaves in a manner that makes interpretation difficult for intraspecific samples (Kryazhimskiy and Plotkin 2008; Arenas and Posada 2014). For example, dN/dS > 1 is often violated under positive selection within a species (Kryazhimskiy and Plotkin 2008). Hence, this test should be avoided or used with extreme caution for intraspecific population genetic studies. A method for using just tip branches to detect recent natural selection is the singleton density score (SDS) (Field et al. 2016). This test is based on the expectation that positive selection for a new mutant distorts the haplotype tree to produce shorter tip branches that bear the favored recent mutant. Hence, haplotypes carrying the favored mutant tend to carry fewer singleton mutations at nearby nucleotide sites (i.e., mutations observed only once in the sample that tend to be recent mutations). To calculate the SDS, first a test SNP is designated with a derived allele hypothesized to be under positive selection. Second, assuming that the test SNP is biallelic (which most are), the distances on the chromosome between the nearest singleton SNP to the test SNP are determined for the three genotypes defined by the test SNP locus. If the ancestral and derived alleles are known at the test SNP (by outgroup data), the two alleles serve as natural controls for genomic regional variation in mutation and recombination rates. Third, the raw SDS is calculated as the maximumlikelihood estimate of the log ratio of the mean tip-branch lengths for the derived versus the ancestral alleles at the test SNP. Finally, the raw SDSs are normalized (Appendix B) to produce the final SDSs with an overall mean of 0 and variance of 1. An SDS > 0 corresponds to an increased frequency of the derived allele at the test SNP. Any SNP can be the test SNP, so a genome can be scanned for recent positive selection through positive SDS outliers. Figure 12.14 shows the results of an SDS scan of a human population that detected three loci subject to recent positive selection (Field et al. 2016). The three loci are the lactase locus, a well-known target of recent selection due to the use of dairy products as food, the MHC, another well-known target of selection, and the region of the gene WDFY4 that has no prior information indicating recent selection. Another class of tests of natural selection is based on the balance of mutation and drift under neutrality as measured by the parameter θ = 4Nμ and the site frequency spectrum (Eq. (5.28)). We already saw in Chapter 7 that Tajima’s D statistic (Eq. (7.1)) could be used as a test of natural selection (Tajima 1989b) if one was willing to assume a constant population size throughout a

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

–log10(p-value)

25 LCT

20 15

MHC WDFY4

10 5 0

1

2

3

4

5

6

7 8 9 10 11 12 Chromosome

14

16 18 20 22

Figure 12.14 A genome scan of SDS in a sample of 3195 individuals from the United Kingdom. Different shades of gray are used to indicate alternate chromosomes. Three clusters of SDS values were above the genome-wide threshold for significance (indicated by the dashed line). These clusters were located in the regions of the lactose locus (LCT), the major histocompatibility complex (MHC), and the WDFY4 locus. Source: Field et al. (2016). © 2016, American Association for the Advancement of Science.

species’ history. The basic idea of Tajima’s D was to estimate θ directly by the observed number of nucleotide differences between randomly drawn pairs of sequences that makes no assumption about neutrality (k in Eq. (7.1)) and to contrast this direct estimator with some other observable statistic (in Tajima’s case, the observed number of segregating sites) that can provide an estimate of θ under the assumption of neutrality (S/a1 in Eq. (7.1)). Tajima (1989b) showed that the difference between these two estimators does not have a zero expectation if natural selection occurs under demographic stability. For example, suppose a mutation arose that was favored by natural selection and rapidly went to fixation. The fixation of the selected site would also increase the frequencies of variants at other polymorphic sites that were in linkage disequilibrium with the selected site, a phenomenon known as hitchhiking (Maynard-Smith and Haigh 1974). Indeed, in the extreme case of sites showing no recombination with the selected site, fixation of the selected site causes fixation of all other sites as well, resulting in what is called a selective sweep. After the selective sweep, each new mutation creates a new segregating site under the infinite sites model, so S/a1 recovers rapidly from the sweep. However, these new mutations are initially rare and contribute little to average heterozygosity, which is most sensitive to haplotypes with intermediate gamete frequencies. As a result, k will be smaller than S/a1 after a selective sweep, converging only slowly at a rate determined by the mutation rate. Hence, if a selective sweep has occurred in the recent past in the DNA region sequenced, D should be negative (k < S/a1). In contrast, if natural selection favors the maintenance of polymorphisms, then there will be haplotypes at intermediate frequencies beyond neutral expectations. Haplotypes with intermediate frequencies contribute much to average heterozygosity but not to the number of segregating sites, so D should be positive (k > S/a1). Thus, significant differences between k and S/a1 not only indicate selection, but the sign of the difference indicates the nature of selection. As noted in Chapter 7, the problem for Tajima was that the alternative estimator under neutrality also depended upon several other assumptions, making interpretation difficult (Tajima 1989a, b). In particular, the assumption of a constant population size throughout a species history was considered problematic. Nevertheless, Tajima’s D is frequently calculated in genome scans with the idea that demographic history should affect all genomic regions in a similar manner. Hence, genomic regions that are outliers for the D statistic would indicate selection (Clemente et al. 2014).

459

460

Population Genetics and Microevolutionary Theory

However, another assumption of the Tajima D statistic is the infinite sites model of mutation (which underlies Eqs. (5.28) and (7.1)). Recall from Chapter 1 that mutational hotspots are scattered throughout the genome that seriously violate the infinite site model. Hu et al. (2016) simulated both a finite site model that allowed homoplasy and a model that allowed variation in the mutation rate across sites. Their simulations showed that both homoplasy and rate variation biased D. Hence, the outliers of D could arise either through mutation, particularly mutational hotspots (Chapter 1), or selection, so ambiguity of interpretation still remains. One way of salvaging this approach is to find other estimators under neutrality that eliminate or reduce the importance of these other assumptions. Fay and Wu (2000) suggested a way of overcoming this confoundment of selection with demography by devising yet another estimator of θ, called θH. Under neutrality and the infinite sites model, another unbiased estimator of θ is given by n−1

θH =

2V i i2 n n−1 i=1

12 18

where Vi is the number of derived variants found i times in the sample of n sequences. By a derived variant, Fay and Wu are referring to the nucleotide state at the polymorphic sites that represents a mutation after the most recent common ancestral molecule. Outgroup data are used to determine whether a state is derived or ancestral, so Eq. (12.18) requires and makes use of more information than is contained in Eq. (7.1). Note also that those variants that are in high frequency contribute heavily to Eq. (12.18) (the i2 term), whereas haplotypes of intermediate frequency contribute heavily to Eq. (7.1). θH gives added weight to derived variants that are in high frequency, but ancestral states of high frequency are given little weight. Under neutral coalescence, ancestral states tend to be more common on the average (Casteloe and Templeton 1994), so this new way of weighting can result in substantial differences with Eq. (7.1) (Ronen et al. 2013). Fay and Wu showed that population growth after a bottleneck does not tend to make derived variants common, so that an excess of derived haplotypes at high frequency is a unique pattern associated with positive directional selection. Fay and Wu (2000) measure this potential excess by the statistic H = k − θH. They found that only a few high-frequency derived variants are needed to detect directional selection with this statistic since not many are expected under neutrality. Simulations indicate that the H statistic outperforms the Tajima D statistic (Ferretti et al. 2010). Data analyses also indicate the superiority of H over D. For example, Rafajlovic et al. (2014) calculated H and D statistics across the human genome using a sliding window of 100 kb and a step size of 10 kb on all windows with five or more SNPs on three human populations of African, European, and Chinese origin. The H statistic has a mean close to zero for all three populations, indicating that on the average most genomic regions were neutral or near neutral. In contrast, the D statistic is skewed negatively from 0 for the African population and positively from 0 for the European and Chinese samples, implying that most genomic regions are under selection in these populations. However, these patterns with the D statistic probably reflect the different population growth histories of these populations rather than strong selection operating upon most regions of the genome (Rafajlovic et al. 2014). Rafajlovic et al. also found that adjustment of demographic history with a model of two instantaneous changes in population size can improve both H and D. However, Koch and Novembre (2017) found that the site frequency spectrum can have complex responses to more realistic demographic histories, making it difficult to make a general or simple demographic adjustment. Achaz (2009) has produced a general class of θ estimators that encompasses both the D and H tests and others. This generalized framework allows the construction of new tests to overcome

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

specific violations of the commonly used assumptions. Ferretti et al. (2010) used Achaz’s framework to derive an optimal general test that performed well under their simulations. However, these generalizations and adjustments of D still share the assumption of the infinite sites model. Unfortunately, there has been little work on the sensitivity of these statistics to nonrandom mutation at the molecular level and particularly with the realistic multi-site models of mutagenesis (Chapter 1). The site frequency spectrum (Eq. (5.28)) arises from coalescent theory that incorporates drift and mutation. It is also possible to use other aspects of coalescent theory to detect selection as a deviation from the expectations due to just drift and mutation. Ferretti et al. (2017) showed that D, H, and similar tests of neutrality can be decomposed into two components: the waiting time between coalescent events, and tree topology. Specifically, the H test depends mostly upon tree imbalance and the length of the lower branches (the branches closest to the present). A tree starts when the common ancestral molecule splits into two DNA lineages (Chapter 5). Tree imbalance measures the number of subsequent splits that occur within each of these two original lineages. If both of the original two lineages give rise to the same number of splits, the tree is completely balanced. The most extreme imbalance is one lineage has no further splits at all so that all the haplotype diversity arises from splits on the other DNA lineage. Ferretti et al. proposed a new test that depends on tree imbalance, the length of the upper branches (those closer to the root), and the length of the lower branches. Their test is of the same general form of a difference between two estimators of θ, but it is explicitly derived from these tree topology measures. They performed simulations that indicate their new statistic has similar power to the H statistic for detecting ongoing directional, positive selection, but has higher power for other types of selection. Wang et al. (2014) showed that a statistic based on the imbalance of accumulated mutation across different DNA lineages within a coalescent process could not only identify genomic regions subject to positive selection but also aid in identifying candidate causal mutations. Yang et al. (2018) proposed a test for positive selection that uses imbalance between the two subtrees created when the common ancestral molecule divides into two lineages. The idea is that variation in population size and other aspects of demographic history should be the same for the two lineages, thereby eliminating the demographic confoundment of the D statistic. However, this assumption that both subtrees experienced the same demographic history is not always true due to the complexities that arise during phylogeographic history. For example, Figure 12.15 shows the mtDNA haplotype tree estimated by Cann et al. (1987). As can be seen, one of the subtrees is limited to Africa, whereas the other subtree is found throughout the world. This probably reflects the out-of-Africa range expansion events shown in Figure 7.14 that influenced one subtree but not the other and not selection. This example warns us that tests based on tree imbalance do not necessarily eliminate all historical confounding factors. Another property of a coalescent tree that can be used to detect selection is the time to the most recent common ancestor (TMRCA, Chapter 5). Short TMRCAs should occur under positive, directional selection, whereas balancing or diversifying selection that maintains polymorphisms should result in long TMRCAs. Indeed, strong balancing and diversifying selection has been shown to result in transpecific polymorphisms in which the polymorphisms have been retained through one or more speciation events. Detecting transpecific polymorphisms is one method of inferring balancing or diversifying selection (Gao et al. 2015; Cheng and DeGiorgio 2018; Wang et al. 2019). For example, some of the polymorphic DNA lineages in humans in the MHC region have a TMRCA going back 35 million years (Zhu et al. 1991), indicating that the DNA lineages within humans are much, much older than the human species, or even the genus Homo. However, one does not need to find transpecific polymorphisms to infer selection. For example, Hunter-Zinck and Clark (2015) scanned consecutive 10 kb windows of some human genomic regions to obtain estimates of the TMRCA in each window and then looked for significant outliers with a statistic

461

462

Population Genetics and Microevolutionary Theory

Africa Asia Australia New Guinea Europe

ANCESTOR

0

0.2

0.4

Sequence Divergence %

0.6

0

0.2

0.4

0.6

Sequence Divergence %

Figure 12.15 The mtDNA haplotype tree estimated by Cann et al. (1987), with each tip representing a modern haplotype. The geographical location of those haplotypes is indicated by shape and shading differences, as shown in the legend. Source: Cann et al. (1987). © 1987 Springer Nature.

50 0

TSel Score

100

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

77.00

77.05

77.10

77.15

77.20

77.25

77.30

Position (Mb)

Figure 12.16 The TSel score that measures outliers for TMRCA over consecutive 10 kb windows in the region of the VASH1 gene on human chromosome 14. Source: Hunter-Zinck and Clark (2015). © 2015, Oxford University Press.

based on these times that they called TSel. Figure 12.16 shows a plot of TSel across a portion of the genome that had a significantly old TMRCA outlier that mapped to a region previously inferred to be under balancing selection (Leffler et al. 2013). Bitarello et al. (2018) created another model for detecting balancing selection, particularly when the balanced polymorphism is not transpecific. Their model assumes that balancing or diversifying selection occurs within a small genomic region to produce a stable target frequency (like Eq. (11.13)) for a haplotype in that region. They scanned the human autosomal genome using a 3 kb sliding window with 1.5 kb steps, although windows with less than 10 SNPs were excluded. Suppose a particular window has n SNPs. Then, they calculated: n

pi − tf

NCD tf =

2

n

12 19

i=1

where pi is the minor allele frequency at SNP i, and tf is the target frequency for the minor haplotype frequency under balancing selection. In each window, they calculated statistic (12.19) for three target frequencies: 0.3, 0.4, and 0.5. They then scanned the genome for significant outliers of Eq. (12.19), doing separate scans for the three target frequencies. They found about 8% of the protein-coding genes in the human autosomal genome have a significant signal of balancing or diversifying selection over the three tf’s, with the strongest signals being found for immune-related genes. As will be discussed in Chapter 14, disease-related genes are likely to be the targets of diversifying selection. The NCD statistic is based on the prediction that the SNP alleles in the same haplotype under balancing selection will tend to have similar allele frequencies. Siewert and Voight (2020) present two statistics that also make use of this prediction. The first uses only intraspecific data and looks for an excess of SNPs near to a presumed core SNP under balancing selection. Because of

463

Population Genetics and Microevolutionary Theory

RAD51B

PIGH ARG2 VTI1B RDH11 RDH12 ZFYVE26

PLEKHH1

TMEM229B

FAM71D MPP5 ATP6V1D EIF2S1 PLEK2

GPHN

LINC00238

143 MKK (AFR)

110 LWK (AFR)

110 GIH (SAN)

FUT8

hitchhiking, these nearby SNPs should have similar allele frequencies to the core SNP. Unlike NCD, this statistic (called β(1)), which has the form of a difference between two estimators of θ, does not require the pre-specification of a target frequency. Their second statistic, β(2), uses in addition the number of fixed differences from an outgroup species, which are informative for long-term balancing selection. Their simulations indicate that these β statistics are more powerful at detecting balancing selection than NCD and Tajima’s D. Climer et al. (2015) provide the most striking example of multiple adjacent SNPs sharing a common polymorphic allele frequency. Using CCC (Chapters 2 and 10), Climer et al. (2015) constructed haplotypes in 11 human populations at and surrounding the Gephyrin locus that codes for a protein that is crucial in synapse formation and plasticity in the nervous system. They discovered two haplotypes in this region that were polymorphic in all 11 populations (Figure 12.17 shows data from four of the 11). These two haplotypes spanned the entire Gephyrin locus and about 300 kb upstream and downstream and differed by 284 SNPs. These two haplotypes were the only two common haplotypes in this region, with most intermediates completely absent or very rare. Hence, the allele frequencies of all 284 SNPs were almost identical. Such a pair is called a yin–yang haplotype pair, and this pair was the largest by an order of magnitude discovered by that time. Using a chimpanzee outgroup, it was further discovered that both the yin and yang haplotypes were highly and equally divergent from the ancestral state. These are all the signals for strong balancing selection, and positive selection for the derived alleles was also confirmed with Fay and Wu’s H statistic and additional statistics that will be discussed in the next section. Another method of detecting recent balancing and positive selection is through identity-by-descent (Chapter 3) across individuals using

102 TSI (EUR)

464

Figure 12.17 The genotypes at 1036 SNPs that span the Gephyrin (GPHN) locus and adjacent regions for individuals from four populations: Toscani in Italy (TSI), Maasai from Kenya (MKK), Luhya from Kenya (LWK), and Gujarati Native Americans from Texas (GIH). The sample sizes of these four populations are indicated by the numbers preceding the population abbreviations. SNPs are indicated by vertical lines and individuals by horizontal lines. Dark blue indicates homozygosity for one allele at a SNP, red homozygosity for the alternative allele at a SNP, and light blue heterozygosity at the SNP. Source: Climer et al. (2015). © 2015, Springer Nature.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

dense SNP surveys (Albrechtsen et al. 2010), and this yin–yang pair also satisfies that criterion since people from all over the world share these two long, derived haplotypes. Hence, there is little doubt that strong balancing selection is operating at the Gephyrin locus.

The Interactions of Natural Selection, Genetic Drift, Mutation, and Recombination Mutation and recombination both create genetic variation (Chapter 1), and natural selection operates on the genetic variation created by both of these sources. It is therefore expected that interactions will arise between natural selection and recombination. Indeed, two such interaction effects have already been mentioned: Hill–Robertson interference and hitchhiking. Hitchhiking is an important consequence of the interaction between selection and recombination, and indeed Hill–Robertson interference is a type of hitchhiking effect. As shown in Chapter 2, whenever a mutation occurs, it is in linkage disequilibrium with the alleles at polymorphic sites that happened to be on the same physical chromosome on which the mutation occurred. Much of this initial linkage disequilibrium is rapidly dissipated due to recombination (Eq. (2.11)), but nearby sites will retain much disequilibrium for long periods of time, particularly if the mutation occurs in an area of the genome with low or no recombination. If this mutation is selected, these nearby sites will also have their frequencies changed through linkage disequilibrium. We begin our exploration of the evolutionary impact of hitchhiking by returning to the models at the beginning of this chapter of a deleterious mutation that is selected against. As we saw earlier in this chapter and in Chapter 5, even a deleterious mutation may persist in the population for many generations, particularly if it is recessive in its deleterious effects in a random-mating population. However, we do expect selection to eventually eliminate many lineages of a deleterious mutation. By eliminating the DNA lineages bearing the deleterious mutation, natural selection also reduces the number of DNA lineages in the genomic region that retains linkage disequilibrium with the deleterious mutation. The reduction in DNA lineages results in a corresponding reduction of the coalescent and inbreeding effective sizes in that region. This reduction in effective sizes is not genome wide, but rather depends on the recombinational distance to the deleterious mutation and the time the deleterious mutation persists in the population (e.g., Eq. (2.11)). This reduction in effective sizes in turn means that even neutral variation at sites in disequilibrium with a deleterious mutation will have reduced heterozygosity through Eqs. (5.6) and (5.7). Hudson and Kaplan (1995) showed that the equilibrium heterozygosity at a neutral site with a recombination frequency of r from a deleterious mutation site with selection coefficient s, and dominance measure h (Eq. (12.6)) is reduced from the standard approximate neutral value of 4Nμ where N is the inbreeding effective size and μ is the mutation rate to H eq ≈ 4Nμ 1 −

μsh 2 sh + r

2

12 20

The portion of Eq. (12.20) in parentheses is less than or equal to one, so selection against a nearby deleterious mutation often reduces the effective sizes and neutral heterozygosity in a genomic region determined by the recombination rate. This local reduction of effective sizes and expected heterozygosity through hitch hiking near a site with deleterious mutations is called background selection.

465

466

Population Genetics and Microevolutionary Theory

Although the effect of background selection may first appear to be minor, keep in mind that deleterious mutations are the most common class of mutations (Chapter 5). Hence, the overall effect of background selection can be quite substantial in shaping genome-wide patterns of variation (Charlesworth 2012; Zeng 2013). The impact of background selection should vary considerably across the genome as mutation and recombination rates vary and as only some parts of the genome (and even genes) are under the strong functional constraints that make deleterious mutations more likely (e.g., Figure 5.4 and Table 12.5). Ewing and Jensen (2016) showed that background selection can lead to underestimating effective sizes and overestimating population growth rates. Bank et al. (2014) and Johri et al. (2020) have shown that ignoring background selection and its variable impacts across the genome can bias demographic and adaptive inferences. These observations are particularly disturbing as much demographic inference in the literature has been made from genetic studies on mtDNA and Y-DNA, both of which typically have little or no recombination—a feature that would maximize the potential for background selection. Demographic inferences from mtDNA are particularly likely to be biased because much of the mitochondrial genome is coding (Figure 1.5) and under strong functional constraints (Table 12.5) as well as nonrecombining. Hence, deleterious mutations are expected with high frequency in mtDNA, as illustrated by the analysis of mitochondrial cytochrome oxidase II (COII) DNA discussed earlier in this chapter. More extreme hitchhiking effects can arise when a beneficial mutation occurs (Maynard-Smith and Haigh 1974). The simplest case is a hard selective sweep that arises when a new mutant allele has a positive average excess of fitness for all allele frequencies and is swept to fixation by selection. As the selected mutant goes to fixation, so do all other alleles at those sites that retained the disequilibrium generated at the selected mutation’s origin. This will result in fixation of a haplotype containing the selected allele, with the length of the haplotype depending mostly upon the amount of recombination and the time to fixation of the selected locus. Figure 12.18 gives an example of these genomic signatures (Lamason et al. 2005). As modern humans expanded into northern Eurasia (Figure 7.14), there was selection for lighter skin (to be discussed in Chapter 14). The locus SLC24A5 has two alleles associated with an amino acid difference that has a large impact on skin color. The ancestral allele, characterized by the nucleotide G in codon 111, is nearly fixed in subSaharan populations and is associated with dark skin color. The derived allele, A, is nearly fixed in European populations. Lamason et al. (2005) scanned the genome in Africans and Europeans, finding the strongest signal of reduced heterozygosity on Chromosome 15, the location of SLC24A5. Figure 12.18 shows the detailed scan for heterozygosity in this region. As can be seen, there is a striking loss of heterozygosity in the European sample surrounding the A allele in SLC24A5 and near fixation for a long haplotype in this region. Hard selective sweeps can be detected by outliers of expected heterozygosity, as shown in Figure 12.18, by statistics that detect fixation or near fixation of long haplotypes (the long-range haplotype score, LRH; the integrated haplotype score, iHS; the extended haplotype homozygosity score, EHH; linkage disequilibrium decay, LDD), shared genomic segment analysis, and many others (Crisci et al. 2013; Ferrer-Admetlla et al. 2014; Garud and Rosenberg 2015; Jacobs et al. 2016; Harris et al. 2018; Refoyo-Martínez et al. 2019; Zheng and Wiehe 2019). The genomic signature of a hard sweep begins to decay after fixation as new mutations occur in the region, so recent selective sweeps are generally easier to detect than older ones. Hard selective sweeps interact strongly with population structure and restricted gene flow. Restricted gene flow among subpopulations slows down the rate of fixation, which means there is more time for recombination to occur during the transient polymorphism phase (Kim and Maruki 2011). This in turn makes the sweep more difficult to detect by confining its strongest hitchhiking effects to a smaller genomic region.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection 0.5 YRI Panel Individuals of African Enrichment

0.4

Heterozygosity

0.3

0.2

0.1 Striking Loss of Heterozygosity among CEU Panel of European Ancestry 0 46.00

46.05

46.10

46.15

46.20

46.25

46.30

Position on Chromosome 15 (Mb) SLC24A5

CTXN2

Genes MYEF2

SLC12A1

Figure 12.18 Reduced heterozygosity in the region of the SLC24A5 locus in a human sample of European ancestry (CEU) versus high heterozygosity in a sub-Saharan African sample (YRI). Source: Modified from Lamason et al. (2005).

Zheng and Wiehe (2019) show that no one test is optimal over all the demographic scenarios that they simulated, particularly those with restricted gene flow. However, in general, they found that haplotype-based methods worked best. As noted earlier, background selection can also create genomic regions of low heterozygosity, but the haplotype-based tests for a selective sweep are less sensitive to background selection and are better indicators of a hard selective sweep (Enard et al. 2014). An explicit correction for background selection nevertheless increases the power for detecting a selective sweep (Huber et al. 2016). Background selection can occur during the transient polymorphism phase of a hard selective sweep if deleterious mutations occur on the DNA lineages bearing the positively selected mutation. Such deleterious mutations can actually increase in frequency due to the sweep, which can lead to an accumulation of deleterious mutations around the alleles under positive selection (Lenz et al. 2016). The accumulation of deleterious mutations in turn can depress the probability of fixation of the beneficial mutation (Pénisson et al. 2017). Alternatively, the beneficial mutation could have occurred on a chromosome that already had linked deleterious mutations nearby, and this initial condition can reduce the probability of fixation of the beneficial mutation or at least prolong its time in the transient polymorphic phase (Assaf et al. 2015). A staggered sweep can occur when recombination rids the beneficial mutation from being associated with a deleterious mutation, resulting in a sudden increase in the rate of the selective sweep (Assaf et al. 2015). A hard selective sweep occurs when a single beneficial mutation is driven to fixation by natural selection, dragging with it a single haplotype background (e.g. Figure 12.17). However, sometimes selection favors the increase of many copies of the beneficial allele upon multiple haplotype

467

468

Population Genetics and Microevolutionary Theory

backgrounds. This is called a soft selective sweep. Soft sweeps can occur when the environment changes and previously neutral standing variation becomes adaptive. We saw this with the A and C alleles at the β-Hb locus in Chapter 11. Under a pre-malarial environment, these two alleles are neutral with respect to each other (Table 11.1, Figure 12.5), so the C allele could drift to higher frequencies after it originated by mutation. When the environment changes to a malarial environment, the C allele now becomes beneficial, but it could be a part of the standing variation in the gene pool at the time of this environmental change. Hence, multiple copies on multiple haplotype backgrounds could increase in frequency due to natural selection, resulting in a soft sweep. Soft sweeps can also result from mutational homoplasy. For example, the S allele at the β-Hb locus apparently originated by mutation at least five times, so when malaria became a strong selective agent in humans, five different haplotype backgrounds all bearing the S allele have been increased in frequency by natural selection (Figure 2.7). The multiple haplotype backgrounds dilute the genomic signal of a selective sweep, so soft sweeps are in general more difficult to detect than hard sweeps. Garud and Rosenberg (2015) developed some statistics that detect both hard and soft sweeps. H12 pools the two most frequent haplotypes together and calculates their pooled expected homozygosity plus the homozygosities of all the remaining haplotypes. H2/H1 is the ratio of H2, the sum of the expected homozygosities of all haplotypes except for the most common haplotype to H1, the sum of the homozygosities of all haplotypes, but with no pooling. The ratio H2/H1 is then adjusted with H12 to form a test statistic that discriminates between hard and soft sweeps. Harris et al. (2018) produced analogous statistics when only unphased genotype data are available rather than haplotypes. Berg and Coop (2015) point out that it is easier to detect and discriminate soft sweeps when samples from multiple populations or time points are available. To further complicate the issue, sometimes a soft sweep will convert to a hard sweep. This occurs when the multiplehaplotype backgrounds are more or less equivalent selectively. In that case, the different haplotype lineages are a part of a coalescent process that is subject to genetic drift, including random loss. With each loss of one of these selectively equivalent haplotypes, the sweep progressively “hardens” (Wilson et al. 2014). This hardening of the sweep can be accelerated by bottleneck effects or by one of the haplotypes having superior fitness over the others. An incomplete selective sweep occurs when the population is sampled during a transient polymorphism phase or when there is a balanced polymorphism. With a transient polymorphism, an incomplete sweep can be indicated by haplotype tree imbalance, as the DNA lineage bearing the favored allele should have very low polymorphism at nearby linked loci while the DNA lineage bearing the ancestral allele should have a normal pattern of nearby polymorphism (Vy and Kim 2015). The detection of balanced polymorphisms was discussed earlier in this chapter.

Candidate Loci Candidate loci have been used throughout the history of population genetics to detect selection, and we have already seen an example of this in Chapter 11. Shorty after the genetic basis of sickle cell was discovered (Neel 1949) and its relation to malarial resistance was known, the β-Hb locus became a candidate locus for natural selection related to malarial environments, leading to many studies that produced data sets like that given in Table 11.1. Our ability to identify candidate loci in many species has increased considerably in the twenty-first century, so this old approach is becoming more and more common. Many examples of this approach will be given in later chapters, so only a few examples will be given now.

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

In Chapter 10, we discussed the butterfly genus Heliconius, a speciose genus mostly in Central and South America that have brightly colored and distinct wing patterns (e.g. Figure 10.3). As will be discussed in more detail in Chapters 13 and 14, these wing patterns have long been the focus of studies on natural selection. Heliconius butterflies are unpalatable to bird predators, and their bright coloration and patterns serve as a warning signal to predators. Moreover, the different Heliconius species and subspecies inhabiting a common geographic area tend to have nearly identical wing patterns and color, thereby forming a Müllerian mimicry complex in which different species share a common warning signal that facilitates predators in learning to avoid that pattern. The genes controlling wing pattern and color are now known, making them excellent candidate loci for natural selection in this group, particularly selection related to Müllerian mimicry. Moest et al. (2020) studied four major candidate regions that control much of wing patterns and colors in nearly 600 individuals from 53 populations in 6 closely related species representing 25 subspecies with distinct wing patterns and colors. They scanned the four candidate genomic regions for selective sweeps using the program SWEEPFINDER2 (DeGiorgio et al. 2016). Using the extensive background information available on this group and their own phylogeographic analyses, Moest et al. estimated that they could detect only those selective sweeps that occurred in the last 800 000 years as the genomic signatures of older sweeps would have been reduced to undetectable levels. They found significant selective sweeps in all four candidate regions in many of the subspecies they screened, confirming the strong selection exerted on these candidate regions and their effect on wing pattern and color. Figure 12.19 shows three examples. Many of the selective sweeps they found did not have a well-defined peak but rather a broad plateau. Recall from Chapter 10 that the species (and subspecies) can and do hybridize in this complex (Edelman et al. 2019). Therefore, Moest et al. hypothesized that these sweeps were neither hard sweeps from a single mutation nor soft sweeps from multiple copies within the gene pool of a species, but rather were due to introgression, as discussed earlier in this chapter. Hence, Moest et al. called these introgressed selective sweeps, and they performed simulations to indicate that results like that shown in Figure 12.19 do indeed have the expected pattern under adaptive introgression. One advantage of the candidate approach is that there is a direct connection between the selective sweep and the phenotypic changes that are the target of this selection, as shown in Figure 12.19. This knowledge of phenotypic connections also allows detailed hypotheses about the environmental variables

H. m. plesseni

H. c. weymeri

>0.2

1/alpha

1.35

optix

cortex

0

CLR 0 400 800

WntA

H. m. meriana

1.55

1.75

1.95

Position (Mb)

2.15 0.55

0.75

0.95

1.15

1.35

Position (Mb)

1.55

1.75

1.95 0.50

0.70

0.90

1.10

Position (Mb)

Figure 12.19 Three examples of selective sweeps involving the split forewing band in Heliconius melpomene plesseni in the WntA candidate region, the yellow and white wing colors in H. cydno weymeri in the cortex candidate region, and the red dennis patch in H. m. meriana in the optix candidate region. CLR refers to the composite likelihood ratio, the criterion used to identify selective sweeps in SWEEPFINDER2. Source: Figure 4, page 12, in Moest et al. (2020).

469

0.00 0.02 0.04 0.06 0.08 0.10

Population Genetics and Microevolutionary Theory

Frequency

470

1940

1950

1960

1970

1980

1990

2000

Year

Figure 12.20 The frequency of the medionigra allele in the Cothill population of Panaxia dominula. Source: Foll et al. (2015). © 2014 John Wiley & Sons.

inducing the selection, which is likely to be avian predation in the context of a multi-species community of unpalatable butterflies with warning coloration for this Heliconius example. Candidate locus studies are often amenable to temporal sampling as a tool for detecting selection, particularly in organisms with short generation times and in experimental populations. Indeed, many of the classic examples of natural selection fall into this category, such as the rise and fall of industrial melanism in the moth Biston betularia over centuries in England as air pollution rose and then fell (Cook and Saccheri 2013). More will be said about this example in Chapter 14. Selection is detected at a candidate locus over time by showing that the allele frequency changes are not consistent with a pure genetic drift model. This in turn requires some knowledge of or an estimate of Nev. For example, Figure 12.20 shows the changes in the frequency of the medionigra allele, a codominant allele for wing color variation in the moth Panaxia dominula for a population at Cothill Fen near Oxford, England. This moth has a generation time of one year, so Figure 12.20 shows allele frequency data over 60 generations. O’Hara (2005) estimated the population size to be roughly a few hundred, so Mathieson and McVean (2013) set Nev = 500 and then used a maximum-likelihood framework (Appendix B) they had developed to estimate a significant selection coefficient of s = 0.114 under a codominant model (h = ½) using the fitness parameterization of Eq. (12.5). They also found that a recessive lethal (or infertile) model fit the data slightly better. Regardless, these estimates indicate strong negative selection against the medionigra allele. Foll et al. (2015) developed a Bayesian approach based on Approximate Bayesian Computation (ABC, see Chapter 7), and used this approach to analyze the Panaxia data. Using Nev = 500, their posterior distribution on both h and s had the highest joint probability near h ≈ 0 and s ≈ 1, the same recessive lethal/infertile model estimated by Mathieson and McVean (2013). As shown in Chapter 6, Nev can also be estimated from temporal data, but this would be best estimated from marker data scattered throughout the genome and not just the candidate gene for selection (Foll et al. 2015). Unfortunately, such data do not exist for this classic study on Panaxia. One advantage of the Bayesian approach is that a prior can be assigned to Nev that reflects the poor knowledge about this parameter (Appendix B). Foll et al. assigned a uniform prior over the interval from 50 to 5000 for Nev and got even stronger support for the recessive lethal/infertile model. As mentioned in Chapter 7, uniform priors of restricted range are generally not a good choice in a Bayesian analysis. A better choice would have been a gamma distribution that concentrated its probability mass between 100 and 1000, which the prior knowledge indicated was the

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

Table 12.8

Allele frequencies associated with lighter skin color in ancient and modern human populations. Sample

Locus

Europe

Africa

Modern Ukraine

Ancient Ukraine

HERC2

0.710 [758]

0.000 [370]

0.651 [86]

0.160 [94]

SLC45A2

0.970 [758]

0.000 [370]

0.927 [82]

0.432 [44]

TYR

0.368 [758]

0.000 [370]

0.367 [98]

0.043 [92]

Note: Numbers in square brackets are the sample size of genes (two times the number of individuals). Source: Data from Wilde et al. (2014).

most likely range, or even a modified beta distribution over the range of 50–5000 that also concentrated its probability mass between 100 and 1000, see Appendix B for a discussion on how to choose appropriate priors. With the advent of ancient DNA studies, it is now feasible to perform temporal analyses of long generation species, such as humans, and over longer periods of time (Malaspinas 2016). For example, Wilde et al. (2014) retrieved aDNA samples from 63 individuals in modern-day Ukraine that dated back to 6500–000 years ago. These and modern samples were surveyed at three candidate skin color loci. Light skin color is believed to have been strongly selected for as humans expanded into high absolute latitudes because of the need for UV light to penetrate the skin for the formation of vitamin D (Jablonski et al. 2006). This will be discussed in more detail in Chapter 14, but for now what is important is that all three of these genes have alleles that lighten skin color, and hence are excellent candidates for selection in the low UV environments associated with northern Eurasia. Indeed, we have already seen strong evidence of a selective sweep in northern Eurasia at one of these loci based on contemporary samples only (Figure 12.18). Table 12.8 shows the frequencies of the lighter skin alleles at these three loci for the ancient DNA sample as well as various modern samples. As can be seen, these light skin alleles are absent in sub-Saharan Africa but common in modern Europeans and Ukrainians. They are present in the aDNA samples, but at much lower frequencies than today in Europe and the Ukraine. They also surveyed mtDNA in these samples and found very little differentiation between the ancient and modern Ukrainians (fst = 0.005), indicating a high degree of regional population continuity. To test for selection, they ran computer simulations of drift with various degrees of selection, accommodating uncertainty in both the ancient and modern allele frequencies, population sizes, and ancient sample age. All of their simulations strongly rejected neutrality and indicated strong positive selection for the lighter skin alleles at all three loci.

Quantitative Genetic Approaches to Detecting Selection Most fitness traits are polygenic (Chapter 11), and alleles with strong marginal effects and those with minor effects can contribute to polygenic adaptation (Jain and Stephan 2017). The methods discussed above on selective sweeps are appropriate for detecting alleles of strong effect, but not necessarily for alleles of weak effect (Jain and Stephan 2017). Do these genes of small effect also leave detectable genomic signatures of selection? The answer turns out to be yes. Recent selection also often involves more subtle changes in allele frequency than long-term selection. Recall that the SDS was designed to detect such recent selection in genome scans

471

472

Population Genetics and Microevolutionary Theory

(Figure 12.14). Field et al. (2016) also modified SDS to study polygenic adaptation. One trait they examined was human height, which has been increasing in European populations. Instead of using all SNPs as test-SNPs as in Figure 12.14, they used previous GWAS studies on human height to identify 551 test SNPs associated with height. They also modified the sign of the test SNP alleles from positive for derived alleles to positive for test SNP alleles that increased the trait value in the GWAS–in this case positive SNP alleles were those that tended to increase height. They called such a modified SDS derived from GWAS a trait-SDS (tSDS). Although any one test SNP was unlikely to have a significant tSDS, the mean tSDS for all 551 height-associated test SNPs has a highly significant positive mean. Hence, overall, the test-SNPs associated with increased height also showed increased allele frequencies, indicating positive selection for height. This conclusion was reinforced by another genome scan that calculated tSDSs for all SNPs. They then calculated the rank correlation between the tSDSs and the GWAS scores for the same trait, after adjusting for linkage disequilibrium. They obtained an extremely significant and positive Spearman rank correlation of 0.08 for height. They applied this technique to several other human quantitative traits and got many highly significant correlations, although height gave the strongest signal of polygenic adaptation. Others have also used temporal changes in allele frequency, even minor ones, to detect polygenic adaptation. Buffalo and Coop (2019) noted that directional selection on an allele should cause the change in allele frequency between two time points should tend to be in the same direction as the change occurring in another time interval if the allele is selected or is in strong linkage disequilibrium to a selected allele. This will create a temporal autocovariance in allele frequency change across time intervals. Buffalo and Coop performed simulations that showed the temporal autocovariance in allele frequency change can detect polygenic directional selection under an additive model and used this model to estimate additive variance for fitness and variance effective population size. Michalak et al. (2019) established several laboratory populations of D. melanogaster from a common founder population and subjected different replicate populations to artificial selection for five different traits (one trait per replicate) and unselected controls, with replicate populations for each selective regime. They detected polygenic selection within a selective regime by temporal autocorrelation of allele frequency changes. The SNP patterns differed substantially across selective regimes, indicating a different set of genes were being selected with only a small amount of overlap. They also found considerable variation even among replicates within a selective regime, indicating multiple adaptive peaks. Chen et al. (2019a) were able to study changes in allele frequency in a long-term studied population of the Florida Scrub-Jay (Aphelocoma coerulescens). This population has been studied so exhaustively that its pedigree structure is known. The distribution of neutral alleles given this pedigree structure was simulated with the gene-drop algorithm of MacCleur et al. (1986), the same algorithm used to estimate the variance effective size of the pedigreed captive population of Speke’s gazelle described in Chapter 4. These simulations revealed some expected allele frequency changes under neutrality due to different genetic contributions of founders and to migrants. They then scanned the genome for allele frequency shift outliers from the neutral distribution in this population from 1999 to 2013, finding 18 SNPs with significant allele frequency changes. This indicates some limited natural selection in this population, but genetic drift and gene flow dominated allele frequency shifts during the study’s time period. For many systems, it is not possible to obtain temporal samples, but as shown in Chapters 5 and 7, coalescence can be used to gain insight into the past from a current genetic sample. This is the approach taken by Edge and Coop (2019). Suppose a GWAS on some phenotypic trait that is

Interactions of Natural Selection with Other Evolutionary Forces and the Detection of Natural Selection

potentially under selection has identified k alleles with significant additive effects on that trait. The additive effects of each locus can be combined into a polygenic score: k

β i pi t

Z t =2

12 21

i=1

where Z(t) is the polygenic score at time t, pi(t) is the frequency of allele i at time t, and βi is the additive effect of allele i that has been rescaled such that the other allele (biallelic SNPs are assumed) has a rescaled additive effect of 0. Polygenic scores are commonly used in human genetics, particularly as risk scores for clinical traits (Templeton 2018a). Before proceeding, we note some limitations of these scores. First, note that βi is not a function of time, but rather is treated as a fixed constant estimated from a current GWAS even though allele frequencies are allowed to change with time. Recall that both average effects and average excesses are functions of allele frequencies and are not constant when allele frequencies change (Chapters 8–10). Hence, Eq. (12.21) would only be approximately true if all allele frequencies change little over the time period being studied. Second, Eq. (12.21) assumes additivity throughout all times based on the additivity observed in the current GWAS. As shown in Chapters 8 and 10, epistatic effects are hidden in the current additive effects and can greatly change the apparent additive effects as allele frequencies change. This reinforces the need to limit Eq. (12.21) to situations when allele frequency changes are modest at best. Given this limitation, Eq. (12.21) can be implemented over time if we can estimate allele frequencies at different time periods. However, we are assuming only a sample from the current time period. If a time-calibrated haplotype tree can be estimated containing allele i, then the simplest estimator of the frequency of allele i at time t in the past is the proportion of DNA lineages at time t that contain allele i. As one goes deeper into the past during the coalescent process, the number of total DNA lineages goes down (Chapter 5), so the error in estimating allele frequencies in the past increases with increasing time into the past, which limits how far into the past one can look. Edge and Coop (2019) give two more complicated estimators of past allele frequencies based on coalescent waiting times and the local rates of coalescence that allow past population sizes to change. Given coalescent estimators of allele frequencies at various times in the past (somewhat similar to obtaining estimators of population size in Skyline plots, as discussed in Chapter 7), Eq. (12.21) can be estimated for these times. The normalized change in the polygenic score between two adjacent time points is then calculated as Xj =

Z tj −Z t 2V A t

j−1

12 22

j−1

where tj−1 is the time closer to the present and VA is the additive variance calculated from the βI’s and the pi(t)’s. Under neutrality, the sum of the square of Eq. (12.22) over all w time points has a chisquare distribution with w degrees of freedom (Appendix 2). Edge and Coop applied this method to the same height data analyzed by Field et al. (2016) as described earlier but without making use of ancient DNA information. Now, there was no significant evidence of selection on height or change in Z over time with all three estimators of past allele frequencies. However, when just the present and most recent time point in the past are used (when past allele frequencies would be estimated most accurately and the restriction of using Z only for modest allele frequency changes would most likely be true), then there was a significant chi-square of 7.0 with 1 degree of freedom (p = 0.0112). GWAS can be applied to any phenotype, and that would include fitness since fitness itself is a phenotype (Chapter 11). Barban et al. (2016) therefore performed a GWAS in the human samples of European ancestry on the fitness traits of age at first birth and lifetime reproductive success (the

473

474

Population Genetics and Microevolutionary Theory

total number of children one has), a common measure of fitness (see Chapter 15). They identified 12 independent loci that were associated with these fitness traits, and these loci are likely under selection. Indeed, one of these loci had a significant signal of positive selection as detected by iHS. Chaves et al. (2016) performed GWAS on beak and body size in three species of Darwin’s finches. These phenotypes are known to be strongly selected in this extensively studied group of birds (Grant and Grant 2014). Chaves et al. uncovered 11 SNPs for these fitness-related traits. Six of these SNPs explained over 80% of the variation in beak size and were dispersed over several chromosomes. The well-documented selection on beak size therefore had a polygenic response.

The Neutralist/Selectionist Debate The articulation of the neutral theory began an intense debate within population genetics (Chapter 5). The neutral theory predicted that much molecular evolution was governed by genetic drift and mutation, with selection being mostly confined to eliminating deleterious alleles with an occasional and rare selective sweep. Others reacted strongly against this model of evolution and argued that selection, including balancing selection, was important. We can now see that both schools were right. Background selection, based on neutral and deleterious mutations, is now well documented. The many methods that we now have for detecting selection have revealed that positive selection is also a common force of evolutionary change and for maintenance of polymorphism. For example, 8% of human protein-coding genes have significant signals of balancing or diversifying selection (Bitarello et al. 2018), 19% of all fixed substitutions between Drosophila melanogaster and D. simulans appear to be fixed by natural selection (Lange and Pool 2018), 45% of all amino-acid substitutions may have been fixed by natural selection in Drosophila (Smith and EyreWalker 2002), and only 20% of amino-acid substitutions in humans are estimated to be neutral (Fay et al. 2001). Both background selection and positive selection, particularly diversifying selection for local adaptation, shape genomic diversity patterns (Rettelbach et al. 2019). Neutrality versus selection is no longer a central question in population genetics. The more interesting questions are now about how various evolutionary forces act and interact upon this genetic diversity in the gene pool.

475

13 Units and Targets of Selection Most genetic models used in this book have been single-locus models. This emphasis upon singlelocus models is typical of much of population genetics. One of the main reasons for the dominance of single-locus models is mathematical and computational tractability. Such rationales do not necessarily justify biologically the dominance of single-locus models. The biological adequacy of a single-locus model in describing evolution is a troubling issue because qualitatively new biological features can emerge as soon as we go beyond the single-locus models. For example, we saw in Chapter 2 that the important conclusion of “no evolution” under the assumptions of the singlelocus Hardy–Weinberg model is not necessarily true for a two-locus model. This reliance upon single-locus models greatly bothered Ernst Mayr (1959, 1970), one of the major architects of the neoDarwinian theory of evolution. He called such single-locus population genetic models “beanbag genetics” in which each “bean” (locus) is studied independently and then added together in a beanbag to reconstruct the whole. Mayr pointed out that the individual is the “target of selection” and not a single locus. It is the individual that lives or dies, mates or fails to mate, is fertile or sterile. Any single locus may contribute to the individual’s fitness, but because of epistasis and coadapted complexes, that single-locus contribution must be placed into the context of the genotype as a whole. Mayr was strongly influenced by the earlier work of Chetverikov (1926) who suggested that the individual is not divisible into discrete traits coded for by individual genes, but instead the individual is a product of the genotype as a whole. Mayr called this idea “the unity of the genotype.” Mayr (1970) pointed out an important complication to the concept of the unity of the genotype as applied to evolution; namely, that the individual’s unified genotype is broken apart in meiosis and fertilization. As a result, although there is the unity of the genotype at the individual level, there is no continuity of an individual’s genotype across generations. Indeed, we now know that in many species each individual genotype is a unique event, never to be replicated in the history of the species (Chapter 1). Evolutionary predictions can only be made when there is genetic continuity over space and time (premise 1 in Chapter 1). An allele at a single-locus displays such continuity (Chapter 1) even though the genetic background upon which that allele can be placed is unique for each individual and is constantly changing across generations. Therefore, Mayr realized that despite the individual being the “target of selection,” the individual is not the meaningful genetic unit for measuring the evolutionary response to natural selection. Instead, Mayr argued that fitness must be used in a “statistical” sense at the level of a reproducing population in the context of its gene pool. This same idea was expressed somewhat differently in Chapter 11. In that chapter, the fundamental equation of natural selection for a measured genotype and the fundamental theorem of natural Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

476

Population Genetics and Microevolutionary Theory

selection for unmeasured genotypes both reveal that the only fitness effects that influence the response to natural selection are those transmissible through a gamete. We showed repeatedly in Chapters 11 and 12 that understanding the evolutionary response to natural selection requires taking the gamete’s perspective, not the individual’s. This is also why we measured fitness in Mayr’s “statistical” sense as a genotypic value, the mean phenotype of a genetically defined group of individuals that automatically averages over all genetic backgrounds that are not used to define the group. The gametic perspective for the response to natural selection potentially undermines Mayr’s use of the “unity of the genotype” as an argument against “beanbag” genetics. Just as meiosis and fertilization thoroughly break down and rearrange the individual’s total genotype, meiosis and fertilization also ensure that the meaningful genetic unit for predicting the response to selection is something far less than an individual’s genotype at all loci in the genome. Could this genetic unit be as small as a single locus? The pervasive occurrence of epistasis noted by Mayr and pointed out in Chapter 10 does not necessarily imply that a single locus is not a meaningful genetic unit for the response to natural selection. Recall from Chapter 10 the example of epistasis between the ApoE and LDLR loci for the phenotype of total serum cholesterol. Despite strong epistasis in this case, it made little difference whether or not this was treated as a two-locus system or a one-locus system in the context of a particular gene pool (Figure 10.19). As pointed out in Chapter 10, epistasis contributes to the additive (i.e. “beanbag”) variance; and indeed in many possible gene pools virtually all of the epistasis appears as an additive effect associated with just one locus (Figure 10.19). In the case given in Figure 10.19a, selection on the phenotype of total serum cholesterol would induce a response at the ApoE locus and virtually none at LDLR. In that case, the selective response could be predicted quite well just using the ApoE locus alone in the context of the gene pool with allele frequencies close to those observed in the study populations of Pedersen and Berg (1989) (Figure 10.19a). Mayr correctly argued that the genetic response to selection must be placed in the context of the gene pool, but once placed in such a gene pool context, it is possible that virtually all of the epistasis found in a multigenic complex may be statistically allocated to the average excesses or effects of single loci. Thus, the “unity of the genotype” does not necessarily undermine the single-locus approach of beanbag genetics. Mayr’s discussion of these issues brings into focus two separate but often confused issues in population genetics. The first is the unit of selection, the level of genetic organization that allows the prediction of the genetic response to selection. As shown in Chapter 11, fitnesses in population genetics are assigned to a genotypic class of individuals rather than to individuals themselves (the “statistical” fitness of Mayr). What was not addressed in Chapter 11 was the level of genetic organization that defines these genotypic classes; the genotypic classes can be single-locus genotypes, or two-locus genotypes, etc. The unit of selection is the level of genetic organization to which a fitness phenotype can be assigned that allows the response to selection to be accurately predicted. This means that the unit of selection must have genetic continuity across the generations. For example, if the unit of selection is a multi-locus unit, then the combinations of alleles across the loci in this complex must recur in evolutionary time and not be unique events. The requirement for continuity over the generations limits units of selection to a number of loci that must be orders of magnitude less than the total number of loci in the whole genome, at least in outbreeding populations. The second issue raised by Mayr is the target of selection, the level of biological organization that displays the phenotype under selection. Mayr only discussed one target of selection, the individual. Individuals were the target of selection in the models given in Chapters 11 and 12. However, we shall see in this chapter that fitness phenotypes can be assigned to biological levels both below and above the level of the individual.

Units and Targets of Selection

Units and targets of selection should never be confused. The unit of selection is always some level of genetic organization that recurs over time and space. A target of selection is some level of biological organization that displays a phenotype that influences the probability of the unit of selection’s recurrence over time and space. Sometimes the unit and target of selection can be the same. For example, a transposable element is a level of genetic organization, but it also displays a phenotype (transposition) that influences its chances for recurrence and is therefore a target of selection. In general, however, units and targets of selection are different entities. In this chapter, we will discuss both units and targets of selection, starting with the problem of the unit of selection.

The Unit of Selection In Chapter 11, we derived the fundamental equation for natural selection for a measured genotype (Eq. 11.5) using a single-locus genetic architecture. This equation can be generalized to genotypes defined by two or more loci. We consider now a two-locus, two-allele model in a random mating population of the sort used in Chapter 2. As shown in Chapter 10 (e.g. Table 10.6), genotypic values can be assigned to genotypes defined by two loci. In our current model, we will let wim/jn be the genotypic value of the fitness phenotype of individuals (our target of selection) sharing the two locus genotype im/jn where im denotes the two-locus gamete type received from one parent and jn the gamete type from the other parent. Hence, i and j denote the genotype at the first locus (with alleles A or a), and m and n the genotype at the second locus (with alleles B and b). We also assume that cis and trans double heterozygotes have the same fitness, that is, wAB/ab = wAb/aB. Using these two-locus genotypic fitnesses as weights in the same manner done in deriving Eq. (11.4) but with a two-locus genotype model like that given in Figure 2.4 from Chapter 2, the change in gamete frequencies can be derived (a useful exercise) as follows: wAB ab gAB aAB − rD w w wAB ab gAb = aAb + rD w w wAB ab gaB = aaB + rD w w wAB ab gab = aab − rD w w

ΔgAB = ΔgAb ΔgaB Δgab

13 1

where gim is the frequency of gamete type im in the gene pool, r is the recombination frequency between the two loci, D is the linkage disequilibrium between the loci in the gene pool, and aim is the average excess for fitness of the two-locus gamete im. For example, the two-locus average excess of the gamete bearing the AB allelic combination in a random mating population is: aAB = gAB wAB

AB

− w + gAb wAB

Ab

− w + gaB wAB

aB

− w + gab wAB

ab

−w

13 2

Note that the first terms in Eq. (13.1) are similar to Eq. (11.5). This first term states that the response to natural selection for this two-locus system is driven by the gametic perspective, in this case, gametes defined by two loci. However, unlike Eq. (11.5), Eq. (13.1) contains a second term that reflects evolution not driven by the gamete’s perspective as measured by average excess. This reflects the fact, pointed out in Chapter 2, that recombination and linkage disequilibrium are also forces of evolutionary change in a two-locus system. This insight goes back to Mayr’s point that the individual’s total genotype is broken apart and scrambled by meiosis and fertilization. Meiosis (including recombination and assortment) destroys the multi-locus genetic continuity that can

477

478

Population Genetics and Microevolutionary Theory

be passed on to the next generation though gametes. Hence, the meiotic factors that break down multi-locus genotypes are a major determinant of the unit of selection. The more recombination, the less likely it is to have a multi-locus unit of selection. Interestingly, the antagonistic relationship between linkage disequilibrium and recombination may frequently make the role of the second terms of Eq. (13.1) less important in the long run. In equilibrium models, D only achieves large absolute values when r is small, and when r is large, D tends to be small in absolute values. Hence, over a broad range of conditions, the product rD will be small and the dynamics of the two-locus system will be dominated by the first terms in the right halves of Eq. (13.1). However, important exceptions exist. When a mutation first occurs, it displays maximal disequilibrium with previously existing polymorphisms (Chapter 2). Similarly, founder events, bottleneck events, and admixture events can all generate extensive linkage disequilibrium. In such cases, the second term in Eq. (13.1) may have a large effect and the two-locus system could evolve in ways unpredictable from natural selection alone (the average excesses of fitness). Note also that the second terms in the right-hand sides of Eq. (13.1) are a product of the neutral evolutionary impact of recombination and disequilibrium revealed in Chapter 2 (rD) times a weight determined by genotypic fitnesses (wAB/ab/w). Note that this weighting contains the genotypic value wAB/ab. What is a specific genotypic fitness doing in an equation about changes in gamete frequencies? The answer lies in the special role that the double heterozygote class plays in evolutionary change driven by recombination and linkage disequilibrium. As pointed out in Chapter 2, recombination is a force for change in two-locus gamete frequencies only in double heterozygotes. Therefore, any factor that influences the frequency of double heterozygotes plays a direct role in modulating the evolutionary impact of recombination, particularly under non-equilibrium conditions that induce linkage disequilibrium. Among those factors is the fitness of the double heterozygote class relative to the average fitness of the population as a whole. If the double heterozygote is more fit than the average individual, selection accentuates the effect of recombination in breaking down linkage disequilibrium; on the other hand, if the double heterozygotes are less fit than average, selection reduces the importance of recombination. The second terms in Eq. (13.1) reveal that qualitatively new features, including genotypic as well as gametic measures of fitness, influence multi-locus evolution. Thus, the average excess is not always the sole arbiter of multi-locus natural selection as it was for single-locus natural selection. This makes the problem of the unit of selection even more important, because if multi-locus complexes are the true units of selection, then recombination, linkage disequilibrium, and the fitnesses of specific genotypes modulate the impact of natural selection in addition to average excess. For the two-locus model given in Eq. (13.1), the question of the unit of selection becomes the question of whether or not the course of selective response can be adequately described by looking at each locus separately (Eq. 11.5) or must we consider the entire two-locus complex as a unit (Eq. 13.1). Consider just the first locus. Let p be the frequency of the A allele at this locus. Since p = gAB + gAb, the twolocus system described by Eq. (13.1) implies that the dynamics of this single locus, assuming rD ≈ 0, is given by: Δp = ΔgAB + ΔgAb = =

gAB g aAB + Ab aAb w w

p gAB g aAB + Ab aAb w p p

13 3

The term gAB/p is the conditional probability of an A allele being coupled with a B allele given that the gamete carries an A allele, and similarly gAb/p is the conditional probability of an A allele being coupled with a b allele given that the gamete carries an A allele. Hence, the term in parenthesis in

Units and Targets of Selection

Eq. (13.3) is an extension of the concept of average excess (a conditional genotypic deviation given a gamete bearing the A allele) that includes the genetic background defined by the second locus (B or b). If we assume that there is no Mendelian epistasis between the two loci such that the fitness of a two-locus genotype is the sum of the marginal fitnesses of the two loci (e.g., wAB/AB = wA/A + wB/B), then, after a little algebra, we have that aAB = aA + aB and aAb = aA + ab. Substituting these additive fitnesses into Eq. (13.3) yields: Δp =

p g aB + gAb ab aA + AB w p

13 4

Recall that, under the Fisherian model (Chapter 8), the sum of the average excesses over all alleles at a locus weighted by their allele frequencies must be zero. This implies that the second term in the parenthesis of Eq. (13.4) will often be small because the average excess of the B allele must be of opposite sign to the average excess of the b allele. Indeed, if there is no linkage disequilibrium (D = 0), then gAB/p = v, the frequency of the B allele, gAb/p = (1 − v), and the second term in parenthesis in Eq. (13.4) is exactly 0. Under these conditions of no Mendelian epistasis and no linkage disequilibrium, Eq. (13.4) reduces to: Δp =

p aA w

13 5

Eq. (13.5) is identical to Eq. (11.5), the fundamental theorem of natural selection for a measured genotype at a single locus (Chapter 11). Hence, in the absence of epistasis and linkage disequilibrium, the unit of selection is exactly a single locus. When there is some linkage disequilibrium but no Mendelian epistasis, Eq. (13.4) predicts some deviations from single-locus dynamics at the A locus due to hitchhiking effects from selection at the B locus. Without epistasis favoring certain A/B combinations, we would expect linkage disequilibrium to decay, and Eq. (13.4) would approach Eq. (13.5), ultimately yielding a single locus unit of selection. When there is much Mendelian epistasis, the fitness effects associated with A are increasingly context dependent upon the alleles at the B/b locus. This makes Eqs. (13.4) and (13.5) less likely to be valid. Accordingly, Mendelian fitness epistasis between two loci is necessary to have a unit of selection beyond the singlelocus level. However, as we saw in Chapter 10, even when epistasis is present, it can exist mostly in the form of single-locus additivity, as shown in Figure 10.19. In the ApoE/LDLR example, the response to selecting on cholesterol level would be highly predictable as a single-locus response of the ApoE ε4 allele despite the fact that epistasis with the LDLR A1 allele is essential for affecting cholesterol levels. This example shows that epistasis is necessary but not sufficient to have a multilocus unit of selection. Another simple way of predicting multi-locus fitnesses from single-locus fitnesses occurs when the fitnesses of each locus are statistically independent and the fitnesses are rescaled probabilities (such as the viability fitness components in Chapter 11). Under these conditions, the multi-locus fitnesses are the products of the single-locus fitnesses. Multiplicative fitnesses do generate deviations from additivity. However, under the ideal Fisherian conditions in which beneficial fitnesses deviations are small (Chapter 12, Figure 12.11), multiplicative effects can be adequately approximated by additive effects. This approximation breaks down as the intensity of selection increases. Hence, another factor influencing the unit of selection is the intensity or magnitude of selection. Neher et al. (2013) have extended these insights to a multi-locus model with genetic drift. They assume weak selection on many loci on a single chromosome. L is the length of the chromosome as measured by the number of sites/loci, with a constant r being the recombination rate between sites. The total map length of the chromosome is R = rL. To measure genetic continuity across

479

480

Population Genetics and Microevolutionary Theory

generations, Neher et al. measure the impact of recombination by the average number of sites that has not been disrupted by recombination after t generations, ξ(t), which is: ξt =

L 1 ≈ 1 + Lrt rt

13 6

We can see from Eq. (13.6) that the average length of a chromosome segment that has not been disrupted by recombination decreases with the increasing recombination rate and increasing time. Selection is then measured by σ 2b, the variance in fitness associated with local chromosome block b. Fitness epistasis within this block means that some multi-site combinations have high fitness while other combination have low fitness—a pattern that increases the variance of fitness in the block. Therefore, σ 2b is a measure of fitness epistasis, with σ 2b increasing as epistasis increases. Neher et al. also include the impact of genetic drift through a coalescent model, and by combining some of the equations found in Neher et al. (2013), we can derive the expected length of a recurring fitness block that constitutes a unit of selection, ξb, as: ξb =

σb r

2 log N ec

13 7

where Nec is the coalescent effective size, which is often closely related to the inbreeding effective size (Chapter 5). The unit of selection, ξb, increases in size with increasing fitness epistasis (σ 2b ), decreasing recombination rate (r), and increasing genetic drift (decreasing Nec). The impacts of epistasis and recombination upon the unit of selection are as expected from our previous discussion, but the role of genetic drift is novel. Recall from Chapter 4 that genetic drift induces linkage disequilibrium (e.g. Figure 4.8), reduces rate of decay of disequilibrium with recombination distance (Colonna et al. 2013), and creates longer, stable haplotypes (Panoutsopoulou et al. 2014). All of these drift factors increase the length of the unit of selection. Despite the importance of the concept of the unit of selection in population genetics, there have been few experiments investigating the unit of selection. One of the few experiments was conducted on the fruitfly Drosophila mercatorum (Templeton et al. 1976). This fly is normally a sexually reproducing species, like most other species of Drosophila. However, when virgin females are isolated from some strains, they lay unfertilized eggs that can successfully develop into viable adult females (Carson 1967), a phenomenon known as parthenogenesis (virgin birth). In D. mercatorum, the unfertilized eggs undergo a normal meiosis (and hence segregation and recombination) followed by the mitotic duplication of a haploid nucleus. Two such haploid nuclei fuse to form a diploid nucleus, which then undergoes development. The resulting adults are all female (as they have two X chromosomes) and are homozygous at all loci. Effectively, these females are doubled haploids, which means that haplotype blocks on chromosomes can be observed directly. Because the parthenogenetic females are diploid and retain normal meiosis, they can reproduce sexually if given the opportunity to mate with a male. Because there is no recombination in males in this species and visible markers exist for the chromosomes, it is possible to breed a male that is homozygous for the same autosomes as a parthenogenetic strain and has its X chromosome from the parthenogenetic strain, with only a single Y chromosome being introduced from a sexual strain (Templeton 1983b). Using such males, different parthenogenetic strains can be crossed, and the resulting F1 females were isolated as virgins to produce a parthenogenetic population through a normal meiosis with full segregation and recombination. These parthenogenetic F2 females are totally homozygous F2, so chromosome blocks could be directly observed using a combination of visible and isozyme markers (Appendix A). Fitness was estimated as the egg to adult viability of multi-locus genetic

Units and Targets of Selection

marker combinations. Finally, the intensity of selection was manipulated in these experimental populations to three different levels with absolute viability ranging over an order of magnitude. These experiments revealed much epistasis for viability, and the unit of selection was typically at the multi-locus level, with little selection detected by single loci. There was a strong interaction between the intensity of selection and recombination in determining the size of the unit of selection. The unit of selection increased in size over larger recombinational distances with the increasing intensity of selection, whereas when selection was at its least absolute intensity, only the closest linked markers behaved as units of selection. Hence, all the basic predictions about the impacts of epistasis, recombination, and selective intensity upon the unit of selection were empirically confirmed. These experiments also give support to the weak selection with epistasis model of Neher et al. (2013). These experiments revealed that single crossover events resulted in a more drastic reduction in viability than double crossover events. This observation makes sense in the model of Neher et al. because a single crossover event actually yields a chromosome with more disruption via recombination than a double crossover. The second crossover undoes much of the disruption caused by the first crossover, resulting in a chromosome that has a smaller recombination block on the average than a single crossover would produce. The theoretical work of Michalakis and Slatkin (1996) also shows the impact of genetic drift, epistasis, the intensity of selection, and recombination on the unit of selection. They investigated a twolocus model on a single chromosome of natural selection starting in a population initially fixed for the a and b alleles, respectively, at the two loci. Suppose mutations can occur to yield the A and B alleles at these two loci such that fixation for A and B takes the population to the highest adaptive peak, but negative epistasis exists in this case such that the aB and Ab gametes are associated with deleterious fitness consequences, that is, there is a fitness valley. Under Fisher’s fundamental theorem of natural selection (Chapter 11), the population would have to remain on the lower adaptive peak. However, Michalakis and Slatkin showed that selection and drift can interact to cause a shift from the lower to higher adaptive peak in this case, consistent with shifting balance theory (Chapter 12). They investigated the impact of selective intensity and amount of recombination upon the evolutionary trajectory taken during this peak shift. When selection was intense and recombination was low, populations generally remained at the ab peak until mutations had occurred to produce the double mutant gamete AB. Then, natural selection would drive the AB gamete to fixation. In this case, the unit of selection was the two-locus supergene AB. In contrast, when selective intensities were weaker and/or recombination was stronger, Michalakis and Slatkin found that other evolutionary trajectories to the new adaptive peak usually occurred. For example, we saw in Chapter 12 that even a deleterious mutant has a finite probability of fixation in a local population with small variance effective size. Once fixed for one of the mutants at one of the loci, selection will now favor in this new genetic background the fixation of the mutant allele at the other locus once it is created by mutation or enters the population via gene flow. In general, Michalakis and Slatkin found that the evolutionary trajectories were such that the population will be monomorphic for one of the two loci most of the time, which implies that at any given time the unit of selection is a single locus because only one locus is responding to selection most of the time. Moreover, although epistasis was critical in defining the fitness surface and driving the evolutionary process, there would be little opportunity to detect the presence of epistasis in these evolutionary transitions because, as shown in Chapter 10, epistasis is not apparent when a critical allele at one locus is very common. This theoretical work reinforces our earlier conclusion that epistasis is necessary, but not sufficient, to result in multi-locus units of selection. Note also that in this case a coadapted complex (Chapter 12) has evolved characterized by much epistasis, but the evolution of this complex can occur primarily through single loci being the units of selection.

481

482

Population Genetics and Microevolutionary Theory

The work of Michalakis and Slatkin (1996) also suggests that another factor can influence the unit of selection: population subdivision. As discussed in Chapter 12, shifting balance works best in a subdivided population with local demes having a small variance effective size but the total population having a large variance effective size. As pointed out in Chapter 6, population subdivision with limited amounts of gene flow and admixture itself can induce linkage disequilibrium. Moreover, subdivided populations experience a Wahlund effect (Chapter 6), which reduces the overall frequency of double heterozygotes and thereby effectively reduces recombination rates. Thus, linkage blocks built up by drift and selection are less disrupted by recombination and have greater stability through time in a subdivided population. Because of the special role played by double (or higher) heterozygotes (Eq. 13.1), the unit of selection is also sensitive to the system of mating. Populations with an inbreeding system of mating ( f > 0) have fewer double heterozygotes and therefore less “effective” recombination. Thus, the unit of selection should be broader in an inbreeding population than in a population with f ≤ 0. Such large, multi-locus units of selection have indeed been demonstrated in plant populations with much selfing (Jain and Allard 1966; Weir et al. 1974). Bürger and Akerman (2011) showed that subdivided populations can have high levels of linkage disequilibrium even between weakly linked loci, allowing the preservation of locally adapted haplotypes. Because of their persistence through time, such adaptive haplotypes can acquire slightly beneficial modifier alleles at other linked loci, resulting in the emergence of clusters of linked adaptive genes. Note once again that the accumulation of these modifiers through selection on their epistatic effects on the original adaptive haplotype occurs with the modifier loci acting as units of selection. As we saw in the previous paragraph, coadapted complexes can evolve through singlelocus units of selection, although once they have evolved, the whole coadapted complex is maintained as a multi-locus unit of selection. Selection on these multi-locus adaptive clusters is difficult to detect with the standard single-marker assays for selection and local adaptation, such as fst outlier analysis (Chapter 12), so multi-locus approaches that focus on the covariances or correlations across loci are needed (Le Corre and Kremer 2012). An example of such an analysis based on CCC networks (Chapter 2) will be given in Chapter 14. Overall, epistasis, strong selection, genetic drift, inbreeding (or assortative mating), and population subdivision favor the build-up of multi-locus adaptive complexes. These complexes are then partially broken down during the process of meiosis and the transmission of genetic material to the next generation through the gametes. The unit of selection emerges from the balance between natural selection, drift, system of mating, and limited gene flow working on epistatic systems to build up multi-locus complexes versus the factors of meiosis and fertilization that break apart these same complexes. We have already seen an example of a higher-order unit of selection in Chapter 12: the supergene involving the multi-locus β-γ gene region (Figure 2.7) in humans living in some malarial regions. Supergenes are not a rare phenomenon, and there are good theoretical reasons for this. One common method for the origin of a new gene is as a tandem duplicate of another gene. In some cases, the duplicated locus retains the same function as the original gene, such as the two copies of the α-Hb genes discussed in Chapter 11. In other cases, the duplicated gene diverges from the original gene, acquiring new or modified functions, as occurred in the β-γ gene region shown in Figure 2.7. This results in a set of functional related but divergent genes that are in tight linkage. This combination is found in many multi-gene families and creates ideal conditions for the evolution of a supergene. The physical clustering of functionally related genes displaying epistasis may be a general feature of many genomes. For example, Costanzo et al. (2016) tested 90% of the 6000 genes or so in the genome of the yeast Saccharomyces cerevisiae for all possible pairwise genetic interactions. They found that genes exhibiting more similar interaction profiles are located closer to each other in the genome, whereas genes with less similar interaction profiles are positioned farther apart. This

Units and Targets of Selection

genomic structure predisposes to the evolution of supergenes, although Yang et al. (2017) argue that gene order itself could be under natural selection, and selection would favor the type of gene order found by Costanzo et al. (2016). The model of Bürger and Akerman (2011) discussed earlier also shows that selection would favor the evolution of recombination and inversions that act as crossover suppressors to help maintain the supergene complex. The association of supergenes and inversions is commonplace (Llaurens et al. 2017). Singer et al. (2005) analyzed chromosome breakpoints during mammalian evolution, and found a nonrandom distribution indicating that clusters of coexpressed genes are being held together by natural selection. Many examples of supergenes are now known. For example, we have discussed in Chapters 10 and 12 the genes controlling wing color and pattern in the Müllerian mimicry complexes found in the genus Heliconius. Many of these genes are in reality supergene complexes (Pardo-Diaz and Jiggins 2014; Kronforst and Papa 2015; Saenko et al. 2019), including ones protected by an inversion (Edelman et al. 2019). Supergenes have also been found in many other species, both invertebrates (Kunte et al. 2014; Nishikawa et al. 2015), vertebrates (Kupper et al. 2016; Lamichhaney et al. 2016a; Taylor and Campagna 2016), and plants (Gould et al. 2017). The unit of selection arises from the interaction of selective intensity and genetic architecture (primarily recombination and epistasis) in the context of population properties (e.g. system of mating, effective sizes, degree of subdivision, etc.). The unit of selection is not just a function of the “unity of the genotype”; rather, the unit of selection is a dynamic compromise between selection building up coadapted complexes and recombination breaking them down. Frequently, particularly in the context of local adaptation in a subdivided population, a coadapted gene complex arises piecemeal from single loci acting as units of selection, but once evolved selection maintains the coadapted complex as a higher order unit of selection. Thus, the unit of selection can change as the population evolves or experiences altered demographic conditions.

Targets of Selection Below the Level of the Individual In Chapters 11 and 12, fitness phenotypes were always assigned to individual organisms. However, we saw that the fitness values of specific individuals were important only through their contribution to genotypic values, that is, the average fitness phenotype for a group of individuals sharing a common genotype for a given genetic unit of selection. Consequently, to study natural selection, all we need is a group that shares a common genetic state. We can then assign a “genotypic value” of fitness to that group. There is nothing about the theory of natural selection that requires the biological units being pooled together and assigned an average fitness value to be individual organisms. Natural selection can occur as long as the group we define has some replicable genetic identity that displays a fitness phenotype, and there is heritable variation in the population for the fitness phenotypes. The genetic identities do not have to correspond to individual level genotypes, although the response to selection will still be modulated by what fitness variation can be passed through time via gametes. In this section, we will discuss targets of selection that are at levels of biological organization below the level of individuals (that is, nested within individuals).

The Genome As pointed out in Chapter 1, genomes are the physical–chemical structures in which the processes of mutation and recombination are carried out. When we add natural selection, genomes are the physical–chemical structures in which the processes of adaptive evolution are manifested. For

483

484

Population Genetics and Microevolutionary Theory

example, when there is a hard selective sweep (Chapter 12), a mutation occurs at a site within the genome and then sweeps to fixation, thereby altering the genome of that species for that nucleotide site and often nearby sites as well. However, the target of selection in the cases of the hard sweeps given in Chapter 12 was not the genome, but rather traits expressed by individuals. Another example emerges from our discussion of the unit of selection, where selection for coadapted gene complexes and supergenes can also lead to selection for altering gene order in the genome or structural changes like inversions that suppress recombination. Once again, genomes are being altered by adaptive evolution, but the target of selection can be an individual butterfly and its wing colors and patterns in the context of a Müllerian mimicry complex. To show that the genome is a target of selection, it is not enough to show that genomic properties have been shaped by natural selection; rather, we must identify genetic units of selection within the genome that have selectable phenotypes expressed at the genomic level that can alter genome structure and thereby be passed on to future generations. As already hinted at in Chapter 1, genomes do contain elements within them that make the genome a legitimate target of selection. One such class of elements are mutational motifs. As discussed in Chapter 1, even single nucleotide site mutation is often nonrandom at the molecular level. The phenotype of mutation at a site is highly sensitive to the genomic environmental context as determined by its nucleotide neighbors. Such nonrandom mutagenesis is a selective force at the genome level because certain sequence motifs are destroyed by mutations while others are favored. For example, CG dinucleotides are hypermutagenic when the cytosine is methylated, making a C to T transition highly likely, and mutability can be further enhanced by the 5 nucleotide adjacent to a CG pair (Baele et al. 2008). As a result, CG dinucleotides are selected against at the genomic level through the phenotype of hypermutation. The impact of this genomic selection is shown in Table 13.1 for the human LPL gene (Templeton et al. 2000a). All 16 dinucleotide pairs are shown along with their observed numbers and frequencies. The null hypothesis is that dinucleotide pairs are placed together at random, that is, their expected frequency is just the product of the two nucleotide frequencies. As can be seen in Table 13.1, the most severe deviation from this null hypothesis is for CG dinucleotides, which are much rarer than expected. The hypermutagenesis associated with this DNA motif strongly selects against this pairing and thereby greatly affects the dinucleotide composition of the human genome. Methylated cytosines in CG dinucleotides also have a functional role in gene expression (Chapter 1), creating the possibility that the destruction of some CG dinucleotides by hypermutagenesis could have phenotypic consequences at the individual level, and often deleterious ones. Moreover, some C to T transitions occur in coding regions and can lead to amino acid changes, which could also induce selection at the individual level. Huttley (2004) examined CG evolution in the tumor suppressor gene BRCA1 in primates. Overall, he found that CG dinucleotides have the largest substitution rate of all dinucleotides, leading to CG’s being underrepresented in the genomes. However, CG dinucleotides in the protein coding region that would cause nonsynonymous changes had a significant reduction in substitution rate, implying that selection at the individual level was favoring the retention of these CG sites. The overall composition of CG dinucleotides in the genome was affected by the balance of genomic versus individual selection. This illustrates a general theme that will recur in this chapter. Pleiotropy is not limited just to one target of selection, but often affects two or more targets. A full understanding of the evolutionary trajectory of the underlying unit of selection often requires studies at multiple biological levels. Transposons (Chapter 1) are both a unit of selection and a target of selection within the genome as they can replicate themselves within the genome. There are two broad classes of transposons: DNA transposons and retrotransposons. DNA transposons generally move within the genome

Units and Targets of Selection

Table 13.1 The number and frequencies of all possible dinucleotides in the LPL reference sequence, and the ratio of observed to expected dinucleotide frequencies. Dinucleotide

Number (Frequency)

Observed/Expected Frequencies

TT

1024 (0.105)

1.15

TC

594 (0.061)

0.98

TA

640 (0.066)

0.75

TG

684 (0.070)

1.16

CT

697 (0.072)

1.15

CC

511 (0.052)

1.23

CA

716 (0.074)

1.23

CG

88 (0.009)

0.22

AT

739 (0.076)

0.87

AC

483 (0.050)

0.83

AA

890 (0.091)

1.09

AG

709 (0.073)

1.25

GT

481 (0.049)

0.81

GC

424 (0.044)

1.05

GA

575 (0.059)

1.01

GG

478 (0.049)

1.21

Note: The expected dinucleotide frequencies are the product of the two respective nucleotide frequencies. The results for the CG dinucleotide are shown in bold. Source: Templeton et al. (2000a) © 2000, Elsevier.

as pieces of DNA, cutting and pasting themselves into new genomic locations. Retrotransposons duplicate through an RNA intermediate, usually with the original transposon remaining at its original site where it is transcribed. The resulting RNA transcript is then reverse transcribed into DNA, which then can integrate into new genomic locations. In either event, this phenotype of transposition is expressed within genomes and can be a target of selection. A transposon that can make many copies of itself and disperse throughout the genome has a much greater chance of being passed on through a gamete to the next generation than another transposon that has poor replicative abilities. For example, a transposon that exists as a single copy on an autosome will be passed on to the next generation in only half of the gametes. However, a transposon that has produced many copies that are dispersed across many locations ensures that virtually all gametes will carry multiple copies to the next generation. In this manner, those transposons most successful at the genomic level also have greater success in spreading throughout the population. As a result, the genomes of many organisms are filled with many different transposons, with the copy number of particular types of transposons sometimes going into the millions. For example, about two-thirds of the human genome consists of repetitive elements, most of which are transposons (Gorbunova et al. 2014; Rishishwar et al. 2018). Indeed, 17% of the human genomes consists of just one retrotransposon family, LINE-1. There is no doubt that transposition is a major selective force that strongly shapes the genomes of many species.

485

486

Population Genetics and Microevolutionary Theory

There is much variation in the ability to transpose even within a family of transposons that are related by descent within a genome, so the potential for genomic-level selection is high. For example, only about 100 copies of LINE-1 transposons can transpose at all (Nee 2016). There are many factors that contribute to this silencing of transposition. Neutral evolution alone explains some of this inactivation. Suppose a transposon is located in a part of the genome such that it has no effect on the individuals who bear it. It then behaves just like a pseudogene (Chapter 5) and accumulates neutral mutations at a rapid rate, including loss-of-transposition mutations that convert it into an inert element. Selection at the individual level is another explanation. At the individual level, the insertion of a transposon is akin to an insertional mutation, and like most mutations, a newly located transposon can have deleterious effects. A transposon that inserts itself into a part of the genome that leads to individual-level deleterious effects will be selected against just like any other deleterious mutation. This in turn results in selection on the transposon to preferentially insert into parts of the genome where it would have no individual-level effects. Transposons in such locations are subject to neutral evolution deactivation. The deleterious effects of transposition have also resulted in many organisms evolving mechanisms to deactivate transposons. For example, animals have the Piwi-piRNA pathway to silence transposons through a small interfering RNA response (Khurana et al. 2011; Bagijn et al. 2012), and the genes involved in the piRNA pathway themselves are subject to extensive positive selection, indicating continual adaptation to new transposons (Simkin et al. 2013). Other mechanisms for silencing transposition are epigenetic modifications of transposon DNA to prevent transcription, such as methylating CG dinucleotides, blocking the integration of DNA copies of transposons into the genome, TRIM proteins that are part of innate immunity against retroviruses that can also act against retrotransposons, and the APOBEC proteins that can edit the DNA of retroviruses and retrotransposons to damage their open reading frames (Zamudio and Bourc’his 2010; Knisbacher and Levanon 2016; Song and Schaack 2018). Once a transposon has been silenced by any or a combination of these host defenses, it becomes more susceptible to innate inactivation through neutral evolution. The overall amount of transposition in the genome reflects a dynamic balance between these antagonistic targets of selection. Transposon fitness effects at the individual-level are not limited to the loci that experience an insertion in or near them. Because transposons often have multiple copies within the genome, transposon insertions can mediate microhomology-driven DNA breaks, recombination, and repair that result in genomic structural variation and copy-number-variants (Szafranski et al. 2018). The retrotransposon insertions are also bringing in a unit of functional DNA that encodes the ability of transposons to move, replicate, and control transcription. Although often harmful at the individuallevel, these features of transposon insertions can sometimes have beneficial effects as well. Indeed, transposons have played a major role in adaptive evolution at both the microevolutionary and macroevolutionary levels. First, transposons can be used to rewire and fine-tune the transcriptome (Cowley and Oakey 2013). For example, different types of transposons have inserted into the promoter of the hsp70Ba gene that codes for the stress-inducible molecular chaperone Hsp70 in Drosophila melanogaster (Lerman et al. 2003). These transposon insertions underlie the natural variation found in the expression of this gene, and this in turn directly alters two components of individual fitness, inducible thermo-tolerance and female reproductive success. Another example is provided by the Alu transposon that accounts for about 11% of the human genome (Mustafina 2013). Alu elements frequently insert into non-coding regions and modify the expression of nearby genes (Cooper 1999). For example, an Alu sequence in the last intron of the human CD8A gene modulates the activity of an adjacent T lymphocyte-specific enhancer. This particular Alu sequence differs at seven nucleotides from its probable source Alu sequence. Two of these nucleotide changes are in an area of the derived sequence that acts as a transcription factor binding site, and site

Units and Targets of Selection

directed mutagenesis indicates that both nucleotide substitutions are necessary for this function. These results suggest that these nucleotide changes were due to selection at the individual-level directed at this specific inserted Alu sequence. Thus, this Alu unit of selection seems to have been shaped by positive selection for its phenotypic impact at the individual-level. Of greater macroevolutionary importance is that transposons can bring a common transcriptional control to a new set of genes, thereby creating a novel network of coordinated gene expression. Such novel networks may often not function well initially, but several have been fine-tuned to help create novel phenotypes of great evolutionary importance (Oliver and Greene 2011; Chuong et al. 2017). For example, transposons have played the critical role in such fundamental adaptations as the vertebrate immune system and mammalian placenta (Koonin and Krupovic 2015; Chuong et al. 2016). The evolution of the vertebrate immune system was of particular macroevolutionary importance. Vertebrates generally have much longer generation lengths than the infectious agents that attack them, yet the vertebrate immune system effectively allows genetic diversity to be generated and selected on a rapid time scale within individuals—the very attributes of transposons. This nongermline genetic diversity can be generated because our antigen receptor genes are divided into gene segments, called V and J, and a third segment called D at some loci. DNA rearrangements, called V(D)J recombination, of these segments can be generated within the cells of our immune system. This combinatorial mechanism generates huge amounts of variation in the antigen recognition portion of the receptor, and mechanisms exist to preferentially select at the cellular level within individuals those combinations that are most effective in dealing with a particular infectious agent. Note that our immune response represents a type of selection at a level below the individual and involves the movement of DNA elements. These features suggest that V(D)J recombination has evolved from a transposable element, and recent studies on the molecular details of this recombination mechanism strongly indicate that this unique feature of the jawed vertebrate immune system evolved from a transposable element called the RAG transposon. This novel immune system, co-opted from a transposon, constitutes one of the most important adaptive breakthroughs in the jawed vertebrates, an adaptation that arose 450 million years ago and retains its critical adaptive significance to the present. Indeed, one can reasonably speculate that humans, along with many other jawed vertebrates, could have never evolved if it had not been for this RAG transposon. Some transposons display a qualitatively different aspect to their evolution not seen in other targets of selection discussed so far. In all previous cases, no matter how intense the selection is below the level of the individual, the selective response of the unit of selection was always constrained and shaped by the necessity of passing on to the next generation through a gamete. However, some transposons have the ability to “infect” a new individual in a manner independent of gametic transmission. This infectious type of transmission is called horizontal transmission, whereas the transmission to new individuals through a gamete is called vertical transmission. The ability of some transposons for horizontal transmission blurs the line between viruses and transposons, and indeed in many cases no such line is readily discernable. This means that to some extent many transposons evolve as an independent organism and to some extent as a genetic element imbedded within the genome of the host. The most dramatic cases of horizontal transmission are those in which a transposable element infects individuals from a different species. Interspecific horizontal transmission can be detected by constructing the molecular phylogeny of a transposon sequence found in many different species and comparing it the molecular phylogeny of some single copy gene from the same species. If all transposon transmission is vertical, then the two phylogenies should be the same. Horizontal transmission will create topological incongruence between the two phylogenies. Such topological incongruence is shown in Figure 13.1 for the P element transposon found in several species of Drosophila. The topological incongruence shown in that Figure requires a minimum of 11 horizontal transfer events among the 18 species surveyed.

487

488

Population Genetics and Microevolutionary Theory D. dacunhai Drosophila Phylogeny

D. sturtevanti D. emaginalis D. subsaltans D. saltans D lusaltnas D. austrosaltans D. prosaltans D. neocordata

Dsturt Dsubsa54 Dsalt51 Dsubsa29 Dsalt28 Dlusal Dfumi3 Dfumi2 Daustr Dfumi9 Dfumi5 Dprosa Dnebu

D. pavlovskiana

Dpauli10 Dpavlo21 Dpauli9 Dpavlo15 Dpauli5 Dpauli15 Dpauli3 Dpavlo16

D. equinoxialis

Dpauli13 Dpauli4

D. nebulosa D. fumipennis D. insularis D. paulistorum

D. willistoni D. tropicalis D. capricorni D. sucinea S. pallida S. elmoi

P-element Phylogeny

Dequi Dwilli Dtrop Dcapri Dsuci Spallida18 Spallida02 Selmoi4 Selmoi12

Figure 13.1 Comparison of the Drosophila species and P-element phylogenetic histories. Double-headed arrows unite P-element clades with the Drosophila species from which they were sampled. Source: Modified from Silva and Kidwell (2000).

Once a transposon has invaded a new species via horizontal transmission, it can rapidly spread through transposition within the novel genome and by vertical transmission, particularly if the species has a population structure characterized by a random or outbreeding system of mating and much gene flow. For example, prior to 1949, the transposable P elements were not found in strains of D. melanogaster collected throughout the world (Anxolabehere et al. 1988). Starting in the 1950s, a few strains collected in the Americans and in the Pacific and Australia began to have P elements (Table 13.2). When P elements first infect a naïve D. melanogaster, the piRNA pathway is ineffective against them, resulting in P elements spreading across the genome and sometimes leading to sterility (Khurana et al. 2011). However, some P elements inserted into piRNA clusters, allowing them to produce novel and inherited piRNAs that could silence P-transposition. These piRNAs are also maternally transmitted into the cytoplasm of the egg, so offspring of such females were not affected by an outburst of P-element transposition. However, if an infected male mated with a naïve female, the offspring would suffer from the sterility syndrome. However, the fly could recover its fertility as it aged and began to express the novel piRNAs inherited from the father. These outbursts of transposition ensured that virtually all gametes produced by a fly with P-elements would have many copies of P-elements, resulting in highly effective vertical transmission. Over time, the incidence of strains bearing P elements tended to increase in these geographical areas, and moreover P elements spread to populations in Europe, Asia, and Africa. Thus, after the initial horizontal transfer around 1950, it took only about 20 years for P elements to spread throughout D. melanogaster on a global basis. By escaping the constraints of only gametic transmission, some transposable elements have acquired a remarkable strategy for evolutionary success. However, their evolution as an

Units and Targets of Selection

Table 13.2 Number and percentagea of tested strains of Drosophila melanogaster collected in four major geographical regions during five time periods without (Pneg) and with (P ) P-elements. Americas P

Europe and Asia Pneg

P

Africa Pneg

Orient and Australia Pneg

P

0 (0)





0 (0)

11 (92)

4 (80)

1 (20)

9 (75)

3 (25)

4 (68)

2 (33)

9 (39)

14 (61)

11 (37)

19 (63)

16 (43)

21 (57)

Period

Pneg

1920–1949

11 (100)

0 (0)

10 (100)

0 (0)

3 (100)

1950–1959

11 (85)

2 (15)

11 (100)

0 (0)

4 (100)

1960–1969

6 (32)

13 (68)

24 (86)

4 (14)

1970–1979

4 (8)

49 (92)

35 (51)

33 (49)

1980–1986

1 (4)

27 (96)

50 (56)

40 (44)

P

1 (8)

a

Percentages are numbers in parentheses. Source: Anxolabehere et al. (1988). © 1988, Oxford University Press.

independent infectious agent still interacts with targets of selection at and below the level of individuals after horizontal transfer has occurred. These multiple levels of selection are not mutually exclusive, but rather are interactive in how they shape the response of these remarkable and highly successful units of selection that numerically dominate the genomes of many species.

Gametes Nonrandom mutagenesis and transposition are common in both somatic and germline cells, but their primary evolutionary impact occurs when they are in the germline. In addition, the germline experiences some processes that are normally rare or absent in somatic cells. For example, recombination and gene conversion occur primarily in meiosis and not mitosis. Consequently, we now focus on the germline as it progresses from stem cells to mature gametes as a target of selection. Recombination and Gene Conversion

Just as mutagenic motifs exist in the genome, so do recombination motifs (Chapter 1). Moreover, just as mutagenic motifs tend to destroy themselves by the act of mutation, recombination motifs also tend to destroy themselves by the act of recombination. If this was the only selective force operating, then recombination motifs should become extinct; yet, recombination hotspots are abundant in many genomes (about 25 000 in the human genome). This is called the recombination hotspot paradox (Ubeda et al. 2019). One possible contributor to the resolution of this paradox is when a recombination motif also has fitness effects at the individual level. Crossing over (not always resolved as a recombination event) favors the proper pairing of homologous chromosomes during meiosis and hence normal segregation. Without normal chromosomal segregation, abnormal chromosome numbers can appear in the offspring, which are typically strongly selected against. Hence, natural selection at the individual level could be favoring the evolution of new recombination hotspots and resurrection of old ones. Consistent with this hypothesis, recombination hotspots are very dynamic over evolutionary time, but such models do not appear to fully resolve the paradox (Ubeda et al. 2019). An additional individual-level selection that helps resolve the paradox focuses on a different unit of selection, at least in mammals—the gene coding for the protein PRDM9 that binds a specific sequence at a target recombination hotspot. Binding specificity between PRDM9 and its target site is required for the initiation of recombination. As an old binding motif is destroyed at an existing recombination hotspot, selection could occur at PRDM9-like genes for a new motif, thereby creating new recombination hotspots or resurrecting old ones. The model of Ubeda

489

490

Population Genetics and Microevolutionary Theory

et al. (2019) reveals that even low levels of viability selection at the individual level to preserve recombination would resolve the paradox and result in hard selective sweeps at PRDM9-like genes. Indeed, the signatures of such sweeps are found at the PRDM9 locus. Recombination plays a different but important role in multigene families. As transposons spread throughout a genome, they form what is known as a multigene family, that is, many copies of what was originally a single DNA element now coexist at different locations within a single genome. In the case of transposons, this multigene family is frequently dispersed, meaning that the copies are not necessarily found next to one another. Another major type of multigene family is a tandem family in which the copies tend to exist adjacent to one another on the same chromosome, such the cluster of globin genes shown in Figure 2.7. These are not mutually exclusive categories because some multigene families consist of several dispersed tandem clusters throughout the genome. Within tandem families, many mechanisms of unequal exchange exist that allow what is originally a single copy or mutation in this family to duplicate itself and occupy more than one position across the tandem family. Among the mechanisms of unequal exchange in tandem families is gene conversion, which can operate on both tandem and dispersed multigene families. For example, we discussed above how transposons can unite a dispersed set of genes into an integrated expression network. The common DNA motifs found in these transposons can allow gene conversion to occur even between widely dispersed elements. Fawcett and Innan (2019) showed that gene conversion acting between different transposon copies provides a mechanism to spread beneficial mutations that improve the network, allows multiple mutations to be combined and transferred together, and allows natural selection to work efficiently in spreading beneficial mutations and removing disadvantageous mutations. Another common mechanism that works in tandem families is unequal crossing over, as shown in Figure 13.2. To understand better the evolutionary dynamics of a multigene family, we need to extend our concept of genetic homology. Traditionally, genetic homology refers to all the copies of a gene that exist at a particular locus, literally a position in the genome. From coalescent theory (Chapter 5), we expect all such copies to be descendants of a single common ancestral gene, which ties in genetic homology to the more general idea of homologous traits being derived from a common ancestral trait. Multigene families create problems with the traditional definition of genetic homology. As can be seen in Figure 13.2, unequal crossing over can cause a gene originally at just one position or locus to have descendant copies that occupy different positions or loci on the same chromosome. These copies are also homologous in the fundamental sense of being descended from a common ancestor, but they violate the usual concept of genetic homology by occupying different loci or positions in the genome. To accommodate this problem, the concept of genetic homology has been extended from orthology (the original definition of genetic homology of all the copies of a gene occupying the same locus) to include paralogy, sets of genes related by descent from a common ancestral gene but that occupy different loci in the genome. The concept of paralogy is also applicable to dispersed multigene families, such as the LINE-1 and Alu transposon families. Unequal crossover, paralogous gene conversion, and transposition are just three mechanisms that can generate paralogous copies of an originally orthologous set of genes or motifs within a gene (e.g. the paralogous gene conversion model of Fawcett and Innan 2019). Thus, genetic elements affected by such processes represent another target of selection below the level of the individual as they allow the spread of what was originally a single DNA segment to multiple locations in the genome. As with the other targets of selection below the level of the individual, genes in tandem families are often associated with multiple targets of selection. Note from Figure 13.2 that unequal crossing over generates variation in the number of copies of the gene in the tandem family, and this is also true for other mechanisms of unequal exchange. However, there is often selection at the level of the

Units and Targets of Selection

1

1

2

3

1

1

2

3

2

4

2

3

4

4

3

Deletion

4

Duplication

Figure 13.2 Unequal crossing-over in a tandem, multigene family. The numbers represent tandem copies of a repeating DNA unit.

individual for the number of copies in the multigene family. We already have seen an example of this in Chapter 11 with respect to the genes coding for the α-chain of hemoglobin. Normally, this is a simple multigene family consisting of just two tandem copies (Hbα1 and Hbα2) on chromosome 16. As discussed in Chapter 11, it is important for the health of the adult individual that the production of α-chains matches that of the β-chains; otherwise an anemic condition known as thalassemia develops. In humans, this production is normally balanced when a person has two orthologous copies of the β-chain locus and the four orthologous/paralogous copies of the duplicated α-chain locus. The greater the discrepancy in the number of copies of the α-chain family from the normal diploid copy number of four, the more clinically severe is the thalassemia (Chapter 11). Although a mild thalassemic condition can be favored by natural selection in a malarial environment (Chapter 11), natural selection at the individual level usually favors those gametes bearing just two paralogous copies of the α-chain genes because of thalassemia. Thus, although a process of reduction and increase in copy number is constantly occurring within these genes, stabilizing selection ensures that most chromosome 16s in humans bear exactly two paralogous copies. Once again, to understand the evolution of even this simple tandem family, we have to consider the effects of multiple targets of selection. Mechanisms of unequal exchange also interact with other population level evolutionary forces, such as genetic drift. For example, Weir et al. (1985) examined the joint effects of unequal exchange and genetic drift upon shaping the amount and pattern of genetic variation within a tandem multigene family. Recall from Chapter 5 that the overall rate of fixation of neutral, orthologous alleles at a single locus is given by (2Nμ)[1/(2N)] = μ, where μ is the neutral mutation rate and N is the population size. Now consider a tandem multigene family with n tandem repeats per chromosome. We will assume that n is a constant, thereby mimicking the stabilized situation seen with the α-globin genes in which natural selection maintains a nearly constant n in the population. However, we will assume neutrality at the individual-level of the mutants arising within the multigene family. Because unequal exchange allows a gene to spread to paralogous positions, there are now two components to coalescence: coalescence of all orthologous copies at a particular locus (or in the coalescent sense, descent of all orthologous copies from a common ancestral gene), and coalescence of all paralogous copies on the same chromosome to the same ancestral form (in the coalescent sense, the descent of all paralogous copies on the chromosome to a common ancestral gene). Thus, we have a combination of coalescence at the population level of all orthologous

491

492

Population Genetics and Microevolutionary Theory

copies and coalescence at the level of the chromosome of all paralogous copies. However, ultimately all genes in the tandem family are homologous, so all orthologous and paralogous copies will eventually undergo coalescence to a common ancestral gene. The total number of genes in the tandem family is 2Nn, and under neutrality all of these copies are equally likely to become the common ancestor of all future genes in this family. Thus, under neutrality, the probability of coalescence to any particular ancestral gene in this family is 1/(2Nn). Retaining μ as the neutral mutation rate per locus, the rate of input of new mutations into the entire family is 2Nnμ. In analogy to Eq. (5.3), the rate of neutral evolution in the multigene family is: Rate of Neutral Evolution =

1 × 2Nnμ = μ 2Nn

13 8

Equation (13.8) reveals a rather startling conclusion: the evolutionary dynamics of a tandem multigene family under genetic drift and neutral mutation when coupled with a molecular mechanism(s) of paralogous spread are the same as that of a single locus over long periods of evolutionary time. Thus, the entire family, regardless of n, evolves under neutrality as if it were a single locus. The evolutionary impact of unequal exchange is strong but invisible in the right-hand side of Eq. (13.8). Equations (5.3) and (13.8) only refer to long-term evolutionary dynamics and are based on the assumption of ultimate coalescence of all orthologous and paralogous copies to a common ancestral gene, no matter how long that coalescence may take. However, on shorter time scales, the time to coalescence, both in the orthologous and paralogous senses, does matter. As we saw in Chapter 5, it takes an average of 4N generations to go to population coalescence to the common ancestral gene for all orthologous copies at a particular locus. Weir et al. (1985) showed that the expected time to coalescence of all orthologous and paralogous copies is given by: 4N when α > 1 2N 13 9

Expected Time To Coalesence = 2 when α ≤ 1 2N α

where α is the probability of a gene converting a paralogous gene to its state by any applicable molecular mechanism of paralogous spread. In the top case of inequalities (13.9), evolution proceeds with the same coalescent dynamics as a single-locus system. In particular, the α parameter, which measures the strength of the molecular level forces causing paralogous exchange, has no impact on the expected time to coalescence. Note that the molecular level forces become irrelevant to coalescent time when α is strong relative to the population level force of genetic drift, whose strength as we saw in Chapter 4 is measured by 1/(2N). When α > 1/(2N), a gene spreads to paralogous positions within chromosomes more rapidly than drift causes orthologous coalescence between chromosomes. As a result, by the time orthologous coalescence has occurred for a particular chromosome, paralogous coalescence has already occurred on that chromosome. Thus, orthologous coalescence is the limiting step to global coalescence in the multigene family when drift is a weak evolutionary force compared with the molecular forces of paralogous conversion. Because mutations are occurring throughout this process, the rapid intra-chromosomal coalescent dynamics of paralogous copies relative to population coalescence of orthologous copies means that more of the genetic variation in the multigene family exists as differences between chromosomes at the population level rather than among paralogous copies within a chromosome at the genome level. In the bottom case of inequalities (13.9), the molecular level force of paralogous exchange is only as strong as or weaker than genetic drift at the population level. This means that orthologous coalescence will tend to occur within the population at any given locus as or more rapidly than coalescence of the paralogous copies within a chromosome. Hence, the time to total coalescence of all

Units and Targets of Selection

orthologous and paralogous copies is limited by the rate of paralogous coalescence. Now, the time to total coalescence is a function of α, and genetic drift does not influence this expected time. In this case, there will be much variation among the paralogous copies within a chromosome, with the level of variation increasing as α decreases. Note that the molecular phenotype (measured by α) is important in determining the evolutionary dynamics of fixation and patterns of neutral variation only when the molecular phenotype is weak relative to the population-level evolutionary force of genetic drift. This at first may seem counterintuitive, and indeed some have verbally argued that strong molecular level forces will override population level processes in multigene families (Dover 1982). However, the model of Weir et al. (1985) shows that just the opposite happens; strong forces at the molecular level in this case accentuate the importance of population-level evolutionary forces. The resolution of this paradox is to focus upon genetic variation at the population level and the dynamics of a population through time; the standard focus of microevolutionary theory. A strong force that results in rapid coalescence will have little impact on levels of variation over long periods of time; a weak force in contrast can make a substantial contribution to variation during that long time period. Indeed, we have seen this phenomenon before. In Chapter 5, we investigated the level of neutral genetic variation found in a population as a function of μ and genetic drift. One of the fundamental breakthroughs of Kimura’s neutral theory was the realization that large amounts of genetic variation can be maintained in a population whose only population-level evolutionary force is drift. As shown by Eq. (5.7), the levels of neutral variation go up as drift becomes weaker and weaker (N increases). The startling conclusion of the neutral theory was that genetic drift was an important evolutionary force even in large populations in which it is extremely weak, not just small populations, and indeed large populations have more genetic variation influenced by drift than a small population (Chapter 5). So saying an evolutionary force is weak is not the same as saying it is unimportant in evolution; often, it means just the opposite. Inequalities (13.9) have another important implication when the molecular processes of unequal exchange are stronger than drift. Suppose a multigene family is created by a gene duplication event in an ancestral species that subsequently gives rise to two or more present-day species (Figure 13.3). If no molecular mechanisms of unequal exchange existed, a mutation could not spread from its original locus of origin to paralogous locations. Thus, the paralogous copies within a species would not coalesce until sometime before the original gene duplication in the ancestral species (Figure 13.3a), implying they should be very divergent from one another. In contrast, orthologous genes between the current species should coalesce to the time of the original duplication event (Figure 13.3a). Thus, without a mechanism for paralogous spread of new mutants, the orthologous comparisons between species should be more similar than the paralogous comparisons within species (Figure 13.3a). However, when the molecular forces for paralogous spread are strong, all orthologous and paralogous genes should coalesce within a species with an expected time of 4N generations. Generally, 4N is going to be much smaller than the time at which speciation occurred in the past. Because of the convergence of the entire multigene family to single-locus coalescent dynamics under strong mechanisms of unequal exchange, the orthologous comparisons between species should be much more divergent than the paralogous comparisons within species (Figure 13.3b). This pattern is called concerted evolution because all copies of the multigene family evolve together, in concert, sharing the same mutational substitutions that discriminate one species from another. Many multigene families do indeed display the pattern of concerted evolution, as expected from inequalities 13.9. Unequal exchange can be an even more powerful selective agent in the germline when it is biased. Figure 13.2 shows that both deletions and duplications result from unequal exchange. Bias

493

494

Population Genetics and Microevolutionary Theory

(a)

α1

α2

β1

Species

β2

Speciation

α

Gene Duplication

β

α1

β1

α2

β2

1 Haplotype Tree

Species Tree

(b)

α1

α2

Species

β1

β2

Paralogous Coalescence

Paralogous Coalescence Speciation

α

β

Species Tree

Gene Duplication

α1

α2

β1

β2

Haplotype Tree

Figure 13.3 Evolution of a tandem duplicated locus without mechanisms of exchange between paralogous copies (a) and with mechanisms of paralogous exchange (b). In both panels, a species tree is shown on the left and a haplotype tree on the right. The two trees are combined in the middle, with the thick lines outlining the species tree and thin diagonal lines showing the haplotype tree imbedded within the species tree. Thin horizonal lines indicate a portion of the chromosome on which the genes reside. An ancestral gene is shown in gray, which is then duplicated into two paralogous copies, shown in black and white, in the common ancestral species of the descendant species α and β.

occurs when just one of these products is preferentially passed on to the gamete. For example, there are genetic diseases in humans that are associated with trinucleotide repeats (Rubinsztein 1999). Huntington’s disease is one such trinucleotide disease, associated with a CAG repeat. Huntington’s disease is inherited as an autosomal dominant genetic disease, and is associated with a late age of onset degeneration of the central nervous system (usually after 40 years of age, but sometimes earlier) that ultimately causes death. Alleles with 35 or fewer CAG repeats are normal (h alleles) and are not associated with the disease. New alleles for Huntington’s disease (H alleles) arise from h alleles with between 29 and 35 CAG repeats that expand on transmission through the paternal

Units and Targets of Selection

germline to 36 CAGs or greater (Chong et al. 1997). Changes in CAG number occur in only 0.68% of normal chromosomes, but once the threshold of 36 is past, this increases to 70% in male germlines (Kremer et al. 1995). The expansion bias increases with increasing repeat number once past the 36 CAG threshold, reaching 98% in males with at least 50 repeats. The extraordinarily high expansion rates are most consistent with an expansion process that occurs throughout male germline mitotic divisions, rather than resulting from a single meiotic event (Leeflang et al. 1999). Both the direction and rate of expansion are nonrandom in this case. Gene conversion is related molecularly to recombination (Chapter 1), and some of its roles as a selective agent in the germline have already been noted, such as its role in concerted evolution. As with unequal exchange, its selective strength can be greatly enhanced when it is biased. Unequal gene conversion occurs in meiosis when an allele or stretch of DNA converts its homolog to its own genetic state in a heterozygous individual in a manner that results in non-Mendelian segregation ratios. Usually unequal gene conversion is symmetric, that is, it is equally likely for either homologous stretch of DNA to be converted. However, sometimes the genetic state of an allele or a stretch of DNA has the property of preferentially converting its homolog to its own state when gene conversion occurs. Such biased gene conversion is a phenotype with an underlying genetic basis and therefore constitutes yet another target of selection below the level of the individual. Walsh (1983) examined a one-locus, two-allele model (A, a) of biased gene conversion. Let A and a have different phenotypes during meiosis within a heterozygous individual such that γ is the probability of an unequal gene conversion event, and β is the conditional probability that a converts to A given an unequal conversion occurs. Note that the above two parameters describe the meiotic “phenotype” of gene conversion at the molecular level within heterozygous individuals. Now, 1 − γ is the probability that no unequal gene conversion event occurred, that is, the probability of getting a 1 : 1 ratio with Mendelian segregation in a Aa heterozygote. With probability γβ conversion is biased in favor of A, yielding a segregation only of A alleles in that meiotic event from an Aa heterozygote. Finally, with probability γ(1 − β), conversion is biased in favor of a, yielding a segregation only of a alleles in that meiotic event from an Aa heterozygote. Hence, the overall segregation ratio from Aa heterozygotes is [1/2(1 − γ) + γβ] A alleles to [1/2(1 − γ) + γ(1 − β)] a alleles rather than the normal 1 : 1 segregation. Note that this biased segregation ratio can be expressed as κ: (1 − κ) where κ = 1/2(1 − γ) + γβ. Letting Gij be the frequency of genotype ij in the adult breeding population, then the frequency of allele A in the next generation’s gene pool, p , is shown in Figure 13.4 and in the following equation: p’ = GAA + κGAa = GAA + 1 2GAa − 1 2GAa + κGAa = p + GAa κ − 1 2

13 10

The change in allele frequency is: Δp = p’ − p = GAa κ − 1 2

13 11

From Eq. (13.11), it is clear that biased gene conversion (κ ½) is a selective force that alters the gene pool. Moreover, if biased gene conversion is the sole target of selection, Eq. (13.11) indicates that we should get fixation of A when κ > ½, and fixation of a when κ < ½. Lartillot (2013) has extended this model to include genetic drift and the association of gene conversion with recombination to yield fixation probabilities when κ > ½ (that is, the A allele is favored by biased gene conversion) in a random mating population to be: uA ≈

κr 1 − e − 4N ev κr

ua ≈

− κr 1 − e4N ev κr

13 12

495

496

Population Genetics and Microevolutionary Theory

AA

Aa

aa

GAA

GAa

Gaa

Adult Population

Mechanisms of 1 Producing Gametes (Violation of Mendel’s First Law)

Gene Pool (Population of Gametes)

Figure 13.4

κ

(1−κ)

1

A

a

p′ = GAA + κGAa

q′ = Gaa + (1–κ) GAa

Biased gene conversion at a single locus with two alleles, A and a.

Table 13.3 shows some numerical results for κr = 0.01, a scaled selection parameter of the same size as s = 0.01, the selection parameter for individual-level selection that was used in Tables 12.2 and 12.3. The results shown in Table 13.3 for biased gene conversion are similar to those for individuallevel selection shown in Tables 12.2 and 12.3 in that genetic drift still causes much loss of favored mutants and results in small but positive fixation probabilities for disfavored mutants. But there are some differences. Individual selection appears to be better at increasing the fixation probabilities of a favored mutant than biased gene conversion, which is not surprising since individual selection can act on all individuals in this model but biased gene conversion can act only on heterozygotes. However, biased gene conversion appears slightly better at preventing fixation of disfavored mutants. Note that the evolutionary dynamics of gene conversion (Eq. 13.11) depend not only upon κ, the selected phenotype, but also upon GAa, the frequency of heterozygotes. This dependency is expected because selection for biased gene conversion can only occur within heterozygotes. The dependency upon GAa ensures that other population level forces will influence the evolutionary process. For example, suppose we have a nonrandom mating population such that GAa = 2pq(1 − f ) (see Eq. (3.2)). Then any system of mating that results in a positive f will diminish the selective response to this target of selection, whereas any system of mating that results in a negative f will enhance the response. Thus, biased gene conversion would be expected to play a lesser role in a population of obligate selfers because such populations have very few heterozygotes, and therefore there is little opportunity for selection within heterozygotes. In contrast, a population that is actively avoiding inbreeding would create optimal conditions for selective responses to biased gene conversion. Another population-level factor is population subdivision, which can also diminish the frequency of heterozygotes through the Wahlund effect (Eq. 6.26). Therefore, biased gene conversion is less effective in subdivided populations. If the population is further subdivided into local demes with small variance and inbreeding effective sizes, heterozygosity is further reduced and biased gene conversions have less opportunity to operate. A contrast of Tables 13.3 and 12.2 reveal that selection at the individual level is unlikely to be overwhelmed or rendered irrelevant by biased gene conversion. Thus, although the target of selection may be below the level of individuals, selection at the individual level and population structure (system of mating, gene flow patterns, and genetic drift) all strongly modulate the evolutionary dynamics of targets of selection below the level of the individual. Selection at the molecular level does not overwhelm but rather interacts with selection at the individual level and with population processes.

Units and Targets of Selection

Table 13.3 The fixation probabilities of a new mutant allele A favored by biased gene conversion (uA), a new mutant allele a disfavored by biased gene conversion (ua), and a neutral mutant (u0). Percent Change from Neutrality N

uA

ua

u0 = 1/(2 N )

A

a

50

0.0116

0.0016

0.0100

16%

−84%

100

0.0102

0.0002

0.0050

104%

−96%

500

0.0100

0.0000

0.0010

900%

−100%

1000

0.0100

0.0000

0.0005

1900%

−100%

Note: The parameter κr = 0.01, N is the variance effective population size, and random mating is assumed. The fixation probabilities of a neutral mutant in a population of the same size are also shown, as well as the percent increase in the fixation probability in the selected case compared with the neutral case, 100(u − u0)/u0.

One of the most commonly observed cases of biased gene conversion is GC-biased gene conversion that favors G over A and C over T at heterozygous nucleotide sites. This type of gene conversion promotes variation in GC content within the genome, with GC rich clusters occurring at locations that have high levels of gene conversion, such as recombinational hotspots (Koester et al. 2012; Capra et al. 2013). Note that this bias in gene conversion can directly counteract the mutational bias associated with methylated cytosine mutagenesis found at CG sites that favors C to T transitions. Hence, antagonistic selective forces can exist even at the molecular level within an individual. In addition, there can be pleiotropic effects of a site subject to biased gene conversion upon individual-level fitness. This is made probable because GC rich areas are also found in and near promotor regions (Koester et al. 2012). Some coding regions are also GC rich, and biased gene conversion in such genes can interact with selection at the individual-level. For example, the homeobox gene Pdx1 in the sand rat Psammomys obesus and close relatives has become uniquely divergent and GC rich, probably due to GC-biased gene conversion. Dai and Holland (2019) find that amino acid changes driven by the GC skew resulted in altered protein stability, with a significantly longer protein half-life for sand rat Pdx1. Sand rat Pdx1 is degraded through the ubiquitin proteasome pathway, and these amino acid changes caused the loss of a key ubiquitination site, otherwise conserved throughout vertebrate evolution. The sand rat Pdx1 has evolved to counter this maladaptive change driven by a strong GC skew by developing a new compensatory ubiquitination site. Bittihn and Tsimring (2017) developed a shifting balance model of selection that incorporates gene conversion. In their model, a functional gene is duplicated to create a non-functional pseudogene copy. As pointed out in Chapter 5, pseudogenes tend to evolve more rapidly than their functional ancestral gene and show higher levels of polymorphism. In their model, the pseudogene serves as reservoir of new genetic variants, and paralogous gene conversion can sometimes place these new variants and variant combinations back into the functional gene. Their models show that gene conversion with a passive duplicate gene can help circumvent valleys of low fitness and cause a significant speedup of adaptation in a rugged adaptive landscape. This scenario has indeed played out in high-altitude adaptations of the Tibetan wolf (Signore et al. 2019). This wolf population has high hypoxia tolerance due to an unusual β-hemoglobin with high O2 affinity. The critical affinityenhancing mutations first evolved in a tandemly linked β-hemoglobin pseudogene and were subsequently transferred to the functional β-Hb gene by interparalog gene conversion. Mutations arising in a non-expressed pseudogene have therefore played an important adaptive role in the Tibetan wolf through gene conversion.

497

498

Population Genetics and Microevolutionary Theory

Meiotic Drive

Meiotic drive or segregation distortion is another form of selection that targets gametes by distorting the normal 50 : 50 Mendelian segregation ratio. Let the distorted segregation ratio be k : (1 − k) where k = ½ corresponds to normal Mendelian segregation. Note that the selection induced by meiotic drive is exactly the same as that shown in Figure 13.4 and Eqs. (13.11) and (13.12) simply by substituting k for κ. The evolutionary dynamics are similar, but the biological meaning is different. One important difference is that gene conversion often affects only a short stretch of DNA, whereas a locus subject to meiotic drive can influence the entire chromosome on which it resides, although linked loci can be separated from its effects by recombination. There are several distinct mechanisms that can lead to meiotic drive. Female meiosis is inherently asymmetrical, resulting in three haploid nuclei that become polar bodies and one that develops into the oocyte. Underlying this meiotic asymmetry is an asymmetry in the spindle apparatus that segregates the chromosomes in mice (Akera et al. 2017, 2019). Some centromeres have increased their transmission to the gamete by preferentially attaching to the spindle fibers that lead to the oocyte. Males can also show meiotic drive. In the zebra fish Danio rerio, Alavioon et al. (2017) found much phenotypic variation in sperm within a single ejaculate for motility and life-span (as a sperm) and identified several sites throughout the genome that affect these sperm phenotypes. These phenotypes in turn can influence the chances for fertilizing an egg, thereby altering k for some of these genetic variants. There are also killer meiotic drive loci in many eukaryotes that destroy the meiotic products that do not inherit them (Nunez et al. 2018; Eickbush et al. 2019; Keais et al. 2020; Wong and Holman 2020). Just like biased gene conversion, meiotic drive is an evolutionary force that will change allele frequencies. The (k − ½) component in Eq. (13.11) (with k instead of κ) is a measure of the fitness deviation of the gamete, that is, the actual amount of meiotic drive relative to normal segregation. Thus, (k − ½) plays a role similar to that of average excess in our standard equations for natural selection at the individual level in Eq. (11.5). In this case, the measure of fitness deviation is assigned directly to the gamete rather than by statistical averaging over individuals. However, the common theme between Eqs. (13.9) and (11.5) is that the response induced by natural selection is funneled through the gametes. Once again, selection operates from the gamete’s perspective, and in this case the gamete has its own direct phenotype. When meiotic drive is the sole source of selection, Eq. (13.11) indicates that an allele is favored by meiotic drive (k > ½) should always go to fixation. If the only phenotype shown by an allele were meiotic drive, it should go to fixation after it originated by mutation. Hence, even if alleles with meiotic drive arise commonly, we do not expect to find many meiotic-drive polymorphisms as such alleles should go rapidly to fixation. Yet, we do find several polymorphisms showing meiotic drive, and often the rarer allele is the one favored by meiotic drive. Why do these meiotic drive alleles stay rare but polymorphic? The answer to this question lies in our recurring observation that a single unit of selection can have more than one target of selection, that is, the same genetic unit that is responding to selection can have phenotypic manifestations at more than one level of biological organization. The t complex in mice provides an example of antagonistic selection for different targets of selection (Redkar et al. 2000; Schimenti 2000). The t complex is located in a 20 cM region of chromosome 17 of the mouse genome that constitutes about 1% of the mouse genome. The t complex normally differs from the non-t state in this region by four different inversions that suppress most but not all recombination in this region. A small central region is not spanned by the inversions. This region contains a large number of candidate genes for sperm motility, capacitation (a process which includes changes in sperm membranes, motility, and metabolism that is essential for subsequent

Units and Targets of Selection

fertilization), binding to the zona pellucida of the oocyte, binding to the oocyte membrane, and penetration of the oocyte. Extensive epistasis exists between subregions of this complex for many of these sperm-related phenotypes that are subject to intense selective pressures, as we will soon see. The combination of low recombination, extensive epistasis, and intense selection make this large, 20 cM genomic region behave as a single unit of selection. For our purposes, we can treat this complex, multi-locus unit of selection as if it were a single supergene with two alleles, t and T. Male mice that are heterozygous T/t show extreme segregation distortion favoring the t allele, with k values going as high as 0.99 for some t alleles. The t/t homozygotes are frequently lethal, and if they live, t/t males are invariably sterile. Thus, t alleles are strongly favored by meiotic drive within T/t heterozygotes males, but t alleles are strongly selected against at the individual level due to their lethality and sterility effects in homozygotes. Note that meiotic drive alone should result in fixation of the t allele through Eq. (13.11), but selection on lethality and sterility of diploid individuals alone should result in fixation of the T allele through Eq. (11.5). Natural populations of mice generally go to neither of these fixation points, instead remaining highly polymorphic even for lethal t alleles. To understand this polymorphic balance of antagonistic targets of selection, we need to see how both these phenotypic levels are filtered through the gamete in going from one generation to the next. As shown in Chapter 11, we measure that aspect of fitness differences among individuals that is transmissible through a gamete by the average excess. However, meiotic drive alters the allele frequencies even before the fitness effects measured by the average excess occur (remember, the average excess measures the average individual fitness effects the gamete is expected to have in the next generation). In Eq. (13.11) and in Figure 13.4, meiotic drive would alter the allele frequency of the t allele from p to p . However, that p was derived under the assumption that all heterozygotes experience meiotic drive. In the case of the t complex, only male heterozygotes display meiotic drive, so only half of the heterozygotes are affected. Accordingly, p in is this case equals p + 1/2GTt(k − 1/2). Assuming random mating, meiotic drive should change the frequency of t from p to p = p + pq (k − 1/2). Now suppose the relative individual-level fitness of T/T is 1, T/t is 1 −s, and t/t is 0 (a lethal/sterile genotype). Given random draws of gametes from the gene pool already altered to p by meiotic drive, the average excess of the t allele for individual level fitnesses is: at = q 1 − s − w + p − w

13 13

where w = q 2 + 2p q 1 − s . Selection at the diploid level further alters the allele frequency as given by Eq. (11.5) to p = p + p at /w. The total change in allele frequency can be written as: Δp = p − p = p − p + p − p =p

at + pq k − 1 2 w

13 14

Equation (13.14) clearly reflects the impacts of selection at two levels of biological organization such that one unit of selection (the t complex) has two targets of selection. The first term in Eq. (13.14) reflects selection at the individual level, and as before is proportional to the average individual-level fitness deviation (average excess) of the gamete of interest, the t allele in this case. The second term reflects the impact of selection among gametes within male heterozygotes, that is, meiotic drive. However, one basic property is preserved in both Eqs. (13.14) and (11.5) (the equation incorporating only individual-level selection), namely, the response to natural selection is determined by the fitness effects that are transmissible through a gamete. The first term in Eq. (13.14) has its sign determined by the average excess of the t allele, the phenotypic measure of the fitnesses of diploid bearers that is transmissible by a t bearing gamete. The second

499

Population Genetics and Microevolutionary Theory

term in Eq. (13.14) has its sign determined by k, a phenotypic measure of meiotic drive that is assigned directly to the t bearing gamete. Hence, both terms of Eq. (13.14) are gametic measures of fitness phenotypes, albeit at two distinct levels of biological organization. We still need to take the gamete’s perspective in order to understand natural selection. Even when natural selection involves multiple levels of selection, the response to selection is always filtered through a gamete. Equation (13.14) also makes clear the antagonistic nature of selection at the gametic and individual levels in this case. For t alleles, k > 1/2, so the second term of Eq. (13.14) is positive for all p between 0 and 1. At the individual level, w is a strictly decreasing function of p, so the only adaptive peak at the individual level is at p = 0. This means that the first term in Eq. (13.14) is always negative for all p between 0 and 1. Hence, selection at the t complex is always going in opposite directions for the two different targets of selection. The equilibrium occurs when the magnitudes of these opposing selective forces are equal. This is shown graphically in Figure 13.5, which plots the two components of selection and reveals the equilibrium as the intersection point of the second component (meiotic drive) with the negative of the first component (individual selection). Figure 13.5 also shows that this equilibrium point is stable. Note that below the equilibrium point the magnitude of meiotic drive increasing p is greater than the magnitude of individual selection decreasing p. Hence, below the equilibrium point meiotic drive overpowers individual selection and increases the frequency of the t allele. However, when above the equilibrium point, the opposite is true, so individual selection overpowers meiotic drive and decreases the frequency of the t allele. Hence, the equilibrium is selectively stable. Thus, the widespread polymorphisms of t alleles are no longer a mystery. These polymorphisms simply reflect the balance between the two targets of selection. Note that this polymorphic balance is inexplicable if one knew only of selection at just one biological level. Seemingly nonsensical evolutionary outcomes are possible with a unit of selection that has multiple targets of selection if one pays attention to only a single target of selection.

0.5

)

on

0.4 al

du ivi

Δp or −Δp

500

0.3

0.2

Stable Polymorphic Point

0.1

n ei

ele

All

to

0.1

t

wi

Se

Ind

F

g

an

Ch

−(

cy

en

u req

e Du

ti lec

= hs

Change in Allele Frequency

Due to Meiotic Drive With k = 0.9 0.2

0.4

0.6

0.8

1.0

Frequency of t Allele

Figure 13.5 A plot of the two components of change in t allele frequency from Eq. (13.14) with k = 0.9 and s = 0.1. Δp is plotted only for either the meiotic target or the negative of the individual target; hence, the two separate lines. The intersection of these two lines occurs when each component has the same magnitude, and since they are of opposite sign, the combined Δp = 0 at that point.

Units and Targets of Selection

Germline Selection

Mutations can accumulate in the genomes of cells as the germline progresses from one cell generation to the next until culminating in meiosis and the production of gametes. This is particularly true for mtDNA that has a high mutation rate and many copies per cell. As discussed in Chapter 1, mtDNA is maternally inherited in many species, and effectively as a haploid, that is, only one mtDNA type is generally passed on through the oocyte. Yet, because of mutation of mtDNA within an individual, some 45% of human individuals carry heteroplasmic mtDNA sequences at levels greater than 1% of their total mtDNA (Wei et al. 2019). These variants can be subject to selection within the germline cells, an example of germline selection. In addition, there is a severe bottleneck of 7–10 mtDNA copies during the progression to the oocyte endpoint (Zaidi et al. 2019), leading to a severe genetic bottleneck of mtDNA diversity in the oocyte and allowing intracellular genetic drift to have a large impact on what mtDNA haplotype(s) are actually transmitted via the egg. Because of these drift effects, a mother can pass on a different mtDNA genome to her offspring than she has, and different mtDNA genomes to different offspring (Zaidi et al. 2019). Despite this large drift effect, selection also occurs during germline development among these mtDNA variants (Zaidi et al. 2019), such as selection against variants in mitochondrial ribosomal DNA and replacement mutations in protein coding loci, whereas there can be positive selection for noncoding D-loop variants (Wei et al. 2019).

Somatic Cells We have often emphasized that what is important in evolution are the genetic units that can be transmitted through a gamete. Selection among mutational variants arising in germline cells is therefore relevant to evolution, as illustrated by the mtDNA examples given above. Mutations and chromosomal variation also arise in somatic cells, but since these cannot directly contribute to a gamete, should not somatic cell evolution within an individual’s body be irrelevant to evolution? The answer is no for two reasons. First, the trajectories of somatic cell evolution can be strongly influenced by genes inherited through the germline. Second, just as we have noted repeatedly, there are often multiple targets of selection, and somatic cell selection can strongly influence an individual’s fitness. These two factors combine to make intra-individual somatic cell evolution very relevant to a population’s evolution. This conclusion will be illustrated by one of the more dramatic phenotypes that can arise from somatic cell evolution—cancer. Somatic mutations can occur at each of the trillions of cell divisions in going from a fertilized egg to an adult, large multi-cellular animal. Given that many of these cells are undergoing rapid cell-population growth during much of development, somatic mutations have a high probability of persisting and accumulating, resulting in extensive somatic mosaicism (Shendure and Akey 2015). Inheritance is usually clonal in somatic cells, so hitchhiking affects the entire genome and not just closely linked variants, resulting in mosaic cell lineages embedded in a treelike genealogy (Lodato et al. 2015). Different cell types come to express many diverse phenotypes, but all cells normally must have strictly controlled cell growth, cell death, and differentiation for a multicellular individual to be viable and fertile. When somatic cell evolution results in a cell lineage that escapes these controls, uncontrolled cell proliferation can occur—the unifying phenotype of all cancers. Mutants leading to cancer are often selected at the cellular level since the cancer cells out-reproduce the controlled cells. However, the fitness consequences of uncontrolled cell growth at the individual level are often extremely deleterious. This in turn leads to strong individual-level selection on genes for redundant cellular control mechanisms and tumor suppression. As a result, selection on somatic cells to escape control is typically a multistage evolutionary process

501

502

Population Genetics and Microevolutionary Theory

in which multiple mutations (typically two to eight) are accumulated in a cell lineage to overcome these redundant control mechanisms. In this manner, a cell lineage progresses stepwise toward cancer (Shpak and Lu 2016). This mutational accumulation is sometimes favored by positive somatic cell selection on these mutations even before the development of the cancer phenotype (Choi et al. 2012; Martincorena et al. 2018; Yizhak et al. 2019; Watson et al. 2020). Somatic mutations accumulate more rapidly in sun-exposed skin, esophagus mucosa, and lung, suggesting environmental exposure can make cancer evolution more likely (Yizhak et al. 2019). Inherited germline mutations in the genes that control somatic cells can greatly increase the risk of cancer by reducing the number of somatic mutations needed to induce cancer. These inherited variants are in two classes: loss-of-function mutations in tumor suppressor genes that have an inhibitory role in the cell cycle; and gain-of-function mutations in oncogenes that regulate the mitotic cycle (Shpak and Lu 2016). Loss of DNA repair mechanisms due to either inherited mutations or somatic mutations can increase somatic cell mutation rates and are found in 40–50% of many cancers (Higgins and Boulton 2018). Once a cell lineage has evolved a cancer phenotype, clonal inheritance creates the ideal condition for the evolution of coadapted complexes in the tumor because new somatic mutations are selected specifically on a cancer genetic background. One class of somatic mutations strongly favored by selection on a cancer genetic background are mutations that allow cancer cells from the original tumor to migrate to new locations, forming secondary tumors—a phenotype called metastasis. Metastasis can evolve either early or late or not at all in the original tumor, and it can evolve only once or multiple times in the cancer cell lineage (Turajlic and Swanton 2016). Metastasis greatly increases the deleterious effects on individual fitness and is associated with 90% of cancer-related deaths in humans (Gundem et al. 2015). The death of the individual with cancer typically means the death of the cancer cell lineage as well, but there are rare exceptions. Canine transmissible venereal tumor (CTVT) is a cancer that arose in a dog about 11 000 years ago (Murchison et al. 2014; Strakova and Murchison 2015). This cancer affects the genitalia of male and female dogs and the cancer cells themselves can act as infectious agents during coitus in dogs. Effectively, a somatic cell lineage has become an independent infectious organism that is now evolving independently of its host species of origin (Baez-Ortega et al. 2019). Another infectious cancer evolved in the 1990s in Tasmanian devils (Sarcophilus harrisii) that infects the face and is transmitted by bites. A second independent transmissible facial cancer evolved in 2014 in Tasmanian devils. These transmissible cancers have caused an 80% population decline in Tasmanian devils and are a major concern for the conservation of this endangered species (James et al. 2019). Transmissible cancers have also evolved in four bivalve species (Yonemitsu et al. 2019).

Overview of Selection Below the Level of an Individual In all of the examples of targets of selection below the level of the individual, we see that it is the balance between molecular and population-level forces that determine the evolutionary outcome. A single unit of selection can have targets of selection both at the genomic or cellular levels and at the individual level. All the targets of selection must be considered at the population level to determine the fate of the unit of selection over evolutionary time. Even in those cases where the only target of selection is below the level of the individual, the within-individual processes leading to this target of selection are often constrained by the evolutionary outcomes shaped by population level evolutionary forces such as system of mating, population subdivision, or genetic drift, as we saw for concerted evolution, gene conversion, and meiotic drive. It is misleading and inappropriate to think of selection below the level of the individual as separate from or independent of evolution at other biological levels.

Units and Targets of Selection

Targets of Selection Above the Level of the Individual Many phenotypes emerge at the level of interactions between two or more individuals. If the same types of interactions are recurrent across generations, they will have continuity over time and can be targets of selection. Targets emerging from interactions among individuals are not difficult to find. For example, of the three major fitness components outlined in Chapter 11, two in general are more appropriately assigned to interacting individuals: mating success and fertility/fecundity. An individual in a dioecious species does not truly have the phenotypes of mating success and fertility/fecundity; such phenotypes take on biological reality only in the context of an interaction with another individual. Another common target of selection is intraspecific competition, a phenotype emphasized by Darwin as being important in his theory of natural selection. But competition also takes on biological reality only in the context of interactions among individuals. It is always possible to assign an average marginal phenotype to the individuals engaging in such interactions, but it is more accurate biologically to assign such phenotypes to the interacting individuals. Similarly, individuals who are relatives, particularly in species that have family structures, often interact in complex ways that can affect each other’s fitnesses. As we shall see in this section, qualitatively new properties of natural selection emerge when we assign such interaction phenotypes directly to the set of interacting individuals rather than treating each individual as a separate entity without the context of an interaction.

Sexual Selection Sexual selection refers to the selection targeting the events that lead up to successful mating or its failure. Many of these events emerge from interactions among individuals and thereby constitute targets of selection above the level of the individual. Such targets under sexual selection are often split into two types:

• •

Intrasexual selection that arises out of competition between individuals of the same sex for mates, and Intersexual selection that arises out of the interactions between individuals of opposite sex that lead to mating or its failure.

As we saw in the previous section, a single unit of selection can have multiple targets of selection, and the same is true for traits under sexual selection. For example, males of the cricket, Gryllus integer, produce a trilled calling song that attracts females, resulting in enhanced mating success (Gray and Cade 1999). Female crickets prefer male calling songs that are close to the mean of the population for the number of pulses per trill. Thus, the sexual selection that emerges from this male/female interaction is stabilizing upon the phenotype of number of pulses. However, gravid females of the parasitoid fly Ormia ochracea use this same song to localize new victims, and indeed prefer the same number of pulses as the female crickets. Once parasitized, the males die in about seven days, so those males with songs that are preferred by females also have lowered viabilities, which selects for males with songs away from the average. As a consequence, the trait of male calling song has targets of selection at the level of mating pairs and at the level of individual viability. In this case, the selective targets are antagonistic and together maintain high levels of genetic variation for male calling song in this species, just as antagonistic targets of selection maintain the polymorphic t complex in mice. Not only can a single unit of selection have multiple targets of selection, but in addition a single target of selection can induce selective responses at multiple units of selection. Targets of sexual

503

504

Population Genetics and Microevolutionary Theory

selection frequently are aimed at multiple units of selection because intrasexual and intersexual selection are often interwoven, resulting in conflicts that direct selective responses to different units of selection in the two sexes. For example, males of the butterfly Pieris napi attempt to mate with many females, and male reproductive success typically increases with the number of matings obtained (Wedell 2001). Females also can mate more than once during a reproductive cycle, and if ejaculates overlap, the reproductive success of one or more of the males that mated with a multiply-mated female can be decreased relative to what it would have been with no ejaculate overlap, a phenomenon known as sperm competition and a type of male–male competition. Females of this species undergo a period of nonreceptivity for remating, and the duration of this period has a genetic component. Another component of phenotypic variation in the nonreceptive period is the degree to which the female’s sperm storage organ is filled; the more full this organ, the less willing the female is to remate. Such a response by females normally insures that they mate often enough to always have sperm available. Thus, the genetic component for the duration of the female nonreceptivity period as a response to the amount of stored sperm can serve as a unit of selection arising out of intersexual mating interactions. The phenotypic variation in female receptivity period also creates variation in the amount of intrasexual sperm competition. Male butterflies produce two types of sperm; normal, eupyrene sperm capable of fertilization, and anucleate, apyrene sperm that are incapable of fertilization. Up to 90% of the sperm can be apyrene. Such sperm can provide nutrients to the female that can affect her reproductive success, and such nonfunctional sperm also fill the sperm storage organ. These nonfunctional sperm have a major impact on decreasing the female’s receptiveness to other males. Thus, the ejaculate, a major indicator of male success in intersexual selection, has also been strongly shaped by male–male intrasexual competition, but the target of this selection has been the female’s remating behavior. Therefore, the genes underlying the production of nonfunctional sperm in males are one unit of selection and the genes influencing female remating behavior are a second unit of selection. In Chapter 10, we saw that when phenotypes arise from genetic architectures characterized by epistasis (interactions between loci), there is an apparent confoundment between apparent causation of phenotypic variation and frequency (recall Figure 10.19). The same is true when the phenotype of fitness emerges from the interactions of two or more individuals. In general, selection arising out of interactions among individuals is frequency dependent when fitness is averaged over interactors in order to assign a fitness measure to individuals. With sexual selection, the other interactors are other individuals with a given genotype. Hence, the mating success or fertility of an individual is expected to depend upon the attributes of the individual and the frequencies of genotypes in the population. This dependence upon genotype frequency was shown in one of the early empirical estimates of male mating success (Ehrman 1966, 1967). Two inversions, AR and CH, behave as polymorphic alleles in the fruit fly Drosophila pseudoobscura. Experiments were executed to estimate the male mating success of the three genotypes, AR/AR, AR/CH, and CH/CH. Male mating success was estimated by the number of matings each male genotype achieved divided by the expected number of matings for that genotype under a neutral, random mating model. Mating success was measured in many different populations over a range of genotype frequencies. Anderson (1969) found that the mating success in Ehrman’s data fit well to equations of the form: 0 6 + αij Gij

13 15

where Gij is the frequency of genotype ij and αij is an empirically derived constant that scales the frequency dependent mating success of males with genotype ij. Note that the mating success of any

Units and Targets of Selection

given male genotype is inversely proportional to its frequency, a result often described as a rare male mating advantage. The α values obtained for these three inversion genotypes were 0.2 for AR/AR, 0.1 for AR/CH, and 0.05 for CH/CH. We will generalize these α values to x, y, and z, respectively, for the three genotypes. Note that the AR/CH heterozygote α value appears intermediate between the α values of the two homozygotes (codominance), and it seems like there ought to be directional selection favoring the AR/AR genotype because it has the largest α term. A graph of male mating success using these values also shows that for all male genotype frequencies, the AR/AR genotype has the highest fitness, and the CH/CH genotype has the lowest (Figure 13.6). This graph also seems to imply that there should be directional selection favoring the AR/AR genotype. However, not all males are in equal genotype frequencies in most populations, so this seeming superiority of AR/AR can be violated in many populations. To see this, consider the following model. We assume that female fitness is not affected by this polymorphism. Thus, we set the relative fitness of all female genotypes equal to one. We assume the three male genotype fitness are as given above. Then, the average female fitness is one, and the average male fitness (noting that the three male genotype frequencies must sum to one), is wmale = GAR

AR

06+

x GAR

+ GAR

CH

06+

AR

y GAR

+ GCH CH

CH

06+

z GCH

CH

=06+x+y+z 13 16 which equals 0.95 for the empirical values of x, y, and z given earlier. Assuming a 50 : 50 sex ratio, the average fitness for the population over these three genotypes is 1/2(1) + 1/2(0.95) = 0.975. Note that in this case the average fitness is a constant over all possible genotype frequencies. Recall from Chapter 11 Fisher’s fundamental theorem of natural selection and the concept of an adaptive landscape. Because the average fitness is a constant over the entire genotypic space, there is no fitness peak or no optimal “target,” and hence there is no possibility that selection of any sort can increase average fitness. Does this mean that there is no selection, or that selection exists in this case but in a manner that violates, or at least makes irrelevant, Fisher’s fundamental theorem? 5.

Male Mating Success

4. AR/AR 3.

2.

AR/CH

1. CH/CH 0.2

0.4

0.6

0.8

1.0

Male Genotype Frequency

Figure 13.6 A plot of male reproductive success for the three genotypes AR/AR, AR/CH, and CH/CH in Drosophila pseudoobscura as a function of male genotype frequency. Source: Based on Anderson (1969).

505

506

Population Genetics and Microevolutionary Theory

To answer these questions, we will simplify the model by assuming Hardy–Weinberg genotype frequencies within each generation (although we do not assume constant allele frequencies across generations). Obviously, mating is not totally random in this case, but the large deviations from random mating expectations occur only for very rare genotypes (Figure 13.6), so we can approximate the overall population genotype frequencies within each generation by the Hardy–Weinberg expectations. Despite the flat adaptive landscape, we can see if evolution is occurring due to sexual selection by seeing if the allele frequency is constant. Because there are no fitness differences among the females, there is no change in allele frequency in the half of the gametes contributed by females. Using the model shown in Figure 11.1 with all fitness components set to one except mating success where we use mij = 0.6 + αij/Gij and using Hardy–Weinberg genotype frequencies, the allele frequency p of AR in the male half of the gamete pool starting with an initial frequency of p is: p2 0 6 + x p2 + pq 0 6 + y 2pq wmales 0 6p + x + y 2 = 06+x+y+z

p =

13 17

Hence, the change in the allele frequency in the male derived gametes is given by: 0 6p + x + y 2 −p 06+x+y+z x + y 2−p x + y + z = 06+x+y+z p p 0 6 + x p2 + q 0 6 + y 2pq − wmales = wmales p = aAR wmales

Δp = p − p =

13 18

where aAR is the male average-excess of the AR allele for the phenotype of male mating success (fitness), as given in the brackets of Eq. (13.18). Note that in general Δp 0, so the fitness differences associated with male mating success are indeed inducing evolution via natural selection. Hence, Fisher’s fundamental theorem is violated in this case because selection is inducing evolutionary change but with no change in average fitness. Such a violation of Fisher’s theorem is a general property of frequency-dependent fitness models, and this constitutes a major limitation of this theorem as frequency-dependent processes arise commonly. However, note that Eq. (13.18) is simply a special case of Eq. (11.5), so the fundamental theorem of natural selection for measured genotypes is not violated. Once again we see that selection is driven from the gamete’s point of view even though the target of selection is at the level of interacting individuals. Hence, the measured genotype approach to selection described by Eq. (11.5) is applicable to a broader array of selective situations than Fisher’s fundamental theorem. Because Eq. (13.18) is a special case of Eq. (11.5) and because no allele frequency change is induced in gametes derived from females in this model, the selective equilibrium defined by the variation of male mating sense occurs when the average excess of male mating success is 0. Setting the bracketed term in Eq. (13.18) to 0 and solving for p, the equilibrium allele frequency is: peq =

x+y 2 06+x+y+z

13 19

Using the empirically derived values given earlier, the selective equilibrium in the D. pseudoobscura population should be 0.71. To see if this equilibrium is stable, Eq. (13.18) tells us we need to look at

Units and Targets of Selection

the sign of the average excess of mating success above and below the equilibrium frequency. If p = 0.75, then the male aAR = −0.017 and the “AR” allele decreases back to the equilibrium point. Likewise, when p = 0.6, then the male aAR = 0.067 and selection increases the “AR” allele up toward the equilibrium point. Hence, there is a stable, balanced equilibrium associated with this selection. From the gamete’s point of view as measured by average excesses, there is an adaptive peak (Curtsinger 1984), and the gametes are on this peak when p = 0.71. However, this peak is not “visible” when we assign the fitnesses to the individuals, a process which results in a flat adaptive surface with a constant average fitness of 0.975. Indeed, this selective equilibrium that has no impact on average fitness seems even more peculiar when we look at the actual fitness values assigned to individuals. Substituting the Hardy–Weinberg genotype frequencies at p = 0.71 into Eq. (13.15) with the empirical α values, the equilibrium fitnesses assigned to the three male genotypes are 0.997 for AR/AR, 0.843 for AR/CH, and 1.195 for CH/CH. Note that at equilibrium the male mating success of the heterozygote is lower than that of either homozygote. Normally, this would suggest an unstable equilibrium state, yet the average excess calculations show that this is a stable equilibrium. The average fitness of the population and the marginal fitnesses of individual genotypes are either uninformative or, worse, misleading when trying to understand the selective equilibrium and dynamics that emerges from frequency-dependent mating success. Targets of selection above the level of the individual lead to many paradoxes when we try to assign the fitnesses to individuals rather than interacting sets of individuals. The target of selection is not at the individual level in this case, and this is what results in the strange behavior of marginal fitnesses assigned to the individual level. Only by taking the gamete’s perspective as measured by average excesses can we successfully predict selective equilibria and stability properties. Curtsinger (1984) shows that these roles of the average excess are true in general for frequency-dependent selection, as well as for lower level targets of selection such as meiotic drive. Thus, the key to understanding the evolutionary response to natural selection remains in taking a gametic perspective regardless of the target of selection. When a target of selection is below the level of the individual, gametes can have selectable phenotypes directly or can be the direct bearers of the consequences of this lower level selection. Only when a DNA element can be transmitted to future generations in a vehicle other than a gamete (such as some transposons) can the importance of the gamete be circumvented. Chapters 11 and 12 showed that individual-level selection is determined by the gamete’s average excess and that selective responses cannot be made just from a knowledge of individual fitnesses. Now we see that even when fitness emerges from interacting sets of individuals, the average excess still retains its selective importance. However, even average excesses have limits on evolutionary predictability when there is frequency dependent selection. For example, Priklopil (2012) modeled a mate choice model of sexual selection, and as sexual preferences became strong, the evolutionary dynamics became chaotic. Chaotic behavior can arise in nonlinear systems such as frequency-dependent systems. Although chaos is deterministic, its unpredictable behavior arises from even extremely minor differences in initial conditions and the exact values of subsequent states, which cannot be measured accurately enough to avoid unpredictability. Another consideration is pleiotropy. Traits under selection often affect other traits that influence an individual’s fitness. As we saw with the sickle-cell example in Chapter 11, it is essential to consider all of the pleiotropic fitness effects in order to predict or account for the evolutionary trajectory. For example, Johnston et al. (2013) found strong sexual selection for horn size in wild Soay sheep such that large horns confer an advantage in male–male competition. Most of the genetic variation in horn size in this population is associated with a single gene, the relaxin-like receptor 2 that has two alleles, Ho + associated with larger horns and HoP associated with smaller horns.

507

508

Population Genetics and Microevolutionary Theory

Their surveys revealed that Ho + is also associated with higher male reproductive success, as expected from the observed pattern of male–male sexual competition. However, they also found that the allele is significantly associated with increased survival such that smaller horns increase male viability. The result of these two antagonistic pleiotropic effects is the maintenance of the Ho+/HoP polymorphism in this population that could not be explained either by sexual selection or viability selection alone. This serves as a warning that sexually selected traits must always be studied in a broader context. Wright’s long ago dictum that pleiotropy is universal should never be forgotten.

Fertility/Fecundity Fertility and fecundity are fitness components that emerge inherently from the interaction of a mating pair of individuals. A non-selfing individual cannot display these phenotypes except in the context of an interaction with a mate. For example, in humans, the ABO blood group locus has three common alleles that determine one’s ABO blood type. Knowing one’s ABO blood type is clinically important because transfusing donor blood bearing an antigen (A and/or B) not found in the recipient can create a potentially lethal immunological reaction in the recipient. If there is leakage of blood across the placenta, a mother can mount an immunological reaction against a developing fetus that bears antigens not found in the mother (recall Table 12.1). This results in a significant increase in the rate of spontaneous abortions in those couples that have ABO genotypes that yield ABO maternal–fetal incompatible combinations (Takano and Miller 1972). In the ABO system, the same individual genotype can have drastically different fertilities in the context of different mates. For example, a type O woman married to a type O man would not be at risk for any ABO induced spontaneous abortions, but a type O woman married to a type AB man would have every pregnancy at risk. Therefore, fertility and fecundity are not individual attributes but rather are attributes of a mating pair of individuals. To model fertility or fecundity as a target of selection on mating pairs, we simplify our model as before by assuming random mating and a within-generation approximation to Hardy–Weinberg genotype frequencies. Table 13.4 gives a one locus, two-allele (A and a) model in a random mating population by inserting a fertility component into the original Weinberg model given in Table 2.3. Note that fertility is attributed directly to a mating pair rather than an individual in Table 13.4. Note further that the fertilities assigned to the true target of selection are not frequency dependent, but rather are constants. However, at the bottom of the table, effective marginal fertilities (the bij’s) are assigned to the individual genotype ij (i and j can equal A or a) in order to predict the genotype frequencies in the next generation. These individual-level or marginal genotype “fertilities” are frequency dependent. Table 13.4 makes it explicit that the appearance of frequency-dependent fitness at the level of the individual is a result of mapping the constant fitnesses of a target of selection at mating pairs onto the individual level. In such a fitness mapping, the frequencies of the interactors become confounded with apparent individual-level fitness causation. Once the frequency dependent bij’s have been defined and assigned to individual-level genotypes, the selective response shown in the row of genotype frequencies in the next generation looks similar to the constant fitness models given in Chapter 11, so once again we can use the normal average excesses of the bij’s to calculate the change in allele frequency and determine potential equilibria and their stabilities. However, a closer examination of the forms of these bij’s in Table 13.4 reveals another complexity that can arise when the target of selection is above the level of the individual. Biologically, the bij’s are the average fertilities of the mating pairs (the targets of selection) that can give rise to an offspring of genotype ij weighted by the probability of that mating pair times the

Table 13.4

A model of fertility in a random mating population with two alleles (A and a) at an autosomal locus with an initial frequency p of the A allele. Mendelian Probabilities of Offspring (Zygotes)

Mating Pair

Frequency 2

2

4

Fertility

AA

Aa

aa

AA × AA

p ×p =p

b1

1

0

0

AA × Aa

2 p2 × 2pq = 4p3q

b2

1

1

0

AA × aa

2 p2 × q2 = 2p2q2

b3

0

1

0

Aa × Aa

2pq × 2pq = 4p2q2

b4

1

1

/2

1

/4

Aa × aa

2 2pq × q2 = 4pq3

b5

0

1

/2

1

/2

aa × aa

q2 × q2 = q4

b6

0

0

1

p4 b1 + 2p3 qb2 + p2 q2 b4 b bAA = p2 b p2b1 + 2pqb2 + q2b4

2p3 qb2 + 2p2 q2 b3 + 2p2 q2 b4 + 2pq3 b5 b bAa = 2pq b p2 b2 + 2pq 1 2b3 + 1 2b4 + q2 b5

p2 q2 b4 + 2pq3 b5 + q4 b6 b baa = q2 b p2b4 + 2pqb5 + q2b6

Genotype Frequency in Next Generation:

Where the bij =

/2 /4

/2

Note: Sex-dependent effects are ignored, so matings such as AA × Aa and Aa × AA are regarded as equivalent and hence their frequencies are added. The average fertility of the mating pairs as weighted by their frequencies is given by b = p4 b1 + 4p3 qb2 + 2p2 q2 b3 + 4p2 q2 b4 + 4pq3 b5 + q4 b6 .

510

Population Genetics and Microevolutionary Theory

Mendelian probability of that mating pair producing genotype ij. In contrast, consider the average fertilities, bij ’s, of the individuals with genotype ij under random mating: bAA = p2 b1 + 2pqb2 + q2 b3 bAa = p2 b2 + 2pqb4 + q2 b5 baa = p2 b3 + 2pqb5 + q2 b6

13 20

Note that the average fertility of an individual genotype is the fertility of that genotype when mated to another individual times the genotype frequency of that other individual. By comparing Eqs. (13.20) with the equations for the bij’s in Table 13.4, there is not a single case in which bij = bij. Thus, the fertilities that we assign to individual genotypes to predict the response to fertility selection are not the average fertilities of the individual genotypes. It is only the average fertilities of the mating pairs that matter, and this can involve mating pairs that do not even include the focal genotype (e.g. note in Table 13.4 that bAA is a function in part of b4, the fertility of the Aa × Aa mating pair, because this mating pair can produce AA offspring). Unfortunately, almost all empirical measurements of fertility are based on average individual fertilities, which are not the relevant measures of the selective response of this target of selection. Virtually the entire literature on fertility or fecundity as a fitness component makes this mistake. The discrepancy between the bij’s and the bij’s illustrates the difficulty in assigning an individual fitness measure when the target of selection is above the level of the individual. This problem cannot in general be addressed simply by looking at the average attribute of an individual as it interacts with other individuals in the population. Sometimes that works, but sometimes it does not, as shown by the important fitness component of fertility. The heart of this discrepancy relates to a common theme we have noted for selection in many different contexts: the necessity of taking the perspective of a gamete in predicting the response to selection. Note that the bij’s are defined in terms of the mating pairs that yield a particular type of offspring in the next generation, that is, the bij’s measure fertility in a manner that requires the passage of a gamete from one generation to the next. In contrast, the bij’s are the average fertilities of the genotypes in the parental generation and do not take into account what will be passed on to the next generation. Only the bij’s are relevant for natural selection. Note that we have a further weakening of the idea that “natural selection is survival of the fittest”—natural selection on fertility does not necessarily favor the most fertile individuals because of the discrepancy between the bij’s and the bij’s. Eq. (11.5) tells us that selection favors the fitter gametes, not the fittest individuals. At least when the individual was the target of selection, it was the average fitness of individuals that contributed to the average excess (Chapter 11). However, for fertility the average individual fertility of a genotypic class does not even contribute to the relevant average excess; rather, the bij’s are the genotypic values that predict the response to selection. Therefore, it is critical that the fitnesses first be assigned to the true target and then followed through the gametes to the next generation, as is done in Table 13.4. As shown in Eq. (11.5), the ultimate gametic measure for determining selective response is the average excess. Table 13.4 shows that the selective response is driven by the average excesses of the bij’s not the bij’s. Figure 13.7 shows a plot of the average excess for a special case of the model given in Table 13.4. As can be seen, the average excess of the A allele takes on three values of zero at intermediate p’s. An examination of the sign of the average excess on either side of these three potential equilibrium points shows that only the middle one, with an equilibrium p at 0.6, is stable. Moreover, both loss ( p = 0) and fixation ( p = 1) are stable equilibria, for a total of three stable equilibria. In general, frequency dependent selection results in many potential equilibrium points, some of which are stable and some of which are not. Therefore, the outcome of natural selection is often sensitive to the initial conditions in frequency dependent models.

Units and Targets of Selection

1. 0.95 0.9 W 0.85 0.8 0.75

aA 0.025

Stable Equilibrium 0.2

0.4

p

0.6

0.8

1.

–0.025 Unstable Equilibrium

Unstable Equilibrium

–0.05

Figure 13.7 A plot of average fertility over all mating pairs and average excess across p for the special case of b1 = 1, b2 = 0.7, b3 = 0.8, b4 = 1.3, b5 = 0.7, and b6 = 0.75 based on the model given in Table 13.4.

Figure 13.7 also shows another attribute of frequency dependent selection that has already been mentioned; average fitness is not necessarily maximized. The average fertility of all mating pairs is also plotted across p in Figure 13.7. As can be seen average fertility is maximized in this special case at p = 1, one of the potential equilibrium points. Note that average fertility is not maximized at the stable polymorphic equilibrium of p = 0.6, and average fertility is close to it’s minimum value at the stable equilibrium of p = 0. This means that selection can take the population away from the p that maximizes average fertility and instead will take the population to a much lower average fertility. Once again, selection does not favor the fittest individuals nor the fittest populations, and this is particularly true when dealing with targets of selection above the level of the individual. What remains true is that selection favors those gametes with positive average excesses, although the phenotypic deviations that go into the average excess must be defined with great care when dealing with targets above the level of the individual.

Competition and Cooperation Darwin identified competition in the “struggle for existence” as a major source of natural selection. Competition inherently involves interactions among individuals who are competing for some resource, and indeed we already saw a form of this in the competition for mates. When two or more individuals compete with one another, what matters is their relative competitive abilities to one another. Indeed, the very phenotype of competitive ability cannot exist at all except in the context

511

512

Population Genetics and Microevolutionary Theory

Table 13.5 The competition model of Cockerham et al. (1972). Genotype AA

Aa

aa

Competing With: AA

w22

w12

w02

Competing With: Aa

w21

w11

w01

Competing With: aa

w20

w10

w00

Marginal Fitness

w2 = p2w22 + 2pqw21 + q2w20 w1 = p2w12 + 2pqw11 + q2w10 w0 = p2w02 + 2pqw01 + q2w00

Note: A random mating population is assumed with one locus with two alleles (A and a) with p = the frequency of A. When individuals of genotype i (i = 0 is aa, i = 1 is Aa, and i = 2 is AA) compete with individuals of genotype j, the fitness consequence to genotype i is given by wij. Marginal fitness, wk, is the average fitness of genotype k over all competitive interactions, not the average population fitness. Source: Cockerham et al. (1972). © 1972, University of Chicago Press.

of interacting individuals. Hence, the phenotype of competitive ability constitutes another target of selection above the level of the individual. Cockerham et al. (1972) presented a simple one-locus, two-allele model of competition involving pairs of individuals that randomly encounter one another in a random mating population. This model can also be used for sexual selection arising from competition for mates. The genotype frequencies within a generation are assumed to be in Hardy–Weinberg proportions, and the frequency of an individual competing with another individual of a specified genotype is directly proportional to the frequency of that genotype under the random encounter assumption. A marginal fitness effect emerges out of these competitive interactions, as given in Table 13.5. As in the previous models, the fitness components emerging from the interaction of competing individuals are constants when assigned to the proper target of selection. To predict the evolutionary response to competitive selection, we need to put this model into the terms found in Eq. (11.5), that is, we need to take the gamete’s perspective. Unlike fertility, Cockerham et al. (1972) show that the average genotypic values of the competitive fitness interactions that an individual experiences are the relevant projections to individual-level fitness in this case. As shown in Table 13.5, these marginal genotypic values are frequency dependent (the wk’s in Table 13.5). Thus, competitive selection is inherently frequency dependent. As a consequence, selection under competition has the same properties we saw before: average fitness is not maximized, Fisher’s fundamental theorem is violated, projections to individual-level fitness can be misleading, and multiple equilibria exist, some stable and some unstable. Indeed, frequency dependent models can yield such complex dynamics for allele frequency changes that analytical solutions are extremely difficult even for relatively simple models (Gavrilets and Hastings 1995). Thus, Cockerham et al. (1972) analyzed their model using the concept of a protected polymorphism. A protected polymorphism occurs when at least one allele at a locus is favored by natural selection when very rare and selected against when very common. The conditions for protection can be derived from either Eq. (11.5) or the average excess of fitness by taking the limits as the allele frequency approaches 0 and 1. For protection, the change in allele frequency predicted from Eq. (11.5) or the average excess must be positive as p 0 (that is, selection causes the allele frequency to

Units and Targets of Selection

increase when it is very rare), and they must be negative as p 1 (that is, selection causes the allele frequency to decrease when it is very common). In such a case, natural selection protects the allele from going to fixation ( p = 1) or loss ( p = 0) and thereby favors the maintenance of a polymorphism (and in a model with no drift such as given here, insures polymorphism). For example, in Chapter 11 we discussed the balanced polymorphism of sickle-cell anemia in a malarial region. As can be seen from Figure 11.2, the average excess of fitness of the S allele is positive when the frequency of the S allele is small, but it is negative when large. Hence, this balanced polymorphism is also a protected polymorphism. However, the two concepts are not identical, as a balanced polymorphism requires a stable intermediate allele frequency such that Δp > 0 when p is just below the equilibrium allele frequency and Δp < 0 when p is just above the equilibrium allele frequency (as in Figure 11.2). However, the protected polymorphism concept focuses exclusively at Δp for p close to 1 and 0; it is not concerned with potential equilibrium points and their stability. With protected polymorphisms, we look exclusively at the instability of the potential equilibrium points of p = 0 and p = 1. The differences between balanced and protected polymorphisms can be illustrated with the Cockerham et al. (1972) competitive model. First, consider the conditions for protection in this model. All the information needed to calculate the average excesses and Eq. (11.5) is given in Table 13.5. The limits of Eq. (11.5) are then taken. These limits are easily determined when there is no dominance or recessiveness. When p is close to 0, the random mating population will consist almost exclusively of aa homozygotes, so the average fitness of the population will converge to w00, the fitness of aa individuals competing with other aa individuals. Similarly, the marginal fitness of aa individuals is w00 because virtually all of the competitive interactions of aa individuals is with other aa individuals in such a population. As long as p is extremely small but not equal to 0, the few A alleles in this population under random mating will be found almost exclusively in Aa heterozygotes, who in turn will be competing almost always with aa individuals. Hence, the marginal fitness of Aa individuals for p close to 0 is w10. The average excess of A when rare is therefore: aA ≈ w10 − w00

13 21

that is, only the heterozygote fitness genotypic deviation of Aa is important when p 0. When p 1, it is easier to look at the average excess of the a allele (p 1 means q 0). The average excess of the a allele under these conditions depends only upon the heterozygote fitness genotypic deviation, which is w12 when the population consists almost entirely of AA individuals. Similarly, the average fitness of the population converges to w22 when almost all individuals are AA. The average excess of the a allele when A is close to fixation is therefore: aa ≈ w12 − w22

13 22

The conditions for protection are that Eqs. (13.21) and (13.22) both be positive (note, if (13.22) is positive, then Δp < 0). Thus, in the case of no dominance or recessiveness, the polymorphism is protected whenever: w10 > w00 and w12 > w22

13 23

Inequalities (13.23) mean that the polymorphism is protected whenever the rare heterozygotes are better competitors than the most common homozygote when both are competing against the most common homozygote. Note that only four of the nine competitive coefficients in Table 13.5 influence the conditions for protection. The other fitnesses coefficients obviously influence the selective dynamics of this system, but they play no role in the narrower problem of protected polymorphism.

513

514

Population Genetics and Microevolutionary Theory

As an example of protection, consider the competitive fitness matrix shown in Figure 13.8, along with a plot of Eq. (11.5) over p from 0 to 1, using the wk’s in Table 13.5 for the genotypic values in the average excess calculation. The fitness elements that contribute to inequalities 13.23 are boxed, and as can be seen the conditions for protection are satisfied in this case. The plot of Δp against p shows this protection, as Δp > 0 when p is small, and Δp < 0 when p is large. Note in this case, Δp = 0 only at p = 0.5, and Δp > 0 when p < 0.5, and Δp < 0 when p > 0.5. Hence, this is a balanced polymorphism with a unique, stable equilibrium. Now consider the situation portrayed in Figure 13.9. Note that fitness elements for protection are identical in this case, but three of the other five elements are not. In this case, the plot of Δp against p looks the same as in Figure 13.8 when p is close to 0 or 1, reflecting the identity of the conditions for protection. However, note in this case there are three potential equilibria, with the one at p = 0.5 being unstable and the other two being stable. Hence, there are two local balanced polymorphic points in this case, but selection can drive the system to very different states depending upon the initial conditions. The selective dynamics shown in Figure 13.9 are obviously more complex than that shown in Figure 13.8, even though both are identical with respect to being protected polymorphisms. Even the simple competitive model of Cockerham et al. (1972) can result in such complex selective dynamics that in many cases we settle for knowing only the conditions for protection. Inequalities 13.23 show that most competitive parameters are not needed for protection, so the conditions for protection are very broad in this frequency-dependent competitive model. In particular, unlike the constant fitness models of Chapter 11 and illustrated in Figure 11.2, global heterozygote superiority is not needed for polymorphism under frequency dependent competition. Although the model of Cockerham et al. (1972) was formulated in terms of competition among individuals, it can be used as a general model for social interactions. Indeed, we will now use it to examine the evolution of cooperation among individuals—seemingly the opposite of competition. Consider the model illustrated in Figure 13.10 that assumes social interactions in a random-mating, random-encountering population. The fitness matrix in that figure shows that the A allele behaves as a dominant allele such that A- individuals interact with one another in a fair and equitable fashion. Both interacting A-s receive the same relative fitness benefit of 1. However, the recessive phenotype associated with aa is an exploiter. When aa individuals interact with A- individuals, they receive a fitness advantage of 1.1 but at the expense of a fitness disadvantage to the A- individual of 0.9. The aa individuals do not cooperate with one another, so an aa by aa interaction results in the reduced fitness of 0.9 for both individuals. A population fixed for the A allele in many ways seems optimal: all individuals share equitably in a fitness of 1, and the average fitness of the population is at its maximum of 1. In contrast, a population fixed for the exploiter allele ( p = 0) would have an average fitness of 0.9, as would all individuals. Hence, if one equates adaptive evolution to a process that optimizes average population fitness, natural selection should drive the A allele to fixation. However, we have learned that selection can only be understood from a gamete’s perspective. The solid line in Figure 13.10 plots Δp as a function of the average excess, and Δp is negative for all intermediate p values. Hence, natural selection strongly drives the fixation of the exploiter phenotype, thereby consistently reducing the average fitness of the population to its minimal value. Selection among interacting individuals cannot be modeled as a simple optimization process, and the gametic perspective of expected fitness as measured by the average excess is the only reliable guide to the evolutionary outcome. In general, evolutionary models with well-mixed individuals show that exploiters or cheaters will displace cooperators (Allen et al. 2013a), so the results shown by the solid line in Figure 13.10 are not surprising. There are many solutions to how selection could come to favor cooperators over exploiters. One solution is to focus on interactions between close relatives rather than on random

Units and Targets of Selection

Competitive Interaction Matrix:

0.0075 0.005

Competing With: AA Aa aa

Δp

0.0025 0.2

0.4

0.6

AA

Aa

aa

0.9 1.0 1.0

1.0 1.0 1.0

1.0 1.0 0.9

0.8

1.0

0.0025 0.005 0.0075

p Figure 13.8 A plot of Eq. (11.5) (Δp) versus p for a special case of the competition model of frequency dependent selection given in Table 13.5. The specific fitness values used are indicated in the figure.

0.0075 0.005

Δp

0.0025

Competitive Interaction Matrix: Competing With: AA Aa aa

0.2

AA

Aa

aa

0.9 1.0 0.6

1.0 0.8 1.0

0.6 1.0 0.9

0.4

0.6

0.8

1.0

0.0025 0.005 0.0075

p Figure 13.9 A plot of Eq. (11.5) (Δp) versus p for a second special case of the competition model of frequency dependent selection given in Table 13.5. The specific fitness values used are indicated in the figure.

encounters. This solution will be explored in the next section. Another solution is to introduce some degree of nonrandomness in the social encounters, even if the encounters are between non-relatives. Consider a simple extension of the random encounter model of Cockerham et al. (1972). Instead of one round of random encounters, let there be a second round of encounters following the first round of random encounters (Templeton 2018a). During this second round, the A- individuals who happened to have interacted with aa individuals during the first round remember this exploitation such that they refuse to interact with any aa individuals during the second round; rather, they choose to interact only with other A- individuals. The total fitness is the product of the fitnesses of the two encounters. The dashed line in Figure 13.10 shows the resulting evolutionary dynamics of this model that includes two rounds of social interaction with memory and learning. As can be seen, the cooperative A- individuals are no longer eliminated by fixation of the a allele, but rather a protected, balanced, stable polymorphism evolves. Consequently, the evolution of cooperative phenotypes does not require kin or family selection, but rather can occur with

515

Population Genetics and Microevolutionary Theory

Social Interaction Matrix:

0.005

0.2

Δp

516

–0.005

0.4

Interacting With: AA Aa aa 0.6

AA

Aa

aa

1.0 1.0 0.9

1.0 1.0 0.9

1.1 1.1 0.9

0.8

1.0

p

–0.010

–0.015

Figure 13.10 Plot of Eq. (11.5) (Δp) versus p for the social interaction matrix. The solid line shows the dynamics of a single-interaction model, and the dashed line shows the dynamics of a model with two rounds of interaction with A- individuals who interacted with aa individuals on the first round now avoiding them and only interacting with other A- individuals.

interactions with non-relatives as long as there is memory and learning. With more encounters, fewer and fewer individuals remain naïve, and selection favors cooperation even more strongly (see also the model of multiple encounters of Dridi and Akçay 2018). In general, factors that increase the nonrandomness of encounters (such as memory), even between non-relatives, make the evolution of cooperative behavior more and more likely. These factors include population and/ or social subdivision (Kulich and Flegr 2011; Powers et al. 2011; Rand et al. 2011, 2014; Allen et al. 2013a, 2017), enhanced discrimination (Sibly and Curnow 2012), and selection favoring individual recognition and long-term memory (Miller et al. 2020). Uyenoyama (1979) has shown that a shifting balance type of population structure (Chapter 12) with a large number of small local demes can favor the increase of alleles associated with cooperative or altruistic behavior. Hence, there are many conditions under which natural selection favors the evolution of cooperation even between individuals that are not closely related. With close relatives, such evolution is even more likely.

Kin/Family Selection In both the fertility model and in the competition model, fitness phenotypes emerged from the interactions of individuals who encountered one another (or mated) at random. However, there are many targets of selection that emerge from individuals whose encounters are highly nonrandom from a genetic perspective. For example, genetically related individuals aggregate in many species, at least for some portion of their lives. This occurs in species with some sort of family structure, which places parents and offspring and/or siblings into a potentially interactive context. Table 13.6 presents a simple one-locus, two-allele (A and a) family model in a random mating population (Templeton 1979a). There are two simultaneous targets of selection in this model, the individual and the family. Thus, the fitnesses that we assign to individuals in the context of a family are constants (the wij’s, where i indexes the genotype and j the family type in Table 13.6). As before, when we assign a marginal fitness to just an individual genotype with no

Units and Targets of Selection

family context (wAA etc. at the bottom of Table 13.6), the marginal fitnesses become frequency dependent, reflecting the fact that the targets of selection include a level above the individual. Hence, all the properties of the earlier frequency dependent model are equally applicable here. Kin or family selection therefore does not maximize average fitness, evolutionary outcomes are not predictable from marginal individual-level fitness projections, and the course of evolution violates Fisher’s fundamental theorem. However, as the bottom of Table 13.6 shows, just as with fertility selection, we can produce a version of Eq. (11.5) using the wAA etc. as genotypic values that allows us to investigate selective dynamics, or at least protected polymorphisms. Darwin stated that “one of the most serious” problems to his theory of natural selection was the evolution of sterile workers in several groups of insects. In general, this leads to the problem of the evolution of altruism, that is, traits that appear deleterious at the individual level but beneficial for a group of individuals. The dual target model given in Table 13.6 can accommodate these two levels of selection and can therefore be used to investigate the evolution of altruism in the context of interactions with kin in a family structure. For example, suppose a population is initially fixed for the A allele, but a new allele, a, mutates that influences an altruistic phenotype in a recessive mode. That is, within a family, any individual expressing this recessive phenotype has lower fitness than his/her sibs that do not express this phenotype, but the presence of altruists in a sibship increases the overall fitness of all their nonaltruistic sibs, with the benefit increasing with an increasing proportion of altruistic individuals. This recessive case is modeled by assuming w15 > w24 = w14 > w21 = w22 = w12 = w13 > w04, w05, w06. The equalities reflect the dominance of the nonaltruist phenotype, and the inequalities insure that the altruist phenotype always has lower fitness than the nonaltruist in every family, but the more altruists, the higher the fitness of the nonaltruist. Because a is initially a new mutation, we start close to p = 1. The average excess equation in more complex in this case, but it is still a straightforward exercise to take the limit as q 0 to examine the properties of Δq when q is rare. Doing so, the conditions for Δq > 0 when q is rare is either w24 > 2w21-w04 or, if w24 = 2w21–w04, then 2w24 + w21 < 2w05w15 (Templeton 1979a). An examination of Table 13.6 reveals why these particular fitness components determine whether or not a increases when rare. Most families in such a population consist of family type 1, which only has AA offspring with fitness w21. When q is rare, the most common family type in which the altruist phenotype will appear is family type 4, in which case the nonaltruist dominant phenotype has fitness w24 and the altruist phenotype has fitness w04. Hence, just these three fitnesses determine the fate of the altruistic allele when rare, and the other seven components are irrelevant unless w24 = 2w21–w04. This equality effectively makes the contribution of family type 4 slight by itself for the selective dynamics of q when q is rare, so now the dynamics become influenced by both the first and second most common family types in which the altruist phenotype appears when q is rare, family types 4 and 5. Hence, the conditions for increase of the a allele now include the fitness of the nonaltruist phenotype in family type 5, w15, and the fitness of the altruist in family type 5, w05. When one of these conditions is satisfied, the altruist allele will increase in the population through its beneficial effects on kin even though the altruist itself has lower individual fitness to the nonaltruist in every family context. Similarly, by looking at Δp when p is close to 0, the fitness inequalities given above insure that Δp > 0. Therefore, the original inequalities combined with the conditions to insure the increase of the a allele when rare result in a protected polymorphism. If we change the inequalities to now have w06 > w15 (that is, the fitness of a pure brood of altruists is higher than the nonaltruist fitness in family type 5, the most common family type in which nonaltruists exist when p is close to 0), w05 > (w21 + w24)/2, and w04 > w21 (conditions that insure that altruists are more fit on the average then nonaltruists at intermediate allele frequencies even though altruist sibs are still less fit than their nonaltruist sibs within

517

Table 13.6

A model of family selection in a random mating population with two alleles (A and a) at an autosomal locus with an initial frequency p of the A allele. Mendelian Probabilities of Offspring (Zygotes) Times wij

Mating Pair

Frequency

AA

Aa

1. AA × AA

p

4

1 w21

0

0

2. AA × Aa

4p3q

1

1

0

3. AA × aa

p 2q 2

0

1 w13

0

4. Aa × Aa

4p2q2

1

1

/2 w14

1

/4 w04

5. Aa × aa

4pq3

0

1

/2 w15

1

/2 w05

6. aa × aa

q4

0

0

1 w06

p4 w21 + 2p3 qw22 + p2 q2 w24 b wAA = p2 w wAA = p2w21 + 2pqw22 + q2w24

2p3 qw12 + 2p2 q2 w13 + 2p2 q2 w14 + 2pq3 w15 w wAa = 2pq w

p2 q2 w04 + 2pq3 w05 + q4 w06 w waa = q2 w

wAa = p2 w12 + 2pq

waa = p2w04 + 2pqw05 + q2w06

Frequency in Next Generation:

Where:

/2 w22 /4 w24

aa

/2 w12

w13 + 1 2w14 + q2 w15

1 2

Note: Sex-dependent effects are ignored, so matings such as AA × Aa and Aa × AA are regarded as equivalent and hence their frequencies are added. The terms wij are the fitnesses of individuals with genotype i (2 = AA, 1 = Aa, and 0 = aa) in the context of family type (mating-pair type) j. The term w = p2 wAA + 2pqwAa + q2 waa is the average fitness of the offspring from the mating pairs as weighted by their family fitness context and Mendelian probabilities.

Units and Targets of Selection

families) then fixation of the altruist allele is the only globally stable solution (Templeton 1979a). Hence, an altruistic trait can indeed evolve through the action of natural selection when the target of selection is a mixture of individual-level selection and selection among interacting related individuals in the context of a family. Hence, selection emerging from interacting biological relatives can solve Darwin’s major difficulty with his theory of natural selection. The evolution of altruism within a family is an example of kin selection, in which natural selection is treated as an optimization process on inclusive fitness—the fitness of an individual plus the fitnesses of relatives as weighted by their degree of genetic relatedness (Hamilton 1964). However, as has been illustrated many times starting in Chapter 11, natural selection is not an optimization process of individual or population fitness. The gametic perspective is needed, and that is absent in the work of Hamilton (1964). Moreover, optimization models fail to capture the complex dynamics associated with frequency-dependent selection as shown in many of the graphs given in this chapter. Indeed, the dynamical complexity can be extreme. For example, Table 13.7 shows a specific fitness possibility for individual/family-level selection. Figure 13.11 shows the plot of Δp against p for this fitness matrix. The dynamics are exceedingly complex and chaotic (the exact result will depend upon how many digits your computer keeps in memory). Finally, inclusive fitness is simply not needed as the same phenomena can be analyzed using targets of selection at multiple levels (Templeton 1979a; Uyenoyama and Feldman 1980; Uyenoyama 1984; Marshall 2011; Allen et al. 2013b). When discussing selection below the level of the individual, we noted that a mixture of targets below the individual and at the individual-level frequently occurred (e.g. the t locus in mice). Consequently, kin and related family selection models that combine individual-level and above individual-level targets of selection continue this theme. We end this chapter with an example that combines targets of selection at all three levels; below, at, and above the level of the individual. We previously mentioned the human genetic disease gene, Huntington’s disease, a trinucleotide disease associated with a CAG repeat. We discussed how new mutants for Huntington’s disease (H alleles) arise from alleles with between 29 and 35 CAG repeats that expand on transmission through the paternal germline to 36 CAGs or greater. Once the threshold of 36 repeats has been achieved or surpassed, there is strong and biased selection in the male germline to increase the number of repeats. Hence, one target of selection in this case is the genetically influenced biased expansion of repeat number with the target of selection being the male germline. The germline selective processes have a direct effect at the level of individual phenotype. Because germline selection operates to increase the number of repeats, there is a preferential transition from Table 13.7 Fitnesses of the offspring in the six family types defined by a single locus with two alleles, as shown in Table 13.6. Fitnesses of Offspring with Genotype i from Family j, wij AA

Aa

aa

1. AA × AA

w21 = 1





2. AA × Aa

w22 = 1

w12 = 1



3. AA × aa



w13 = 1



Parents

4. Aa × Aa

w24 = 1.1

w14 = 1.1

w04 = 0.9

5. Aa × aa



w15 = 1.2

w05 = 1.0

6. aa × aa





w06 = 1.2

519

Population Genetics and Microevolutionary Theory

1.5 x 10–16 1.0 x 10–16

Δp

520

0.5 x 10–16 0

0.2

0.4

0.6

0.8

1.0

–0.5 x 10–16 –1.0 x 10–16

p Figure 13.11 Table 13.7.

Plot of Eq. (11.5) (Δp) versus p, the frequency of the A allele, for the fitness model shown in

normal to disease alleles, and once a disease allele exists, it is extremely unlikely to back mutate to a normal allele. Thus germline selection is biased toward producing individuals with the disease phenotype. Moreover, the age of onset of the disease is highly correlated with repeat number (Rubinsztein et al. 1997; Figure 13.12). Combining germline selection with this individual level phenotypic consequence, most newly arisen disease alleles start with low copy numbers and therefore late ages of onset. Indeed, the age of onset is so high in individuals with low copy number alleles, that many of the heterozygotes have completed their reproduction or die from other causes before the disease is expressed (Falush et al. 2001). Hence, new disease alleles with late age-of-onset are expected to be neutral at the individual level. As the alleles go through more and more male generations, germline selection causes their copy number to increase and, thereby, decreases the age of onset. As can be seen from Figure 13.12, once the repeat number is greater than 46, most individuals develop the disease in their 30’s or younger, the prime reproductive ages in developed countries. By the time germline selection drives the copy number above 60, virtually all disease phenotypes develop while the individual is a teenager or younger, making the disease virtually a lethal in a reproductive sense (the relationship between fitness and age of onset of a deleterious genetic disease will be examined in detail in Chapter 15). Hence, the germline selection goes in a direction that creates increasingly strong individual-level selection in the opposite direction. Putting the two levels of selection together, the expected trajectory of any newly arisen disease allele is to increase repeat number during an initial phase of individual selective neutrality, and therefore the fate of the disease allele is strongly influenced by genetic drift. As the allelic lineage becomes older and older, individual selection against the alleles becomes stronger and stronger, eventually eliminating that lineage of disease alleles completely. However, the disease persists in the population as a whole because germline selection is constantly recreating new lineages of the disease allele. However, even this is not the whole story for Huntington’s disease. One common design in human genetics in estimating the fitness of a genetic disease is to use the normal sibs as controls. When this was done for Huntington’s disease, a fitness advantage of diseased individuals was found relative to their unaffected sibs in many studies (Reed and Neel 1959; Wallace 1976). For example, Reed and Neel (1959) found that the relative fitness of affected sibs is 1.12 when the fitness

Units and Targets of Selection

70

Age of Onset

60

50

40

30

20

36 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 Number of CAG Repeats

Figure 13.12 The relationship between CAG repeat number in Huntington’s disease and the median age of onset of the disease. For each repeat number, the median age of onset is indicated by an open bar, and the 95% confidence interval for age of onset is given by a solid bar. Source: Rubinsztein et al. (1997). © 1997, National Academy of Sciences.

of non-affected sibs is set to 1. Hence, both germline selection and individual selection operating within families seem to favor this disease. Perhaps this within family selection for Huntington’s disease is due to more subtle neurological effects that appear before the age of onset of the disease phenotype, such as less caring about transmitting the disease to the next generation. Reproductive decisions in such families are indeed influenced by fear of transmission to potential offspring and/ or social ostracism of families with Huntington’s disease (Wallace 1976), so a neurological change that made an individual less caring about transmitting the disease or less sensitive to social pressure could account for the within family advantage of Huntington heterozygotes (Neel, personal communication). These same factors of fear of transmission and social pressures also create family-level selection. Because Huntington’s disease is an autosomal dominant, it tends not to skip any generations, at least once the copy number is sufficiently high. Hence, certain families are known by the general community to have this disease “run” in their families, and the individuals born into these families often know (even before the genetic basis of the disease was established) that they were at high risk, as were their potential children. Together, these social factors can and did influence all individuals from Huntington families to lower their overall reproductive output relative to individuals from non-Huntington families. The within family advantage to affected individuals is weaker than the family-level disadvantage (Reed and Neel 1959; Wallace 1976). This type of selective situation can be modeled as a special case of Table 13.6, with Table 13.8 giving a simplified version of Table 13.6 that is relevant when the disease alleles are very rare in the general population (a more exact treatment is given in Yokoyama and Templeton 1980). In this example, the relative fitness of hh sibs in families segregating for Huntington’s disease is set to 1, whereas the fitness of the Hh sibs in these families is set to 1.12. This reflects the within family advantage in reproductive success found in sib control studies of this disease. In contrast, hh offspring from hh × hh parents are

521

522

Population Genetics and Microevolutionary Theory

Table 13.8 A model of family and individual-level selection associated with Huntington’s disease. Mendelian Probabilities of Offspring Times the Fitness in the Context of Family j Mating Pair and Number

Frequency of Mating Pair

5. Hh × hh

4pq3

6. hh × hh

4

q

Offspring Frequency in Next Generation:

Hh 1

/2 (1.12)

hh 1

/2 (1)

0

1 (1.2)

2pq3 1 12 1 12 ≈ 2pq q2 12 w

2pq3 1 + q4 1 2 2pq + q2 1 2 ≈ q2 12 w

Note: The dominant allele that causes the disease is indicated by H with frequency p, and the normal allele by h with frequency q = 1 − p. Because the H allele is generally rare in a population, the selective dynamics when p is close to 0 are dominated by family types 5 and 6 in Table 13.6, so only that subset of Table 13.6 is shown in this special case. When H is rare, the average fitness is approximated by 1.2, the fitness of hh offspring from mating type 6, the most common mating type under these conditions.

assigned a fitness of 1.2, much larger than the average Huntington family fitness of 1.06 in this example. Letting p 0 (q 1) in the equations at the bottom of Table 13.8, the marginal fitness assigned 1 12 1 12 2pq + q2 1 2 to Hh is q2 = 0 93 as q 1. The marginal fitness assigned to hh is 12 12 12 12 = 1 as p 0 (q 1). Hence, selection near this allele frequency boundary strongly favors 12 the hh genotype despite within family, individual selection favoring Hh over hh. In the case of Huntington’s disease, the family-level selection strongly works against this disease despite its late age of onset in many individuals and its within family advantage. Thus, natural selection upon this one unit of selection involves strong germline selection below the level of the individual, increasing individual selection operating in opposition to germline selection, individual-level fertility selection within families favoring Hh, and strong family-level selection against families with the H allele. One unit of selection—multiple targets of selection. As with other frequency-dependent models of selection, selection at the level of families or kin does not maximize average fitness even locally, can result in seemingly bizarre fitnesses assigned to individuals at equilibrium, violates Fisher’s fundamental theorem and Wright’s peak climbing metaphor (unless the landscape is redefined in terms of a fitness function determined by average effects/excesses, Curtsinger 1984), and frequently has multiple equilibria and complex selective dynamics. Despite this complexity, one common theme emerges from this model, and indeed all the other models discussed in this chapter with the possible exception of transposons capable of horizontal transmission: all targets of selection have their selective impact filtered by a unit of selection that must be transmitted through a gamete to the next generation. Hence, although many of the features of natural selection discussed in Chapter 11 fail to hold true for targets of selection below and above the level of the individual, the importance of the gametic perspective on natural selection remains. To understand natural selection on most targets of selection, it is essential to examine the process from the gamete’s point of view. Natural selection favors those gametes with above average fitness effects, regardless of whether or not those effects are direct phenotypes (e.g. meiotic drive) or average excesses statistically assigned to a gamete from targets of selection at or above the level of the individual. The key to understanding natural selection is therefore to take a gametic perspective.

523

14 Selection in Heterogeneous Environments Premise three states that phenotypes emerge out of a genotype-by-environment interaction. In Chapter 11, we saw that natural selection arises from this premise when there is genetic variation in the population to produce heritable variation in the phenotype of fitness. Often there is not only variation in genotypes but also in environments. We have already seen this. For example, in Chapter 8, Fisher’s basic quantitative genetic model has an environmental deviation that is modeled as a random variable assigned independently to each individual. Thus, there is both genetic and potentially environmental variation in Fisher’s model. We have also seen examples of environmental variation that are not random for each individual. For example, we discussed how the environment changed in wet, tropical Africa after the introduction of the Malaysian agricultural complex and how this environmental change altered the phenotype of fitness associated with genetic variation at the human β-Hb locus (Table 11.1). This chapter will focus on how natural selection operates when there is both genetic and environmental heterogeneity influencing the interaction of genotypes and environments in producing the phenotype of fitness. Just as we modeled genotypic variation to examine natural selection, we will now need to model environmental heterogeneity in order to study how populations adapt to changing environments. In particular, we now consider two dimensions of environmental heterogeneity. One dimension refers to the physical source of environmental heterogeneity: spatial versus temporal. Environments can vary over space and over time. For example, some regions of the world have environmental conditions conducive to the existence of the malarial parasite whereas others do not. Therefore, there is spatial heterogeneity at any given time for humans living in malarial versus non-malarial geographical areas. However, as seen with the introduction of the Malaysian agricultural complex in wet, tropical Africa, there is also temporal heterogeneity in malarial versus nonmalarial environments. The same region in space can be a non-malarial environment during some times and a malarial environment at other times. The other dimension is environmental grain, how organisms experience environmental heterogeneity. Species differ in size, their ability to move, and generation times. This creates differences in scale in the experience of environmental heterogeneity, both spatially and temporally. For example, a grizzly bear can move over large areas throughout its life, and thereby experience a variety of different environments that vary spatially. In contrast, a tree cannot move through space and hence must deal with the particular spatial environment in which it originally germinated. Thus, spatial variation is experienced differently by grizzly bears and trees. As another example, humans are a long-lived species, and as such we experience seasonal variation within our lives. In contrast, the Zebra Swallowtail butterfly (Eurytides marcellus) has two generations a year (spring and summer) in Missouri. The environmental conditions during these two seasons are quite Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

524

Population Genetics and Microevolutionary Theory

distinct and interact with the swallowtail’s basic developmental program to yield two distinct forms. The Spring form is smaller, less melanic, and has proportionally shorter swallowtails (extensions from the hind wings) than the Summer form. In this case, seasonal variation is not experienced by individual butterflies but is only experienced across the generations in this species. Thus, the temporal variation of the seasons is experienced very differently by humans and by Zebra Swallowtails. The grain of the environment can take on two extreme values. A fine-grained environment occurs when the individual experiences environmental heterogeneity within its own lifetime, and a coarse-grained environment occurs when an individual remains in a single environment throughout its lifetime, but the environment varies between demes occupying different spatial locations or across generations. The individual experiences the coarse-grained environment as a constant, but from the gamete’s perspective, the gametes of the individual may experience a different environment either spatially (via gene flow) or temporally (across the generations). Since the gamete’s perspective is the one that counts with respect to natural selection and adaptation, coarse-grained heterogeneity can have a great evolutionary impact even though no individual experiences this heterogeneity. Fine and coarse grains are extremes in a continuum. In many cases, individuals experience environmental variation, but the average environment experienced by all individuals in a deme also varies spatially among demes and temporally across generations. However, it is convenient to organize our exploration of environmental grain through these two extremes. Coupling fine and coarse grain with spatial and temporal variation leads to a total of four combinations of environmental heterogeneity. However, because an individual can only be in one place at one time, the individual always experiences fine-grained spatial heterogeneity as a sequence of temporal changes. For example, we can describe the environmental heterogeneity of a grizzly bear wandering through a series of heterogeneous spatial patches as a temporal sequence within that bear’s lifetime. Therefore, we need only consider three types of environmental heterogeneity: coarse-grained spatial, coarse-grained temporal, and fine grained. There is one other type of environmental heterogeneity that deserves special consideration. An important aspect of the environment of any species is the other species with which it coexists and interacts. Once again, we have already seen this in previous chapters in our discussion of the impact of the malarial parasite upon the selective environment that it induces in its human host. There is nothing unusual about one species constituting an important part of the environment of another. When two or more species are interacting with one another, there is the possibility that all species constitute a part of each other’s environment. Thus, when one species evolves in response to its interspecific interactions, that evolutionary change constitutes a changed environment for the other species. The other species can then evolve in response to its altered biological environment, which in turn constitutes a changed environment for the species of interest. Coevolution occurs when two or more species mutually adapt to one another through interspecific interactions. This is a special form of environmental heterogeneity because the “environment” is changing because it is capable of evolving.

Coarse-grained Spatial Heterogeneity Consider a landscape subdivided into spatial patches or habitats that induce different fitness responses from specific genotypes. We further assume that these genotypes experience only one of these patches in their lifetime (or at least, the selectively relevant portion of their lifetime for the unit of selection of interest). As an example, consider the northern acorn barnacle (Semibalanus

Selection in Heterogeneous Environments

balanoides) on the northeastern coast of the United States (Schmidt and Rand 2001). This, as well as many other marine species, has planktonic larval dispersal, resulting in high levels of gene flow and negligible population subdivision at the larval stage. The larvae then settle and enter a completely sessile stage for the remainder of their lives. The intertidal region that they settle in is a mosaic of habitats that greatly differ in environmental parameters that affect barnacle survivorship. A specific individual only experiences the habitat in which it settled, so this is coarse-grained, spatial heterogeneity. Schmidt and Rand (2001) performed genetic surveys on larvae and upon different aged adults found in four habitat types (Table 14.1). They estimated habitat-specific viabilities by measuring how genotype frequencies change with age within a particular habitat type. When genetic surveys were performed on mtDNA and upon an isozyme marker Gpi that codes for the enzyme glucose-6-phosphate isomerase, they could detect no heterogeneity either through space or age. These markers indicate the panmictic nature of this population and confirm its high level of gene flow and lack of population subdivision. However, a third genetic marker, the Mpi locus that codes for the enzyme mannose-6-phosphate isomerase, showed considerable heterogeneity. In particular, the age-specific surveys indicated a pulse of genotype-specific mortality that occurred over a two-week interval subsequent to metamorphosis from the larval to the adult form, leading to the estimated viabilities given in Table 14.1, although only the viabilities in the high intertidal zones deviated significantly from neutrality. This variation in viability associated with this locus arises in part from the differential ability to process mannose-6-phosphate through the glycolytic pathway. When mannose was supplemented in the diet and barnacles were exposed to temperature and desiccation stress, the FF genotype grew faster than the SS genotype, with SF being intermediate. This result is consistent with the pattern of viabilities observed in Table 14.1, as the exposed habitats experience more stressful temperature and desiccation conditions. Applying Eq. (11.5) to the viabilities from just one habitat leads to the prediction of fixation for the F allele in the exposed, high intertidal habitat, and fixation for the S allele in the algal, high intertidal habitat, the only two habitats with significant evidence for selection. Despite no single habitat having selective forces that would maintain a balanced polymorphism, this polymorphism seems to be stable. How can this be? We will answer this question with a model known as the Levene model after the person who did the original work (Levene 1953). There are many variants of this model, but Levene’s original model starts with a single-locus, two-allele unit of selection with the fitnesses in habitat i being: Genotype: Fitness in habitat i: Table 14.1

AA vi

Aa 1

aa wi

Habitat-specific viability estimates for the Mpi genotypes in the northern acorn barnacle.

Habitat

SS

SF

FF

Exposed Substrate in High Intertidal Zone

0.696

1

1.424

Exposed Substrate in Low Intertidal Zone

0.898

1

1.012

Under Algal Canopy in High Intertidal Zone

1.519

1

0.880

Under Algal Canopy in Low Intertidal Zone

0.913

1

0.976

Note: All viabilities are measured relative to the viability of the heterozygote. Source: Schmidt and Rand (2001). © 2001 John Wiley & Sons.

525

526

Population Genetics and Microevolutionary Theory

Levene next let ci be the proportion of the total population that comes from habitat i. Note that ci is a function only of the habitat type and not of the genotypic composition within the habitat. This is an example of soft selection in which some factor not related to the genotypes of interest is density limiting in the habitat. In the Levene model, any genotypic composition can survive and reproduce in niche i, producing the same absolute number of offspring. The fitness differences shown above are therefore only relative differences among the genotypes within a habitat and are in no way measures of an absolute ability to produce offspring. Levene next assumed a completely panmictic, random mating global population. The zygotes produced by this single random mating population are then randomly distributed over the available habitats. Essentially, the population structure of the original Levene model is an island model with m = 1 (see Chapter 6). Therefore, the initial genotype frequencies found in all habitats before selection are given by the Hardy–Weinberg law; p2 for AA, 2pq for Aa, and q2 for aa where p is the frequency of the A allele in the total population. Under these assumptions, we can now predict the change in allele frequency within habitat i from Eq. (11.5) as: Δpi =

p aA,i wi

14 1

where wi = p2 vi + 2pq + q2 wi and aA,i = p vi − wi + q 1 − wii = pvi + 1 − p − wi

The quantity wi is the average relative fitness within habitat i, and aA,i is the average excess of the A allele for fitness in habitat i. The change in the frequency of the A allele in the total population is the average over all habitats of all the allele frequency changes in a specific habitat (Δpi) as weighted by that habitat’s output to the total population (ci), that is: Δp =

ci Δpi = p i

i

ci pvi + 1 − p − wi wi

14 2

Even this simple model can generate considerable complexity, so Levene only evaluated the conditions for a protected polymorphism (Chapter 13). A polymorphism will be protected if Eq. (14.2) is positive when p is close to zero and negative when p is close to one. As p approaches 0, most individuals in the population are aa, and the average fitness therefore approaches wi. Taking the limit of the average excess in habitat i as p approaches 0 yields aA,I = 1 − wi. Hence, we have: lim p 0 Δp = p

i

ci 1 − wi = p wi

ci i

1 −1 wi

14 3

Since p is small but positive, the sign of Eq. (14.3) is determined only by the summation. Hence, the condition for protecting the A allele when it is rare in the total population is: ci i

1 −1 wi

>0 i

1

ci >1 wi i

ci < 1 wi

14 4

The last representation of inequality (14.4) is the harmonic mean across habitats of the fitness of genotype aa. (The harmonic mean of a set of finite numbers is the reciprocal of the sum of the reciprocals of each number in the set.) When the harmonic mean fitness of aa is less than one, the fitness of the Aa genotype, then the A allele is protected against loss when rare. This makes sense because when A is rare, almost all copies of A bearing gametes come to be in Aa individuals, with harmonic mean fitness of 1, and are being selected relative to aa individuals, with a harmonic mean

Selection in Heterogeneous Environments

fitness less than one. Similarly, by taking the limit of Eq. (14.2) as p goes to one, the condition for protecting the a allele when it is rare is ci i

1 −1 vi

>0 i

ci >1 vi

1 i

ci vi

Δ, the organism experiences the environment as an ecotone because the transition between the two environments occurs on a spatial scale less than individual gene flow relative to selection. If ℓc < Δ, the organism experiences the environmental transition as a gradient in which adaptation to transitional, intermediate environments is

537

Population Genetics and Microevolutionary Theory

possible. Either situation can result in a genetic cline. In the case of an ecotone, populations far away from the transitional zone will tend to be fixed for either the A or a alleles depending upon which environment they are in, but populations near the environmental transition zone will have intermediate allele frequencies due to the mixing via gene flow of gametes coming from the two alternative environments. The width of this cline is on the order of ℓc. In the case of a gradient, there is more potential for geographic differentiation for a given strength of selection s because populations in the transition zone can show local adaptation to the transitional environments. This means that the width of the cline is greater than ℓc and tends to approach Δ. Note that Eq. (14.14) tells us that there is no absolute difference between a gradient and an ecotone. The distinction between the two depends upon the physical scale of the environmental heterogeneity (Δ), the population attribute of the degree of isolation-by-distance (ℓ), and the locus-specific attribute of phenotypic response to the spatial heterogeneity (b). Hence, not only can different species experience the same physical heterogeneity (measured by Δ) in different fashions because they differ in ℓ, but the sensitivity to b means that even within a species one locus may respond to the spatial heterogeneity as if it were an ecotone, another locus respond to the same heterogeneity as if it were a gradient, and yet other loci may not respond to the spatial heterogeneity at all because they are neutral with respect to this environmental change. We now return to the genetic cline in the carbonaria allele associated with melanism in B. betularia shown in Figure 14.5. There is some dominance in this system, so the model shown in Eq. (14.12) is not completely applicable, but we will use it as an approximation. A one-dimensional transect of this cline is shown in Figure 14.7. Saccheri et al. (2008) estimated ℓ to be 13.6 km. Prior to 1975, the distance between the heavily polluted Liverpool area (position 120 km in Figure 14.7) to the unpolluted Welch countryside was about 70 km. The parameter s was estimated to be 0.19 for the 2002 data, also shown in Figure 14.5, but not the pre-1975 data. However, the steeper observed slope in the pre-1979 cline indicates a b ≈ 0.01 to yield s ≈ 0.70. Hence, for the pre-1979 cline, ℓc ≈ 16.3 < 70 = Δ, so prior to 1979 this was a gradient. In 2002, the cline was shallower and shifted far to the east, so Δ ≈ 170 km, and using the estimated s = 0.19 yields ℓc ≈ 31.2 < 170 = Δ, so the cline in 2002 also reflects a gradient. 1.0 0.9 Carbonaria frequency

538

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0

50

100

150

200

Distance (km)

Figure 14.7 Clines of the dominant melanic phenotype carbonaria of the moth B. betularia on a WSW to ENE transect across a region spanning Abersoch on the western edge of north Wales to Leeds in northern England. Circles indicate the early cline up until 1975 (filled circles 1964–1969, open circles 1970–1975); squares show the cline in 2002. The lines running through the circles or squares are the 68.2% confidence intervals. The dashed line indicates the pre-1975 cline, and the solid curve the predicted 2002 cline with an estimated s = 0.19 and ℓ= 13.6 km.

Selection in Heterogeneous Environments

Table 14.2 The relative fitnesses of the male and female genotypes created by the A+ and A− alleles at the human X-linked G-6-PD locus in malarial (coastal) and non-malarial (mountains) regions on the island of Sardinia. Genotype Male

Female

Environment

A+

A−

A+/A+

A+/A−

A−/A−

Coastal

1

0.97

1

1.07

0.97

Mountains

1

0.90

1

0.98

0.90

Source: Livingstone (1973).

Recall from Chapter 11 that alleles resulting in G-6-PD deficiency at the X-linked G-6-PD locus in humans are involved with malarial resistance but are deleterious due to anemia and favism in nonmalarial environments. The island of Sardinia lies off the west coast of Italy. The coastal areas of Sardinia historically have had a high incidence of malaria, but the central mountainous region does not. The estimated relative fitnesses of the genotypes associated with the active (A+) and deficient (A−) alleles at this locus in Sardinia are given in Table 14.2 (from Livingstone 1973). Figure 14.8 shows the frequencies of the A− allele along a 130 km transect bisecting the island going from the east coast, through the central mountains, to the west coast. The actual transition from the lowland malarial areas to the highland non-malarial region occurs in just a few kilometers on both the east and west sides of this transect. People have inhabited this area for over 2000 years with relatively constant densities. Livingstone (1973) simulated the evolutionary response to the fitness values shown in Table 14.2 in a manner that tried to mimic the movements of peasant populations in Europe. A good fit to the observed pattern shown in Figure 14.8 was obtained by assuming that the environmental transition was sharp with Δ ≤ 2.65 km (the average distance between adjacent villages in the simulations) and that 25% of the people left their village of birth, with those dispersing going primarily to adjacent villages and other nearby villages, yielding an average dispersal distance of d = 3.34 km, which yields ℓ = 3.34 1 4 = 1.67 km. The cline in this case is driven by the fitness effects of A− in hemizygous males and in heterozygous females (Table 14.2). From Table 14.2, s = 0.97 – 0.90 = 0.07 in males, and the s for females is 1.07 – 0.98 = 0.09. Averaging across the two sexes, s = 0.08. Then, from Eq. (14.12), ℓc = 5.91 km. Hence, the characteristic length of this cline is more than twice the transitional distance Δ, implying that the ecotone model explains the cline observed in Figure 14.8. In Chapter 15, we will give a Drosophila example of a genetic cline over 1 km that defines a gradient. Yet, the larger physical distance in this human example defines an ecotone. The distinction between an ecotone and a gradient is not a function of absolute distance, but rather depends upon the balance among physical distance, gene flow, and selective strength. Because of differences in gene flow and selection, 1 km can define a gradient, and 2.65 km can define an ecotone, as these examples show. We saw earlier how genetic architecture is important in adapting to discrete habitat heterogeneity, and the same is true for adapting to environmental gradients. Humans display a cline in skin color with absolute latitude, with darker skins near the equator and lighter skins at the higher absolute latitudes (Relethford 2002, 2012). This cline arises from two selective forces related to ultraviolet B radiation (UVB), which in turn is highly correlated with absolute latitude. High UVB environments at low absolute latitudes select for dark pigmentation as protection from UVB

539

Population Genetics and Microevolutionary Theory

0.35

0.30

0.25 Frequency of the A– Allele

540

0.20

0.15

0.10

0.05

0.00

West Coast

Mountains

East Coast

Figure 14.8 The frequencies of the A− alleles at the X-linked G-6-PD locus in human populations along an east–west transect of the island of Sardinia. Source: Livingstone (1973).

damage, whereas low UVB environments at high absolute latitudes select for light skin to sustain cutaneous photosynthesis of vitamin D mediated through UVB (Jablonski 2010). Tiosano et al. (2016) surveyed SNPs in several skin color genes and the vitamin D receptor (VDR) gene in 10 human populations that spanned a large range of absolute latitudes. These genes, all unlinked, influence vitamin D metabolism and pigmentation in a manner characterized by extensive epistasis for the trait of skin pigmentation (Pośpiech et al. 2014). Recall from Chapter 13 that coadapted complexes can arise from even unlinked genes within local populations when there is restricted gene flow leading to some degree of subdivision, which is provided primarily by isolation-by-distance in humans (Templeton 2018a). This type of coadaptation shows up primarily as inter-populational linkage disequilibrium (Eq. (3.6)) that contributes to the covariance of allelic effects at different loci shown in equation 14.9. Accordingly, Tiosano et al. pooled all the samples into a global sample to capture inter-populational linkage disequilibrium (Eq. (3.6)) and measured allele-specific disequilibrium with CCC (Eq. (2.17)). Permutation testing was used to determine a threshold of 0.65 for retained CCC values that would yield a false positive rate of less than 0.001 in the entire data set, and seven distinct allelic networks were defined by the retained CCC edges between allelic pairs, as

Selection in Heterogeneous Environments

65_6

VDR rs2248098 C

VDR rs7305032 T

65_3

VDR rs7975232 T

VDR rs2525044 C VDR rs7975232 G

A rs739837 VDR VDR rs739837 C VDR rs2248098 T

T rs731236 VDR

MC1R rs3212363 A

VDR rs1544410 G

MC1R rs3212357 T MC1R rs2228478 A

C rs3212359 MC1R

C rs3212357 MC1R

MC1R rs3212359 T

G rs11568820 VDR

C rs3782905 VDR

65_4

VDR rs10875695 G

A rs7302235 VDR

SLC45A5 rs16891982 G

A rs1426654 SLC45A2

TYR rs1042602 C

65_7

G rs2238136 VDR

65_5

65_2

MC1R rs3212369 A

VDR rs4237856 T

C rs2238138 VDR

VDR rs4073729 C

TYRP1 rs2733832 C

VDR rs11168293 C A rs10783219 VDR

TYRP1 rs1408799 T VDR rs2853564 T

65_1 C rs1989969 VDR

T rs4760658 VDR

Figure 14.9 Allelic networks for VDR and skin color genes. Each dot represents an allelic node, identified by its SNP (the rs number) and nucleotide state. Out of the total of 128 nodes in the data set, only the nodes that had an edge with CCC values ≥ 0.65 connecting them to another node are retained. The 51 retained edges defined 7 discrete networks of alleles, labeled 65_1 through 65_7. Source: Tiosano et al. (2016).

shown in Figure 14.9. Five of these networks defined haplotypes within a single gene region, but two (65_1 and 65_2 in Figure 14.9) contained SNPs from VDR and various skin color genes. Both of these networks showed a significant, linear association of their frequencies with absolute altitude (Figure 14.10). These patterns were replicated in an independent HapMap data set (Tiosano et al. 2016), and HapMap data were also used for a genome scan that indicated having so many significant CCC edges in this small number of genes was an extreme outlier. One explanation of these results that would not involve epistasis is simply adding up parallel clines of the single-locus components of these multi-locus networks. However, the single-locus components of these networks show highly diverse latitudinal patterns, with only a few showing latitudinal clines. Figure 14.11 shows the extremes of the patterns shown by the elements of 65_2, the multi-locus network with the strongest latitudinal association that spans virtually the entire frequency range of 0–1. One extreme pattern is a non-clinal: an in-versus-out-of-Africa frequency change in a haplotype in the VDR promoter that is known to have a large effect on the transcription of the VDR gene. Most of the other elements showed this same non-clinal pattern, albeit with less extreme frequency shifts. The skin color gene SLC45A2 was the only component to display a linear latitudinal clinal pattern. However, even when the elements that display clines were excluded, the remaining multi-locus networks still displayed significant clinal associations with latitude. Obviously, the whole is greater than the sum of

541

1

Frequency of Individuals with All Alleles in the Network

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Frequency 65_1 Frequency 65_2

0.1 0

0

10

20

30

40

50

60

Absolute Latitude

Figure 14.10 A plot of the frequencies of individuals bearing all the alleles in network 65_1 (solid line) and 65_2 (dashed line) versus absolute latitude. The lines were estimated with weighted least squares, and nonlinear alternatives were not significantly better. Source: Based on Tiosano et al. (2016).

1 0.9

Frequency of Network Element

0.8 0.7 0.6 VDR promoter

0.5

SLC45A2

0.4 0.3 0.2 0.1 0

0

10

20

30

40

50

60

Absolute Latitude

Figure 14.11 A plot of the frequencies of two of the six single-locus components contained in the 65_2 versus absolute latitude. The VDR promoter haplotype pattern is shown by the dashed lines, and the clinal pattern of the skin color gene SLC45A2 is shown by a solid line. Source: Based on Tiosano et al. (2016).

Selection in Heterogeneous Environments

its parts, as expected under coadaptation with a known epistatic system. The failure of most of the single-locus elements to have significant latitudinal clines reveals that most of loci involved in these multi-locus clines would not have been discoverable by single-locus analyses. This is true even of VDR, even though its promoter haplotype displays the strongest single-locus frequency shift, but not in a clinal fashion. This pattern plus information from ancient DNA studies on when some of the other skin color gene elements began to increase in frequency indicates that the 65_2 coadapted complex first arose by selection favoring near fixation of the high activity VDR promoter haplotype when humans expanded to higher latitudes from equatorial Africa. The other loci then helped fine-tune the system to the local UVB environments that were subsequently colonized, ultimately forming a strong clinal pattern of the coadapted complex (Figure 14.10) on the UVB environmental gradient. In addition to genetic clines, there are also phenotypic clines, gradual shifts of phenotypic frequencies or mean phenotypes over geographical space. The genetic clines shown in Figure 14.10 also correspond to phenotypic clines for human skin color, but phenotypic clines do not always correspond to genetic clines. Premise three states that phenotypes arise from the interaction of genotypes and environments. We have already seen with the yeast (Herrera et al. 2012) and lizard (Wogan et al. 2020) examples described earlier in this chapter how organisms can sometimes adapt to discrete environmental heterogeneity via phenotypic plasticity mediated by environmentally associated changes in DNA methylation. Phenotypic plasticity can also occur in response to an environmental gradient, so phenotypic clines could be due to phenotypic plasticity when that plasticity is in response to the environments on the gradient or ecotone. However, sometimes the phenotypic plasticity is maladaptive, that is, the plastic response of the organism to the environment diminishes its ability to survive or reproduce (Huang and Agrawal 2016). In these cases, genetic clines often arise that affect phenotypes in the opposite direction of plasticity, yielding a situation known as counter-gradient selection (Schmid and Guillaume 2017). For example, populations of the green frog Rana clamitans live in the east coast of the United States and into the Appalachian Mountains (Berven et al. 1979). Populations were sampled in ponds along an elevational transect from 10 to 1250 m above sea level. The growth rates of the tadpoles were measured, and it was discovered that tadpoles in the lowland ponds had the largest growth rates and tadpoles in the montane ponds had the smallest growth rates. However, growth rates in amphibians are strongly influenced by temperature, and the average temperature varies greatly over this transect. In particular, low temperatures decrease the growth rate whereas warm temperatures increase it. Consequently, the phenotypic cline in growth rates along this elevational transect could be due just to the temperature effects with no genetic component at all. To test this hypothesis, Berven et al. (1979) performed laboratory experiments under controlled temperature conditions on egg masses sampled from both lowland and montane ponds. They discovered that for a given temperature, the montane forms had higher growth rates than the lowland forms—exactly the opposite of the observed phenotypic cline! De Jong (1988) produced a simple model to show how genetic and phenotypic clines can go in opposite directions. Consider first a one locus model with two alleles, A and a. Let all genotypes have linear norms of reaction (Chapter 10) to temperature such that the genotypic values of growth are GAA = a + cT

GAa = cT

Gaa = − a + cT

14 15

where T is the temperature, c is the slope of the linear norm of reaction to temperature shared in common by all genotypes, and a > 0 defines the genotypic-specific intercepts of the norm of reaction (note that the heterozygote is always intermediate between the two homozygotes such that no heterozygote superiority for growth rate is possible in this model). For simplicity, we also assume

543

544

Population Genetics and Microevolutionary Theory

Mean growth rate

GAA GAa Gaa

wAA(T3) wAa (T3) waa (T3) wAA(T2) wAa (T2) waa (T2) wAA(T1) wAa(T1) waa(T1) T1 Fitness

T2

T3

Temperature

Figure 14.12 Hypothetical norms of reaction over temperature and fitnesses assigned to the phenotype of growth rate for a one-locus, two-allele model. The fitness curve assigned to the phenotype of growth rate is shown on the left of the figure, plotted sideways. The right side plots the growth rate response of each genotype to temperature. Three points on the temperature gradient are indicated, along with their effects on the fitnesses assigned to genotypes.

that the environmental variance is zero for any given temperature, that is, all individuals sharing the same genotype have the same phenotype at a specific temperature. Under Eqs. (14.15), the norms of reaction of the genotypes are all parallel lines when plotted against temperature, but with some genotypes having uniformly higher or lower growth rates for a specific temperature, as shown in the right side of Figure 14.12. In this simple model, the AA genotype always has the highest growth rate at any given temperature, the aa genotype the lowest, and the Aa genotype an intermediate growth rate. Now suppose that the fitnesses assigned to individuals on the basis of their actual growth rates are constant throughout the transect with a single optimal growth rate, say β, being associated with the highest fitness. A fitness function that has a single optimal value β with fitness dropping off symmetrically about β is given by: w P = 1−α β−P

2

14 16

where P is the individual’s phenotype (in this case growth rate) and α determines how rapidly fitness drops off in individuals with growth rates that deviate from β. A sideways plot of Eq. (14.16) is shown on the left side of Figure 14.12, whose maximum value is β. Note that this is a fitness assigned to an individual’s phenotype, and it is not a fitness assigned to genotypes. However, in order to predict the selective response at this locus, we need to assign genotypic values of fitness. This is done by using the norm of reaction for frogs reared at any particular temperature. Figure 14.12 shows three possible temperatures. Even though fitness for growth rates is constant regardless of temperature, the genotypic values of fitness vary as temperature changes. The AA genotype has the highest fitness at temperature 1 in Figure 14.12, the Aa genotype at temperature 2, and the aa genotype at temperature 3. Hence, the relative genotypic fitnesses vary dramatically over this temperature gradient. This discrepancy between phenotypic and genotypic fitnesses shows why it is essential to distinguish the cases in which fitness is assigned to a phenotype or trait versus the cases in which fitness is assigned to genotypes. Fitnesses alone do not determine the evolutionary response to natural selection. Another important factor in determining that response is population structure. We will assume that the species is

Selection in Heterogeneous Environments

distributed over this temperature regime such that there is random mating in any local area, the local densities are large so that drift can be ignored, and that there is substantial isolation-bydistance such that local allele frequencies reflect the local selective conditions—an extreme gradient model. Under these assumptions, selection is expected to drive the local allele frequency to one (fixation for A) whenever the fitness of AA is greater than the fitness of Aa and aa. Combining Eqs. (14.15) and (14.16), this occurs whenever: 2β − a 14 17 T< 1 − α β − a − cT 2 > 1 − α β − cT 2 2c Similarly, selection will cause the local fixation of the a allele whenever: 2β + a T> 1 − α β + a − cT 2 > 1 − α β − cT 2 2c

14 18

2β − a 2β + a

, the phenotypic response follows the norm of reaction 2c for the aa genotype, the sole genotype in the population under those conditions. When a balanced polymorphism is possible, there are both genetic and phenotypic clines but in opposite directions. The norms of reaction for all the genotypes are indicated by the thin dashed lines. Source: De Jong (1988). ©Springer Nature.

Coarse-grained Temporal Heterogeneity Just as a population can move through space via the dispersal of its constituent individuals, a population can also move through time via acts of reproduction to create generation after generation. The environment can change as a population moves through time such that different generations experience different environments, and because fitness is a genotype-by-environment interaction, the fitnesses associated with particular genotypes can also change across the generations. Gene pools do not change instantaneously in response to natural selection in a changed environment, but rather at a rate proportional to the magnitude of the average excesses of fitness of the gametes. As a result, whenever there is a significant environmental change over time, there is usually a time lag before the gamete frequencies can fully adjust. These time lags in turn are strongly influenced by the genetic architecture. We saw in Chapter 11 that there was a rapid increase in S alleles at the β-Hb locus after the introduction of the Malaysian agricultural complex, but there was hardly any initial response at all in the C allele frequency. As discussed in Chapter 11, this difference in the relative time lags to the altered environment was due to the initial allele frequencies, population structure, and the details of genetic architecture. In this particular case, the critical

Selection in Heterogeneous Environments

feature of the genetic architecture emerges from the fact the S allele behaves as a dominant allele for malarial resistance, whereas the C allele behaves as a recessive allele. These features of genetic architecture, when coupled with initial rare allele frequencies and random mating, lead to orders of magnitude differences in the initial adaptive response to the temporal change to a malarial endemic environment associated with the introduction of the Malaysian agricultural complex. Recall also that genetic architecture includes the nature of the genotype to phenotype relationship, which itself can be directly altered by an environmental change. Thus, we already saw in Chapter 8 that the S allele is a recessive allele for viability in the non-malarial environment, but an overdominant allele for viability in the malarial environment. Evolution under coarse-grained temporal heterogeneity can become complex and difficult to predict because the adaptive response depends upon a potentially changing genetic architecture, population structure, and historical factors (e.g. the initial composition of the gene pool). The ever-present time lags in evolutionary response to temporal changes in the environment also mean that the current genetic state of a population is not always well adapted to the current environment. A past environment can continue to affect the genetic composition of a population for long periods of time. We also see this with the S allele. Most African–Americans are no longer subject to death via malaria nor have they been for centuries, yet the S allele still persists in high frequency, although its frequency has been reduced (Table 12.1). Thus, the key to understanding the current high frequency of the S alleles in African–Americans lies in an understanding of the past environments experienced by this population and not just the current environment. Another difficulty in predicting evolution under coarse-grained temporal variation is that there are many different types of temporal variation that can have distinct evolutionary impacts. Therefore, we will examine several types of temporal variation, starting with seasonal or cyclical variation between generations.

Seasonal and Cyclical Variation Time lag effects can be particularly strong for coarse-grained cyclical temporal variation. For example, the twin-spotted ladybug beetle, Adalia bipunctata, has at least two generations per year in Germany (Timofeef-Ressovsky 1940). One generation hibernates over winter as adults and comes out in the spring. The second generation lives over the summer and into the autumn. There is also a genetically based color polymorphism in this species, with red and black forms. Populations of these beetles were monitored near Berlin, Germany, to reveal that the black forms survive better in the summer than the red forms, but the red forms survive hibernation much better than the black forms. This seasonal reversal of viabilities results in an annual cycle such that the red forms constitute 63.4% of the population in April (the beetles emerging from hibernation), whereas the black forms predominate by autumn, being some 58.7% of the population in October. Note that the red form is most common in the spring, just as the environmental conditions favoring the black forms are beginning. By autumn, the black forms predominate, yet it is the red form that is better adapted to the hibernation phase that will soon commence. Thus, the time lags inherent in any evolutionary response can yield maladaptive consequences. Insight into the evolutionary implications of coarse-grained seasonal selection can be obtained through a simple one-locus, two-allele model (Hoekstra 1975). In most models in population genetics, the basic temporal unit is a point in the life cycle at one generation to the corresponding point in the next generation. However, for a cyclical selection model, Hoekstra chose as his basic unit one complete cycle of the environmental changes, which corresponds to more than one generation in the coarse-grained case. The special case of a cycle of two environments and two generations (such as with the ladybug beetles) is shown in Table 14.3. Notice that the frequencies of the genotypes

547

548

Population Genetics and Microevolutionary Theory

Table 14.3 Hoekstra’s (1975) model of coarse-grained cyclical selection at one locus with two alleles (A and a) over a two environment cycle, with each environment experienced by a different generation. Genotype

AA

Aa

aa

Zygotic Frequency at Beginning of Cycle

p2

2pq

q2

Fitness in Environment 1

v1

1

w1

Genotype Frequency After Selection

p2 v1 w1

2pq w1

q2 w1 w1

Zygotic Frequency at Second Generation

p2 v1 p + q w21

2pq v1 p + q w1 q + p w21

q2 w 1 q + p w21

2

Fitness in Environment 2

v2

1

w2

Genotype Frequency After One Cycle

p2 wAA w

2pqwAa w

q2 waa w

Where:

w1 = p2 v1 + 2pq + q2 w1

2

2 = p2 v2 v21 + 2pqv2 v1 + q2 v2 v 1 w1 + 1 wAa = p2 v1 + 2pq + q2 w1 2 waa = p2 w2 + 2pqw2 w1 + q2 w2 w21

wAA = v2 v1 p + q

w = p2 wAA + 2pqwAa + q2 waa Source: Modified from Hoekstra (1975).

after one complete cycle are of the same form as the standard single generation models if we use the cycle fitnesses wAA, wAa, and waa. However, these cycle fitnesses are not the standard single generation fitnesses, but rather are nonlinear functions of the fitnesses that occur in both environments in the cycle as weighted by the zygotic genotype frequencies at the beginning of the cycle. Hence, much biological complexity is buried in these seemingly simple equations. Fortunately, we already have the tools to reveal that complexity. Note that the cycle fitnesses are all of the form wi = p2ωi2 + 2pqωi1 + q2ωi0 when we let i = 2 correspond to AA, i = 1 to Aa, and i = 0 to aa. This mathematical form is identical to that of the model of competitive selection of Cockerham et al. (1972) given in Table 13.5. Thus, although these two models deal with different biological situations, they end up having identical mathematical forms. This means that all the results inferred from the frequencydependent model of competition can be applied to this model of cyclical selection. For example, by simply equating the fitness components in the cyclical model shown at the bottom of Table 14.3 to the corresponding fitness components shown at the bottom of Table 13.5, we have from inequality (13.23) that the conditions for protecting a polymorphism when there is no dominance or recessiveness is: v1 > v2 v21 w1 > w2 w21

v1 v2 < 1 w1 w2 < 1

14 21

Inequalities (14.21) mean that the polymorphism is protected when the geometric mean of the homozygote fitnesses over the environment cycle is less than that of the heterozygote. Because cyclical selection is inherently frequency dependent, it also has the potential for multiple equilibrium (and hence, the initial state of the gene pool can influence the evolutionary outcome), violating Fisher’s fundamental theorem, and displaying chaotic dynamic behavior. Bergland et al. (2014) studied a population of Drosophila melanogaster in Pennsylvania in the United States with spring and fall collections over three consecutive years (2009–2011).

Selection in Heterogeneous Environments

D. melanogaster has about 10 generations over the summer and about 1 or 2 over the winter, so seasonal variation is coarse grained in this species. Bergland et al. surveyed their samples for about half a million SNPs and identified about 1750 SNPs that showed large and consistent seasonal shifts in frequency. These seasonal SNPs were scattered throughout the genome and not strongly associated with inversions, implying a polygenic basis to seasonal adaptation. Simulations of this population indicated that seasonal SNPs are much more likely to maintain the polymorphic state than neutral loci, and comparisons with the sister species Drosophila simulans indicated an enrichment of old polymorphisms. These observations support the conclusion of Hoekstra (1975) that seasonal selection broadens the range of conditions for protecting polymorphisms.

Random or Frequent Temporal Variation Grant and Grant (2002) executed a long-term study from 1972 to 2001 on two populations of Darwin’s finches, Geospiza fortis and G. scandens, on the Galapagos island of Daphne Major. They performed detailed studies that allowed them to estimate standardized selection differentials based on viability for several important morphological traits, as shown in Figure 14.14. Although these fitnesses are assigned to phenotypes, we now know the genetic architecture underlying many of these traits. For example, beak size is largely controlled by the HMGA2 gene in these species (Lamichhaney et al. 2016a), so selection on beak size would be expected to translate into selection on the HMGA2 gene. As can be seen in Figure 14.14, there is much fluctuation in both the intensity and direction of selection on all phenotypes in both species. Indeed, Grant and Grant (pg. 707, 2002) concluded “the long-term evolution is unpredictable because environments, which determine the directions and magnitudes of selection coefficients, fluctuate unpredictably.” One might think that unpredictable fitness fluctuations with no direction or trend would “average out” over many generations and therefore have little or no evolutionary impact. However, this is not the case. Consider the fixation probability of a new mutation with positive selection coefficient s given by Eq. (12.17). In a fluctuating environment, Danino and Shnerb (2018) have shown that the probability of fixation of a new mutant is affected in a complex manner, sometimes increasing the probability of fixation and sometimes decreasing it. In general, fixation of a mutant that on the average is beneficial tends to decrease with increasing environmental stochasticity. Cvijovic et al. (2015) explain this in terms of increasing stochasticity bringing the mutant more into the domain of an effectively neutral mutation. Recall from Chapter 5 that Ohta’s (1976) nearly neutral mutation theory showed that the behavior of a mutant with a selection coefficient sufficiently small relative to the force of genetic drift behaves effectively like a neutral mutation, and hence selection does not enhance its fixation probability. Cvijovic et al. show a similar phenomenon with random fluctuations in selection which also makes selection less effective. This also means that mutants subject to random selective fluctuations often tend to persist longer in the population, increasing the level of polymorphism, as also shown by Dean et al. (2017). Selective fluctuations also help maintain polymorphisms when alleles are sometimes favored and sometimes selected against. Troth et al. (2018) showed that in populations of monkey flowers (Mimulus guttatus) there are fitness fluctuations and trade-offs between seed-set and the timing of reproduction depending upon the amount of rainfall, which varies from year to year. These fluctuations helped maintain polymorphisms, particularly in larger populations. Haldane and Jayakar (1963) derived some conditions for protected polymorphisms in a temporally fluctuating environment. Let A be a dominant allele in a random mating population such that the relative fitness of the dominant phenotype for A− is set to 1 every generation. Let the relative fitness of the recessive phenotype for aa be wi for generation i, which can fluctuate with each

549

Population Genetics and Microevolutionary Theory

G. fortis

(a)

G. scandens

(d)

L Body Size Selection

0.5

0.5 0.25

0.25

0 0 –0.25 –0.25

S 73

–0.5 78

83

88

93

98

(b)

78

83

88

93

98

73

78

83

88

93

98

73

78

83

88

93

98

0.6

L

0.4

0.4 Beak Size Selection

73

(e) 0.6

0.2 0.2 0 0 –0.2 –0.2

S 73

–0.4 78

83

88

93

98

(f)

(c)

0.2 0.3

Pointed 0.1

Beak Shape Selection

550

0 0.15 –0.1 –0.2

0

–0.3 –0.15

Blunt 73

–0.4 78

83

88

Year

93

98

Year

Figure 14.14 Standardized selection differentials over time for two populations of Darwin’s finches, calculated for each sample surviving from one year to the next. Positive values indicate selection for large size or pointed beaks. Source: Grant and Grant (2002).© 2002 The American Association for the Advancement of Science.

Selection in Heterogeneous Environments

generation. They showed that the approximate conditions for a protected polymorphism in this case are: 1 n wi > 1 and n i=1

n

wi < 1

14 22

i=1

The first condition is that the arithmetic mean fitness of the recessive phenotype must be greater than that of the dominant phenotype, and the second condition is that geometric mean fitness of the recessive phenotype must be less than that of the dominant phenotype. Dropping the assumption of dominance, they then let vi be the fitness for AA at generation i, wi the fitness for aa at generation i, and 1 the fitness of the Aa genotype for every generation. The approximate conditions for a protected polymorphism are now: n

n

vi < 1 and i=1

wi < 1

14 23

i=1

Both of the above conditions are broader than those needed to maintain a polymorphism under a constant fitness model. Indeed, it is impossible for natural selection to maintain a polymorphism under complete dominance in a constant fitness model, but condition (14.22) shows that polymorphism can be maintained under complete dominance with coarse-grained temporal fluctuations. In general, a polymorphism in a constant fitness model requires that the arithmetic mean fitness of the homozygotes be less than the arithmetic mean fitness of the heterozygote (e.g. the balanced polymorphism of the A and S alleles discussed in Chapter 11). Inequalities (14.23) state that the geometric mean fitnesses over time of the homozygotes must be less than one (the relative fitness of the heterozygote) for a polymorphism to be protected, which is broader than the arithmetic mean condition. Carja et al. (2013) show that this geometric mean condition is generally valid for coarsegrained temporal heterogeneity. Condition (14.23) can be satisfied even if there is not a single generation that displays heterozygote superiority over both homozygotes. Hence, coarse-grained temporal heterogeneity tends to broaden the conditions for polymorphism.

Sporadic, Recurrent Environments Some environments are relatively constant for many generations, but then an extreme environmental change occurs such as a drought, famine, or flood. Such sporadic but extreme conditions can still have a large evolutionary impact because of their selective intensity. For example, there was an extreme cold event during the winter of 2013–2014 in the southeastern United States due to a weakening of the polar vortex. Campbell-Staton et al. (2017) had taken samples from southern Texas populations of the green anole lizard, Anolis carolinensis, before and after this cold event. They found a significant increase in cold tolerance and identified 14 genomic regions that had differentiated over this short time interval, indicating a strong selective response in this generation exposed to an extreme environment. Sporadic events can have a lasting impact on a population when selective severity is coupled with time lag effects and weak selection under normal conditions. Haldane and Jayakar (1963) used their models to show how a trait that is normally mildly selected against but that is strongly selected for about once in every 20 generations could persist in high frequencies in a population. For example, Anolis lizards living on Caribbean islands are sporadically hit by powerful hurricanes. Studies on populations before and after hurricanes indicate that there is strong selection for larger toepads that help prevent the lizards from being blown away by the hurricane-force winds (Donihue et al. 2020).

551

552

Population Genetics and Microevolutionary Theory

Donihue et al. also performed an extensive survey of 188 Anolis species throughout the Neotropics to demonstrate that toepad area positively correlates with hurricane activity over the past 70 years, thereby indicating a long-term evolutionary effect of these short-term but intense episodes of selection. Another possible example of intense, sporadic selection relates to the trait of type 2 diabetes mellitus in humans. Type 2 diabetes is an adult onset alteration in insulin secretion and insulin resistance (that is, cells do not respond effectively to insulin, a hormone responsible for mediating the uptake by cells of glucose from the blood). Adult onset diabetes is one of the more common diseases affecting humanity, with a global prevalence among adults over 18 years of age of 8.5% in 2014 (http://www.who.int/news-room/fact-sheets/detail/diabetes). Both candidate loci and GWAS approaches have shown that genetic variation at many loci influences the risk for adult onset diabetes. Many of these loci display the signatures associated with positive selection (Ayub et al. 2014; Chang et al. 2011; Fraser 2013; Fullerton et al. 2002; Klimentidis et al. 2011; Minster et al. 2016; Segurel et al. 2013; Stead et al. 2003; Vander Molen et al. 2005; Vatsiou et al. 2016). Interestingly, there is no global signal of enrichment for positive selection when the risk loci are considered collectively. Instead, the signatures of selection at particular loci tend to be local, indicating that much of the positive selection for risk loci occurred only in recent human evolutionary history at a local level (Ayub et al. 2014; Minster et al. 2016; Sandor et al. 2017). This evidence for intense selection is not expected on a late age-of-onset disease such as type 2 diabetes, for reasons that will be detailed in Chapter 15. Basically, the deleterious effects of this disease do not occur until after most reproduction has occurred, so its selective impact should be weak. So why are the alleles that contribute to a deleterious disease so common in human populations and show evidence for positive selection despite being mildly selected against nearly every generation? Neel (1962) suggested a possible answer to this question: the thrifty genotype hypothesis. This hypothesis postulates that the same genetic states that predispose one to diabetes also result in a quick insulin trigger even when the phenotype of diabetes is not expressed. Such a quick trigger is advantageous when individuals suffer periodically from famines since it would minimize renal loss of precious glucose and result in more efficient food utilization. For example, Minster et al. (2016) discovered a variant in the gene CREBRF in Samoans that increases the risk for diabetes and displays a signature of positive selection. This risk variant also decreased energy use and increased fat storage in an adipocyte cell model, supporting the thrifty genotype hypothesis. Famines increase mortality mostly in children and the elderly, and death in children is often associated with intense selective forces (Chapter 15). Moreover, even the survivors of a famine suffer from reduced female fertility, increased risk for infectious diseases, and epigenetic changes that result in health risks long after the famine is over (Qasim et al. 2018). Hence, famines would represent an intense selective force on the genes responsible for the pre-diabetic “thrifty” phenotype. When food is more plentiful, selection against these genotypes would be mild because the age of onset of the diabetic phenotype is typically after most reproduction (see Chapter 15) and because the high-sugar, high-calorie diets found in modern societies that help trigger the diabetic phenotype are very recent in human evolutionary history. Sporadic intense selection favoring alleles associated with the thrifty phenotype coupled with many generations of weak selection against these alleles are exactly the conditions that Haldane and Jayakar (1963) explicitly showed would maintain polymorphisms even though these genes appear to be deleterious most generations. We now know that famines have been a sporadic but recurring occurrence in human evolution. Famine or extreme hunger leads to specific epigenetic changes in the human genome that are marked by persistent DNA methylation patterns that are detectable even with ancient DNA. These epigenetic signals have been found in the DNA from ancient hunter–gatherers (Gokhman et al. 2017). Interestingly, Neandertals also had diabetes risk haplotypes, and at least one has introgressed

Selection in Heterogeneous Environments

into modern humans and is in high frequency in some current populations subject to historic famines (Williams et al. 2014). Famines likely became more common with the transition to agriculture that began about 12 000 years ago (Cochran and Harpending 2009), explaining the recency of many of the positive selection signatures for diabetes risk alleles. These observations indicate that famines have long been a sporadic selective force in humans and have maintained “thrifty” polymorphisms since at least the time of the Neandertals. The impact of this selection is particularly apparent in current human populations that have a history of famines over the last several centuries (Neel et al. 1998; Chen et al. 2012; Diamond 2003; Minster et al. 2016). The Pima Indians in Arizona are one such population. The Pimas were formerly hunter–gathers and farmers who used irrigation to raise a variety of crops, but principally maize. However, they were living in an arid part of the country, and their maize-based agricultural system was subject to periodic failures during times of drought. This was accentuated in the late nineteenth century when European–American immigrants diverted the headwaters of the rivers used by the Pimas for irrigation, resulting in widespread starvation. With the collapse of their agricultural system, the surviving Pimas were dependent on a government dispensed diet that consisted of high-fat, highly refined foods. By the 1950s among adult Pima Indians, 37% of the men and 54% of the women suffer from type 2 diabetes, one of the highest incidences known in human populations. Under the thrifty genotype hypothesis, the extremely high incidence of diabetes in the Pima Indians and other populations is due to their recent evolutionary history of high mortality from starvation. However, most human populations have experienced some famine over a time scale of centuries, so the thrifty genotype hypothesis also explains why diabetes is so common in human populations in general, although not at the rates seen in the Pima Indians. Moreover, many of the same genes associated with diabetes risk are also associated through pleiotropy with risk for hypertension (high blood pressure) and obesity (Neel et al. 1998), so the thrifty genotype hypothesis also provides an explanation for these other common human maladies. A variant of the thrifty genotype hypothesis also explains why humans are so prone to coronary artery disease (CAD). CAD is initiated by injuries to the endothelial lining of the coronary arteries, followed by the deposition of lipids from low-density lipoprotein particles. This results in an atherosclerotic plaque. As the plaque grows, it restricts blood flow and changes the mechanical characteristics of the artery wall. These events facilitate plaque rupture, which in turn induces clotting and partial or total blockage of the flow of blood to some heart muscle cells. Depending upon the extent and location of the blockages, symptoms range from mild pain to sudden death. CAD accounts for about one-third of total human mortality in western, developed societies, making it the most common cause of death. Both genetic and environmental factors contribute to this disease (e.g. the ApoE locus as discussed in Chapter 8). The lateness in life with which CAD typically occurs and the recentness of the environmental situation in which it is common (Western, developed societies) imply that it is unlikely that CAD itself has been subject to strong natural selection during human evolution. Rather, it is more likely that the genes that predispose one to CAD have effects on other traits that were subject to natural selection in past environments. Past human evolution has been characterized by the rapid and dramatic expansion of our brain and cognitive abilities. The development and maintenance of a large brain creates a high demand for cholesterol. Because the diet of early humans had much less cholesterol than the current diets of people living in developed countries, selection would favor those genotypes that were “thrifty” in their absorption and production of cholesterol (Mann 1998), and specifically the ApoE ε4 allele (Chapter 8) is hypothesized to be associated with such thrifty lipid genotypes (Corbo and Scacchi 1999). When human lifespan increased and the diet become high in fat, the thrifty ApoE ε4 genotypes lead to increased risk for CAD (Stengard et al. 1996) and for other lipid associated maladies such as Alzheimer’s disease (Reiman et al. 2001) and Parkinson’s disease (Zareparsi et al. 2002). Overall, we see that many of the

553

554

Population Genetics and Microevolutionary Theory

most common systemic diseases afflicting modern humans all seem to be due to natural selection operating on past sporadic but selectively intense environmental conditions. The incidences of many of the diseases and conditions mentioned in the previous paragraph are increasing at an alarming rate. For example, the incidence of type 2 diabetes has risen from 108 million in 1980 to 422 million in 2014 (http://www.who.int/news-room/fact-sheets/detail/ diabetes). Although the thrifty genotype explains the high risk that humans have for these conditions, such a large increase in only a few decades cannot possibly be due to an evolutionary response in the human gene pool. The answer is that during this time period, the eating and physical activity habits of many people changed in such a manner that more individuals with the risky genotypes were exposed to the environmental conditions that led to the expression of diabetes and obesity (Friedman 2003; Wahl et al. 2017). Diabetes and obesity, like other phenotypes, emerge from how genotypes interact with the environment (premise 3). When environments change over time, such interactions may lead to direct phenotypic alterations without any evolution in the gene pool. Such interactions are explicitly acknowledged in the concepts of norm of reaction and phenotypic plasticity (Chapter 10). We have already seen how such phenotypic plasticity can influence both the genetic and phenotypic response to a spatially varying environment with the example of the green frog cline and the model of de Jong (1988). Indeed, the de Jong model shown in Figure 14.12 is equally applicable to coarse-grained temporal heterogeneity. For example, suppose that global warming causes an increase in temperature over time for a frog population at a particular location. Then Figure 14.13 shows the genetic trajectory of the evolution of that population as it adapts to an increasing temperature. Recall how this model warns us to be cautious in interpreting phenotypic clines over space. The same warning is applicable to interpreting phenotypic changes over time. For example, Figure 14.13 shows that it is possible for the most rapid phenotypic evolution to have occurred when there is no genetic evolution at all (when populations are fixed for one allele or the other), and the slowest phenotypic evolution to have occurred when there is rapid genetic evolution (when the populations are polymorphic in Figure 14.13). One should never automatically equate phenotypic change over time to evolutionary change. Unfortunately, such an equation is common in the evolutionary literature. For example, Eldredge and Gould (1972) noted fossil evidence for some organisms showing periods of morphological stasis that were punctuated by short bursts of rapid morphological change. Their theory of punctuated equilibrium states that the periods of rapid phenotypic change in otherwise phenotypically static lineages are due to rapid bursts of evolutionary change. However, if something similar to the situation shown in Figure 14.13 were occurring, the periods of rapid phenotypic change would happen when there is little to no evolutionary change, and periods of relative phenotypic stasis would have occurred due to continual evolutionary adjustments. This by no means implies that all cases of rapid phenotypic change are associated with no evolutionary change, but it does serve to warn us that patterns are insufficient to infer processes. This occurs because there are often several different processes that can generate the same patterns. In this case, one could have rapid phenotypic change due to either rapid evolutionary change or due to rapid environmental change coupled with phenotypic plasticity. Both are biologically plausible explanations that can yield indistinguishable patterns at the phenotypic level.

Transitions to a New Long-term Environment Sometimes a population experiences a transition to a new environment that represents a new longterm state to which the population must now adapt. This change is sometimes relatively abrupt, taking place over a just one or a few generations. For example, the transition to the Malaysian agriculture complex (Chapter 11) could have been very rapid and created a new environment in tropical

Selection in Heterogeneous Environments

Africa characterized by malaria becoming an endemic and common disease. Humans had to adapt to this new environment and are probably continuing to adapt even after about 100 generations as shown by the transient polymorphism of Hb-C (Figure 11.4). Sometimes the changes occur more slowly as a cumulative trend. This occurred for industrial melanism. As air pollution become increasingly worse in England after the start of the Industrial revolution, many moth species gradually adapted to its consequences by increasing the frequency of melanic genes to near fixation in heavily polluted areas (Cook 2003), as shown for Leeds in the pre-1975 cline in Figure 14.7. More recently, starting in the latter half of the twentieth century, air pollution laws and their enforcement significantly reduced air pollution, and then the moths began to adapt to this second change in air quality by reducing the frequency of melanic moths (the 2002 cline in Figure 14.7). There is great concern about this type of coarse-grained temporal heterogeneity because human activities are making such environmental transitions common across the entire globe. First, humans are translocating, either intentionally or inadvertently, many species to new areas well outside their original species range where the species often encounters a radically new environment for the first time in its evolutionary history (Capinha et al. 2015). From the perspective of the translocated population, this is a coarse-grained temporal change. The outcomes of such introductions range from extinction to the translocated population becoming invasive, either immediately or through subsequent evolution. One highly successful invader is the fire ant, Solenopsis invicta. This species was inadvertently introduction to Alabama in the 1930s from its native range in South America and subsequently spread throughout much of the southeastern United States. Privman et al. (2018) sampled native and invasive populations and used high-resolution genome scans for an fst outlier analysis to identify positive selected genomic regions, focusing especially upon regions positively selected in the invasive populations. They found many such regions, indicating that there has been much positive selection in the invasive population since its introduction. They specifically found recent positive selection on putative ion channel genes, which are implicated in neurological functions, on vitellogenin, which is a key regulator of development and caste determination, and on genes implicated in pheromonal signaling. All of these genes are candidates for influencing social behavior. Genes with signatures of positive selection were significantly more often those overexpressed in workers compared with queens and males, suggesting that worker traits are under stronger selection than queen and male traits. These results support the hypothesis that enhanced social cooperation may facilitate invasiveness in this invasive ant species. Second, human activities are changing our global climate and transforming many ecosystems into agricultural and urban landscapes, which induce novel habitats, habitat loss, and habitat fragmentation. The transformations by human land use became global in impact by 3000 years ago (Stephens et al. 2019), and global climate change has become apparent more recently (Vince 2011). These consequences of human activities have the potential of changing the environment for virtually all species on this planet (Vince 2011). Some species respond to these temporal changes by range shifts to avoid or minimize the environmental changes. Although human activities have increased the ability of some species to disperse around the world (Capinha et al. 2015), for many other species human activities reduce the ability to move and shift their range in a meaningful fashion (Tucker et al. 2018). In those cases, the possible outcomes are extinction (Román-Palacios and Wiens 2020) or adaptation, both epigenetic and genetic via natural selection. Adaptive evolution can also help mediate climate-driven range shifts (Diamond 2018). Hence, evolution is and will play a critical role in the survival of species during the Anthropocene (Nadeau and Urban 2019). One of the many examples of adaptive evolution in response to human-induced environmental change is the work of Bay et al. (2018) on the genomic basis of adaptation to climate change in 21 populations of the North American migratory bird, the yellow warbler (Setophaga petechia).

555

556

Population Genetics and Microevolutionary Theory

They discovered isolation-by-distance in this species, but also many regions scattered across the genome that were strongly associated with environmental variables after accounting for population structure. The associations are particularly strong with precipitation that is changing rapidly under climate change. Some of the strongest associations between genotype and climate were upstream of genes with a known function in avian behavior, dispersal, and migration. Bay et al. (2018) also used the current genomic associations with climate to create a genomic vulnerability metric that measures the mismatch between a population’s current gene pool with its predicted ideal gene pool under future climatic scenarios. These genomic vulnerabilities of populations were found to be correlated with current population status as determined by the North American Breeding Bird Survey. Populations with the greatest mismatch to climate change have experienced the largest population declines in the recent past, indicating the important role of genetic variation in the gene pool in mediating adaptive responses to climate change. Phenotypic plasticity often plays an important role when a population first encounters a novel environment. We saw this with the plastic response of humans to an environment of high calories and low activity that has resulted in an increased incidence of type 2 diabetes. When maladaptive phenotypes are induced by the novel environment, selection in the new environment may favor alleles that ameliorate or compensate for the initial maladaptive norm of reaction in the new environment—a process called genetic compensation (Grether 2014; Schlichting and Wund 2014). Counter-gradient selection is one example of genetic compensation. However, not all plastic responses are deleterious, and beneficial ones can play a critical role in a population’s ability to survive in the new environment. Plasticity can allow the individuals to survive and reproduce in the novel environment and thereby establish a population that might not otherwise be able to persist. Given that we are now focusing on environmental changes that are long-term and persistent, evolutionary processes mediated by natural selection can genetically mold this initial plastic response. Recall that a phenotype is any measurable trait (Chapter 8). Since we can measure norms of reaction, phenotypic plasticity itself is a phenotype. Like any other phenotype, there can be genetic variation for the norm of reaction and resulting degree of plasticity among individuals in a population, and hence the trait of phenotypic plasticity can evolve. One example of such a phenomenon is genetic assimilation (Waddington 1957) in which selection acts upon heritable variation in phenotypic plasticity to turn a phenotype directly stimulated by an altered environment (plasticity) into a fixed phenotypic response no longer sensitive to the ancestral environmental triggers (assimilation). An example of genetic assimilation is found in ambystomid salamanders, such as Ambystoma tigrinum whose phylogeography was discussed in Chapter 7. Most amphibians have an aquatic larval phase followed by metamorphosis to a terrestrial adult phase. Metamorphosis in many amphibians is under the control of hormones from the hypothalamus, pituitary, and thyroid glands, with the hormone thyroxin (TH) being the primary, but not exclusive, mediator of the tissue responses that lead to metamorphosis through TH receptor proteins on the cells of responsive tissues (Crowner et al. 2019; Rose 1999). Nutrition, photoperiod, and temperature all affect endocrine activity. In particular, poor nutrition, darkness, and low temperature all tend to reduce the production of TH in ambystomid salamanders, resulting in phenotypic plasticity for the timing of metamorphosis. Indeed, metamorphosis can be completely prevented under appropriate environmental conditions, resulting in aquatic larval forms that became sexually mature and thereby bypass the terrestrial adult phase completely. Such sexually mature aquatic salamanders are called paedomorphs (Gould 1977). Many tiger salamanders are phenotypically plastic for the phenotypes of paedomorphic versus metamorphic adults.

Selection in Heterogeneous Environments

The climatic changes associated with the end of the last glacial period not only profoundly influenced the phylogeography of the tiger salamander (Figure 7.12) but also probably influenced this salamander’s phenotypes through time. During the glacial period, fossils from the Kansas– Oklahoma area consist of giant paedomorphs of tiger salamanders, but as the temperature increased following the Pleistocene, metamorphic forms become more common. This change in phenotype frequency may or may not be due to genetic evolution; it certainly could be explained entirely in terms of phenotypic plasticity with no genetic evolution at all. Current populations of the tiger salamander from clade 4–2 (Figure 7.12) are still plastic for the phenotype of paedomorphy. In the Rocky Mountains, where ponds are permanent and have low temperatures, paedomorphs are frequent, and clade 4–2 represents a recent range expansion from this environment that favors the production of paedomorphs. Clade 4–2 produces paedomorphs throughout its current geographic range (Figure 7.12) in large, permanent ponds (Templeton 1994). However, the salamanders from clade 4–1 had a very different recent evolutionary history (Figure 7.12). They were fragmented from the western populations in the Pleistocene and lived in the Ozarks. The Ozarks are not high enough to induce cool temperatures and have few permanent ponds. Under these environmental conditions, paedomorphs are not expected nor are they found. However, the range of this clade has also expanded and now overlaps with that of clade 4–2 (Figure 7.12). When both types of salamanders are taken from permanent ponds, all paedomorphs turn out to be from clade 4–2 (Templeton 1994). Consequently, we see a puzzling pattern. The clade 4–1 salamanders recently lived in environments that would make paedomorphy unlikely, but now even when they are living in large, permanent ponds that induce paedomorphy in clade 4–2 salamanders, they are incapable of becoming paedomorphs. In contrast, consider the closely related species, Ambystoma mexicanum. This species lives in permanent lakes in the mountainous region of Mexico. This environment favors paedomorphy, but even when these salamanders are placed in environments that favor metamorphosis in other tiger salamanders, A. mexicanum normally fails to undergo metamorphosis. We seem to have a strange, almost Lamarckian phenomenon: paedomorphy and metamorphosis are phenotypically plastic in some salamanders, but when one salamander population is placed in an environment that favors metamorphosis, it becomes genetically incapable of paedomorphy, whereas a second population found in an environment that favors paedomorphy becomes genetically incapable of metamorphosis. Somehow, prolonged exposure to the environment favoring a particular phenotypic response has become “genetically assimilated” and is now expressed (or not expressed) regardless of the environment. There is no strange evolutionary force working here as long as there is genetic variation for the degree of plasticity in the paedomorphic/metamorphic phenotype, that is, there is genetic variation in the norm of reaction. Voss et al. (2003) used a candidate locus approach (Chapter 10) to study the genetic basis of metamorphosis in hybrid populations made from laboratory crosses of A. mexicanum with A. tigrinum tigrinum (clade 4–1, the population that cannot produce paedomorphs) and discovered that two TH receptor loci were associated with these phenotypes, although in a manner that suggested strong epistasis with other, unmeasured loci. Voss et al. (2012) used TH treatments on paedomorphic A. tigrinum to identify three QTLs that vary in their responsiveness to TH and that affect the timing to metamorphosis. By delaying metamorphosis, body size is increased, which in turn often augments adult fitness. Hence, delayed metamorphosis can be selected under some environmental circumstances. This same pathway can be used to delay metamorphosis indefinitely, that is, paedomorphy. Suppose, for example, that most salamanders can only find temporary ponds, as is the case in the Ozarks. In such an environment, any salamander that failed to undergo metamorphosis would die

557

558

Population Genetics and Microevolutionary Theory

when the pond dried up. In a population that was genetically variable in its phenotypic responses to environmental cues, some salamanders would respond to a particular environment by developing into paedomorphs or delaying metamorphosis, whereas others would respond to the same environment by undergoing metamorphosis. In populations living in temporary ponds, there would be selection against all animals that give a paedomorphic or delayed-metamorphosis response to their immediate environment. Therefore, evolution should shift the responsiveness to the environment in the direction favoring metamorphosis (Eq. (9.22)). As long as there was heritable variation, the population would become less and less likely to produce paedomorphs even under environmental conditions that induced paedomorphs with a high probability in the ancestral population. A similar selective scheme could have been happening in A. mexicanum, but in this case living in a permanent lake favors the paedomorphs. Another possible evolutionary mechanism of genetic assimilation in these salamanders is the accumulation of neutral mutations. A. mexicanum now lives in an environment that always leads to paedomorphy. Hence, any genes that deal exclusively with metamorphosis could, under these environmental conditions, be neutral because they are not expressed in any functional fashion. A neutral locus still evolves (Chapter 5), and new, neutral mutations should eventually go to fixation. Since random mutations on average diminish functional capability, the metamorphic genes that have been rendered neutral are expected to eventually become fixed for non-functional and diminished functional mutations. Hence, when the animal is now put into an environment that would favor metamorphosis in the ancestral population, it can no longer do so because it lacks one or more functional genes for metamorphosis. These two hypotheses (selection against plasticity or reduced environmental sensitivity, and neutral evolution at loci rendered functionless by longterm non-expression of specific phenotypes) are not mutually exclusive, and both could contribute to the process of genetic assimilation. A new environment can induce novel phenotypic responses from the norms of reaction of the genotypes in the population that initially encounters this novel environment. Pál and Miklós (1999) have shown that even if the environment initially induces random phenotypic variants, the increased phenotypic variance nevertheless makes it more likely that some favorable phenotype will exist in the novel environment. This in turn can initiate the process of genetic assimilation and trigger a shift to a new adaptive peak that appeared in this novel environment. Bódi et al. (2017) demonstrate that phenotypic plasticity in laboratory microbial populations can induce much phenotypic heterogeneity in response to a deteriorating environment. As predicted by the model of Pál and Miklós (1999), this phenotypic heterogeneity promotes genetic adaptation over many generations, that is, phenotypic plasticity increases evolvability in a changing environment (Sommer 2020). These evolutionary tendencies are strengthened when the environmentally induced variation is itself subject to trans-generational epigenetic inheritance (Boskovic and Rando 2018). For example, suppose a novel environment induces new methylation patterns in the DNA that in turn affect DNA expression that in turn affect phenotypic variation. Methylation patterns can sometimes be passed on to the next generation, and this initial non-genetic inheritance actually makes the process of genetic assimilation even more likely (Pál and Miklós 1999).

Fine-grained Heterogeneity An individual often experiences environmental heterogeneity within its own lifetime. Because an individual can only be at one place at any given time, an individual experiences both spatial and temporal heterogeneity within its own lifetime as a temporal sequence. Hence, for purposes of

Selection in Heterogeneous Environments

microevolutionary modeling, no distinction is necessary between fine-grained spatial and temporal heterogeneity. In many situations, fine-grained heterogeneity needs no special consideration as it can be folded into the constant fitness models given in Chapter 11. To see this, consider two extreme situations: the case in which every individual in the population experiences the same temporal sequence of fine-grained heterogeneity, and the case in which every individual experiences an independent sample of temporal sequences of environments within its lifetime. The first case would apply to the situation in which an organism has one generation per year, but in which seasonal variation influences the viabilities of the genotypes within the population. Thus, every individual in the population experiences the seasonal variation within its own lifetime, and every individual experiences exactly the same sequence of this seasonal heterogeneity. In the previous section, Hoekstra’s (1975) model of coarse-grained seasonal variation was given, and Hoekstra also modeled the case of fine-grained seasonal variation. Table 14.4 gives the two-season finegrained model in which the fitness effects within a season are identical to those given in Table 14.3, the two-season coarse-grained model. Note that Table 14.4 is just the standard, constant fitness model (Figure 11.1) with the constant viabilities of ℓAA = wAA = v1v2 and ℓaa = waa = w1w2. Hence, all of the equations and conclusions of Chapter 11 apply to this case of fine-grained seasonal variation. In particular, the frequency-dependent dynamics that emerge from the coarse-grained analogue of this model do not appear in the fine-grained case. Even for this simple two-season model, we get qualitatively different evolutionary dynamics for the coarse-grained and fine-grained versions. This shows that environmental grain greatly influences the microevolutionary process. Now consider the second case in which each individual independently samples its own finegrained temporal sequence of environments. This will induce fitness differences among individuals sharing the same genotype within a generation. Thus, we need to replace a constant fitness for genotype ij with a random fitness with mean wij (now the average fitness of all individuals with genotype ij) and a variance of σ 2ij . However, we have already modeled this situation as well in our constant fitness models of Chapter 11. Recall that the fundamental definition of fitness in population genetics is that of a genotypic value assigned to all individuals sharing a common genotype

Table 14.4 Hoekstra’s (1975) model of fine-grained cyclical selection at one locus with two alleles (A and a) over a two-environment cycle, with each environment experienced by all individuals within a single generation. Genotype

AA

Aa

aa

Zygotic Frequency at Beginning of Cycle

p2

2pq

q2

Fitness in Environment 1

v1

1

w1

Genotype Frequency After Selection in 1

p2 v1 w1

2pq w1

q2 w1 w1

Fitness in Environment 2

v2

1

w2

2pq w

q2 waa w

2

Genotype Frequency After Selection in 1 & 2

p wAA w

Where:

w1 = p2 v1 + 2pq + q2 w1 wAA = v1 v2 waa = w1 w2 w = p2 wAA + 2pq + q2 waa

Source: Modified from Hoekstra (1975).

559

560

Population Genetics and Microevolutionary Theory

(Chapter 11). The genotypic value for the phenotype of fitness is simply the average fitness for all individuals who share a common genotype. But there is nothing about the concept of genotypic value that requires that all individuals share exactly the same phenotypic value. Quite the contrary, in Chapter 8, we explicitly assumed that the individuals that share a common genotype do not have the same phenotype but rather are characterized by a mean phenotype (the genotypic value) coupled with a random, individual environmental deviation (Eq. (8.6)). In Chapter 8, we assumed that the distribution of environmental deviations was identical for all genotypes and had a variance of σe2, the “environmental variance” (Eq. (8.16)). Because fitness is thought of as a genotypic value in population genetics, there has never been the assumption that every individual with the same genotype has to have exactly the same fitness phenotype; only the averages matter in the models developed in Chapter 11. Hence, the environmental variance term can accommodate any fitness fluctuations among individuals with the same genotype that is induced by sampling fine-grained environmental variation. Moreover, since only the genotypic values enter into the equations given in Chapter 11, we do not even require the assumption that each genotype has the same environmental variance as we did in Chapter 8; rather, different genotypes can display different environmental variances to fine-grained heterogeneity, but it is still only their genotypic values of fitness that drive the evolutionary response under most conditions. However, there are some situations for which the within-genotype environmental variance of fitness does matter to the evolutionary outcome. One such exception is for newly arisen mutations. As we saw in Chapter 5, genetic drift has a major impact on the survival of a newly arisen neutral mutation even in an effectively infinite sized population. We then saw in Chapter 12 how genetic drift still has a major impact on the survival of a newly arisen selected mutation even in an effectively infinite sized population. For example, if a new allele, A, mutates from the ancestral allele a, such that the relative fitnesses are 1 + s (s > 0) and 1 for Aa and aa, respectively, we showed in Chapter 12 that the probability of survival of the selectively favored A allele in an ideal population of large size is approximately 2s, which implies that the majority of selectively favored alleles are lost due to genetic drift. The large impact of genetic drift on newly arisen mutants even in effectively infinite-sized populations stems from the fact that a newly arisen mutation is initially found in only one copy, so that finite sampling cannot be ignored. Therefore, the random force of genetic drift is powerful regardless of the total population size because the fate of all new mutations initially depends upon a small, finite number of copies. Similarly, random forces generated by fine-grained heterogeneity in fitness also play a powerful role in influencing the fate of a new mutation (Templeton 1977b). To model fine-grained heterogeneity, let the mean number of offspring of an individual bearing the new mutation (an Aa individual) have a mean of 2(1 + s) and variance of 2(1 + s) + σ 2s in a large, stable population consisting mostly of aa individuals with a mean and variance of offspring number of two. Recall from Chapter 5 that the offspring number distribution in our ideal population is Poisson, which implies that the mean and variance in the number of offspring are the same. The quantity σ 2s therefore measures the variance in the selection coefficient s among Aa individuals that is induced by fine-grained heterogeneity that creates a variance in offspring numbers beyond 2(1 + s) (the Poisson assumption). When s, the average selection coefficient, is close to 0, the survival probability of the A allele is Pr A survives =

2s 1 + s + σ 2s

14 24

Note that when there is no fine-grained heterogeneity in fitness (σ 2s = 0), Eq. (14.24) reduces to 2s/ (1 + s) ≈ 2s since we assumed s was close to 0. Thus, the constant fitness case given in Chapter 12 is just a special case of Eq. (14.24). When fine-grained heterogeneity exists, σ 2s > 0 and the probability of new mutant surviving always decreases relative to the case with no fine-grained heterogeneity.

Selection in Heterogeneous Environments

This observation has some important implications for how populations evolve in response to finegrained heterogeneity. Consider two mutations, say A1 and A2, each with identical average fitnesses (both 1 + s), but with σ 2s1 < σ 2s2, that is, the A1a genotype is more buffered against the fitness fluctuations caused by fine-grained environmental heterogeneity than is the A2a genotype. Then, Eq. (14.24) implies that A1 has a greater chance of surviving and going to ultimate fixation in the population than A2. Hence, natural selection in this case favors those mutations that are associated with fitness phenotypes that are more buffered against responding to fine-grained heterogeneity. This example also challenges our notion of selective neutrality. Since these two mutants have identical average fitness effects, they are neutral alleles with respect to one another. However, they are not equivalent in their evolutionary dynamics and survival probabilities under selection in a fine-grained environment. Once again, selection favors the genotypes best buffered against fitness fluctuations. Selection can even favor such fine-grained buffering over the mean selective value. For example, let si now be the average selection coefficient associated with the Aia genotype. Now let us assume that s1 > s2. Normally, we would expect A1 to have a higher probability of survival than A2 because it is associated with a higher average fitness. But Eq. (14.24) implies that A2 is more likely to survive than A1 whenever: σ 2s2
dij0

y − dij1

when y > dij1

In this model, the d’s measure the short-term buffering mechanisms that usually have no or low physiological costs. Thus, each genotype ij can endure a run of 0’s of length dij0 and a run of 1’s of length dij1 without any deleterious consequences at all. The larger the value of d, the better is the short-term buffering capacity. However, physiological constraints ensure that the d’s cannot be too large, and when exposure to these environments lasts longer that the relevant d, the short-term buffering mechanisms break down and fitness begins to decline at an exponential rate measured by the relevant λ. Hence, λ measures the long-term buffering capacity of the organism, with small λ’s corresponding to better long-term buffering capabilities. The expected value of the logarithm of fitness under fitness model 14.35 is (Templeton and Rothman 1978): E ln wij ≈ ln cij − L f o 1 − α

dij0

λij0 + f 1 1 − β

dij1

λij1

14 36

Hence, fine-grained environmental heterogeneity tends to select those genotypes that minimize the quantity in brackets in Eq. (14.36). This bracketed quantity can tell us much about how organisms adapt to fine-grained environmental runs. First, we see that the selective impacts of the environmental states 0 and 1 depend upon their frequencies, f0 and f1. Hence, the more an organism encounters a particular environmental state, the more important it is to have a high fitness response to that state. Second, the impact of an environmental state also depends upon how likely it is to generate long runs. Consider state 0. The quantity 1 − α is the probability of remaining in state 0 given the environment is in that state already. If 1 − α is small, even modest values of dij0 ensure that environmental state 0 has little

565

566

Population Genetics and Microevolutionary Theory

overall fitness effect. Hence, a short-term buffering strategy is effective. However, if 1 − α is large, long runs of 0’s are likely to be encountered, and the quantity 1 − α dij0 could still be substantial even when dij0 is at its maximum physiological limit. In this case, the long-term buffering parameter λιφ0 becomes an important contributor to fitness. These predictions are consistent with observations on natural populations. Plants that live in upland environments far from rivers have little chance of encountering prolonged flooding conditions, and most such plants only have the short-term physiological buffering mechanisms mentioned above. In contrast, plants that live in riparian habitats are much more likely to encounter prolonged floods and tend to have both the short-term and long-term buffering mechanisms against flooding (Kozlowski and Pallardy 2002). These examples and Eq. (14.36) tell us that fine-grained environmental heterogeneity cannot be adequately described just by the frequencies of the various environmental states, but rather it is necessary to know something about the temporal sequence of states that individuals encounter during their lifetime. A given environmental state can be either benign, mildly deleterious, or lethal to an individual depending upon its temporal context, so evolution in fine-grained environments must also be sensitive to this context. Equation 14.36 provides insight into why plasticity evolves in the first place. The parameters d and λ buffer the phenotype of fitness, but as illustrated by the plant example, this is typically accomplished through plasticity of other traits. Because fine-grained heterogeneity is experienced by individuals and because individuals themselves do not evolve, plasticity at the individual level is needed to achieve fitness buffering. In a population with genetic variation in the d and λ parameters, natural selection would favor those genotypes with sufficient buffering capacities/phenotypic plasticity to deal with the types of environments and run lengths the population has encountered in the past (Eq. (14.36)). Hence, fine-grained heterogeneity is the key to understanding the initial evolution of plasticity. Behavior in animals is one trait that can display much plasticity that can result in fitness buffering (Renn and Schumer 2013). For example, many organisms do not randomly disperse in a spatially variable landscape, but instead often adjust their dispersal decisions according to their phenotype and the environmental conditions (Jacob et al. 2015). Such nonrandom choices can result in fitness buffering. Templeton and Rothman (1981) extended the Levene model to include nonrandom dispersal. The conditions for protection in a one-locus, two-allele model under soft selection are γ wi,Aa γ wi,Aa ci i,Aa >1 ci i,Aa >1 14 37 γ w γ i,aa i,aa i,AA wi,AA i i where ci is the proportional output from niche i, as in the original Levene model (inequalities (14.4) and (14.5)), γ i, jk is the actual proportion of individuals with genotype jk that dispersed into niche i, and wi, jk is the fitness of genotype jk in niche i. An interesting special case arises when wi, jk = wi for all jk, that is, all genotypes have the same relative fitness within each niche but absolute fitnesses (the c’s) can vary between niches. In this case, inequalities (14.37) reduce to ci γ i,Aa γ i,aa > 1

ci γ i,Aa γ i,AA > 1

i

14 38

i

Thus, even with identical relative fitness responses within all niches, a polymorphism can be maintained with habitat selection when the heterozygotes are the most efficient genotype at choosing those habitats with the higher carrying capacities (c’s). For hard selection, the conditions for protection are: γ i,Aa wi,Aa > i

γ i,aa wi,aa i

γ i,Aa wi,Aa > i

γ i,AA wi,AA i

14 39

Selection in Heterogeneous Environments

Note once again that even with identical relative fitness responses to all niches, a polymorphism can be maintained with habitat selection when the heterozygotes are the most efficient at choosing the niches with higher absolute fitness. Templeton and Rothman (1981) next added habitat selection onto the fitness model shown in Eq. (14.33) by allowing different habitats in the landscape to have different values for α and β in the environmental matrix 14.31 under both hard and soft selection. This model also introduced a genotype-specific cost to the act of habitat selection through the c parameter in Eq. (14.33). In this model, selection can favor fitness buffering through habitat selection in addition to the physiological buffering parameters (d and λ) in Eq. (14.33). The relations between habitat selection and physiological buffering can be quite complex and variable depending upon the costs of habitat selection, its effectiveness, and whether there is hard or soft selection, but the models do show that effective habitat selection alone can protect a polymorphism. The models also show that natural selection can favor an organism to preferentially shift to a niche before it has become physiologically adapted to the niche. Hence, a plastic behavior can result in much secondary adaptive evolution, a phenomenon found in Anolis lizards living in colder habitats (Muñoz and Losos 2018).

Coevolution Fitness arises from how genotypes interact with their environments. The environment for any one species often includes other species. We have already seen how interactions among species can define the adaptive environment for a particular species. This was shown in Chapter 11 with the discussion of sickle-cell anemia and malarial resistance in Africa. An environmental transition was associated with a radical shift in fitnesses for the genotypes associated with the human β-Hb locus (Table 11.1). This environmental transition was defined by the interactions among three species; humans changed their agricultural practices that provided breeding sites and habitats for the mosquito Anopholes gambiae, and increases in the densities of humans and mosquitos provided increased host resources for a third species, Plasmodium falciparum that parasitizes both humans and mosquitoes. Much of the environment for any organism is defined by its interactions with individuals of other species. But species are not static; they are all capable of evolving. Hence, when the environment of one species is defined by other species, evolution in these other species can create a changing environment for the species of interest. As one species adapts to the “environment” defined by the other species, the other species in turn can adapt to the changing “environment” created by evolution in the first species. We already saw in Chapter 11 how humans adapted to the malarial parasite, but the malarial parasite in turn has adapted to humans. The human parasite, P. falciparum, is closely related phylogenetically to P. praefalciparum, a malarial parasite in gorillas. The human parasite has low genetic diversity compared to the gorilla parasite (Molina-Cruz et al. 2016), and all extant human lineages of P. falciparum appear to be derived from a single lineage of P. praefalciparum that is nearly identical to human P. falciparum (Liu et al. 2010a, b). These observations suggest a recent origin of P. falciparum that was accompanied by a severe genetic bottleneck. The malarial parasite enters human erythrocytes (red blood cells) by binding to glycophorin A and B on the erythrocyte surface. Chowdhury et al. (2018) inferred positive selection from McDonald–Kreitman tests (Chapter 12) on the P. falciparum gene that influences the binding affinity with these human glycophorins. P. falciparum was also subject to strong selection induced by the evolution of human resistance mechanisms. For example, the apical membrane antigen 1 (AMA1) locus codes for a surface-accessible protein in P. falciparum that can serve as a target for the human immune response to malarial infection. The

567

568

Population Genetics and Microevolutionary Theory

McDonald–Kreitman and Tajima’s D tests (Chapter 12) indicated strong selection to maintain polymorphism in this malarial gene, presumably driven by selection induced by the human immune response (Polley and Conway 2001). Such high levels of polymorphism are selected for in the parasite because of the memory component of acquired immune responses in the human host that induces frequency-dependent selection for diversity in the parasite. Genome scans of the malarial genome reveal the strongest selection on genes with peak expression at the stage of the initial invasion of the human erythrocyte, and these P. falciparum genes have high levels of polymorphism despite P. falciparum having low levels of overall polymorphism (Amambua-Ngwa et al. 2012). Indeed, many other regions of the P. falciparum genome bear the signature of recent and strong selection since the parasite has begun specializing on humans (Conway et al. 2000; Volkman et al. 2001; Mobegi et al. 2014). Humans have also interacted with the malarial parasite by altering the environment, in this case through the development and use of anti-malarial drugs. These human-produced drugs have also invoked selective sweeps (Chapter 12) in specific genes in P. falciparum for drug resistance (Wootton et al. 2002; Mobegi et al. 2014; Amambua-Ngwa et al. 2019). Hence, malaria adapted to humans, humans are adapting to malaria, and malaria is adapting to the human adaptations to malaria, and so on. Mode (1958) first coined the term coevolution to describe the situation when two or more species mutually adapt to one another through interspecific interactions. Coevolution is simply natural selection operating within each of the interacting species, recognizing that each species constitutes part of the environment of the other species. Mode’s original models dealt only with host–parasite interactions (such as humans and malaria). The strong evolutionary dynamics induced by antagonistic interactions between species sometimes results in the Red Queen process (Van Valen 1973), named after the literary character that had to keep running just to stay in place in Alice in Wonderland. Such Red Queen coevolution has been documented in other host–parasite pairs, such rabbits and myxoma virus (Alves et al. 2019). As noted in previous chapters, the MHC complex is one of the primary regions in the genome that modulates the response to pathogens in many vertebrate species, and this region has extraordinary levels of genetic diversity (Stefan et al. 2019). Both theoretical (Ejsmond and Radwan 2015; Stefan et al. 2019) and experimental work in model organisms (Kubinak et al. 2012) indicate that the Red Queen process is the primary driver of MHC diversity through frequency-dependent selection and selection for divergent alleles. Ehrlich and Raven (1964) generalized the concept of coevolution to include any potential interaction among individuals of different species (Table 14.5). Of the terms listed in Table 14.5, only the bottom three are true interactions, so the primary focus of coevolutionary models is upon those last three. The term “species interaction” has different meanings in ecology and population genetics. In ecology, the traditional meaning of an interspecific interaction is an interaction that influences population dynamics (size, density, and/or growth rates). However, in evolutionary models, the relevant criterion of an interaction is that the relative fitnesses within a species are influenced by the interactions of individuals with other species. Such relative fitness interactions can have a strong evolutionary effect even if they have no impact at all on population dynamics. A further complication arises from consideration of units and targets of selection. The target of selection induced by an interspecific interaction may well be the individual, but in any model of evolutionary response, the unit of selection is generally much smaller than the individual’s intact, multi-locus genotype. Indeed, many different units of selection can be influenced by interactions with individuals of different species, and each unit of selection may respond to that interaction in a qualitatively different fashion. Consequently, the appropriate focus for models of coevolution is at the level of traits that influence fitness through interactions with individuals of another species and their underlying units of selection.

Selection in Heterogeneous Environments

Table 14.5

Types of interspecific interactions.

Species 1

Species 2

Type of Interaction

0

0

Neutralism

+

0

Commensalism



0

Amensalism

+







Predator–Prey Pathogen Host Competition

+

+

Mutualism

Note: A plus sign means that the indicated species benefits (either in terms of increased growth of population size and/or individual fitness) from the interaction, and a negative sign means that the indicated species is harmed (either in terms of a reduction in population size and/or reduced individual fitness) from the interaction. A zero means no effect at all upon the indicated species from the presence of the other species.

Heliconius butterflies illustrate that different traits can simultaneously exist within the same individuals that have qualitatively different coevolutionary responses (Templeton and Gilbert 1985). Heliconius is a genus of New World butterflies found mostly in the tropics. The larvae of these butterflies feed on various species of the plant genus Passiflora. The relationship between the butterflies and Passiflora is predator (butterfly) to prey (the plant), and this antagonistic interaction has induced an intense Red Queen arms race (de Castro et al. 2018). Passiflora species have evolved unusual variation of leaf shape within the genus; the occurrence of yellow structures mimicking heliconiine eggs, an extensive diversity of defense compounds such as cyanogenic glucosides, alkaloids, flavonoids, saponins, tannins and phenolics; trichomes; mimicry of pathogen infection through variegation; and production of extrafloral nectar to attract ants and other predators of heliconiine larvae. Heliconiines can synthesize cyanogenic glucosides themselves, and their ability to handle these compounds was probably one of the most crucial adaptations that allowed the ancestor of these butterflies to feed on Passiflora plants. Heliconius larvae can sequester cyanogenic glucosides and alkaloids derived from their host plants and utilize them for their own protection against their predators, such as birds. Heliconius adults have highly accurate visual and chemosensory systems and one of the largest brain-to-body ratios in insects, allowing them to memorize shapes and to display elaborate pre-oviposition behavior in order to defeat the visual defenses evolved by Passiflora species. Generally, sympatric species (those living in the same area) of Heliconius use nonoverlapping sets of host Passiflora species. Hence, from the point of view of larval feeding traits, there is often no interspecific interaction among sympatric Heliconius species. Moreover, the population sizes of the various species are probably determined by density-dependent factors acting on the larval stages (Gilbert 1983). Hence, the most likely population dynamic interaction between these species is neutralism. The adult Heliconius butterflies use the cucurbit vines from the genera Gurania and Anguria as a source of nectar and pollen. The butterflies and these plants are in a mutualistic relationship as the butterflies provide pollination services that benefit the plants while the nectar and pollen are critical for maintaining adult viability and in producing eggs for the butterflies. Usually sympatric heliconiid species are able to use the same plants as pollen and nectar sources, and these resources are often sparsely distributed. Hence, any trait that increases adult foraging efficiency or competitive

569

570

Population Genetics and Microevolutionary Theory

ability against other butterflies for these resources should be strongly selected. The nature of the coevolutionary interaction for such traits would be one of competition, both interspecifically and intraspecifically. To see this, consider a two-species model with the following fitness function for such traits within species 1: w θ12 = 1 + a1 K 1 − N 1 − θ12 N 2

14 40

where a1 is a constant for the focal species 1, K1 is the amount of resource (say pollen) available to species 1 (as well as to the other species), N1 is the density of conspecific individuals from species 1 that are using this resource, N2 is the density of individuals from competing species 2, and θ12 measures the competitive impact of individuals of species 2 upon individuals of species 1 in obtaining the resource. For simplicity, Eq. (14.40) regards all individuals within a species as having the same competitive abilities, so there is no intraspecific selection in this model. However, the fitnesses of individuals in species 1 are affected by species 2 through the phenotype θ12,. We now assume that there is some heritability to θ12 in species 1, and the response to selection is given by Eq. (11.22), Fisher’s fundamental theorem of natural selection. Note also that this is a linear fitness model with respect to θ12, so the complications due to nonlinearity are not applicable in this model. As can be seen from Eq. (14.40), fitness is a decreasing function of θ12. Therefore, S in Eq. (11.22) is negative, and given some heritability for this measure of competition, the intraspecific response to selection, R, is also negative. That is, natural selection in this model tends to reduce θ12. Similar selective forces would be operating in species 2. Thus, selection is operating in the same direction within both species, even though the nature of the interaction between the two species is antagonistic. A reduction in θ12 can be achieved in many different ways, such as increasing competitive ability against the other species (which in turn increases the θ term in the other species, and thereby could induce more intense selection in that species) or by specializing on a part of the resource that the other species does not use well or efficiently (which reduces the θ terms for both species). Traits in Heliconius that can be explained by such competition selection for a sparsely distributed resource include their highly developed visual system and learning ability that allows them to find and remember the location of these sparsely distributed resources (both of these traits are also involved in their coevolution with Passiflora), early morning flight to avoid being followed by another butterfly, trap-lining behavior to efficiently use the nectar/pollen resources they have found in space and time (flowering time varies across the plant species used), and the fact that different sympatric species have significantly differing abilities to utilize small-grained or large-grained pollen particles. The adult Heliconius butterflies are distasteful and poisonous to bird predators because of amino acids derived from pollen and from allelochemicals derived from Passiflora and stored from the larval stage. Hence, the coevolution of heliconiid butterflies as predators on Passiflora is tied into their coevolution as prey with birds. Heliconius butterflies are brightly colored and have wing patterns that attract attention. Such wing patterns are regarded as an example of aposematic coloration or warning coloration that warns potential predators that the individual is distasteful and/or poisonous. Although there is much diversity in coloration pattern in this genus, and even between different geographical populations within a species, many sympatric Heliconius species or geographical races tend to look alike with respect to wing coloration and pattern. This is an example of Müllerian mimicry in which two or more aposematic species share a similar warning pattern (Anderson and de Jager 2020). Such common warning signals allow potential predators to learn the pattern more efficiently and thereby avoid individuals displaying that pattern. Hence, it is to the mutual benefit of all potential prey to share a common warning pattern. Much is now known about the genetic architectures underlying wing pattern and color in these butterflies. Often, only about four or five major genes or supergenes are involved, frequently with some dominance relationships

Selection in Heterogeneous Environments

among the alleles, interacting enhancers, and modifiers to fine tune dominance and the wing pattern (Concha et al. 2019; Hoyal Cuthill and Charleston 2015; Lewis et al. 2019; Naisbit et al. 2003; Saenko et al. 2019). Moreover, hybridization and introgression occur in this group, and selection on introgressed genes and supergenes can also facilitate convergence to a common pattern (Edelman et al. 2019; Moest et al. 2020). The Müllerian mimics in many cases use different genetic architectures to achieve the same mimetic patterns, although many homologous loci are used in common. Templeton and Gilbert (1985) developed the following model of coevolution of Müllerian mimicry between two species, say 1 and 2, that focuses upon a single locus or supergene. As before, let Ni be the density of species i, and we will regard these densities as fixed constants (determined by larval food resource availability). In species 1, let there be a recessive wing pattern phenotype associated with the genotype aa at a gene/supergene, and likewise assume that there is a recessive phenotype associated with the genotype bb at a gene/supergene in species 2. It makes no difference whether the gene/supergene in species 2 is homologous or not to that found in species 1, as both supergenes evolve independently in the two species, which are assumed to be reproductively isolated in this model. Within each species, assume that there is a dominant “allele” (A in species 1, and B in species 2). Potential predators are assumed to perceive some resemblance between the dominant phenotypes, a lesser degree of resemblance between the dominant and recessive phenotype within a species, and the least degree of resemblance between the dominant and recessive phenotypes between species. Predators are assumed to see no resemblance at all between the two recessive phenotypes found in the different species. The fitness effect associated with a given wing phenotype is assumed to be proportional to the number of individuals resembling that phenotype times a coefficient measuring the degree of resemblance. Thus, the fitnesses of the genotypes defined by a single gene/supergene in species 1 are: waa = Gaa N 1 + a 1 − Gaa N 1 + b 1 − Gbb N 2 wA− = 1 − Gaa N 1 + aGaa N 1 + c 1 − Gbb N 2 + dGbb N 2

14 41

where Gij is the genotype frequency of genotype ij, and a, b, c, and d are constants that measure the degree of perceived resemblance of the phenotypes by potential predators such that 1 > a, c > b, d > 0 where a perfect perceptional match is given a score of “1.” Note that the fitnesses within species are both frequency and density dependent, and moreover they depend upon the genotype frequencies and density of the other species. The average excess of the A allele for fitness is given by: aA = Gaa 1 − a 1 − 2Gaa N 1 + c − b 1 − Gbb N 2 + dGbb N 2

14 42

Given the assumed patterns of resemblance, 1 − a, (c − b), and d are all positive. Hence, the average excess of A is always positive (and hence increasing in frequency due to natural selection) when Gaa < 1/2. However, under these conditions, the increase in the frequency of the A allele occurs because the dominant phenotype is the most common warning pattern within species 1, and the A allele will be favored even in the absence of any interspecific interaction (e.g. N2 = 0 in Eq. (14.42)). When Gaa ≥ 1/2, the A allele can only increase in species 1 if the interspecific terms weighted by the density of the other species, N2, are sufficiently large to overcome the intraspecific liability of having the rarer phenotype within species 1. Since it is reasonable to assume that neither species initially resembled one another (assuming no introgression), it is only the interspecific interactions that allow the evolution of a common warning pattern in this model. When the genotype frequencies of both aa and bb are initially close to 1, the average excess in Eq. (14.42) will be positive only when: dN 2 > 1 − a N 1

14 43

571

572

Population Genetics and Microevolutionary Theory

that is, when the interspecific resemblance of the dominant phenotype in species 1 to the recessive phenotype in species 2 as weighted by the density of species 2 outweighs the lack of resemblance (1 − a) of the dominant phenotype in species 1 to the recessive phenotype in species 1 as weighted by the density of species 1. Hence, the course of evolution within species 1 is determined by a combination of both interspecific and intraspecific factors, and coevolution toward a common warning pattern can only occur when the interspecific interactions are strong. Moreover, if one species is much more common than the other, an inequality of the form (14.43) is likely to be satisfied only for the rarer species, that is, the rarer species will be selected to resemble the more common species but not vice versa. True coevolution, where both species are evolving toward a common pattern, is more likely if their densities are comparable. The above model assumes that the mutualistic allele, A, is associated with a dominant wing phenotype. However, what if A were associated with a recessive phenotype and the population were randomly mating? The fitness model now becomes: wa − = 1 − p2 N 1 + ap2 N 1 + b 1 − Gbb N 2 wAA = p2 N 1 + a 1 − p2 N 1 + c 1 − Gbb N 2 + dGbb N 2

14 44

where p is the frequency of the A allele. The average excess of the A allele now becomes: aA = p 1 − p

1 − a 2p2 − 1 N 1 + c − b 1 − Gbb N 2 + dGbb N 2

14 45

The term in the brackets in Eq. (14.45) is similar to the bracketed term in Eq. (14.42). However, the condition for the bracketed term always being positive is now p > 1 2 = 0.7071. When the mutualistic phenotype was dominant, the corresponding condition was Gaa < 1/2 which under random mating implies p > 1 − 1 2 = 0.2929. Hence, the conditions under which natural selection always favors the mutualistic phenotype have become much narrower in the recessive model. Note also that the bracketed term in the average excess Eq. (14.42) is weighted by Gaa, whereas the bracketed term in the average excess Eq. (14.45) is weighted by p(1 − p). When the A allele is initially rare, Gaa is close to one, and if the interspecific components of the average excess are sufficiently strong to make the bracketed term positive, the magnitude of that positive effect is almost completely translated into increasing p through Eq. (11.5). In the recessive case, p(1 − p) is close to zero when p is very small, so even if the interspecific components make the bracketed term positive, the average excess of the A allele is still close to zero. Hence, regardless of the advantage of interspecific resemblance, Eq. (11.5) would predict little selective pressure to increase the frequency of the A allele. Even when positive, the average excesses for the mutualistic allele will differ by several orders of magnitude in the dominant versus recessive case, making it far less likely for mutualism to evolve when A is recessive. Hence, the genetic architecture within a species is an important constraint on the coevolution between species. Indeed, this simple model can explain why the coevolution of Müllerian mimicry in these butterflies has been primarily accomplished through the replacement of recessive alleles by dominant or co-dominant alleles (Sheppard et al. 1985) and that epistatic modifiers have been favored that strengthen the degree of dominance during this evolutionary process (Naisbit et al. 2003). It is important to keep in mind that the selective processes defined by Eqs. (14.40) on competitive traits and 14.42 or 14.45 on wing pattern operate simultaneously upon different units of selection. Adult sympatric individuals of different Heliconius species are simultaneously competitors for pollen and nectar resources and mutualists for wing color and patterns. These different traits coevolve in qualitatively and quantitatively different fashions in the same populations at the same time. It is erroneous and misleading to speak of a single interspecific interaction as characterizing the net

Selection in Heterogeneous Environments

relationship among these butterflies with respect to coevolution. Because units of selection are generally less than an individual’s intact, total genotype, the very idea of a net or overall interaction among individuals makes no sense when modeling coevolution in population genetics. If the system of mating is modified to include inbreeding or assortative mating (f > 0, Chapter 3), then the average excess for a mutualistic, recessive A allele when mutualistic alleles are rare in both species is (Templeton and Gilbert 1985): f dN 2 − 1 − a N 1

14 46

Hence, the same condition as described in inequality (14.43) must still be satisfied for the evolution of a recessive mutualistic trait, but now we must also have f > 0. The intraspecific system of mating also constrains the course of coevolution. Assortative mating for wing patterns has not been reported, but disassortative mating has that increases the heterozygosity at supergene complexes (Chouteau et al. 2017). This in turn results in antagonistic frequency-dependent selection on wing pattern variation between intraspecific sexual selection and interspecific coevolutionary pressures that can maintain some wing pattern forms that only provide moderate protection against predators. The work of Chouteau et al. (2017) serves to remind us that coevolution is just standard natural selection operating upon the intraspecific fitnesses effects of an interspecific interaction. Like all the intraspecific evolutionary processes discussed in this book, coevolution is constrained by intraspecific parameters, such as genetic architecture and population structure, including system of mating. In particular, the selective direction of coevolution is determined by the sign of the average excesses for intraspecific fitness for the intraspecific units of selection. The gamete’s perspective still rules in the selective response to environmental heterogeneity.

573

574

15 Selection in Age-Structured Populations Up to now we have assumed discrete generations. Under this assumption, all individuals in the same generation are born at the same time and then reproduce at the same time, followed by complete reproductive senescence or death. Such a model approximates reality for some species. For example, many insects and plants have only one generation per year that is synchronized by the seasons, and the discrete generation model can approximate their evolution. However, as pointed out in Chapter 2, individuals in many species can reproduce at multiple times throughout their life, can mate with individuals of different ages, can survive beyond their age of reproduction, and can coexist with their offspring and other generations. We do not have to look far to find such a species; our own species falls into this category of overlapping generations. In species with overlapping generations, an important component of population structure is age structure, the distribution of the ages of the individuals found in the population at a given time. The age distribution can have a large impact on whose gametes are transmitted to the next generation, particularly when an individual’s chances of survival, mating, and reproducing are all influenced by age, as they are in humans. In this chapter, we will examine the evolutionary impact of age structure and age-dependent fitness components in demes with overlapping generations. Such models also lie at the interface of population genetics and population ecology, a field that deals extensively with age-structured populations. We therefore start this chapter with an examination of some of the fundamental demographic parameters that ecologists have used to characterize populations with overlapping generations. We will then introduce genetic variation that influences the phenotypic variation in these demographic parameters to look at the impact of selection in age-structured populations, especially the evolution of senescence. Finally, a detailed example of such selection will be given, an example that in addition makes use of many of the concepts developed in previous chapters. This example is then used to provide an overview of population genetics as treated in this book.

Life History and Fitness Life history is the progress of an individual throughout their life. An individual is conceived and sometimes survives to birth, then grows into an adult, or fails to survive to adulthood. If the individual survives to adulthood, then the individual perhaps mates and reproduces at specific ages, and finally dies at some age. In Chapter 11, we defined the fitness components of viability, mating success, and fertility/fecundity. To examine life history and its evolutionary implications, we must first make each of these fitness components an explicit function of age. We start with the fitness

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

Selection in Age-Structured Populations

component of viability. Viability can be measured in an age-specific fashion by the age-specific survivorship, ℓx, the probability of an individual surviving to age x. Ideally, age should be measured from fertilization in order to cover the diploid individual’s entire life history, but, in practice, a time point well after conception is used in many species. For mammals, and humans in particular, the initial time point is usually birth. This obviously misses any deaths that occur between conception and birth, which can be substantial. For example, 70% of all human conceptions end in death before birth, most of which are not normally detectable (Larsen et al. 2013). Table 15.1 illustrates the concept of age-specific survivorship through the example of females from the United States as determined by the 2010 US census data (Martin et al. 2012). As shown in this table, age is often not treated as a continuous variable, but rather as a series of consecutive categories or ranges of ages. This represents the practical constraint of how such data are gathered, but, in theory, one could treat age as a continuous variable. However, in this chapter, we bow to the reality of actual data and will treat age as an ordered categorical variable. Table 15.1 shows the age ranges used for this human example. Although each category is a range of ages (generally five years

Table 15.1

The life history table for US females based on the 2010 census data.

Age Range (years)

Assigned Age, x

ℓx

bxmx

ℓxbxmx

xℓxbxmx

1, the population is growing; if ℜ = 1, the population size is stable; and if ℜ < 1, the population is declining. Unlike R0, the net reproductive rate per lifetime, ℜ is measuring the stable growth of the population in absolute time. For example, using the life history parameters given in Table 15.2, Euler’s equation for phenotype 1 becomes 1 = ℜ1−1 × 2 where the subscript “1” indicates the phenotype. Solving this yields ℜ1 = 2. In contrast, using the life parameters given in Table 15.2 for phenotype 2, Euler’s equation becomes 1 = ℜ2−2 × 2, so ℜ2 = 2 = 1.412. Note that the ℜ’s correctly predict that phenotype 1 grows faster than phenotype 2, even though both phenotypes have identical net reproductive rates. In a population of genetically variable individuals in which different genotypes have different life history parameters, the ℜ defined by the life history parameters of a particular genotype no longer reflects the growth of that genotype in the population because, under Mendelian genetics, many of the offspring will have a genotype that differs from that of their parents. Rather, the ℜ for a particular genotype measures the reproductive output of that genotypic class and not the growth of that genotypic class. Fisher (1930) therefore concluded that ℜ is a better measure of fitness in a population with overlapping of generations than is R0. It is also more convenient mathematically to measure fitness as an exponential rate rather than a multiplicative constant such as ℜ, so Fisher defined the Malthusian parameter r such that er = ℜ with the cohorts being genotypic classes and not the entire population. With this exponential transformation, a genotype is producing an excess of offspring when r > 0, it is producing offspring at a replacement rate when r = 0, and it is producing a deficiency of offspring when r < 0. Substituting this exponential transformation of ℜ into Euler’s equation (Eq. 15.8) yields: max age

1=

e − rx ℓx mx bx

15 9

x=0

Equation (15.9) provides a method of implicitly calculating r for a genotype or any other cohort as characterized by a set of life history parameters. For example, using the data on US females given in Table 15.1, the implicit solution for r from Eq. (15.9) is −0.00176. An approximate solution to Eq. (15.9) can be derived using the linear Taylor’s series approximation to e−rx ≈ 1−rx. Substituting this approximation into Eq. (15.9) yields: max age

max age

1 − rx ℓx mx bx =

1=

x=0 max age

r=

x=0

ℓ x m x bx − 1

x=0 max age

xℓ x mx bx x=0

max age

ℓ x m x bx − r

xℓ x mx bx x=0 max age

1 − 1 R0 = T

xℓ x mx bx where T =

x=0 max age

15 10

ℓ x m x bx

x=0

For example, applying Eq. (15.10) to the US female life history data given in Table 15.1 yields r = −0.00181, which is very close to the exact value of −0.00176. Equation (15.10) allows us to gain greater biological insight into the meaning of the Malthusian parameter. First, r is an increasing function of R0, that is, r increases as the number of offspring produced over a lifetime increases. Second, r is a decreasing function of T. The biological meaning of T becomes clear when we note that

Selection in Age-Structured Populations ℓ x mx bx

15 11

/

ℓ x mx b x x

is the proportion of offspring borne to a female of age x relative to her total lifetime reproductive output. Hence, Eq. (15.11) represents a probability distribution over age of the number of offspring borne to a female in this cohort. T is the expected value of this distribution, that is, T is the average age at which a female gives birth and is called the average generation time. Therefore, r increases as the total number of births increases and r decreases as the average generation time increases. The Malthusian parameter is a measure of fitness that takes into account both the number of offspring produced and how rapidly they are produced. We now have two fitness measures for dealing with age structure; the net reproductive rate (Eq. 15.1) when all the genotypes in the population have similar generation times, and the Malthusian parameter (Eq. 15.9) when the genotypes differ in their average generation times. In deriving the Malthusian parameter, we assumed a stable age distribution, but there are many biologically realistic situations in which this assumption can be violated. For example, a survey of 27 insect species living in seasonal environments revealed that none of them ever experience a stable age distribution (Taylor 1979). More complicated measures of fitness can be derived from the life history parameters that take into account deviations from the stable age distribution (Demetrius 1975, 1985; Templeton 1980b), but these lie outside the scope of this book. In the remainder of this chapter, we will use either the net reproductive rate or the Malthusian parameter as our measure of fitness in age-structured populations.

The Evolution of Senescence Why do we grow old? Shouldn’t natural selection favor an ageless phenotype in which there is no decline of vigor or reproductive output with age? How could it be an evolutionary outcome for individuals to lose their vigor and reproductive capabilities as they age? In this section, we will see how the fitness measures derived above allow us to address these important questions. Let us start with a population of ageless individuals who show no senescence over their entire lifetime. Being ageless is not the same as being immortal. Individuals who do not age still can die through accidents, predation, disease, etc. They are ageless in the sense that their chances of dying in an interval of time do not depend upon their age. Let d be the probability of an individual dying in a unit of time. We regard d as being independent of age and a constant throughout the entire lifetime, reflecting the ageless phenotype of the individual. The individual is also regarded as being ageless with respect to reproduction by letting mb, the probability of having mated times the expected number of offspring in a time unit, also be independent of age and a constant throughout the entire lifetime. Given these ageless parameters, the probability of an individual living to age x is: x

1−d = 1−d

ℓx =

x

15 12

i=0

Then, net reproductive rate of an ageless individual is: ∞

R0 = x=0

ℓx mb = mb

∞ x=0

1−d

x

=

mb d

15 13

581

582

Population Genetics and Microevolutionary Theory

using the well-known formula for the sum of a geometric series [sn = a + ag + ag2 + … + agn−1 = a(1−gn)/(1−g) a/(1−g) as n ∞ where a and g are constants with −1 < g < 1]. We can also apply the sum of a geometric series to Euler’s equation to obtain the Malthusian parameter for this ageless population as: ∞

1=



e − rx ℓx mb = mb

x=0

1 − d e−r

x=0

x

=

mb 1 − 1 − d e−r

15 14

r = ln 1 − d − ln 1 − mb When d and mb are small numbers, the Taylor’s series approximation of r from Eq. (15.14) is r ≈ mb−d. Now, suppose a mutation occurs in this ageless population such that the bearers of this mutation senesce and die at age n−1. The net reproductive rate of the mutant individuals is: n−1

R0 = mb

1−d

x

=

x=0

mb 1− 1−d d

n

15 15

For large n and any d < 1 (that is, some death occur from causes unrelated to age), the term (1−d)n goes to zero, and hence the term in brackets in Eq. (15.15) goes to one. Thus, if n is large enough (depending on d), then R0 ≈ R0 and the mutation is selectively neutral as measured by the net reproductive rate. Similarly, one can show that the mutant’s Malthusian parameter obeys the implicit approximation: r ≈ mb 1 − 1 − d e − r

n

− d ≈ mb − d

for large n

15 16

Once again, as long as senescence is delayed to an old age, the mutant phenotype is neutral. As we saw in Chapter 5, neutral and nearly neutral mutations will inevitably become fixed in a population over long periods of time. This means that if mutations can occur that kill their bearers at a sufficiently advanced age, such mutations are effectively neutral and some will go to fixation, thereby destroying the agelessness of the initial population. This model for the evolution of senescence is called mutational accumulation because senescence is driven by the accumulation of effectively neutral mutations that have late age-specific deleterious associations. An example of alleles with age-specific deleterious effects is found for Huntington’s disease, one of several neurodegenerative diseases in humans associated with trinucleotide repeats which have a late age of onset (Chapter 13). Langbehn et al. (2004) found that the empirical relationship between age of onset and CAG repeat number is well described by the equation: S Age, CAG =

1 + exp

π − 21 54 − exp 9 56 − 0 146CAG + Age 3 35 55 + exp 17 72 − 0 327CAG

−1

15 17

where Age is the age of the individual, CAG is the number of CAG repeats (see Figure 13.12), and S(Age,CAG) is the probability of having no neurological symptoms to the given age with the given repeat number. If we make the assumption that all reproduction stops with the onset of the neurological symptoms, the net reproductive rate for a bearer of Huntington’s chorea is: max age

ℓx mx bx S x, CAG

R0 CAG =

15 18

x=0

Using the life history data in Table 15.1 in Eq. (15.18), we can calculate the net reproductive rate of bearers of a newly formed Huntington’s allele (one that just crossed the threshold repeat number

Selection in Age-Structured Populations

and reached a value of 36 repeats, as discussed in Chapter 13) to be 0.94936 versus the normal net reproductive rate from Table 15.1 of 0.95209. The neurological symptoms are relatively mild when they first occur and then get progressively worse, eventually resulting in death. The assumption that all reproduction stops with the onset of symptoms is therefore overly conservative, so the actual difference in net reproductive rates associated with a newly formed Huntington’s allele is even less than that indicated above. Hence, the lethal neurodegeneration of Huntington’s disease is essentially neutral with respect to natural selection when an allele first reaches the 36 repeat threshold. Of course, as discussed in Chapter 13, there are other targets of selection on Huntington’s disease, including the family and the repeats themselves in the male germline. Focusing just upon the selection on the repeats themselves and ignoring the family-level selection, we saw in Chapter 13 that selection at the genomic-level favors an increase in repeat number, which in turn is associated with an earlier age of onset (Eq. 15.18 and Figure 13.12). As the age of onset is lowered, there is now stronger individual-level selection against Huntington’s disease. For example, the net reproductive rate for bearers of a Huntington’s allele with a CAG repeat number of 56 using Eq. (15.18) and the life history data in Table 15.1 is 0.38298 versus the normal net reproductive rate of 0.95209, resulting in substantial selection at the individual level against the Huntington allele. These calculations show how important the age of onset is in determining the fitness impact of an allele that affects life history parameters. Even lethal genetic diseases are effectively neutral when the age of onset is old enough. As a result, individual selection alone cannot prevent the evolution of senescence via genetic drift leading to the fixation of nearly neutral alleles with deleterious effects of late age of onset. Reproductive traits can also show age-specific deleterious effects. For example, Gruhn et al. (2019) show that chromosomal errors associated with centromeres and chromosome cohesion loss in meiosis increase as women age in a highly non-linear fashion. These errors lead to pregnancy loss and congenital disorders in the children of older mothers. Such parental age effects are also found in plants and contribute to the evolution of senescence (Barks and Laird 2020). Wensink et al. (2017) have argued that external mortality is not the driver of senescence in this model, but rather the stable age distribution. They illustrate their argument with a counter example. Suppose that a population consists of individuals that are not only ageless but also truly immortal, that is, ℓx = 1 for every x. However, suppose mb > 0, that is, reproduction still occurs. Since there is no death, there is infinite population growth. As a result of this infinite capacity to grow, the adults born in the past contribute less and less proportionately to the newborn class as time progresses, making their relative reproductive contributions go to zero. Once again, one gets effective neutrality for senescence at old ages. Their conclusion that external mortality is not the cause of selective conditions favoring senescence is somewhat misleading, however. In deriving Eqs. (15.15) and (15.16), external mortality is definitely the cause of effective neutrality, and it is also the cause of the stable age distribution – not the other way around. Recall that we must always take the gamete’s perspective in making microevolutionary predictions, so the fitness of the bearers of the mutant after it first appears is the only criterion to consider for predicting its initial probability of survival and its ultimate probability of fixation, as shown in previous chapters. External mortality is definitely the driver of age distribution and of the fitnesses shown in Eqs. (15.15) and (15.16) that favor mutational accumulation (the same is true for the antagonistic pleiotropy model of senescence to be discussed shortly). Hence, external mortality can indeed drive the evolution of senescence. In this chapter, we will also give the example of selection on the aa super-allele in Drosophila mercatorum. Measured fitnesses and age structures clearly show that an increase in external mortality (due to desiccation in the D. mercatorum example) favors the aa allele that is associated with increased early female fecundity but decreased late age viability – a classic senescent allele. However, we will

583

584

Population Genetics and Microevolutionary Theory

also see that environmental conditions that greatly reduce external mortality result in explosive population growth that lead to a young age structure. These conditions also favor the aa allele through Eqs. (15.15) and (15.16), as predicted by the model of Wensink et al. (2017) but contrary to the mutational accumulation model. However, even in this case, it is the fitnesses assigned to the various genotypes that drive the evolutionary processes, so the gamete’s fitness perspective always rules. Nevertheless, the young age distribution model is a legitimate alternative for the evolution of senescence through the impact of the age distribution upon the fitnesses assigned to genotypes through Eq. (15.1) or (15.10) even when external mortality is reduced. We now turn our attention to another class of mutations with life history effects. Suppose, as before, a mutation occurs that kills its bearers at age n−1. However, we now assume that this same mutation increases earlier reproduction from mb to mb such that mb > mb. For example, suppose this mutation is associated with transferring the energy used in maintaining viability after age n−1 to reproduction at earlier ages. This mutation therefore has a pattern of antagonistic pleiotropy (Chapter 11) because it is associated with traits that have opposite effects on fitness. The net reproductive rate of the individuals with this antagonistic pleiotropic mutant is: n−1

R0 = mb x=0

1−d

x

=

mb 1− 1−d d

n

15 19

As before, the term in brackets in Eq. (15.19) goes to one as n increases, so if the age of onset of the deleterious effects of this mutant is old enough, then its net reproductive rate is approximately mb’/d which is greater than the net reproductive rate of mb/d for the non-mutants. Similarly, one can show that the Malthusian parameter for this pleiotropic mutant is, for n large, approximately mb’−d > mb−d. Once again, by either fitness criterion, bearers of this pleiotropic mutant are actually favored by natural selection as long as the deleterious effects have a late age of onset. In this case, our initial ageless population will evolve senescence due to the positive action of natural selection, that is, it is adaptive to senesce. Many mutations have been found that are associated with beneficial effects early in life and deleterious effects later in life. For example, in Chapter 11, we discussed several gene loci associated with resistance to falciparum malaria. Since most of the mortality associated with this parasite occurs in childhood, the beneficial effects of these malarial resistance genes are primarily expressed at an early age. However, the deleterious effects of these same genes (often associated with the chronic effects of anemia) are often not clinically significant until later in life. Hence, we have a classic trade-off between early versus late viability. Another example of trade-offs is the work of Parker et al. (2020) that compared two experimental lines of Drosophila melanogaster: one selected for postponed reproductive senescence and the other an unselected control. Focusing on 57 candidate loci, they found that all but one candidate locus affected at least one life history trait in one sex- or age-specific reproductive productivity. Of these, 23 candidate genes had antagonistic pleiotropic effects on lifespan and productivity, indicating that antagonistic pleiotropy on life history traits is common. One popular hypothesis is that antagonistic pleiotropy is inherent in those multicellular organisms that have separate germlines versus somatic lines (Kirkwood 1977). The basic idea is that there is an inherent energetic trade-off between allocating energy to maintain and increase the productivity of the germline versus allocating energy to maintain and grow the somatic line. Chen et al. (2020a) tested this hypothesis experimentally with zebrafish (Danio rerio) males that continue somatic and germline proliferation throughout life. They used a split-clutch design in which they eliminated the germline from some of the males by microinjection of antisense oligonucleotides to

Selection in Age-Structured Populations

knock down the germline-specific gene dead end. This had no effect on somatic development, including the development of testes and male behavior, but eliminated the germline. Males with and without germlines were then exposed to a variety of stressful environments or manipulations, and somatic recovery (and where appropriate, germline recovery) was measured. They found that somatic recovery occurred substantially faster in germline-free fish versus germline-carrying fish, but germline-carrying fish showed recovery in several traits related to offspring number and fitness. Overall, their results support the hypothesis that germline maintenance is costly and directly trades off with somatic maintenance. Govindaraju et al. (2020) focus on another inherent source of antagonistic pleiotropy: antagonistic effects between different targets of selection. They point out that aging entails an irreversible deceleration of physiological processes at the cellular level that leads to altered metabolic activity and a decline of the integrity of tissues, organs, and organ systems – both germline and somatic. DNA damage and mutations during mitosis can be repaired, albeit never perfectly; ordinary metabolism in cells generates various toxins that have to be removed or otherwise detoxified; proteins in the cells can be damaged and have to be broken down or removed and replaced; and cell growth and development must be tightly controlled to insure the viability and fertility of the individual. All of these processes take energy and are influenced by genes, so the speed of aging can be adjusted through evolution at those genes controlling these cellular repair and control processes. However, the immediate phenotypic expression of many of these genes is at the cellular level, with potential cascading effects up to the individual level. As we saw in Chapter 13, somatic cells are a legitimate target of selection with their own evolutionary processes that can sometimes be in conflict with selection at the individual level. We already discussed this issue in Chapter 13 with respect to cancer. Cancer arises out of the accumulation of mutations in cell lines (both somatic and germline) that eventually allows the cells to grow in an uncontrolled fashion that can kill the individual. Govindaraju et al. (2020) generalized this process to the accumulation in cell lineages of mutations that lead to an irreversible deceleration of physiological processes at the cellular level – and therefore aging and senescence at the individual level. They used the theory of Muller’s ratchet (Muller 1932) on the inevitable step-wise accumulation of even deleterious mutations under asexual reproduction. Because both germline and somatic cells reproduce asexually through mitosis within a multicellular individual, both cell types are susceptible to Muller’s ratchet – as we have already seen with the cellular evolution of cancer (Chapter 13). This Muller’s ratchet theory of the evolution of senescence is different from the mutational accumulation theory of inherited, late-acting mutations discussed earlier because it deals with the accumulation of mutations in cell lineages within a single individual that leads to antagonistic effects at the level of the individual, which in turn induces selection on the inherited genes that influence cellular repair and control. As seen with cancer (Chapter 13), sometimes the accumulation of mutations in a cell lineage leads to overcoming the mechanisms that control and limit cell growth that are essential for a healthy multicellular individual. One mechanism that controls and limits cell growth in many species is telomere shortening. Telomeres are the nucleoprotein structures at the ends of eukaryotic chromosomes. Telomeres typically have tandem repeats. For example, human telomeres contain tandem repeats of the hexanucleotide TTAGGG that span between 2 and 20 kb in somatic cells and greater than 20 kb in germline cells (Riethman 2008). Normal cells undergo progressive telomere shortening with cell division, and sufficient shortening can stop cell division (Ofir et al. 2002; Shay and Wright 2019). Many cancers become immortal by activating the enzyme telomerase to lengthen the telomeres, although other cancers lengthen telomeres through a DNA recombination mechanism (similar to what is shown in Figure 13.2) (Shay and Wright 2019). Because of the tendency of telomeres to shorten with each mitotic division, they may also serve as a biomarker for age

585

586

Population Genetics and Microevolutionary Theory

or senescence. Telomere length can also be influenced by environmental factors (e.g. Shlush et al. 2011), but even a rough measure of age or senescence would be valuable for studying life history in many species. Whittemore et al. (2019) reviewed the literature and did show there is an overall relationship between lifespan and telomere length across many species, but they also showed the relationship is even stronger with the telomere shortening rate. Telomere length is also correlated with individual age within a species and an individual’s remaining lifespan (Bichet et al. 2020). It is important to note that mutational accumulation, young age distributions, Muller’s ratchet on cellular evolution, and antagonistic pleiotropy are not mutually exclusive. All of these processes could contribute to the evolution of aging and senescence, but their exact mixture could vary from species to species. For example, the mutational accumulation model depends upon the concept of effective neutrality, which in turn depends both upon the age of onset as shown above, and upon the strength of genetic drift in the population. For example, Lohr et al. (2014) studied life history and aging in small populations of Daphnia magna and found no trade-offs but found much support for the mutational accumulation model. These results suggest that the evolution of lifespan and aging can be strongly affected by genetic drift – an expected result when fusing Ohta’s nearly neutral theory (Chapter 5) with the theory of mutational accumulation. Everman and Morgan (2018) used GWAS on experimental lines of D. melanogaster of large population size and measured various stress responses in flies of different ages. Their results revealed many genes that fit both the mutational accumulation model and the antagonistic pleiotropy model (they did not test the young age distribution model nor the Muller’s ratchet model). Clearly, these are not alternative models of senescence but all can jointly contribute to the evolution of senescence. So why do we grow old? Because it is evolutionarily inevitable under this mixture of models for nearly all multicellular species.

Abnormal Abdomen: An Example of Selection in an Age-Structured Population In Chapter 6, we discussed population structure and gene flow in populations of the fruit fly Drosophila mercatorum living near the town of Kamuela (also known as Waimea) on the Island of Hawaii (see Figure 6.23). This population is also polymorphic for a supergene complex called abnormal abdomen (aa) that is under strong selection. We now return to this system for a detailed examination because it is an excellent example of natural selection in an age-structured population. Moreover, the abnormal abdomen story will illustrate many of the other themes about natural selection that have been developed in the last several chapters.

Genetic Architecture and Units and Targets of Selection Below the Level of the Individual The abnormal abdomen supergene in the fly D. mercatorum gets its name from the fact that it is often associated with the retention of juvenile cuticle (larval type cuticle) on the adult abdomen. Another phenotype associated with this genetic syndrome is a slowdown in egg-to-adult developmental time. Both of these phenotypes are similar to phenotypes associated with the bobbed (bb) locus in D. melanogaster and D. hydei. Bobbed is known to be due to deletions in the X-linked 18S/ 28S ribosomal DNA (rDNA) (Ritossa 1976), and the aa supergene also maps to the mercatorum X chromosome in a region associated with the nucleolar organizer, the location of the 18S/28S rDNA (Templeton et al. 1985). Normally, the phenotypic expression of aa is limited to females,

Selection in Age-Structured Populations

but there is a Y-linked modifier that allows expression in males (Templeton et al. 1985). One of the few functional regions on the Drosophila Y chromosome is another cluster of 18S/28S rDNA. For all of these reasons, the 18S/28S rDNA became a candidate locus (Chapter 10) for this phenotypic syndrome. The 18S/28S rDNA in Drosophila exists as a tandem, multigene family of a repeating unit containing the DNA that codes for the 18S, 5.8S, and 28S RNA subunits of the ribosome, all separated by short, transcribed spacer sequences that are removed during the processing of the primary transcript (Figure 15.3a). Each unit is separated from adjacent units by a non-transcribed spacer (NTS). There are normally about 200–300 repeats in the rDNA cluster on the X, and a smaller cluster of rDNA units is found on the Y chromosome (Figure 15.3f ) (Ritossa 1976). The bb mutations of D. melanogaster represent deletions in the rDNA on the X that severely reduce the number of 18S/ 28S units in the cluster. DeSalle et al. (1986) therefore examined the amount of rDNA in flies displaying the abnormal abdomen (aa) phenotype and flies without the abnormal abdomen phenotype. They found no deficiency of X-linked rDNA in either group of flies. However, they did find that the Y chromosomes that allowed expression of aa in males (Yaa) did indeed have a severe deletion of the Y-linked DNA, thereby indicating that rDNA was involved in the aa syndrome. Because the quantity of X-linked rDNA was normal in aa flies, DeSalle et al. (1986) next turned their attention to the quality of the rDNA. They discovered that all lines showing the aa phenotype had at least a third or more of their 28S genes disrupted by a 5-kilobase (kb) intervening sequence, now known to be an R1 insert (Figure 15.3b–d). R1 inserts are a type of retrotransposon (Chapter 13) found in arthropods that transpose into a particular site in the rDNA 28S subunit (Figure 15.3b–d). Genetic variation at an EcoR1 restriction site (Appendix A) exists among the R1 elements even within a single rDNA cluster of D. mercatorum (Figures 15.3b and c). Another type of retrotransposon, the R2 insert, also has this specificity for 28S rDNA, and the R2 elements insert in a site just 5 of the R1 insertion site in D. mercatorum (Figure 15.3d and e). As shown in Figure 15.3, both types of inserts, both singly and doubly, exist in D. mercatorum, even on a single X chromosome (Malik and Eickbush 1999). Such inserts functionally inactivate the 18S/28S unit in which they are imbedded by disrupting normal transcription and processing (DeSalle et al. 1986), so the presence of these inserts functionally inactivates much of the rDNA, thereby explaining the similarities of aa to bb even though there is no physical deletion of rDNA. Such inserts have two ways of spreading within the rDNA cluster of the X chromosome. First, they can transpose. The act of transposition often creates some genetic heterogeneity at the 5 end of the R1 elements and the 3 end of the R2, and such heterogeneity is found among different R1 and R2 copies, thereby indicating that these elements have been actively transposing in D. mercatorum (Malik and Eickbush 1999). The R1 and R2 elements can also spread within a tandem, multigene family via mechanisms of unequal exchange (Chapter 13). Spread via unequal exchange creates blocks of inserted versus noninserted 28S subunits, as spread via unequal exchange is to adjacent subunits (see Figure 13.2), whereas transposition should lead to a non-clustered distribution within the rDNA. The inserted and non-inserted 28S repeats are highly clustered within the rDNA (DeSalle et al. 1986), indicating that unequal exchange is quantitatively more important than transposition as a mechanism for spread within X chromosomes. Because the R1 and R2 elements can spread within the multigene family by a combination of transposition and unequal exchange, these retrotransposons are both units and targets of selection below the level of the individual (Chapter 13). If the molecular mechanisms of spread were strong, we would expect little variation among the paralogous copies within a chromosome but much variation among homologous rDNA families on different X chromosomes at the population level. In contrast, if the molecular mechanisms of

587

588

Population Genetics and Microevolutionary Theory

(a) Uninserted X 18S

18S

28S A

B

(b) R1 Inserted EcoRI+ X 18S

R1

28S G/F

18S

I

C

B

(c) R1 Inserted EcoRI− X 18S

R1

28S

18S

9.4 kb

C

B

(d) R1 and R2 Inserted X 28S

18S

R2

R1

5.3 kb

18S C

I'

B

(e) R2 Inserted X 28S

18S

5.3 kb

R2

18S 3.8 kb

B

(f) Uninserted Y 18S

18S

28S A

D

Figure 15.3 18S/28S rDNA repeat types found in Drosophila mercatorum. Thin lines indicate spacers and blocks coding regions. A whole repeat is shown connected to the 18S rDNA of the next repeat. Blocks labeled R1 indicate R1 type retrotransposon insertions, and blocks labeled R2 indicate R2 type retrotransposon insertions. The block coding for 5.8S rRNA between the 18S and 28S coding regions is not labeled. EcoR1 restriction sites are shown by vertical lines connected to a triangle, and the restriction fragments are designated by letters (DeSalle et al. 1986). Those fragments labeled with their length in kilobases are fragments expected to appear on Southern blots using a complete uninserted 18S/28S repeat as a probe. Source: Based on data from DeSalle et al. (1986) and Malik and Eickbush (1999).

spread were weak, we would expect much variation among the paralogous copies within a chromosome as well (Chapter 13, Weir et al. 1985). Of course, flies that had all their 28S genes bearing an insert would be nonviable, so selection at the individual level would prevent inserted repeats being in every paralogous copy. Genetic variation among inserted 28S repeats certainly exists (Figure 15.3

Selection in Age-Structured Populations

and R1 elements with large 5 truncations reported in Malik and Eickbush 1999), so if molecular mechanisms of spread are strong we would expect only one insert type to dominate the inserted subset of repeats on a single chromosome. Hence, the strength of the molecular mechanisms for insert spread can be assessed by examining the patterns of variation of inserted repeats both within and among chromosomes. All the variants shown in Figure 15.3 were isolated using PCR-based techniques. Such techniques can find even extremely rare variants, but because of biases in the PCR procedure, relative abundance of the alternative forms cannot be estimated reliably. However, for the standard aa test strain (derived from the Kamuela population but subjected to intense selection in the laboratory for morphological expression of juvenilized abdominal cuticle), Malik and Eickbush (1999) estimated that 86% of the X-linked 28S repeats had inserts with any of the insert patterns shown in Figure 15.3. Much of the original work on aa was performed in the 1980s and utilized Southern Blot techniques (Appendix A). This technique has the advantage that quantitative information exists about the relative abundance of various repeat types if they produce different band lengths that hybridize with the probe being used. Using this technique, the banding pattern associated with Figure 15.3b seemed to be dominant among the inserted class. However, band intensity in a Southern blot depends upon many factors in addition to the relative abundance of various DNA fragments, such as the size of the fragment, what proportion of the fragment hybridizes with the probe being used, and how far the fragment migrates in the agarose gel. Taking all these factors into account, Hollocher et al. (1992) and Templeton et al. (1993) developed an unbiased estimator with high replicability for the proportion of X-linked repeats bearing the EcoR1+ R1 insert as shown in Figure 15.3b from densitometry scans of Southern blots on D. mercatorum males (which have only one X chromosome): Prop EcoR1 + R1 =

G A B−C A+G B−C + D

15 20

where the italicized letters refer to the densitometry readings on the Southern blot of the various bands indicated in Figure 15.3. Because of length variation in the 5 end of the R1 sequence, several different bands are often seen close to one another (indicated by the G/F notation in Figure 15.3), so the “G” in Eq. (15.20) refers to the sum of the densitometry readings of all these bands. Using Eq. (15.20) upon replicate scans from the aa stock reveals that 76% of the X-linked repeats bear the EcoR1+ R1 insert with a standard deviation of 7%. The value of 76% is not statistically different from the value of Malik and Eickbush (1999) of 86% (t test = 1.36 with 9 degrees of freedom), implying that the vast majority of inserted repeats have just one type of insert, the EcoR1+ R1 element. This means that inserts of type C, D, or E in Figure 15.3 and large 5 R1 truncations are minor contributors to inserted rDNA elements in the aa stock. The more important question is the contribution of the various insert types to variation among X chromosomes from the natural population. Figure 15.4 shows the distribution of the proportion of 28S rDNA repeats that bear the EcoR1+ R1 insert in 1036 X chromosomes extracted from the Kamuela natural population of D. mercatorum. As is readily seen, there is extensive variation among X chromosomes for the proportion of EcoR1+ R1 inserted repeats, a result consistent with the strong degree of within-chromosome dominance of the EcoR1+ R1 insert found in the aa stock, implying strong molecular mechanisms for the spread of the EcoR1+ R1 insert. As shown by the model of Weir et al. (1985), the same selective forces at the molecular level that would lead to the within-chromosome dominance of the EcoR1+ R1 insert are also expected to lead to much inter-chromosomal variation (Chapter 13). As Figure 15.4 shows, there is indeed much

589

Population Genetics and Microevolutionary Theory

0.16 0.14

Frequency in Population

590

0.12 0.1 0.08 0.06 0.04 0.02 0 0.075 0.125 0.175 0.225 0.275 0.325 0.375 0.425 0.475 0.525 0.575 0.625 0.675 0.725 0.775 0.825 0.875 0.925 0.975

Proportion of X-Linked rDNA Repeats with the EcoR1+ R1 Insert

Figure 15.4 The proportion of 28S rDNA repeats that bear the EcoR1+ R1 insert in 1036 X chromosomes surveyed from the natural population of Drosophila mercatorum living in the vicinity of Kamuela, Hawaii.

interchromosomal variation in EcoR1+ R1 insert proportion. However, Figure 15.4 does not directly address the relative abundance of all the insert types shown in Figure 15.3 in the natural population. To test the prediction that the EcoR1+ R1 insert dominates overall on X chromosomes extracted from the natural population, consider another estimator of the proportion of inserted repeats: Prop inserted repeats =

C B

15 21

This estimator uses fragments near the 3 end of each repeat element, and thereby detects the inserted elements of types B, C, and D in Figure 15.3, but fails to detect type F (R2 alone inserted). However, the exclusion of type F is expected to produce little error because Malik and Eickbush (1999) noted that most elements bearing an R2 insert also bore an R1 insert. Moreover, elements bearing just an R2 insert should yield a 3.8 kb band on a Southern blot (see Figure 15.3), but such a band was not visible on almost all autoradiographs and was at most very faint on a few. Hence, most R2 inserts occur in conjunction with an R1 insert in the natural population. Finally, because estimator 15.21 uses a 3 fragment, it would not be affected by 5 truncations of the R1 elements. Hence, estimator 15.21 should capture virtually all of the heterogeneity in inserted elements described by Malik and Eickbush (1999). Estimator 15.21 was not used in the original analysis because the C band has a weak hybridization signal and is more diffuse because of the greater distance it must travel through the agarose gel (Hollocher et al. 1992). These properties cause estimator 15.21 to be severely biased. However, Malik and Eickbush (1999) made a suggestion that allows us to use estimator 15.21 to test for whether or not there is significant variation in nature for the insert types other than EcoR1+ R1. As will be discussed shortly, there is another X-linked element that is necessary for expression of the aa phenotype called the under-replication (ur) locus, which has two alleles in the natural population, ur+ and uraa, that can be scored by testcrosses with the aa stock. Hollocher et al.

Selection in Age-Structured Populations

Table 15.3 Numbers of ur+ and uraa X chromosomes with their percentage of inserted 28S genes as estimated from Eq. (15.20) that are above and below the overall median value for all chromosomes. Allele at ur Locus on X

Below Insert Median

Above Insert Median

ur+

281

251

137

167

ur

aa

Source: Hollocher et al. (1992). © 1992, Genetics Society of America.

(1992) measured the degree of association (a type of linkage disequilibrium) between the EcoR1+ R1 insert proportions of each male shown in Figure 15.3 with the ur allele borne by each male as determined by a testcross with aa females. Their significance and strength of the association was measured through a median test (Appendix B). In this test, all the X chromosomes scored for both insert proportion and ur are ranked by insert proportion and divided into those above and below the median. The X chromosomes are then subdivided into those that are ur+ and those that are uraa to yield a 2 by 2 categorical table. The degree of association between insert proportion and the allelic state at the ur locus can then be measured by a standard chi-square homogeneity test of the null hypothesis of no association. Table 15.3 gives the results when insert proportion is estimated using Eq. (15.20), which yields a chi-square of 4.65 with 1 degree of freedom (p = 0.03). Malik and Eickbush (1999) predicted that this statistic of association should be strongly influenced if the inserted types ignored by Eq. (15.20) also show significant variation in natural populations. Although estimator 15.21 is biased, it can still be used in statistics that are invariant to shifts in mean, which includes the median test. Therefore, the median test statistic of association was recalculated using Eq. (15.21) to yield the results shown in Table 15.4. The overall numbers show minor differences between Tables 15.4 and 15.5 because each table includes only those individuals who were successfully scanned for the bands relevant to a particular estimator. As a consequence, there are a few individuals included in one analysis that are not in the other. Despite these minor differences in the samples, the results in both tables are virtually identical. In Table 15.4, the chi-square statistic is 5.58 with 1 degree of freedom (p = 0.02). The near identity of the results using these two extremely different estimators indicates that the inserts other than the EcoR1+ R1 insert make no significant contribution to variation in the X-linked rDNA found in the natural population near Kamuela. The overall pattern is therefore one of homogeneity among the insert types found within a chromosome (dominated by the EcoR1+ R1 insert) and heterogeneity among X chromosomes in the proportion of EcoR1+ R1 inserts (Figure 15.4). This is the pattern expected when the molecular mechanisms that allow the spread of the EcoR1+ R1 insert are strong (Chapter 13). This conclusion was empirically confirmed by the direct monitoring of strains. For example, a parthenogenetic strain of D. mercatorum (Chapter 13) was established that had only a few of its 28S genes bearing the EcoR1+ R1 insert. Within three years, a parthenogenetic sublineage derived from this strain had

Table 15.4 Numbers of ur+ and uraa X chromosomes with their percentage of inserted 28S genes as estimated from Eq. (15.21) above and below the overall median value for all chromosomes. Allele at ur Locus on X

Below Insert Median

Above Insert Median

ur+

287

254

136

169

ur

aa

591

592

Population Genetics and Microevolutionary Theory

developed the aa phenotype, and upon examination, it was found that most of its 28S X-linked repeats now had the EcoR1+ R1 insert, indicating a drastic turnover in its rDNA complement in a relatively short time interval (DeSalle et al. 1986). These direct observations confirm that strong forces for paralogous spread exist for the EcoR1+ R1 insert, and this conclusion is compatible with the population-level observations on intra- and inter-chromosomal patterns of variation. Hence, the EcoR1+ R1 insert is a unit of selection in itself and is also a target of strong selective forces below the level of the individual because it has the ability to replicate and spread within the genome. Evidence of strong selection directly on the insert below the level of the individual does not mean that molecular-level forces overwhelm the evolutionary forces occurring at the level of individuals in a reproducing population, a frequent misinterpretation (Dover 1982; Malik and Eickbush 1999). Rather, as we saw in Chapter 13, the strong molecular-level selection means that this multigene system behaves as a single locus with respect to genetic drift at the population level. Moreover, the strong molecular-level forces create much genetic variation among chromosomes at the population level (Figure 15.4), which enhances the response to natural selection for targets of selection at the individual level and above. Thus, the strong molecular-level forces seen in aa augment, not diminish, the ability of this system to be subject to population-level evolutionary forces, such as genetic drift and selection targeted at the level of individuals. This is true not just for the R1 retrotransposons but for many other transposable elements as well (Kidwell and Lisch 2000).

Genetic Architecture and Units of Selection at the Level of the Individual Surveys of several strains displaying the aa phenotype revealed that they all had one-third or more of their 28S genes with the aa insert (DeSalle et al. 1986). Thus, having a third or more of aa inserted 28S genes appears to be necessary for the aa syndrome. However, these same surveys revealed that not all strains with more than a third of aa inserted 28S genes display the aa phenotype. Hence, having a third or more of the 28S genes bearing the aa insert is necessary but not sufficient for the syndrome. As discussed in Chapter 8, when the causes of phenotypic variation are not both necessary and sufficient, interactions among variable, causative factors are frequently implicated. Indeed, DeSalle and Templeton (1986) discovered that a second, interactive criterion must also be satisfied: there must be no preferential under-replication of inserted rDNA repeats in the polytene tissues of the larval fat body. To understand the molecular nature of this second necessary, interactive factor, we must first briefly review what happens to rDNA in somatic tissues in the genus Drosophila. The rDNA codes for components of the ribosomes that in turn are necessary for translating messenger RNA into proteins. The ability to synthesize proteins is such a critical and necessary function of most cells that organisms have evolved several mechanisms to buffer themselves against physical or functional deletions of rDNA. In Drosophila, diploid cells display a type of somatic amplification of their rDNA in response to physical or functional deletions of rDNA repeats known as compensatory response (Tartof 1971). Drosophila mercatorum flies with the aa syndrome also show compensatory response in their diploid tissues, indicating that the aa insert is indeed causing a functional deficiency of ribosomal RNA, but one that can be compensated for in diploid somatic cells. Drosophila also have polytene tissues in which there is much endoreplication of DNA, creating the giant polytene chromosomes of Drosophila and many other insects (an example is shown in Figure 1.4). Euchromatic DNA is greatly amplified in polytene tissues, but the rDNA tends to be under-replicated relative to the euchromatin. For example, in the polytene salivary glands of D. melanogaster larvae, most euchromatic DNA is at a level of polytenization of 1024 copies per cell, but the 18S/28S rDNA is present at only about 128 copies (Ritossa 1976). This is also true for D. mercatorum, and Figure 15.5 shows a cartoon version of this rDNA under-replication. One way

Selection in Age-Structured Populations

(a) aa

(b) Non-aa

Internal Transcribed Spacer (with 5.8S and 2S rRNA Genes) 28S rRNA Gene aa Insert (R1 Retrotransposon)

Euchromatic DNA Intergenic Spacer External Transcribed Spacer 18S rRNA Gene

Figure 15.5 A cartoon of the X-linked rDNA multigene family showing the lack of selective under-replication of 28S repeats bearing the aa R1 retrotransposon in X chromosomes bearing the uraa (a) and the occurrence of selective under-replication in X chromosomes bearing the ur+ allele (b).

to compensate for a deficiency in rDNA in polytene tissues is to have less under-replication of the rDNA overall, and this occurs in D. mercatorum (Malik and Eickbush 1999). However, the degree of under-replication need not be uniform over all rDNA repeats. It has been well documented in D. melanogaster that selective replication can favor certain repeat types within the rDNA (Spradling 1987), and the same is true for D. mercatorum (DeSalle and Templeton 1986). In flies that do not express aa but nevertheless have a large portion of their 28S genes bearing inserts, there is preferential under-replication of the inserted 28S repeats (DeSalle and Templeton 1986). Because of this preferential under-replication coupled with diminished overall under-replication, the uninserted functional 28S repeats are effectively over-replicated relative to the nonfunctional inserted repeats (Figure 15.5b), and the resulting tissue seems not to be affected by any functional deficiency of ribosomes. In contrast, aa flies have a uniform under-replication across the rDNA cluster in the polytene fat body tissue (Figure 15.5a), although Malik and Eickbush (1999) report preferential underreplication of the inserted 28S repeats in polytene salivary glands of aa flies. However, as will soon be apparent, the phenotypic significance of aa at the individual level arises from its affects in the fat body, not the salivary gland, so it is the absence of preferential under-replication in this tissue that is critical to the aa phenotype. The presence or absence of preferential under-replication in the fat body is controlled by an Xlinked locus (the under-replication or ur locus) with two alleles, ur+ (which allows for preferential under-replication) and uraa (which leads to uniform under-replication). Hence, the expression of aa requires two molecular conditions: (i) a third or more of the 28S repeats must bear the aa insert and (ii) uniform under-replication must occur in the fat body. Both of these molecular elements show genetic variation in a natural population living near Kamuela, Hawaii (Hollocher et al. 1992). Although these X-linked components are separable by recombination, the linkage is tight with a recombination frequency of 0.004 (Templeton et al. 1985). As indicated above, these two elements also display extensive epistasis at the molecular level, and if the aa syndrome is selected at the

593

594

Population Genetics and Microevolutionary Theory

individual level, this should translate into strong fitness epistasis as well. That such is the case is indicated by the linkage disequilibrium between these two elements in nature (Tables 15.3 and 15.4). In particular, the nature of the disequilibrium is such that X chromosomes bearing the uraa allele have higher proportions of inserted rDNA. As a consequence, virtually every X chromosome in the natural population with an uraa allele is also above the one-third threshold level of inserted repeats. Thus, in nature, both of the necessary elements for the aa syndrome co-segregate because of disequilibrium and tight linkage. As a consequence of this disequilibrium and the strong molecularlevel forces, this multigene system is effectively a supergene complex that displays inheritance patterns close to that of a single Mendelian locus (Hollocher et al. 1992). Hence, at the level of a population of genetically variable individuals, the combination of tight linkage and strong epistasis makes our unit of selection for the phenotypic consequences of aa the entire ur/rDNA supergene complex (Chapter 13) on the X chromosome when the individual is the target of selection. For much of the subsequent discussion, we will treat this supergene complex as if it were a single X-linked locus with two alleles, aa, which has both the uraa allele and a third or more of the 28S genes with the aa insert, and +, which does not satisfy one or both of these conditions. However, toward the end of the discussion, we will look more carefully at how selection has shaped the different components of this supergene complex.

Phenotypes and Potential Targets of Selection at the Level of the Individual The tissue level consequence of aa in the fat body is to produce a deficiency of functional rDNA, which in turn can lead to a deficiency of ribosomes in the fat body. This in turn can lead to reduced rates of protein translation in the fat body. However, not all proteins are sensitive to translational control from a deficiency of ribosomes. The proteins that are expected to be most sensitive to a ribosomal deficiency are those that are synthesized in large quantities over a short time interval. One such protein is juvenile hormone (JH) esterase, which is synthesized in the fat body in large quantities during the late third larval instar (the last instar before pupation in Drosophila) and prepupal stages. To understand the significance of this esterase protein, we must first briefly discuss the role of juvenile hormone in Drosophila. JH is typically in high titers during the larval phase of life. The transition from the larval to the pupal stage of life is triggered by an increasing ratio of another insect hormone ecdysone relative to JH. Part of this changing ratio is due to the production of JH esterase by the fat body starting in the late third instar stage. This enzyme degrades JH, thereby causing the JH titer to decline and the ecdysone/JH ratio to increase. JH esterase is a common physiological mechanism in holometabolous insects (those undergoing full metamorphosis) to reduce JH titers at critical developmental times. The production of JH esterase should be sensitive to translational control as this is a protein that is produced in the fat body in large quantities during the late third instar stage. Templeton and Rankin (1978) measured the activity of JH esterase in late third instar larvae reared at 25 C by topically applying tritiated Cecropia JH (JH is readily absorbed through the cuticle) followed by cytosolic extractions three hours later that were then fractionated to separate the amount of labeled JH from its degradation products (Figure 15.6). By summing the total number of counts in the JH fractions and in the JH degradation fractions, the percent of labeled JH that was degraded in this three-hour period was estimated to be 10.4% in aa flies and 33.7% in non-aa flies, indicating that non-aa larvae degraded JH 3.23-fold greater than aa larvae in the late third instar. Hence, the aa syndrome is indeed characterized by a large reduction in the amount of JH esterase in late third instar larvae. This observation explains two of the phenotypic characteristics of the aa syndrome. First, reduced amounts of JH esterase in late third instar larvae would lead to a slower decay of JH,

Selection in Age-Structured Populations

10000

aa

Counts Per Minute

Non-aa

1000

100 1

2

3

4

5

6

7

8

9

Degradation Products of JH TLC Fraction Number

10

11

12

13

14

JH

Figure 15.6 Juvenile hormone (JH) esterase activity after three hours of incubation of late third instar larvae with 3H-Cecropia JH in aa and non-aa stocks of Drosophila mercatorum. TLC (tritium-labelled compounds) fraction numbers 10 through 14 correspond predominantly to intact JH, whereas numbers 1 through 9 to JH degradation products.

which in turn would cause a prolongation of the larval phase. Such a prolongation of the larval phase is indeed observed in aa flies. Second, the abdominal histoblasts undergo their adult differentiation in the late third instar/prepupal phase, and hence are extremely sensitive to JH titer at this stage (Riddiford et al. 2003). Juvenilized adult abdominal cuticle can be induced in Drosophila as a phenocopy by topical application of JH at this stage of development (Oliveira Filho 1975). The juvenilized adult abdominal cuticle that gives the aa syndrome its name can therefore also be explained by high titers of JH in the late third instar that result from the reduced translation of JH esterase proteins in the fat body of aa flies. Premise three (Chapter 1) states that phenotypes arise from interactions between genotypes and environment. The phenotype of juvenilized adult abdominal cuticle is no exception. When reared in a laboratory environment at 25 C, there is extensive expression of juvenilized abdominal cuticle in aa flies. However, rearing larvae at 22 C slows down the rate of larval development and prolongs the larval phase. In this temperature-induced slowdown of larval development, aa flies rarely

595

596

Population Genetics and Microevolutionary Theory

display juvenilized cuticle. Whether or not the phenotype of juvenilized cuticle is expressed in natural populations is critical for understanding the role of natural selection on this syndrome because the phenotype of juvenilized cuticle induces strong negative selection. At the end of the pupal phase, flies inhale air to expand their abdomens to help them literally pop out of the pupal case. When flies with juvenilized abdominal cuticle do this, the top of the pupal case pops open as normal, but the juvenilized cuticle of the inflated abdomen often sticks to the inside of the pupal case, preventing the fly from emerging from the pupal case and causing it to die. Moreover, of those flies that successfully emerge from the pupal case, flies with juvenilized abdominal cuticle are more prone to death by desiccation. Indeed, the selective pressures are so intense against this phenotype in the laboratory at 25 C that the phenotype of juvenilized cuticle is rapidly eliminated unless actively counteracted by artificial selection every generation favoring flies with juvenilized cuticle. Interestingly, many of the aa stocks that revert to a normal cuticle phenotype with the cessation of artificial selection rapidly evolve preferential under-replication of inserted 18S/28S rDNA units in the polytene fat body rather than any changes in the rDNA itself (DeSalle and Templeton 1986). As noted earlier, Malik and Eickbush (1999) found preferential under-replication of inserted rDNA units in the larval polytene salivary gland in an aa strain. One interpretation of this observation is that the failure of preferential under-replication in aa occurs only in fat body but not in salivary gland polytene tissues. However, Malik and Eickbush (1999) failed to artificially select for juvenilized cuticle after receiving the aa strain, and when they examined the aa strain after their experiments, its abdominal cuticle was normal (Eickbush, personal communication). Unfortunately, Malik and Eickbush did not monitor the cuticle as they bred the aa flies in their laboratory, so there is no way of knowing when the reversion to wild-type cuticle occurred relative to their experiments. Hence, their experiments could mean that there is preferential under-replication in salivary polytene tissues, or they could mean that preferential under-replication in the polytene tissues had evolved in their stock that was not subjected to artificial selection for juvenilized cuticle. The results of Malik and Eickbush (1999) are therefore biologically uninterpretable because they ignored the strength of selection against the phenotype of juvenilized cuticle in a laboratory environment. Because the phenotype of juvenilized cuticle can induce strong selection under suitable environmental conditions, it is critical to examine wild-caught individuals for this phenotype. As pointed out in Chapter 6, populations of D. mercatorum live in the Kohala Mountains on the Island of Hawaii (Figure 6.23). The sole larval food resource in this region is rotting cladodes of the prickly pear cactus Opuntia megacantha. Figure 15.7 shows the distribution of the cacti in this area at the time of these studies. The range of the cacti on the mountainside spans a dramatic humidity gradient. The windward side of Kohala has a rainforest existing at the top and extending down the slope and then rapidly transitioning into a desert. Site A in Figure 15.7 is close to the rainforest and is humid. However, site B, only 300 m downhill from site A, is much drier, and sites F and IV are extremely dry. There is also a temperature gradient on the mountainside (Figure 15.8). The cladodes of this cactus are large (Figure 15.9), and their large size should dampen considerably the temperature fluctuations experienced throughout the daily cycle. Consequently, it is doubtful if a Drosophila larvae in a cladode would experience the maximum or minimum temperatures, but rather only intermediate temperatures. As can be seen from Figure 15.8, this implies that at all sites the larval rearing temperature is well below 25 C. Based upon the laboratory norm of reaction, this implies that juvenilized abdominal cuticle should rarely be expressed under natural environment conditions. In addition, autosomal loci have been identified in the natural population that have alleles that suppress the juvenilized cuticle trait in aa/aa flies (Templeton et al. 1993). Given the low larval temperatures in nature and the existence of epistatic suppressors of juvenilized cuticle in the gene pool, it is not surprising that only a handful of wild-caught flies had small patches of

Selection in Age-Structured Populations

3600 3400

A 3200

3000

B C D

2800

F 2600

2400

2400

1 km

True North

1 Mile Contour Interval: 40 Feet (12.2 m)

2200

IV

Road

Temperature in Degrees Celsius

Figure 15.7 Map of the distribution of Opuntia megacantha near Kamuela (also known as Waimea), Hawaii. The dotted lines enclose the area in which the cactus was found during much of the 1980s. Transects of collecting sites on the slopes of the Kohalas are indicated by the letters A, B, C, D, and F, and a collecting site in the saddle at the base of the Kohalas is indicated by IV. A map of the Island of Hawaii is in the upper, right-hand corner, and the small rectangle on that map shows the position of the detailed larger map near Kamuela.

26 25 24 23 22 21 20 19 18 17 16 15 14

6 10 14 18

A

B

C

F

IV

Collecting Site

Figure 15.8 Mean temperatures as measured by hygro-thermographs placed within shaded cactus patches at the collecting sites shown in Figure 15.7 over the period 1980–1990. Four different times are plotted for each site: the temperature at 6:00 hours (close to the minimum for the entire day/night cycle), 10:00 hours, 14:00 hours (close to the maximum for the entire day/night cycle), and 18:00 hours. Over the nighttime hours, there is generally a steady decline of the temperature at 18:00 hours to that shown at 6:00 hours.

597

598

Population Genetics and Microevolutionary Theory

Figure 15.9 A picture of a small portion of a cactus patch at site B. The Drosophila collector is Dr. Bonnie Templeton, who is 1.57 m in height.

juvenilized abdominal cuticle out of tens of thousands examined even though they were drawn from populations with a high frequency of the aa super-allele. Thus, the phenotype of juvenilized cuticle and the strong selective forces associated with it appear to play no role in selection on aa in natural populations. The other phenotype that is predicted to occur from the low JH esterase activity in third instar larvae (Figure 15.6) is the prolongation of the larval phase, leading to an increase in egg-to-pupal developmental time. This phenotype is also sensitive to temperature and density conditions in the laboratory environment, but the relative slowdown of aa flies to non-aa flies is robust under a broad range of laboratory conditions, amounting to 1.28 days longer for flies from aa lines relative to nonaa lines when averaged across both sexes (Templeton et al. 1993). Only the average across sexes was observed in these experiments because pupae are difficult to sex. However, sex effects were expected. In general, males do not display the abnormal abdomen syndrome because there is a cluster of rDNA on the Y chromosome that appears to be immune to insertion by the R1 and R2 elements (Figure 15.3f ). This Y-linked rDNA cluster suppresses the effects of aa in males, although some Y chromosomes exist in the natural population that have a deletion of this rDNA cluster, thereby allowing expression of aa in males (Hollocher et al. 1992; Hollocher and Templeton 1994). However, the laboratory lines of aa used in these experiments all had Y chromosomes with the rDNA cluster. Hence, males should not be affected, which implies that the average developmental delay in reaching the pupal stage in these experiments was actually around 2.56 days in aa females. Although the egg-to-pupa developmental delay appears to be a more robust phenotype of aa in the laboratory, the story of the juvenilized cuticle phenotype and premise three warn us that it is important to measure the phenotypes of interest under natural environmental conditions and

Selection in Age-Structured Populations

genetic backgrounds if possible. Unfortunately, the phenotype of egg-to-pupa developmental time is difficult to measure in nature, but it is feasible to measure relative egg-to-adult developmental time. Laboratory experiments indicate that this phenotype also differentiates aa from non-aa flies in a robust manner, with aa females taking an average of 0.52 days longer to reach the adult stage than non-aa females, but with no developmental slowdown in males. To see whether or not this developmental delay occurs in nature, patches of the cactus Opuntia megacantha at the sites shown in Figure 15.7 were inspected between 1982 and 1990 for rotting cladodes, the sole larval food resource for the Kamuela population of D. mercatorum. The rots were bagged and thereafter inspected daily, and all adult flies that emerged over the last 24-hours were aspirated out and genotyped. To ensure that we sampled the entire emergence from a rot, we included in this analysis only those rots for which the first flies emerged after three or more days after bagging, indicating that the rot was bagged early enough to capture the complete emergence profile. If there is a difference in emergence times between aa and non-aa females, it could either be due to a difference in egg-to-adult developmental time or to a difference between aa and non-aa females in the stage of the rot at which they tend to oviposit. Fortunately, the males serve as a control of this later possibility. Because of the suppression of aa caused by most Y chromosomes, male emergence should not be affected by aa genotype if both aa and non-aa females tend to oviposit at the same times. If, however, aa and non-aa females tend to oviposit at different times, then male emergence should be associated with their aa genotype. Finally, because the rots differed greatly in size and in the state of the rot at the time of bagging, all comparisons on emergence time are among genotypes of the same sex within the same rot. These observations on bagged rot emergence revealed that aa/− females take 0.92 days longer to emerge than +/+ females (significant at the 0.0001 level), whereas there was no effect of male genotype on emergence time (Templeton et al. 1993). Hence, the developmental slowdown associated with the aa super-allele is expressed under natural environmental conditions, and indeed, it is expressed more strongly in the field than in the laboratory. Moreover, the autosomal suppressors of juvenilized cuticle that are found in the natural populations have no effect on this life history trait (Templeton et al. 1993). This is an example of differential epistasis in which an epistatic modifier locus alters the phenotypic expression of another locus with respect to some pleiotropic traits but not all traits. This illustrates that the pattern of pleiotropy itself is a genetically variable trait and can evolve as part of an adaptive gene complex. Given the expression of the delay in the time from egg to adult under natural environmental conditions and genetic backgrounds, Eq. (15.10) implies that such a delay, by itself, decreases fitness. Hence, if the slowdown in egg-to-adult developmental time were the only phenotype associated with aa in this natural population, we would expect aa to be selected against. A slowdown in egg-to-adult developmental time and juvenilized cuticle are not the only phenotypes associated with the aa syndrome under laboratory conditions. The JH/JH-esterase system that is so strongly influenced by aa in the late third instar (Figure 15.6) is reactivated in the adult stage after successful pupation. JH is a primary coordinator of reproductive processes in insects (Wyatt 1997). All tissues that are directly or indirectly involved in reproduction can be targets for JH action. This reproductive role of JH probably preceded its metamorphic role in the course of insect evolution. D. mercatorum is part of the virilis/repleta radiation within the genus and subgenus Drosophila, and adult JH titer has been shown to be controlled by JH esterase in D. virilis (Khlebodarova et al. 1996). Thomas (1991) measured JH esterase activity in one-day-old adult D. mercatorum and found that non-aa flies metabolized JH 2.5-fold greater than aa flies, thus paralleling the effects seen in the third larval instar. JH has many functions in adult Drosophila, including influencing the onset of oviposition (Khlebodarova et al. 1996), controlling the transcription of specific proteins called vitellogenins that

599

600

Population Genetics and Microevolutionary Theory

are transported into eggs (Dubrovsky et al. 2002), and influencing the amount of egg production (Wilson et al. 1983). Thomas (1991) therefore measured the titer of JH bound in adult female ovariole tissue. As expected from the deficiency of JH esterase in young adults, the titer of bound JH was significantly higher in adult aa females over non-aa females within the first week after eclosion, but by two weeks, there was no significant difference between the genotypes. These differences in adult JH titer and JH esterase activity should result in increased fecundity in young adult females. This expectation is borne out in the laboratory environment as aa flies have increased egg laying output over non-aa females for the first 10 days after eclosion, but with the fecundity differences between the strains diminishing to non-significance by 11–14 days after eclosion (Table 15.5). We next need to see if this life history phenotype of increased early fecundity is also expressed under natural conditions. It is virtually impossible to count the number of eggs a wild female lays in nature on a day-by-day basis as a function of her age. However, it is still possible to test the hypothesis of increased early fecundity in nature (Templeton et al. 1990). We first need to be able to measure the age of wild-caught females. This can be done by examining apodemes, inner thoracic extensions of cuticle that serve as sites for muscle attachment (Johnston and Ellison 1982). Apodemes accumulate daily growth layers after eclosion (Figure 15.10). Hence, it is possible to count

Table 15.5

Mean number of eggs lain by aa and + (non-aa) females in a laboratory environment. + Females

Time Interval (from Eclosion)

Eggs/Female

Sample Size

aa Females Eggs/Female

Sample Size

Days 2–4

71.05

19

72.90

20

Days 5–7

174.53

19

213.75

20

Days 8–10

174.89

19

210.84

19

Days 11–14

244.00

19

242.88

17

Source: Modified from Templeton (1993).

Figure 15.10 Nomarski differential interference contrast picture of an apodeme from a female Drosophila mercatorum six days after eclosion showing the eclosion layer (E) and five subsequent growth layers (marked by white or black lines). The bar in the lower right corner is 10 μm.

Selection in Age-Structured Populations

the number of layers and thereby estimate the age of the fly in days since eclosion. Unfortunately, the bands become too small to accurately count after about 12–15 days, and this aging technique requires that the fly be killed. Nevertheless, it does allow the age structure of a sample of wildcaught flies to be determined. As will be described shortly, there is much spatial and temporal heterogeneity in the environment in this study area, and much of this heterogeneity can have a profound impact on the age structure of the population at a particular site and time. In particular, we often encounter conditions in which a majority of the wild-caught flies are less than a week of age from eclosion (which we will call “young” populations) versus conditions in which a majority of the adult flies are older than a week of age from eclosion (“old” populations). Recall now the bagged rot experiments that were used to examine the relative egg-to-adult developmental time in flies bearing aa versus those that were +/+. Rots were bagged at an early stage and almost always had many females ovipositing on them at the time of bagging. A collection of these females was taken at the time of bagging, and each wild-caught female was then placed in a vial and allowed to lay eggs on an artificial food. The male offspring emerging from these vials bear one X chromosome from their wild-caught mother, so by characterizing the genetic state of these lab-reared sons we can infer the genotype of the mother. Similarly, we scored the genetic state of the X chromosomes of the males emerging from the bagged rots. Given that the abnormal abdomen syndrome is generally suppressed in males, aa should be a neutral allele in its male carriers. If females with and without aa X chromosomes have equal fecundity, the frequency of aa in the laboratory-reared sons of wild-caught females should be the same as the frequency of aa in the males that emerge from the rots bagged at the same site and time as when the females were captured. However, if fecundity differences exist, Eq. (11.5) predicts that the aa allele frequency should also differ. Moreover, Eq. (11.5) can be solved for the average excess of fecundity as aaa =

wΔp p

15 22

where aaa is the average excess of fecundity for aa bearing gametes and Δp is the difference in the frequency of aa in mothers (as estimated from their lab-reared sons) versus their male offspring in nature (as estimated from the males emerging from bagged rots). Since we are only interested in relative differences among the genotypes, we can always define the average fitness to be 1. Hence, we can estimate the average excess of relative fecundity associated with aa as aaa =

Δp p males from rots − p lab-reared sons = p lab-reared sons p

15 23

Finally, we relate the estimates of the average excess of fecundity to the age structure found in the sample of ovipositing females. Such data are presented in Table 15.6. As can be seen, when the females are primarily a week of adult age or less, there is a strong, positive, and statistically significant average excess of fecundity as estimated by Eq. (15.23), with an average value of 0.59. In contrast, when the females are primarily older than a week of adult age, the average excess of fecundity is slightly negative but is not significantly different from zero in any “old” sample (Templeton et al. 1993). Hence, the field data (Table 15.6) are concordant with the laboratory data (Table 15.5) and indicate that flies bearing the aa allele have a fecundity advantage during early adulthood that is lost as they age. The laboratory data indicate yet another life history phenotype: aa females have a significant decreased adult survivorship relative to non-aa flies (Figure 15.11), but aa/Y and +/Y males show no difference in survivorship (Templeton et al. 1993). The molecular and developmental mechanisms for this difference in adult longevity is unknown, but the work of Brandt et al. (2005) indicates

601

Population Genetics and Microevolutionary Theory

Table 15.6 Estimates of the average excess of female fecundity associated with the aa super-allele in natural populations with differing age structures. Cactus Site

Year

Age Structure

Freq. aa in Sons

Freq. aa in Males from Rots

aaa = Δp/p

B

1982

Old

0.32

0.22

−0.32

IV

1982

Old

0.32

0.31

−0.04

C

1983

Young

0.38

0.67

+0.78

F

1984

Young

0.56

0.70

+0.23

IV

1984

Old

0.36

0.31

−0.15

B

1987

Young

0.30

0.53

+0.77

B

1989

Old

0.31

0.36

+0.15

Pooled for “Old” Age Structures:

−0.09

Pooled for “Young” Age Structures:

+0.59

Note: “Young” refers to an age structure in which a majority of the flies are less than one week of age since eclosion, and “old” is an age structure in which a majority of the flies are older than one week of age since eclosion.

1.0

aa/aa Females

0.9

+/+ Females

0.8 Probability of Survival

602

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0

5

10

15

Adult Age in Weeks From Eclosion

Figure 15.11 Probability of survival of aa/aa females versus +/+ females under laboratory conditions. Source: Based on Templeton (1993).

that the vitellogenin gene family plays a central role in regulating lifespan in a variety of organisms, including both humans and insects. As mentioned above, the aa system does influence the expression of these genes through its impact on JH titer, and this in turn may influence adult longevity through a traditional trade-off of increased early fecundity (Tables 15.5 and 15.6) versus viability at older ages.

Selection in Age-Structured Populations

As with the other traits associated with the aa syndrome, we would like to measure these longevity effects in the natural population. Unfortunately, it was impossible to simultaneously score the age and genotype of wild-caught flies with the techniques available in the 1980s, so we could not directly test whether or not this life history phenotype was also expressed in nature. However, as we will soon see, in some cases, the natural environment is very harsh for the adult Drosophila, with few living past a week of adult age. This would imply that the longevity differences shown in Figure 15.11 are not important in nature when the natural conditions are harsh. The above considerations about the phenotypes associated with the abnormal abdomen syndrome indicate an abundance of potential targets of selection at the level of the individual (female egg-to-adult developmental time, female fecundity, and female survivorship) in addition to the ones previously discussed below the level of the individual (e.g. the ability of the aa R1 insert to spread to other 28S units). We will now turn our attention to whether or not selection is indeed occurring on this genetic syndrome, with our initial focus being upon the target at the individual level by regarding the unit of selection as the supergene, that is, the aa versus + super-alleles. We will then look at the R1 insert as the unit of selection, with possible targets at both the individual and genomic levels.

Natural Selection on the aa Supergene in a Spatially and Temporally Heterogeneous Environment Natural selection always occurs in the context of an organism interacting with its environment, so we must first examine the environment experienced by the populations of D. mercatorum that live in the Kohala Mountains of Hawaii. The environment in the Kohalas is highly heterogeneous, both spatially and temporally. As already mentioned, there is normally a strong temperature gradient as one goes down the mountainside (Figure 15.8) and a strong humidity/rainfall gradient, with the mean annual precipitation averaging 160 mm near the coast and rising to more than 3000 mm near the summit (Chadwick et al. 2003). The rainfall/humidity gradient has a large impact on the growth form of the cactus, Opuntia megacantha, which in turn defines other critical aspects of the fly’s environment. The rotting cladodes of this cactus are the sole larval food resource for D. mercatorum in this area, and adult flies spend most of their time within cactus patches, which offer protection against desiccation and the wind. As shown in Figure 15.9, the cacti that grow in the upper, more humid parts of the gradient are extremely large, with individual rotting cladodes able to support hundreds of larvae. As one goes down the mountainside, the cacti become much smaller (Figure 15.12). As a result, the population sizes per cactus (estimated by mark/recapture) decline as one goes down the mountainside by up to three orders of magnitude, and the flies in the drier parts of the humidity gradient have less protection against desiccation in smaller cacti. The ecological impact of this environmental gradient is also evident from the relative proportions of the species of Drosophila collected in cacti along this transect (Figure 15.13). The two species that tend to dominate these collections are the repleta-group species D. mercatorum and D. hydei, and mark/recapture experiments indicate that the relative proportions of these two species to one another in the collections accurately reflect their relative proportions in abundance in the cactus patches. There is a dramatic shift in species composition across this transect, with D. hydei dominating at the top but being increasingly replaced by D. mercatorum as one goes down the mountainside. The environmental changes on this gradient also strongly affect adult survivorship and age structure within D. mercatorum. Mark/recapture experiments on natural populations (Johnston and Templeton 1982) reveal that adult survivorship is strongly affected by humidity, with the per

603

604

Population Genetics and Microevolutionary Theory

Figure 15.12

A picture of a cactus patch at site IV. The Drosophila collector is Dr. Rob DeSalle.

day survivorship being only 0.81 under the driest conditions where cacti are small (normally, sites F and IV in Figure 15.7) and increasing to 0.97 as one goes up the mountainside to site B. These survivorship estimates result in extremely different predicted age structures across this transect. Using the observed survivorships as estimates of the quantity (1−d) in Eq. (15.12), the predicted age-specific survivorship curves under these two different environments are shown in Figure 15.14. As can be seen, we expect few adult flies to live longer than one week after eclosion under the dry conditions, whereas under the wet conditions found at the top of the transect, we expect most adult flies to survive longer than two weeks from eclosion. These expectations based upon field survivorship data are concordant with the observed field age structures estimated by apodeme band counting, as shown in Figure 15.15. Although the physical distances are small to a human (Figure 15.7), there is a dramatic shift in the age structure of the population over this humidity gradient, with most flies too old to age by apodeme ridge counting at the humid top of the transect (ages greater than 15 days from eclosion) to almost all flies being less than a week of age at the lower dry elevations.

Selection in Age-Structured Populations

Proportion of Collection

0.7 0.6 Site A

0.5 0.4 0.3 0.2 0.1 0

Proportion of Collection

0.6 0.5 Site B

0.4 0.3 0.2 0.1 0

Proportion of Collection

0.6 0.5 0.4 0.3 Site C

0.2 0.1 0 1

Proportion of Collection

0.9 0.8 Site IV

0.7 0.6 0.5 0.4 0.3 0.2 0.1

Figure 15.13

ii bu sk

im

mi gr

an s

ns

de i hy

sim ula

me rc

ato r

um

0

Drosophila species composition of collections from cactus patches at sites A, B, C, and IV in 1980.

605

Population Genetics and Microevolutionary Theory

1 Dry Wet

0.9 0.8 Probability of Surviving

606

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

2

3

4

5

6

7

8

9

10 11 12 13 14

15

Adult Age in Days From Eclosion

Figure 15.14 The predicted adult survivorship curves based upon the observed daily survivorship rates of 0.81 at the dry sites (F and IV) and of 0.97 at a wet site (B).

This area of Hawaii is notoriously windy, as noted in a traditional Hawaiian chant about this locale: Hole waimea ika ihe a ka makani (“Tousled is Waimea by spear sharp thrusts of wind”). Under normal conditions, the wind blows down the mountainside shown in Figure 15.7 between 15 and 35 km/hour, occasionally gusting higher. The flies live inside patches of the prickly pear cactus Opuntia megacantha (Figure 15.9), and these patches block the wind. Both laboratory and field observations reveal that the flies will not disperse between these cactus patches unless the wind speed drops and remains below 10 km/hr (Johnston and Templeton 1982). Days with such low winds are relatively rare in this area, occurring on average only about 4–5 days per month. When these relatively still days occur, 31% of the adult D. mercatorum population disperse, and the dispersing adults move an average of 43 m/day, roughly the average distance between neighboring large cactus patches. This nearest neighbor pattern of dispersal could generate a pattern of isolation by distance. The amount of gene flow associated with this dispersal is also a function of how long the flies live as adults, for it is only in the adult stage that flies can disperse. Adult age structure data from the field indicate that most flies will have only one or two days on average in their lifetime in which dispersal is possible. Putting all these data together yields ℓ = 28.8 m (Eq. 14.13). In terms of the two-dimensional neighborhood model in Eq. (6.40), this means that the radius of the neighborhood area in which parents can be treated as if drawn at random is 2σ = 2ℓ = 57.6 m. Hence, there is isolation by distance over the 1 km transect. This restricted dispersal implies that even the spatial heterogeneity on the kilometer scale of the transect shown in Figure 15.7 is experienced by the flies as coarse-grained (Chapter 14). However, the densities are sufficiently high that Nevm = 8.5 (Chapter 6, recall that fst emerges from the balance of gene flow to drift and not just gene flow alone as shown by Eq. 6.39). With such a high Nevm, there is no expected nor observed significant subdivision over the transect for most nuclear loci (DeSalle et al. 1987). However, recall from Levene’s model (Chapter 14) that even higher levels of gene flow than this do not preclude local adaptation. We now have sufficient information to estimate the fitnesses associated with the aa genotypes. Let x be the age from hatching, e the age at eclosion, a = x−e the adult age from eclosion, n the maximal adult age, and ℓa = ℓa + e/ℓe the adult survivorship to age a given eclosion. With these

Selection in Age-Structured Populations

Figure 15.15 Ages of field captured D. mercatorum from sites A, B, C, and IV in 1980. Number of Flies

8

Site A

6 4 2 0 0–3

4–6

7–9

10–12 13–15

Number of Flies

8

>15

Site B

6 4 2 0

Number of Flies

0–3

4–6

7–9

10–12 13–15

>15

6 4

Site C

2 0

Number of Flies

0–3

4–6

7–9

10–12 13–15

>15

6 4

Site IV

2 0 0–3

4–6 7–9 10–12 13–15 >15 Adult Age in Days from Eclosion

definitions and noting that there is no reproduction until after eclosion, Euler’s equation (Eq. 15.9) becomes: n

1= x=0

e − rx ℓx mx bx = ℓ e

n a=0

e − r a + e ℓ a m a ba

15 24

607

608

Population Genetics and Microevolutionary Theory

and the approximation to the Malthusian parameter (Eq. 15.10) becomes: n

r≈

ℓ a m a ba − 1

a=0 n

ℓe

15 25

ℓ a ma b a a + e

a=0

with approximation 15.25 holding whenever r is close to zero in magnitude (Templeton et al. 1990). Equation (15.25) shows that the developmental slowdown associated with aa (an increase in the age of eclosion e) tends to reduce fitness, whereas the increased early fecundity associated with aa (Tables 15.5 and 15.6) tends to increase fitness. Hence, aa is characterized by antagonistic pleiotropy. Exactly how these two opposing fitness effects are weighted into overall fitness depends critically upon adult survivorship. To evaluate Eq. (15.25), we can use the laboratory data showing that the time to eclosion for female flies bearing the aa super-allele, e(aa/−), was 12.2 days (1.74 weeks), whereas e(+/+) = 10.5 days or 1.50 weeks for +/+ females. These laboratory-based figures are concordant with the developmental slowdown observed under field conditions, as noted above. We have no information on larval viability in the field (ℓe), so we will assume that there are no genotypic differences in this trait. Also, to insure that r is close to zero for both aa/− and +/+ genotypes, we also assume that 1ℓ e

=

Ra aa − + Ra + + 2

where

15 26

n

ℓ a j m a j ba j

Ra j =

j = aa − , + +

a=0

Given that virtually all wild-caught females that are not newly eclosed (observable from their cuticle) are inseminated, we set ma(j) = 1 for both females with genotype j = aa/− and j = +/+. We use the laboratory-based fecundities (Table 15.5) to estimate the ba’s for the aa/− and +/+ genotypes as these values are also concordant with field data, as noted above. The only component remaining that is needed to evaluate Eq. (15.25) is adult survivorship. Consider first the fitnesses of the genotypes under the “wet” conditions found at the upper sites. These are optimal conditions for adult survivorship, as indicated by the old age distribution (Figure 15.15) and high daily survivorships (Figure 15.14). Accordingly, we estimate the genotypic-specific survivorships by the laboratory survivorships shown in Figure 15.11, curves that are concordant with the field data. Inserting all these numbers into Eq. (15.25) yields r(aa/−) = −0.012 and r(+/+) = +0.010. Under these wet conditions with an old age structure, the +/+ genotype has a strong fitness advantage. This is not surprising. The only fitness advantage associated with the aa/− genotype is increased early fecundity, but with an old age structure, early fecundity is given less weight. Moreover, there is a survivorship disadvantage for older aa/− flies, and this disadvantage is given increased weight by an old age structure. Now, consider the dry conditions found at the lower sites that result in a low daily survivorship probability (the “dry” curve in Figure 15.14). Under these conditions, few adult flies live longer than a week, and any difference in survivorship in old adults is probably irrelevant. In this case, desiccation-driven mortality dominates adult survivorship for all genotypes, so we now use the “dry” survivorship curve in Figure 15.14 in Eq. (15.25), with all other numbers being the same as before. This survivorship curve yields r(aa/−) = +0.012 and r(+/+) = −0.014. Hence, there is a complete reversal of the fitnesses in sign in going from the top to the bottom of the transect. Note that this

Selection in Age-Structured Populations

reversal of fitnesses in going from humid to dry conditions fits well into the antagonistic pleiotropy model of senescence. External mortality is greatly increased under dry conditions, and this leads to natural selection favoring a genotype with high fecundity at young ages but low innate viability at older ages. The adaptive response to natural selection is influenced only in part by fitness differences; we must also consider the interactions with other evolutionary forces (Chapter 12) and whether this spatial heterogeneity is a gradient or ecotone (Chapter 14). We have already shown that dispersal and gene flow are restricted in this area such that ℓ = 28.8 m (Eq. 14.13). The significant differences in the fitnesses of aa homozygotes yield s = 0.0245 across the transect extremes (Templeton et al. 1990). Hence, the characteristic length for the aa locus over this transect is, from Eq. (14.14), ℓ c = 28 8 0 0245 = 184 m, which is much less than the Δ = 1000 m length of the transect. With respect to the aa locus, these populations of D. mercatorum experience this environmental transition not as an ecotone, but as a gradient in which local adaptation should result in a genetic cline within the transitional zone. Indeed, there is a significant genetic cline for aa across this humid/dry environmental transition (Figure 15.16). However, there was no significant cline for the allozymes, either considered together or individually. One of the isozyme loci is another X-linked locus, glucose-6-phosphate dehydrogenase (G-6-PD), which is located at the other end of the X from aa. The allele frequency changes for the G-6-PD S allele are also shown in Figure 15.16. In this case, there is no cline and no statistically significant differentiation across this transect, illustrating how gradients are locus specific, as expected from Eq. (14.14) because of its dependence on s. The environment near Kamuela is characterized not only by extreme spatial heterogeneity but also temporal heterogeneity that is coarse grained to these short-lived flies. For example, 1980 was a “normal” year in the sense that the overall weather that year deviated little from the long-term averages. The weather was dominated by the trade winds, and the spatial heterogeneity was

0.6

Allele Frequency

0.5

0.4

aa

0.3

G-6-PD S

0.2 A&B

C&D

E&F

IV

Site on Transect

Figure 15.16 The frequencies of the aa allele at the X-linked abnormal abdomen locus and the S allele at the X-linked G-6-PD locus in populations of Drosophila mercatorum over a transect on the leeward side of Kohala in Hawaii. The position of the sites is indicated in Figure 15.7 Vertical solids lines indicate ± one standard deviation above and below the estimated frequency of the S allele, and vertical dotted lines indicate ± one standard deviation above and below the estimated frequency of the aa allele.

609

Population Genetics and Microevolutionary Theory

characterized by the strong humidity/rainfall gradient described above. However, 1981 was the third driest year recorded up to that time on the Island of Hawaii, and that year, all sites on the transect experienced low humidity and rainfall. This drought had barely ended when, on April 4, 1982, Mexico’s El Chichón volcano had an explosive eruption that ejected the largest plume of dust into the atmosphere since 1912. The plume-cloud covered more than a quarter of the earth’s surface and blocked out as much as 10% of the sun’s total radiation between the equator and 30 north, including Hawaii. This induced one of the wettest springs and summers in Hawaii’s recorded history up to that time. Far from going back to normal after the summer of 1982, 1982–1983 turned out to be an El Niño year, a recurrent fluctuation of currents and temperatures in the eastern equatorial Pacific that induces major dislocations of the rainfall regimes in the tropics (Cane 1983). This El Niño event induced another dry year in the Kohalas. It was not until 1984 that the weather returned to its normal, long-term average pattern. These weather fluctuations occurred on a time scale of many generations from the perspective of a fly, so there was time to adaptively respond to these altered weather regimes (Templeton et al. 1987b; Templeton and Johnston 1988). This is shown in Figure 15.17, which graphs the frequency of the aa super-allele over the mountainside transect for the normal years (1980, 1984, and 1985), the two drought years (1981 and 1983), and for the wet year of 1982. During the drought years, there was no significant change in aa allele frequency over the mountainside, in contrast to the normal years. All sites during the drought years had a uniformly high frequency of aa that was not significantly different from the frequency of aa at the lower dry sites during normal years. There was also no significant cline during the wet year of 1982, but now the frequency of aa was uniformly low and not significantly different from the frequency of aa at the upper wet sites during normal years. During the normal years, the cline shown in Figure 15.16 was re-established. Hence, these populations of D. mercatorum are adaptively tracking these coarse-grained weather fluctuations in a manner consistent with the estimated fitnesses under wet and dry conditions.

0.45 0.4 Frequency of the aa Allele

610

0.35 0.3 0.25 0.2 0.15

Normal

Drought

Wet

0.1 A&B

C&D Sites on Transect

E&F

Figure 15.17 The frequency of aa on the mountainside transect shown in Figure 15.7 for the years 1980–1985. The years 1980, 1984, and 1985 are pooled together as they all experienced normal (near longterm averages) weather, and the years 1981 and 1983 are pooled together because they were both drought years. The year 1982 was an abnormally wet year due the explosive eruption of a volcano in Mexico.

Selection in Age-Structured Populations

Although the weather was normal on the mountainside in 1984 and 1985, it was not at site IV. As can be seen from Figure 15.7, sites A through F are located on the slope of the Kohalas, whereas IV is located on the relatively flat saddle between Kohala and the volcano Mauna Kea. Winds normally blow down the mountainside from A to F, but winds also blow through the saddle from the wet windward coast to the dry leeward coast. In 1984 and 1985, the wind blowing through the saddle was stronger than normal, causing site IV to be more humid and to have more rainfall than normal. Consistent with the pattern shown in Figure 15.17, site IV had a significantly higher frequency of aa during the dry years of 1980, 1981, and 1983 ( paa = 0.47) than the wet years of 1982, 1984, and 1985 ( paa = 0.33). Hence, there was adaptive tracking of coarse-grained spatial variation induced by coarse-grained temporal variation. Most of the collections between 1980 and 1985 had been done in the spring or summer months, but in 1986 and 1987, the collections were made in December. The native Hawaiians recognized only two seasons. Kau is the fruitful season, when the weather is warmer and when the trade winds are most reliable. The collections up to 1986 were made during Kau. The other season is Hoo-ilo, when the weather is cooler and when the trade winds are most often interrupted by other winds. The 1986 and 1987 collections were made in Hoo-ilo, and hence had the potential of revealing the impact of coarse-grained (to a fly) seasonal variation. Kona wind is one of the alternative winds that is more likely in Hoo-ilo. Normally, the trade winds blow from the north-east to the south-west, but during Kona weather, the winds tend to blow from the west to the east. Such episodes of Kona weather often last from one to several days, and during that time, there is often much rain over the entire leeward slopes of the Kohalas, where all the collecting sites are located. There are typically one to two Kona storms per year, sometimes more. Both the 1986 and the 1987 collections were made shortly after major Kona storms. From an ecological perspective, the wet Kona weather in 1986 and 1987 created an extremely different environment for the Drosophila than that found in the wet year of 1982. First, as can be seen from Figure 15.13, for a normal year, the two most abundant species of Drosophila overall are D. mercatorum and D. hydei, the only two repleta group Drosophila (the repleta group as a whole is more adapted to using cacti than other Drosophila). Of these two repleta group species, D. hydei dominates at the top of the transect where conditions are most humid and drops out as one goes down the mountainside, being found only occasionally at the lower sites, whereas D. mercatorum is relatively more abundant at the drier sites. During 1982 when conditions were wet over the entire transect for several months, D. hydei was able to expand down the mountainside. For example, in 1980, D. mercatorum constituted 58% of the repleta group flies captured at site B and 99% at site IV. But during the prolonged wet year of 1982, D. mercatorum was only 7% of the repleta group flies at site B and 27% at site IV. 1982 was the only year in which D. hydei dominated the collections at all sites. However, during and immediately after the wet episodes of Kona weather, D. mercatorum constituted 99% of the repleta group collection at site B and 100% at site IV, and the respective figures for the Kona year 1987 were 99.6% at B and 100% at IV. Consequently, from the perspective of the Drosophila community, the prolonged wet year of 1982 is not at all comparable to the wet seasonal Kona episodes of 1986 or 1987. The reason for these contrasting ecological responses to the two different types of wet weather lies in the fact that the Kona weather is only a wet episode (impacting only one or two generations of flies) in an otherwise normal weather pattern. Hence, this is an example of a sporadic, coarsegrained temporal change (Chapter 14). As noted earlier, the sole larval food resource in this area for D. mercatorum (and D. hydei as well) is rotting cladodes of the prickly pear cactus. The extremely wet conditions associated with a Kona storm create a superabundance of larval food resource, both in the number of cladodes simultaneously rotting and in the size of each individual rot. As a result,

611

612

Population Genetics and Microevolutionary Theory

there is the potential for a population explosion. For example, the number of adult D. mercatorum living in cactus B-1 (the one shown in Figure 15.9) was estimated through mark/recapture experiments to be 971 ± 76 in 1980 and 420 ± 58 in 1984, the only two normal years in which the size was estimated for the flies living in this cactus patch. In the drought year of 1981, the B-1 size was 180 ± 73, and it was 543 ± 121 in the El Niño year of 1983. The B-1 population size was 61 ± 15 in the wet year of 1982, reflecting the dominance of D. hydei that year. In the generation after the 1988 Kona storm, the estimated size was 8133 ± 1120 in B-1 (Templeton et al. 1989), by far the largest size ever estimated for B-1 during the entire course of the study. Indeed, the most divergent population sizes in B-1 are between the two wet conditions: the prolonged wet year of 1982 versus the Kona episode in 1988. Hence, these two “wet” years are quite distinct ecologically. Much of this distinction between the ecological impact of prolonged wet versus episodic Kona wet conditions appears to be due to interactions between the two repleta group species, D. mercatorum and D. hydei. D. mercatorum, with its shorter generation time and higher fecundities, can better exploit the temporary abundance of larval food sources associated with a Kona episode than D. hydei. Since the wet conditions disappear shortly after the Kona storm, and the larval food resources decline back to normal levels, D. hydei never has the chance to expand under these wet conditions as it did under the prolonged wet weather of 1982. This pattern is clearly shown by monitoring the flies that emerge from the bagged rots. Figure 15.18a shows the fly species that emerged from bagged rots from site B cacti during the collecting periods of normal years. As can be seen, D. mercatorum is the most common fly to emerge from the rots, with D. hydei being a close second. However, during the wet year of 1982, D. hydei strongly dominates the rots (Figure 15.18b), and few D. mercatorum emerged, a fact consistent with the extremely low population size of D. mercatorum in B-1 that year (61 adult flies). The wet weather associated with the Kona storm creates exactly the opposite situation: now D. mercatorum dominants and D. hydei is almost eliminated from the rots (Figure 15.18c). Not only are there more rots available after the Kona storm, but the rots are almost exclusively used by D. mercatorum. This explains the change in the average number of D. mercatorum to emerge from a rot at site B from 14.3 flies/rot during the normal years in the Kau season, down to 0.7 flies/rot during the wet year of 1982, and up to 283 flies/rot after the Kona storm of 1987. Hence, the Kona weather episode was characterized by a temporary but very dramatic population explosion of D. mercatorum. Such a temporary abundance of larval food resources and its attendant population explosion results in a very young age structure, which in turn would be expected to favor the aa/+ genotypes with their higher early fecundity through Eq. (15.25). This temporary pulse of wet, Kona conditions also interacts with the physical structure and location of the cacti in influencing the environment of the Drosophila. As noted earlier, the cacti grow to huge sizes in the upper part of the transect (Figure 15.9), but are small in the lower sections (Figure 15.12). The large cacti accumulate many fallen cladodes, and many of these begin to rot simultaneously as a result of the Kona storm. Moreover, each rotting cladode can support hundreds of larvae because of their large size, as noted above. In contrast, the small cacti at the lower sites do not accumulate many fallen pads, and each rot is smaller. Hence, the potential for population growth is not as great at the sites with small cacti (sites F and IV) as compared to the sites with large cacti (sites A–D). A further complicating factor is shown in Figure 15.18d–f. These figures show the emergence of adult flies from rots bagged at sites F and IV. During the normal years, D. mercatorum strongly dominates the use of the rots at these normally dry lower sites so that even though the rots are smaller, the number of D. mercatorum that emerge from these small rots is actually larger on the average (34.6 flies/rot) than the average number emerging from site B rots (14.3 flies/rot) during the same time periods. However, during the wet year of 1982, D. hydei swept through all sites and became the most common Drosophila to emerge even from the rots at sites

Selection in Age-Structured Populations

lan ica me

gra ns mi

kii

im

s

sim ula n

hy de i

um me rca tor

me lan

igr an

im m

bu s

tor u rca me

ica

(f) Kona Weather

s

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

kii

(c) Kona Weather

ns

(e) Wet Year of 1982

bu s

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

ei

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Sites F and IV Rots (d) Normal Years

(b) Wet Year of 1982

sim ula

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

hy d

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

m

Proportion of Flies of a Given Species Emerging From Rot

Site B Rots (a) Normal Years

Figure 15.18 The relative emergence of adult flies of various species of Drosophila from rots bagged at sites B versus F and IV as a function of the weather conditions at the time of bagging.

F and IV, and other, non-repleta group Drosophila also became common. As a consequence, only an average of 2.0 D. mercatorum/rot emerged at sites F and IV during 1982. After the Kona weather, there was an explosive exploitation of the rots by the non-repleta species D. simulans, and very few D. mercatorum emerged from the rots at the lower sites. Accordingly, the average emergence of D. mercatorum at sites IV and F was 6.8 flies/rot, well below the normal year average of 34.6 flies/rot. During the normal year of 1984, the population size of cactus IV-1 (shown in Figure 15.12) was estimated to be 51 ± 14.7, but, during the Kona weather collections, it was impossible to estimate the size because so few D. mercatorum could be collected at this site. Hence, in great contrast to the upper sites with the large cacti that had explosive population growth and a super-abundance of

613

Population Genetics and Microevolutionary Theory

larval food resources for D. mercatorum, the lower sites experienced a reduction in population size and less larval food resources for D. mercatorum. The humid conditions associated with the Kona weather would reduce the adult death rate from dessication even at sites F and IV. These ecological conditions would lead to an older age structure. So immediately after the Kona weather, despite the wet, humid conditions, we expect a reverse of the cline in aa frequency observed in the normal years; now, aa/+ should have higher fitness at the upper sites that have a young age-structure due to explosive population growth and lower fitness at the lower sites due to an older adult age-structure associated with high humidity. These altered age-structures should reverse the cline observed under normal weather conditions. Indeed, this is just what happened, as shown in Figure 15.19. The reversal of the cline after the Kona weather relative to the normal weather cline clearly shows that the aa is not an adaptation to low humidity conditions per se, but rather aa is an adaptation to a young age structure of adults. The humid conditions that exist after a Kona storm should reduce external mortality due to desiccation (Figure 15.14), yet these benign conditions for adult survivorship favor the aa allele when accompanied by a population explosion, contrary to the predictions of the mutational accumulation and antagonistic pleiotropy models but consistent with the young age structure model (Wensink et al. 2017). The aa syndrome is favored whenever the ecological conditions are such that the adult age structure is young, and it is selected against when the adult age structure is old, regardless of the humidity conditions.

Natural Selection on the Components of the aa Supergene As explained earlier, the aa supergene consists of two molecular components: the presence of R1 inserts in a third or more of the 28S ribosomal genes on the X chromosome and the presence of the X-linked uraa allele that codes for the failure of selective under-replication of inserted rDNA repeats in the fat body polytene tissue. Because of the linkage disequilibrium between the rDNA and the ur locus described earlier, virtually all X chromosomes with the uraa allele also have a third or more of their 28S genes bearing the R1 insert. Hence, the testcross procedure used to score X chromosomes by their phenotypic effect on adult abdominal cuticle effectively only scores for the presence or absence of the uraa on the X, with variation in the insert proportion only affecting the degree of 0.5 Frequency of the aa Allele

614

0.45 0.4 0.35 0.3 0.25 0.2 A&B

C&D

F & IV

Sites

Figure 15.19 The cline in the frequency of aa during the years when the collection was made shortly after Kona storms.

Selection in Age-Structured Populations

morphological expression in the testcross progeny (Templeton et al. 1989). To examine the effects of the insert proportion separately from uraa, Templeton et al. (1989) also scored the proportion of inserted 28S genes using the Southern blot procedure from wild-caught males from the 1986 collection who were also testcrossed. In this manner, a sample of X chromosomes was obtained for which information existed on both uraa and insert proportion. As mentioned in the previous section, 1986 was a Kona year, and the aa supergene cline was reversed from the normal pattern (Figure 15.19). Confining inference just to the 1986 sample, Figure 15.20 shows a statistically significant cline in the frequency of uraa, with this allele declining in frequency as we went from the upper sites with the large cacti to the lower sites with the small cacti. There was also a statistically significant cline in the average proportion of inserted 28S genes that paralleled the uraa cline. This cline in insert proportions is expected just from the linkage disequilibrium that exists in this supergene. In order to see if there were any patterns that could not be explained by this disequilibrium, Templeton et al. (1989) separated the X chromosomes into those bearing the uraa allele and those bearing the ur+ allele and tested each subset of X chromosomes for significant site variation in the insert proportion. A dramatic contrast was revealed in the geographical pattern of insert proportion between uraa and ur+ X chromosomes. There was no significant geographical heterogeneity in insert proportion in the ur+ X chromosomes, whereas there was an even sharper and more significant cline in the insert proportions in the uraa X chromosomes (Figure 15.20). This cline in insert proportions is within uraa X chromosomes and therefore cannot be explained simply by the disequilibrium 0.65

Insert Proportion on ur aa X Chromosomes Frequency of ur aa

Frequency of ur aa / Average Insert Proportion

0.6

0.55

0.5

0.45

0.4

0.35

0.3 A&B

C&D

F & IV

Sites

Figure 15.20 The frequency of the uraa allele at the under-replication locus and the mean proportion of inserted 28S in the 1986 sample of D. mercatorum X chromosomes plotted against collecting site location.

615

616

Population Genetics and Microevolutionary Theory

between the two major genetic components of the aa syndrome. Instead, this pattern suggests that natural selection is operating upon the insert proportion at the individual level as a function of the coarse-grained spatial heterogeneity across the transect but only in the context of an X chromosome that codes for uniform under-replication. When selective or preferential under-replication occurs, there is no cline in insert proportion, indicating that the inserts are selectively neutral at the individual level on this genetic background. However, when preferential under-replication fails to occur (on uraa X chromosomes), selection favors increased insert proportions in the same environmental contexts that favor the aa syndrome. Note that the selection on the R1 inserts depends both upon the genetic background and upon the external environment. We have previously discussed evidence that the EcoR1+ R1 elements are favored targets of selection below the level of the individual. They have the ability to spread throughout the rDNA complex via both transposition and tandem duplication and dominate intrachromosomally over alternative transposable elements. The results shown in Figure 15.20 indicate that individual-level, interchromosomal selection also occurs on the EcoR1+ R1 elements, but this selection on the EcoR1+ R1 elements is strictly modulated by fitness epistasis with the ur locus and occurs only when there is uniform under-replication. This result is consistent with the molecular biology of the syndrome since preferential under-replication buffers the organism against the impact of having a large proportion of the 28S genes inactivated by the insert. When there is uniform under-replication, the severity of the syndrome in the laboratory environment increases with increasing insert proportion (Templeton et al. 1989), so it is not surprising that individual-level selection could act upon insert proportion in the molecular context of uniform under-replication. Because of this epistasis and its demonstrable impact on joint allele frequency patterns, the aa supergene represents an example of a coadapted gene complex (Chapter 12). These results also make it clear that an understanding of the evolutionary significance of these R1 transposable elements requires studies at the molecular, tissue, physiological, individual, and population levels, as well as monitoring of environmental factors and population structure. These R1 elements are targets of selection both below the level of the individual and at the level of the individual, and studies directed at only one of these biological levels will always yield an incomplete picture.

Overview The aa story touches upon many of the major points made in this book about natural selection. Moreover, it illustrates many of the major methodological approaches used in population genetics. Evolutionary processes have produced an immense array of biological diversity on this planet, with species displaying complex and intricate adaptations to their environments. Understanding this diversity and complexity, its origins, and its implications ranging from the molecular through ecological levels is a daunting challenge. To meet this challenge, the study of population genetics requires an appreciation of a broad range of scientific approaches, as illustrated by the studies on aa. In this book, we made use of four approaches, all of which were used in the aa studies:

•• ••

Comparative analysis Reductionism Holism Monitoring of populations We now discuss each of these approaches, illustrating their use with aa and other systems.

Selection in Age-Structured Populations

Comparative Analysis An evolutionary process occurs over time; therefore, evolving populations (and the genes contained within those populations) have a history. The comparative approach to biological science makes active use of this history. This is a scientific method used extensively in biology, mostly at the species level and above. Traditionally, an evolutionary tree is constructed for a group of species. Then, other data about these organisms (anatomy, developmental pathways, behavior, etc.) are overlaid upon the evolutionary tree. In this manner, it is possible to infer how many evolutionary transitions occurred in characters of interest, the locations of transitions within the evolutionary tree, and patterns of evolutionary associations among characters. Contrasts between those organisms close together on an evolutionary tree are generally those that are most informative about the character of interest because the sharing of evolutionary history for all other traits is maximized by this contrast. A comparative contrast bears some similarity to a controlled experiment in reductionist empirical science because the contrast is chosen to minimize confounding factors. For example, this comparative approach was used to infer natural selection on primate mtDNA (Figure 12.12). Comparative approaches are also used to make hypotheses about a species for which critical information is lacking. Information from comparative analysis was crucial for the aa studies. The similarity between the aa morphological phenotype and the “bobbed” phenotype in the related species of D. melanogaster and the even more closely related species D. hydei provided the information that made the rDNA tandem cluster on the X chromosome the initial genetic candidate for aa. This comparative information was partially correct in that aa was indeed greatly affected by the rDNA, but unlike bobbed, there was no deletion of many tandem copies. However, this clue also lead to the discovery of the R1 inserts in many of the tandem rDNA copies. R1 inserts are not limited to D. mercatorum in the genus Drosophila, so once again, information from comparative analyses was crucial in deciding to investigate transcriptional inactivation of inserted repeats and preferential under-replication in the larval fat body. These studies on other Drosophila and insects also suggested a role for JH metabolism. In particular, the induction of aa-like phenocopies by topical application of JH analogs in late third instar larvae of D. hydei was a major motivating factor for the JH studies performed on aa in D. mercatorum. Although the focus of population genetics has traditionally been on populations within a species, all species have an evolutionary history, and the wise population geneticist makes full use of this historical information. The studies on the genetic and physiological basis of aa could not have occurred without using this comparative information from other evolutionary related species. One of the more exciting developments in population genetics during the last part of the twentieth century was the development of coalescent theory and of molecular techniques that have allowed the application of comparative approaches within species. Indeed, coalescent theory and haplotype trees often blur the distinction between an intra- versus inter-specific analysis and can integrate them into one. For example, we saw this blending in contingency tests of natural selection (Figure 12.12) and in the phylogeographic analyses of African elephants (Figures 7.10 and 7.23). Given the recent advances in theory, statistics, and molecular genetics, such blended studies will undoubtedly become more and more common in population genetics.

Reductionism Reductionism seeks to break down phenomena from a complex whole into simpler, more workable parts to find underlying rules, laws, and explanations. The reductionist approach is based upon the assumption that many complex features of a system can be explained in terms of a few components

617

618

Population Genetics and Microevolutionary Theory

or rules contained within the system itself, that is, the explanation for the observed complexity lies within the content of the system. In this manner, simplicity (the parts contained within the system) generates complexity (the attributes of the whole system). Reductionism seeks necessary and sufficient explanations for the phenomenon under study. Such content-oriented explanations based upon reductionism are said to be proximate causes for the phenomenon of interest. To understand the proximate cause of aa, it was necessary to perform reductionistic molecular, physiological, and developmental studies. Information from comparative analyses indicated that ribosomal DNA is a candidate locus (Chapter 10) for aa; hence, the reductionistic studies initially focused on rDNA. These studies were generally done in the laboratory using controlled experiments in which ideally all potential variables save one are fixed, thereby allowing strong inference about how the single remaining variable factor causes effects of interest in the system under study. The controlled experiment fixes the context to allow inference about the content of a system varying with respect to a single factor. The experimental approach has been widely applied in population genetics and has proven to be a powerful tool in elucidating causal factors in microevolution. Note, however, that the strong inferences made possible by this approach are limited by the fixed contexts of the experiment, so generalizations outside of that context need to be made with great caution and confirmed when possible by field studies. Moreover, potential interactions with variables that have been experimentally fixed lie outside the domain of inference of the experimental approach unless experiments are explicitly designed to study interactions. These laboratory experiments on the aa syndrome ultimately lead to defining a variety of targets and units of selection (Chapter 13),from the R1 elements being both a unit and target of selection with respect to spreading within a chromosome below the level of the individual, to the developmental delays, increased early fecundity, and decreased old survivorship found at the level of the individual. By defining the genetic elements, their linkage relationships, and strong epistasis, the reductionistic studies indicate that the appropriate unit of selection for the individual-level phenotypes is not a single locus, but rather a supergene consisting of the ur locus and the multigene family of X-linked rDNA. Overall, the studies on proximate causation set the groundwork for the investigations on the ultimate causation of why aa is polymorphic and shows dynamic temporal and spatial patterns. Reductionism in population genetics is not limited to experiments but is also the basis of much theoretical population genetics. In modeling microevolution, the complexity of an evolving population is often simplified by reducing the number of variables and ignoring many biological details. With such simplification, laws and complex evolutionary patterns can be elucidated from a few components or factors that are contained within the population itself. The studies on aa used many of these equations that arose from reductionistic models of microevolution.

Holism The holistic approach is based upon the assumption that simple patterns exist in nature that emerge when underlying complex systems are placed into a particular context (simplicity emerges from complexity). The explanation of these emergent patterns often does not depend upon knowing the detailed content or proximate causes of the component complex systems, but rather depends upon the context in which these components are placed in a higher level interacting whole. These context-dependent explanations that do not depend upon detailed content reveal what is commonly called ultimate causation. For example, why do people die? A reductionist approach would look at each instance of death and attempt to describe why that particular person died at that particular time in terms of the status

Selection in Age-Structured Populations

of that particular individual’s health at the time of death. Taking such a reductionist approach, the three leading proximate causes of death in the year 2000 in the United States are (i) heart disease (29.6% of all deaths that year), (ii) cancer (23%), and (iii) cerebrovascular disease (7%) (Mokdad et al. 2004). In contrast, a holistic approach looks at multiple external variables that define the health context of a population of individuals. One would try to access the importance of context variables as predictors of death at the level of the whole population. Taking such a holistic approach, the three leading ultimate causes of death in the year 2000 in the United States are (i) tobacco consumption (18.1%), (ii) being overweight (poor diet and physical inactivity, 16.6%), and (iii) alcohol consumption (3.5%) (Mokdad et al. 2004). The ultimate explanation of causes of death does not depend upon the proximate cause of death of any particular individual. The ultimate answers as to why people die depend upon the environmental context (tobacco, diet, physical activity, alcohol, etc.) into which individuals have been placed. It is critical to note that reductionist and holistic approaches are complementary, not antagonistic. Both approaches provide answers that are meaningful, albeit at different biological levels. A practicing physician would be most concerned with the particular health status of their patients. Such a physician would be prescribing specific treatments for specific individuals based on studies and knowledge of proximate causation. However, a public health official would focus more on ultimate causation and would try to augment the health of the US population by encouraging less tobacco and alcohol use and reducing the number of overweight people through diet and exercise. Both answers to why people die are valid, and both answers can be used in making health-related decisions. The reductionist and holistic answers each lead to insights and details that are not addressed by the other. This was certainly true for the aa studies. The reductionistic, proximate findings shaped and directed the studies on ultimate causation at the evolutionary level – studies seeking to answer why aa is present in this natural population and why it displays a dynamic pattern of spatial and temporal shifts in frequency. The proximate studies revealed a spectrum of pleiotropic effects associated with aa, which in turn suggested candidate phenotypes that could influence various components of fitness (Chapter 11), particularly those related to viability and fecundity. Holistic field studies were then executed to put the phenotypic expression into an environmental context, showing which phenotypes were irrelevant in the natural environment (e.g. juvenilized cuticle) and which were important (e.g. egg-to-adult slowdown and early female fecundity) in the context of agestructured natural populations and the environmental factors that influenced that age structure. The reductionistic modeling of theoretical population genetics that produced Eq. (15.25) was then used as a holistic tool to assemble all these diverse phenotypic components into a single emergent property: fitness in the context of the environmental variation experienced by this population over space and time. As shown in Chapter 12, even a complete knowledge of the relationship between genotype and phenotype and its fitness consequences is insufficient for ultimate causation of allele frequency patterns in space and time. As shown in Chapters 8 and 11, the evolutionary effects of fitness differences are channeled through gametes through the average excess, which is a function not only of fitness but also of gamete frequencies and system of mating. Moreover, Chapter 12 emphasized that multiple evolutionary forces operate in an interactive fashion upon a genetic system. As a consequence, ultimate causation of temporal and spatial patterns for aa could not be understood except in the context of population structure (Chapters 2 through 7) and spatial and temporal environmental heterogeneity (Chapter 14). One of the most important interactions influencing population structure is the balance between genetic drift and gene flow (Chapter 6). The balance between these two evolutionary forces was measured in Chapter 6 for the D. mercatorum population living near

619

620

Population Genetics and Microevolutionary Theory

Kamuela such that the population would approximate panmixia for neutral loci but would still be capable of local adaptation under natural selection. This knowledge of population structure provides a critical context for interpreting the spatial and temporal heterogeneities observed in the aa supergene allele frequencies. These emergent properties of population structure and of fitness in the context of environmental variation provided insight into the ultimate causation of the existence and persistence of aa in this population. All too often, reductionism and holism are presented as alternative, antagonistic approaches in biology. The holistic, field experiments performed on natural populations of D. mercatorum could not have been designed or executed without reductionistic laboratory experiments on proximate causation and without the integrative power of reductionistic population genetic theory to reveal emergent properties. Population genetic studies benefit by using both reductionism and holism. The aa studies are an example of such an integrated reductionist/holistic approach.

Monitoring Populations Many hypotheses in population genetics can be tested by monitoring populations, both experimental and natural. One of the simplest types of monitoring is a one-time sample of individuals of unknown relationship coupled with some sort of genetic survey (using one or more the techniques described in Appendix A). Such simple genetic surveys allow one to estimate and test many of the evolutionary forces described in Section 1 of this book. Many current genetic survey techniques of present-day genes and/or populations allow inferring evolutionary history of those genes and/or populations. Moreover, the genetic survey data can be overlaid with phenotypic data to test hypothesis about how genetic variation influences phenotypic variation, as shown in Section 2. Finally, Section 3 shows that many tests for the presence or past operation of natural selection are possible from such genetic survey data. The monitoring of populations can be extended beyond a simple one-time survey of genetic variation of individuals of unknown relationship. For example, one can sample families (parents and offspring) instead of individuals, or follow a population longitudinally through time to obtain multigeneration data. Population genetics is concerned with the fate of genes over space and time within a species, and this fate can be observed or estimated by monitoring populations over space and time. Such monitoring over space and time also allows population geneticists to make use of natural experiments, and natural experiments were a critical component of the aa studies. Use was made of droughts, volcanic eruptions, El Niño weather events, seasonal variation, and storms coupled with spatial heterogeneity. The monitoring included not only genetic surveys but also extensive monitoring of environmental variables such as temperature, humidity, and wind along with ecological monitoring of population and community variables such as population size, food resource availability, dispersal, and community species composition. By making use of these natural experiments coupled with genetic, environmental, and ecological monitoring, the ultimate adaptive significance of aa was revealed to be as an adaptation to young age structure. These studies revealed that the fitness consequences of aa were strongly interacting with environmental factors that varied over both space and time, creating a highly heterogeneous selective environment (Chapter 14). The ultimate causation of the spatial and temporal dynamics of aa arose from the interactions among the fitness effects of aa interacting with spatial heterogeneity in the environment upon the substrate of population structure that modulated the fitness response into spatial clines that tracked temporal changes in the environment (Figures 15.17, 15.19, and 15.20). As shown by the aa system, population genetic studies are inherently a cross-disciplinary endeavor

Selection in Age-Structured Populations

that integrate multiple methods of inference that are complementary and reinforcing to one another. The great population geneticist Theodosius Dobzhansky (1962) stated that “Nothing in biology makes sense except in the light of evolution.” Dobzhansky’s statement emphasizes that evolution is the core of biology. All aspects of biology are informed by evolution because evolutionary processes operating in conjunction with physico-chemical properties have shaped all attributes of living organisms. Population genetics is at the core of evolution. All evolutionary change ultimately traces to evolution within populations – the domain of population genetics. Population genetics describes the mechanisms by which evolutionary change occurs within populations as emerging from just three, well-established properties of DNA (or occasionally RNA): DNA can replicate, DNA can mutate and recombine, and the information in DNA interacts with the environment to produce phenotypes. This is an amazing accomplishment.

621

622

Appendix A Genetic Survey Techniques Population genetics deals with the distribution of genetic variation through space and time. As a consequence, the specific questions addressed by population genetic studies have always been influenced and constrained by the techniques used to measure genetic variation within populations. This appendix will not present a comprehensive discussion of all the techniques used in population genetics, but rather will focus on those techniques currently used or used in the recent past. Moreover, this will not be a cookbook of how to implement these survey techniques, particularly given the rapidity with which many of the techniques are being refined and altered. Instead, the purpose of this appendix is to familiarize the reader with the properties and limitations of the basic genetic survey techniques. The 1960s marked a major change in population genetic survey techniques with the advent of protein electrophoresis, as discussed in Chapter 5. We begin our survey of techniques for measuring genetic variation with protein electrophoresis, and then work our way to the present, when studies of genetic variation at the DNA level are now dominating the field.

Protein Electrophoresis Protein electrophoresis was first introduced as a major genetic survey technique in the 1960s (Harris 1966; Johnson et al. 1966; Lewontin and Hubby 1966) and is currently rarely used as a tool for screening populations for genetic variation. Protein electrophoresis is based on the fact that non-denatured proteins with different net charges at a specified pH migrate at different rates through various media (paper, cellulose acetate, starch gels, polyacrylamide gels, etc.). Because the proteins must be nondenatured, great care is needed in collecting and preparing samples. In many cases, fresh samples are used, such as a blood sample (often separated into the white and red blood cells), whole organisms (such as individual Drosophila), or tissues (liver, root tips, etc.). If fresh samples cannot be used, the samples must be quickly frozen after removal from the organism and maintained at low temperatures, typically −70 to −80 C. The samples are homogenized in a buffer solution to release the proteins, and multiple individual samples (often around 20–25 individuals) are generally placed along a line on the buffered medium (Figure A.1). A power supply is then used to run an electrical current through the medium, and the proteins will migrate into this medium, with the distance of the migration being a function of the running conditions (the medium itself, the buffer used and its pH, and the length of time allowed for the proteins to migrate) and the charge on the protein. Not all amino acids are charged under the pH ranges mostly commonly used. Most of the charge differences are due to 5 of the 20 amino acids: lysine, arginine, and histidine, which tend to have a positive charge, and glutamic acid and aspartic acid, which tend to have a negative charge. Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

Appendix A Genetic Survey Techniques

Moreover, substitutions involving these amino acids are more effective at changing the charge of the molecule if they are located on or near the external surfaces of the protein. However, other amino acid changes can also alter the rate at which the protein moves through a medium through conformational changes. Overall, only about a third of all amino acid substitutions alter electrophoretic mobility, so protein electrophoresis only detects a subset of the non-synonymous variants in protein-coding genes. Once the proteins have migrated to their final positions in the medium, they need to be visualized in order to be scored by a human observer. Sometimes, the protein acts as its own stain. For example, hemoglobin molecules are a bright red, so their position in the medium is indicated by a red band. More generally, the gel or other media is immersed in a histochemical stain that singles out a particular protein from among the thousands of other proteins that have migrated in the same medium. Such stains often take advantage of the fact that many proteins are enzymes that catalyze a specific biochemical reaction. The substrates and cofactors for a specific enzymatic reaction can be added to a staining solution along with a stain that is chemically coupled to the enzymatic reaction such that the stain is deposited only where that reaction is occurring. Some media, such as starch gels, can be sliced into multiple thin sheets, and each sheet can be immersed into a different staining solution. In this manner, a single gel can be used to score many different loci. Once stained, bands appear on the medium that indicate the positions to which a specific protein and its variants have migrated (Figure A.1). The different allelic protein variants revealed by this technique are often called allozymes. The resulting banding patterns typically define a codominant Mendelian system of variation. Figure A.1 shows the typical pattern obtained for a monomer protein at a locus polymorphic for two electrophoretic alleles. More complicated patterns can be obtained for proteins that are dimers or tetramers, but, in general, a set of distinct bands is associated with all the electrophoretic alleles and genotypes. Another advantage of protein electrophoresis is that one chooses the protein-coding loci to be scored on the basis of the stains used, and not necessarily on those loci being polymorphic. Thus, one can survey a population for a specific protein and conclude that there was no electrophoretically detectable genetic variation at that locus, allowing one to estimate the proportion of loci that are polymorphic versus monomorphic. The implications of this feature were discussed in Chapter 5. One limitation of this method is that the evolutionary relationships among the alleles cannot be reliably inferred from the banding patterns. Hence, protein electrophoretic data cannot generally be used to construct haplotype trees (Chapter 5). However, population genetic distances (Box 6.1) can be calculated from these data, so trees of populations can be estimated using algorithms such as neighbor-joining (Box 5.2), although one should check whether a tree of populations is a biologically plausible possibility (Chapter 7).

Restriction Endonucleases The 1960s also marked the discovery of restriction endonucleases (commonly called restriction enzymes), enzymes that cleave duplex DNA at particular oligonucleotide sequences, usually of four, five, or six base pairs in length (Linn and Arber 1968; Meselson and Yuan 1968). For example, the restriction enzyme EcoR1 (named from the bacteria, Escherichia coli, from which it was isolated) cuts double-stranded DNA where the non-methylated sequence 5 -GAATTC-3 occurs. Hundreds of other restriction enzymes have been isolated from bacteria, each with a specific but often different recognition sequence. These enzymes have and continue to play an important role in population genetic surveys. Many different techniques have been developed that use restriction enzymes for genetic surveys, as will now be outlined.

623

624

Appendix A Genetic Survey Techniques

Volts

Amps

– Initial Position of Samples

Positions of Proteins After Electrophoresis

SS SF SS FF SS FF SF SF FF

Inferred Genotypes

+

Figure A.1 Protein electrophoresis. The results shown are for a monomer protein coded for by a locus with two alleles, S and F, standing for slow and fast (referring to the relative mobilities of the proteins they code).

Restriction Fragment Length Polymorphisms of Purified DNA The earliest genetic surveys using restriction enzymes began with highly purified DNA, usually mtDNA (Avise et al. 1979). The purified DNA is cut with one or more restriction enzymes, and the resulting fragmented DNA is placed in wells in an agarose or polyacrylimide gel. The gel is then subject to electrophoresis, and the negatively charged DNA fragments move toward the anode. Such gels are a complex molecular network, so smaller DNA fragments move through it more rapidly than large fragments; hence, the fragments are separated by their molecular weight. The position of the fragments within the gel are visualized by stains such as ethidium bromide or by using radioactively labeled DNA. In the latter case, the gel is dried after being run and an X-ray film is overlaid upon it within a light-proof container. After exposing the film to the radioactive gel, the film is developed into an autoradiograph to reveal the position of the fragments. The film technology has now largely been replaced by other visualization methods that avoid radiation. Variation is observed as different banding patterns, as shown in Figure A.2, and such variation is known as a restriction fragment length polymorphism (RFLP). Because the recognition sequence is a specific nucleotide sequence, any nucleotide polymorphism or insertion/deletion polymorphisms whose alternative states are associated with the presence versus absence of the recognition sequence will define a RFLP. RFLPs can be located in both protein-coding and non-coding DNA, and within-coding regions can be due to both synonymous and non-synonymous mutations.

Appendix A Genetic Survey Techniques

Cut site

A

a

-

Initial position of samples

Positions of DNA fragments after electrophoresis

aa

AA

Aa

aa

Aa

AA aa

AA

Aa

Inferred genotypes

+ Figure A.2 Restriction fragment length polymorphism. A population is polymorphic for a single restriction recognition sequence in a specified purified piece of DNA, such that those molecules with the recognition sequence are designated by A and those molecules without the recognition sequence are designated by a. Upon cutting with the restriction enzyme, molecules of type A yield two fragments, whereas molecules of type a yield a single large fragment. The resulting banding patterns are shown in the bottom half of the figure.

Hence, a broader class of polymorphisms can be detected with this technique than with protein electrophoresis. As with allozymes, RFLPs are codominant Mendelian markers, and also like protein electrophoresis, DNA regions can be surveyed to reveal no polymorphism.

Southern Blots The initial limitation of RFLPs to purified DNA greatly narrowed its applicability because few classes of DNA could be purified at that time in sufficient quantities. For example, although nuclear DNA as a whole could be purified, small homologous regions within the nuclear DNA could not. Consequently, one could digest the entire nuclear genome with one or more restriction enzymes, but this resulted in a highly heterogeneous pool of DNA fragments that would yield a continuous

625

626

Appendix A Genetic Survey Techniques

smear in an autoradiograph. Southern (1975) developed a technique that would allow a specific DNA region to be scored even when the initial sample consisted of the entire nuclear genome. Southern’s technique begins with the same procedures described above. After gel electrophoresis, the DNA fragments in the gel are denatured in a basic salt solution and then transferred as singlestranded DNA fragments to a nylon or nitrocellulose membrane through capillary action by blotting the salt solution through the gel and membrane into paper towels placed on top; hence, this technique is known as Southern blotting. The next step requires that the specific region of DNA to be surveyed had previously been cloned or otherwise isolated and purified. This cloned or purified DNA is then radioactively labeled (non-radioactive labels were later developed) and denatured into single-stranded copies. This labeled, single-stranded DNA is known as the “probe,” and the membrane with the single-stranded fragments transferred from the gel is incubated with the probe under conditions in which strands that are complementary to those of the probe will hybridize and thereby form labeled duplexes in the membrane. Thus, the probe picks out those sequences that are homologous to it and ideally no others. The fragments that bind to the probe are then visualized by autoradiography or some other appropriate technique for non-radioactive labels. The development of the Southern blotting technique greatly expanded the utility of RFLPs as a means of surveying for polymorphisms. The entire genome was now theoretically open for investigation.

Restriction Site Mapping A restriction site map shows the physical position of all the restriction sites relative to one another. Sometimes, these relative positions can be inferred just from the digestion profiles and the estimated fragment sizes, but often this is not possible. For example, Figure A.3 shows a hypothetical region of DNA that is cut once by the enzyme EcoR1, and once by the enzyme BamH1. There are two ways in which these two restriction sites can be oriented relative to one another that are both consistent with the single-enzyme sized-fragment patterns. Hence, the restriction site map cannot be inferred from the single-digest fragment data. Additional data are required to infer the map in such cases. Often, the needed information can be generated by doing a double digest. Figure A.3 shows the results of digesting the DNA with both EcoR1 and BamH1 simultaneously, and as can be seen, the two possible restriction site maps produce different double-digest patterns. Hence, the restriction site map can be generated by producing different kinds of digestion profiles. All the genetic information available in the RFLP analysis is still present when one goes to the extra effort of producing a map, but, in addition, the map generates haplotype data, that is, the simultaneous genetic state at two or more polymorphic sites on the same DNA molecule. Haplotype trees and linkage disequilibrium can therefore be estimated from the restriction site haplotypes.

DNA Fingerprinting Jeffreys et al. (1985) isolated some DNA probes from humans that had a conserved core sequence of 10–100 bp that was widely scattered throughout the human genome, although it was soon discovered that many of these probes worked well on a variety of organisms. At each genomic location, the core sequence tended to exist as tandem repeats, but with the number of tandem repeats varying from location to location and even from individual to individual at the same location. These tandem repeat regions were called minisatellites or VNTRs (variable number of tandem repeats). When nuclear DNA is digested with restriction enzymes that cut outside the core region and then subject to Southern blots, many fragments of diverse lengths will hybridize with the probe as a function of

Appendix A Genetic Survey Techniques Possible map 1

A

A C EcoR1

Cut site EcoR1

EcoR1 Cut site

Cut site B

Possible map 2 D BamH1

E

BamH1

BamH1

Cut site D

Cut site D

F

Cut site B

– Initial position of samples

C A

A

B E D

D

B F

D

Positions of DNA fragments after electrophoresis

+ Figure A.3 Restriction site mapping through double digests. The left two lanes show the fragment length profiles with single digestions of a region of DNA with either EcoR1 or BamH1. All the fragments are indicated by capital letters. The single digestion profiles are compatible with two different restriction site maps, as shown in the right two lanes. The two possible maps are readily distinguished by their double digestion profiles.

variation both in the number of tandem repeats and in the locations of the cut sites outside of the core region. This results in a complex gel profile influenced by many loci scattered throughout the genome, and such multi-locus profiles are called DNA fingerprints. DNA fingerprinting reveals extensive genetic variation in most populations, often to the extent that no two individuals are the same in their gel profiles (except for identical twins). Hence, DNA fingerprinting was used extensively in forensics. However, DNA fingerprinting has some serious disadvantages for many population genetic studies. The gel profile is a multi-locus phenotype, and there is generally no way of knowing how many loci are involved and which bands correspond to homologous alleles and which are associated with paralogous loci. Hence, one cannot even determine allele or genotype frequencies from such data, much less more complicated population genetic parameters.

Polymerase Chain Reaction The original RFLP analyses required purified DNA. At that time, one could purify mtDNA by centrifugation and some other DNA regions by laborious processes. The development of the polymerase chain reaction (PCR) allowed the purification of small, well-defined regions of the genome

627

628

Appendix A Genetic Survey Techniques

1. Denature DNA

2. Anneal primers

3. Extend primers

4. Repeat cycle

Figure A.4

The polymerase chain reaction.

through amplification that could then be subject to restriction site analysis (Saiki et al. 1985). PCR involves four main steps (Figure A.4). First, the double-stranded DNA from the sample is denatured (made into single strands) by heating. Second, single-stranded DNA primers are annealed to complementary DNA from the sample that flank the region to be amplified. Third, the now doublestranded primer regions are extended by use of the thermostable Taq DNA polymerase to synthesize complementary strands from the primers. Fourth, the first through third steps are repeated multiple times to produce a large quantify of the DNA in the region flanked by the primers. Once amplified, the DNA can be subjected to restriction site analysis and mapping, as discussed above. Saiki et al. (1985) pointed out that the PCR technique was not necessarily limited to restriction enzyme analysis. They were certainly right. PCR revolutionized genetics. It soon became obvious that this technique could be applied in many diverse ways in both molecular and population genetics. Several genetic survey techniques were soon developed that greatly expanded our abilities to survey genetic variation in natural populations. Most of these PCR-based techniques reveal variation at the DNA level with and without the use of restriction enzymes.

Amplified Fragment-length Polymorphisms (AFLPs) This technique begins with a restriction digest of genomic DNA, usually by a pair of enzymes with one having a four base recognition sequence and the other having a six base recognition sequence. Such a double digestion of genomic DNA would create a complex mix of many different genomic

Appendix A Genetic Survey Techniques

fragments that would yield a smear after gel electrophoresis. To produce a readable gel, the number of visualized bands has to be greatly reduced. The Southern blotting technique solved this problem with the use of a probe that would anneal to only a specific subset of the DNA. A PCR primer can also be thought of as a probe that will anneal to only a subset of the DNA. The task is to choose a primer that will anneal to many fragments in order to detect much polymorphism, but not so many fragments as to produce an unreadable gel. This balancing act is achieved by attaching short synthetic DNA sequences known as adaptors to the cut ends of the restriction fragments. The primers are then designed to match the known synthetic adaptor sequence plus some additional nucleotides. Hence, the only fragments that will amplify under PCR are those that had these additional nucleotide sequences next to the recognition sequences of the restriction enzymes used in the original genomic digestion. Thus, only a small subset of the restriction fragments will amplify, thereby producing a readable gel. As was the case with DNA fingerprinting, this primer strategy generally will amplify DNA fragments from multiple locations within the genome, so the resulting bands on the gels represent a multi-locus phenotype. Most of the polymorphism observed with this technique is nucleotide polymorphism in the bases adjacent to the restriction cut sites. A fragment amplifies when its sequence matches the primer, but if an alternative polymorphic state in that set of nucleotides does not match the primer, then no amplification occurs. As a result, the polymorphisms are exhibited as the presence or absence of specific bands, and the presence of a band is a dominant phenotype. Because of the dominant, multi-locus nature of the polymorphisms, AFLPs cannot be used to estimate many genetic parameters, such as heterozygosity. As was the case with DNA fingerprinting, AFLPs are most useful for identifying individuals genetically and estimating relatedness between individuals.

Randomly Amplified Polymorphic DNAs (RAPDs) The RAPD technique uses short PCR primers of about 10–20 arbitrary base pairs in length. Such short sequences will have matching complementary sequences at multiple places within a genome just by chance alone, that is, primers will anneal at places that just happen to match their sequence and not necessarily because of underlying homology (identity-by-descent). Hence, many DNA fragments are randomly amplified. Polymorphisms are frequently revealed when these randomly amplified products are separated by size with electrophoresis, usually due to underlying polymorphisms in the primer recognition sites, although the precise nature of these polymorphisms at the DNA level is not defined. This technique results in multi-locus, dominant gel pattern phenotypes, so it bears many similarities to AFLPs and also shares its constraints.

Microsatellites or Short Tandem Repeats (STRs) Another PCR-based technique that is extensively used in population genetics is to assay short tandem repeats or microsatellites. Scattered throughout the genome of most organisms are tandem repeats of short sequences, often only two to four nucleotides long (Hamada et al. 1984). Many of these sites have polymorphic variation in the number of tandem copies. Primers are developed for invariant regions in the DNA that flank such a microsatellite region, and PCR amplification with such flanking primers will produce DNA products that differ in size as a function of the number of repeats. Fragments of different sizes can be separated by gel electrophoresis, with different sizes corresponding to different alleles. Many such microsatellite regions are highly polymorphic with many alleles. The advantage of this system is that high levels of genetic variation can be detected in such a manner that individual, codominant alleles can be scored at individual loci,

629

630

Appendix A Genetic Survey Techniques

in great contrast to other high variation alternatives such as DNA fingerprinting, AFLPs, or RAPDs. In this regard, microsatellites are more similar to allozymes, although the levels of variation tend to be much higher. Because changes in repeat number are so common and the number of size classes is finite and small (usually less than 20), the same size-class allele can originate from multiple, independent mutations. Hence, the infinite alleles model is generally inapplicable to microsatellite data. Moreover, sometimes, PCR does not amplify one or both of the allelic copies at a locus, especially for samples of poor DNA quality. This phenomenon is called allelic dropout and can bias estimates of heterozygosity and inbreeding (Wang et al. 2012). Replicate genotyping can be used to reduce this problem, and a maximum likelihood (Appendix B) approach can be used to estimate the rate of allelic dropout to help correct the bias it causes (Wang et al. 2012).

DNA Polymorphisms DNA is the primary genetic material of most organisms, so scoring genetic variation directly at the level of DNA is ideal. Moreover, reverse transcriptase can be used to make DNA copies of RNA, so RNA organisms and the transcriptome are also amenable to DNA sequencing. All the methods of surveying genetic variation given above ultimately score genetic variation at the level of DNA, but often with some imprecision as to the nature of the polymorphism at the nucleotide sequence level. For example, when a restriction enzyme cuts a specific region of DNA, we know that the DNA sequence at the cut site must match the recognition sequence of the enzyme. However, given that the site is polymorphic, we do not know the precise nature of the polymorphism. For example, consider a six-base cutter. Wherever it cuts, we know the precise state of six nucleotides, but in those homologous pieces of DNA in which it does not cut, we do not know the nucleotide state. Indeed, a mutation to any other of the three possible nucleotides at any one of the six nucleotides in the recognition site would destroy the ability of the restriction enzyme to cut at that site. Hence, there are 18 different possible single nucleotide substitutions that could underlie the RFLP. Moreover, the recognition sequence could be either created or destroyed by an appropriate insertion or deletion, so there are many ways at the DNA level to yield an RFLP. There are now many PCR-based techniques that allow us to survey polymorphisms directly at the level of DNA. Many of these techniques reveal the precise molecular nature of the polymorphism.

DNA Resequencing The most direct and complete method for measuring genetic variation is to completely sequence the DNA section that has been amplified by a pair of PCR primers or otherwise enriched. Moreover, by designing a set of PCR primers that anneal to conserved regions of the DNA in a manner that creates overlaps in the amplified or enriched pieces, it is possible to sequence long stretches of DNA. There are many ways of sequencing DNA, but as the technology in this area changes so rapidly, no attempt will be made to describe the details other than to note that such a sequencing ability exists. Polymorphisms are discovered by resequencing the homologous pieces of DNA from multiple individuals. Resequencing reveals all the polymorphisms in the sampled individuals in the DNA region being sequenced: coding and non-coding, synonymous and non-synonymous, single nucleotide substitutions and insertions/deletions. Sometimes, special techniques are needed to score insertions and deletions. Moreover, the precise molecular nature of the mutation is identified; e.g. an A to G single nucleotide substitution, or a specific insertion/deletion polymorphism of a specified

Appendix A Genetic Survey Techniques

sequence of nucleotides. The primary limitations of this genetic survey technique are labor (mostly in getting all the primer reactions to work), cost, and quality (to be discussed later). It is important to keep in mind that many resequencing studies do not produce an actual DNA sequence. DNA sequences can be directly obtained for haploid (or effectively haploid) DNA regions such as sperm, mitochondrial DNA (mtDNA), most Y-chromosomal DNA (Y-DNA), or X-chromosomal DNA scored in males in XY sex-determining species. Also, certain types of data sets allow sequence inferences, such as extensive pedigree data and admixed populations with information on the ancestral populations. The problem arises when surveying diploid genes from diploid tissues without this external information. For example, suppose an individual is heterozygous at two nucleotide sites in an autosomal region. Many DNA sequencing methods infer heterozygous sites as double nucleotide scores, say nucleotides A and T at one site and G and C at the second site. However, the sequencing technique often does not indicate if the A at site one is located on the same DNA molecule as the G or the C at site two. As a result, a double heterozygote individual such as A/T, G/C could either have the haplotype pair AG and TC (where underlining indicates the pair of polymorphic nucleotides found on a single DNA molecule) or AC and TG. Thus, many sequencing techniques when applied to diploid DNA do not yield the sequence of any single DNA molecule, but rather reveal the diploid genotype at each heterozygous site but with no phase information between sites. There are some molecular techniques that yield phased or partially phased inferences (Kuleshov et al. 2014; Chaisson et al. 2015; Shin et al. 2019; Stergachis et al. 2020; Soifer et al. 2020), but these can be expensive and labor intensive. Most phasing at the time of this writing is done using statistical and algorithmic approaches. Templeton et al. (1988) developed one of the first statistical phasing methods with an algorithm known as estimation-maximization (EM), a type of maximum likelihood procedure (Appendix B) to statistically phase DNA sequences or haplotypes from diploid data, and the EM algorithm is still used for this purpose. The phase of individuals that are homozygous or heterozygous at a single site is known, so the phasing problem arises with individuals who are multiple heterozygotes. Here, the lack of phased data means that their heterozygous genotypes are compatible with multiple haplotype states, only one pair of which is correct. In general, if an individual is heterozygous at n sites, there are 2n possible haplotypes compatible with the unphased data, only two of which are true. The EM algorithm is based on wishful thinking – we do not have the data that we want, so we will pretend that we do by assuming some initial haplotypes for all multiple heterozygotes! This initial guess-work takes advantage of the known haplotypes from homozygous and single heterozygous individuals. This exercise in wishful thinking actually works in many situations, and EM is a type of iterative algorithm to obtain maximum likelihood estimators (Appendix B) of the haplotypes and their frequencies. The EM algorithm works well when the number of heterozygous sites borne by individuals is small and there are many individuals homozygous for all sites or heterozygotes at a single site in the sample. These conditions are rarely satisfied with many current data sets that extend over a large number of polymorphic sites. Stephens et al. (2001) use a Bayesian (Appendix B) algorithm to estimate phase, using a neutral coalescent to generate the prior probabilities. Basically, their prior probability distribution gives most weight to resolutions based on unambiguous haplotypes found in homozygotes or single heterozygotes (like the EM algorithm). For ambiguous genotypes that cannot be resolved with unambiguous haplotypes, it gives more weight to those potential haplotypes most similar to a known haplotype with high frequency (under neutral coalescent theory, most rare haplotypes are 1-step or a few-step mutational derivatives of a common haplotype because common haplotypes are more likely to be hit by a mutational event than a rare haplotype simply because there are more copies at risk for mutation). In general, because PHASE uses prior information from evolutionary theory, it does a better job than EM in resolving the most ambiguous genotypes.

631

632

Appendix A Genetic Survey Techniques

Both the EM and PHASE algorithms perform well when the phase of a substantial portion of the data set is known without ambiguity. Similarly, both algorithms perform poorly when the proportion of ambiguous genotypes is high. Accordingly, hybrid procedures that use external phasing information coupled with statistical or algorithmic inference tend to do better (Delaneau et al. 2019). Algorithmic approaches have also been used, such as SplittingHeirs that uses a novel integer linear programming algorithm to infer haplotypes (Climer et al. 2010). SplittingHeirs yields the highest accuracy over EM, PHASE, and several other methods for seven sets of haplotype data for which the true phase is known and that varied in the amount of recombination. SplittingHeirs outperformed other methods over all recombination rates, but particularly in areas of moderate to high recombination. A faster and efficient but less precise algorithmic phasing can be done with the CCC measure of linkage disequilibrium because the CCC vector approach provides information about the association between specific alleles at two different sites. For unphased data, the exact gamete types borne by double heterozygotes is not known for certain, but a computationally efficient assumption is simply to give equal weight to all four gamete types that can be produced by double heterozygotes. This will bias the results, but the actual phasing is done using multiple markers and not just a single pair. As a result, there is often little impact of this assumption on the ultimate phasing. Climer et al. (2014a) coupled the CCC measure with the program BLOCBUSTER to infer multi-site allele networks that often correspond to phased haplotypes. This was the method used to phase haplotypes in the Gephyrin region of the human genome, discovering a yin-yang pair defined by 284 SNPs spanning more than 1 Mb that included the Gephyrin locus plus about 300 kb upstream and downstream from that gene (Climer et al. 2015). However, not all allele networks inferred from CCC and BLOCBUSTER are haplotypes (e.g. 65_1 and 65_2 in Figure 14.9), so the networks need to be examined carefully and ideally confirmed. In the Gephyrin case, both the yin and yang haplotypes were predicted to be in high frequency in different populations, and the phasing could be confirmed by homozygous individuals and heterozygotes at only a handful of sites (Figure 12.17). Interestingly, an earlier genome-wide scan (Park 2012) using scalar measures of linkage disequilibrium (LD) identified this region as having an exceptionally strong block of linkage disequilibrium but did not detect the underlying yin-yang pattern responsible for this LD block. Ironically, this yin-yang pair was just too big to be seen, as most other phasing algorithms could not handle nearly 300 unphased SNPs.

Single Nucleotide Polymorphisms (SNPs) DNA resequencing and other techniques reveal polymorphisms in the surveyed region, and a common class of polymorphisms is single nucleotides with alternative polymorphic nucleotide states (single nucleotide polymorphisms, or SNPs). This class of polymorphisms is particularly valuable in population genomics because the scoring of genotypes at single nucleotides is amenable to rapid and massive automated screening. For example, one method of automated, massive screening is through the SNP chip (pronounced “snip chip”) that is a type of DNA microarray. A microarray is usually a rectangular array of units. A small DNA sequence of known state is affixed to a unit. These known DNA molecules are designed to anneal to the DNA surrounding a SNP that is amplified from a subject. A specific unit in the array will only anneal to subject DNA that has an exact nucleotide match, and the spots that anneal are coupled with a labeling technique (usually involving fluorescence) that can be read automatically. The SNP chip is designed to result in a unique labeling pattern for each potential genotype at the polymorphic nucleotide site. A single chip can score up to a few million SNP genotypes, although smaller SNP chips are more commonly used. Many other automated methods exist or are being developed, and this is an area of rapid technological development.

Appendix A Genetic Survey Techniques

Next- and Third-Generation Sequencing Much of the initial DNA sequencing was done through a technique known as Sanger sequencing that was introduced in 1977 (Sanger et al. 1977). In the twenty-first century, several new sequencing platforms were developed that could orchestrate the multiple steps of Sanger sequencing into an automated flow, enabling sequence data to be generated from tens of thousands to billions of templates simultaneously in a massively parallel fashion (McCombie et al. 2019). The three basic types of next-generation sequencing (NGS) at the time of this writing are short read technologies with about 150 bp reads with error rates in the range of 0.1–0.5%; long-read, single-molecule technologies with 10–100 kb reads or longer with error rates in the range of 10–15% but have an advantage in genome assembly, and linked-read technologies that generate short-reads from longer molecules (Lappalainen et al. 2019; McCombie et al. 2019). NGS was initially dominated by the short read technologies, so the long read technologies are sometimes called third-generation sequencing (van Dijk et al. 2018). The higher error rates associated with NGS can cause problems in population genetic inference, particularly when there is low coverage (Vieira et al. 2016). Higher coverage (resequencing the same segment of DNA multiple times) is therefore an important quality control factor in NGS. Quality issues are even more severe when dealing with sequencing DNA from museum specimens or aDNA because the DNA itself is often damaged or degraded (Billerman and Walsh 2019) and reference sequences are incomplete (Günther and Nettelblad 2019). It cannot be emphasized enough that all sequencing methods, both old and new, do not generate sequences of nucleotides as their raw output data. Depending upon the technology, the actual data output can be bands or peaks on a gel, fluorescence at a given wavelength, etc. These raw output data are then interpreted as nucleotides or SNPs by software, and this interpretation is often subject to error (Korneliussen et al. 2013). For example, in generating SNP genotypes from resequencing data in the study of Coventry et al. (2010), it was noticed that there was a severe deficiency of heterozygotes at many called SNPs in a human population that was expected to be in Hardy–Weinberg frequencies – and indeed was in Hardy–Weinberg when these same SNPs were scored by an alternative procedure. The original software used was obviously calling many heterozygotes as homozygotes from double peak observations. As detailed in the “Genotype calling” section of the methods section of Coventry et al. (2010), the original software was modified and cross-validated with other data, and ultimately an entirely new Bayesian calling program was developed. These modifications and the new software greatly reduced the errors found with the original software. An examination of quality (in this case via testing for Hardy–Weinberg) should be a part of any sequencing study. Another type of self-inflicted source of error is the common use of imputation. Scientists hate missing data, and missing data are commonplace in sequence studies. Imputation is a commonly used method for filling in missing data, thereby avoiding the statistical difficulties associated with missing observations (Das et al. 2018). Moreover, SNPs that were not directly scored are sometimes imputed to increase the coverage of the genome. Imputation usually starts with a reference panel of known haplotypes, although Choudhury et al. (2019) describe how imputation can be implemented through a statistical inference method for non-model organisms that lack a reference panel and reference genome. Phasing SNPs into haplotypes can be done more accurately in areas of the genome with little or no recombination, but phasing is more difficult and subject to more errors in areas with recombination and weak to moderate linkage disequilibrium (Climer et al. 2010). SplitingHeirs did better than other commonly used phasing programs in both areas of no to low recombination and areas of high recombination, but unfortunately most phasing for imputation has been done with other programs, which can have up to 45% of the SNPs incorrectly phased in areas of high

633

634

Appendix A Genetic Survey Techniques

recombination (Climer et al. 2010). Incorrect phasing in the reference panel will lead to imputation errors, so we expect much variation across the genome in the accuracy of imputation as a function of local recombination rates. The second step in imputation is to phase the genotypes in the study data set around missing data sites. This step is also subject to the same type of phasing errors mentioned in the previous paragraph. The third step is to match the haplotypes in the reference panel to those in the study data set. This matching can also be in error or have some ambiguities. The matched haplotypes in the reference panel are then used to fill in the missing genotypes in the study data set and to infer unscored SNP sites found in the reference haplotype, so the reference panel is playing a critical role in filling in the missing data and extending genomic coverage. Serious errors can be introduced at this stage if the reference panel has a different gene pool or haplotype structure than the sample being studied. For example, in Chapter 4, we pointed out that the two X-linked loci, G6PD and color blindness, have opposite allelic phases of linkage disequilibrium and therefore haplotype frequencies in two nearby human populations – one in Sardinia and one in Southern Italy. Ideally, the reference panel is sampled from exactly the same population of inference as the sample under study. However, there are often few reference panels available, so imputation errors often increase as the population genetic distance between the reference sample and the data sample increase. For example, Chundru et al. (2019) evaluated the imputation error rate on a data set for two different reference panels that were supposed to be similar to each other and to the data sample (primarily Europeans or people of European ancestry). Using the 1000 Genome reference panel, they found the imputation error rate to be 3.2%, and using the Haplotype Reference Consortium reference panel, they found the error rate to be 1.6%. Whether or not these error rates would be of biological significance would depend upon the question being addressed. But we do note that these are large reference panels closely related to the sample, so these error rates were obtained under ideal circumstances for imputation. Imputation errors emerged as a major problem in the skin color/vitamin D receptor study (Tiosano et al. 2016) described in Chapter 14. Imputation was not used on the data set our group created nor on the HapMap data that we used for confirmation (imputation is used in HapMap, but it is possible to exclude imputed genotypes). We avoided imputed data because the focus of our study was allelic networks that are greatly affected by haplotype structure and linkage disequilibrium, and hence imputation could severely bias the allelic networks we were estimating. Also, our surveys included many human populations for which no closely related reference panel exist, which also greatly increases imputation error. Another possible reference panel source is the 1000 Genomes Project (http://www.1000genomes.org). When we used 1000 Genome data, we obtained results that were discrepant with both our own data and the non-imputed HapMap data. We wished to exclude imputed genotypes from the 1000 Genome data, but could not because we discovered that imputation was used extensively in creating these data and, as the 1000 Genome website at that time warned, “… we are unable to precisely identify which sites used imputation to generate their genotype.” Because of the biases and errors that imputation can cause for our type of analysis, and because of the inability to identify and exclude the imputed nucleotides, the 1000 Genome data were useless for our study. Since then, there has been improvement on documenting imputation, but this database still has many errors (Belsare et al. 2019). Researchers using any DNA database, either their own or a public database, need to carefully examine that database and its possible errors with respect to the goals of their study. We concluded that the 1000 Genomes database was actively misleading for the goals of our skin color/VDR study, but this database could be useful for other purposes.

Appendix A Genetic Survey Techniques

RAD Sequencing NGS platforms can be coupled with other genetic survey techniques to create hybrid methods of surveying genetic variation in populations. One method of this type is restriction site-associated DNA sequencing or RADseq (Andrews et al. 2016). RADseq is one of several methods for performing a reduced-representation sequence that only targets a subset of the genome. RADseq has been particularly popular in population genetic studies of non-model organisms because it does not require any prior genomic information for the species being studied. There are several related methods for executing RADseq, but all use restriction enzymes to obtain DNA sequence data at both coding and non-coding positions scattered throughout the entire genome. The sample DNA is first digested with one or more restriction enzymes. The resulting fragments are reduced in length to the appropriate length for sequencing by mechanical shearing, by double digests, by PCR preferentially amplifying short fragments, or by a size-selection step. Sequencing adaptors (double-stranded oligonucleotides) are added that are required by NGS platforms. Most RADseq techniques produce sequence read lengths between 30 and 300 bp, although sometimes longer. Genotyping performed by sequencing allows SNP discovery. With some techniques, one can also identify phased SNPs and haplotypes in sequences 400–800 bp long (Rochette et al. 2019). Further filtering after genotyping is usually performed to remove loci or samples with large proportions of missing data, and the extent and type of filtering can affect the ultimate analytical results. Researchers must put much thought into these filtering parameters and the other steps with respect to the goals of their studies. Polymorphisms in restriction sites can cause errors, as well as the genotyping errors associated with NGS. Indeed, RADseq often results in a high rate of erroneous genotyping calls, particularly calling heterozygotes as homozygotes, so genotyping error rates should be assessed and adjustments made (Bresadola et al. 2020). These errors can bias population genetic statistics, such as underestimating genomic diversity, overestimating fst, and increasing false positives and false negatives in fst outlier analyses. Filtering can reduce these biases. RADseq is generally an efficient and cost-effective way of obtaining thousands of SNPs even in non-model species. This is usually not adequate for GWAS or comprehensive genome-wide scans for natural selection (Lowry et al. 2017) but is useful for studies on population structure and history, levels of genetic variation, and genetic relatedness between individuals.

635

636

Appendix B Probability and Statistics Population genetics deals with sampling genetic variation from populations, and, hence, the fundamental observations gathered in population genetics are subject to sampling error. Moreover, many aspects of the evolutionary process itself are subject to random processes, such as mutation (Chapter 1) and genetic drift (Chapters 4 and 5). Hence, random factors and uncertainty often need to be incorporated even into theoretical population genetics. Randomness in both biological processes and sampling error can be described by probability measures. Therefore, this appendix begins with a brief overview of probability theory. Population geneticists also use their data and models to make inferences about the evolutionary process, and this requires the use of statistics – functions of data that represent realizations of random processes. Therefore, this appendix will also provide a brief outline of some essential statistical concepts and tools used in population genetics.

Probability A probability is a measure assigned to an event that is not certain to occur. There are many definitions of exactly what a probability measures, but the most common and straightforward interpretation is that the probability of an event represents the frequency with which the event would occur in a large number of independent trials. Probabilities have some well-defined mathematical properties that can be used to combine simple events into more complex events. Consider two events, called A and B. These events are said to be independent if P A and B = P A

B =P A P B

B1

where P (A and B) is the probability that events A and B are simultaneously true. Two events are said to be mutually exclusive if P A and B = 0

and

P A or B = P A

B =P A +P B

B2

A set of events Ai is said to be mutually exclusive and exhaustive if each pair of events satisfies Eq. (B.2) and n

P A1 or A2 or A3 or…An = P

n i = 1 Ai

=

P Ai = 1 i=1

where n is the number of events in the set (either finite or infinite).

Population Genetics and Microevolutionary Theory, Second Edition. Alan R. Templeton. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc. Companion website: www.wiley.com/go/templeton/populationgenetics2

B3

Appendix B Probability and Statistics

Now, consider the probability that event A occurs given that event B is known to have occurred. Then, P A given B = P A B =

P B AP A P A B = P B P B

B4

P(A|B) is known as the conditional probability of A given B. Equation (B.4) is known as Bayes’ Theorem. Consider the special case of Bayes’ Theorem in which the two events are independent. In that case, Eq. (B.1) can be substituted into the numerator of Eq. (B.4) to yield: P A B =

P A B P AP B = =P A P B P B

B5

that is, when two events are independent, the probability of one event is not affected at all by the occurrence of the other event.

Random Variables, Probability Distributions, and Expectation A random variable is simply a number that is assigned to an event. In many cases, a number naturally describes the event (such as an allele frequency in models of genetic drift, as in Chapter 4), but even in the case of discrete, qualitative events, it is always possible to assign a numerical value to describe the different event outcomes, as is the case in Box 3.1 in which a value of 1 was assigned to gametes bearing an A allele and a value of 0 was assigned to gametes bearing an a allele. A probability distribution is the set of probabilities assigned to a mutually exclusive and exhaustive set of random variables. In many cases, it is possible to describe the entire probability distribution as a closed mathematical function of the random variable (say x) given a set of one or more parameters (say ω). Such a closed form is symbolized by f(x|ω) in this appendix. If the random variable is discrete, f(x|ω) gives the probability of the event described by the random variable taking on the value x. If the random variable is continuous, then f(x|ω)dx is the probability of the event that the random variable takes on a value in the interval between x and x + dx as dx approaches zero. In general, for a continuous random variable, X2

f x ω dx

B6

X1

is the probability of the event that the random variable takes on a value in the interval between X1 and X2. By combining Eqs. (B.2) and (B.3), probability distributions always satisfy the following constraint: f x ω =1

for a discrete random variable

x

f x ω dx = 1

for a continuous random variable

B7

x

where the summation or integration is over all possible values of the random variable x. This means that one of the possible values of the random variable will certainly occur (a probability of one).

637

638

Appendix B Probability and Statistics

Because random variables are numbers, they can be manipulated by other mathematical functions. Let g(x) be some function of the random variable x. Then, the expectation of g(x) is gx f x ω

Egx =

for a discrete random variable

x

E g x = g x f x ω dx

for a continuous random variable

B8

x

where the summation or integration is over all possible values of the random variable x. The expectation of g(x) represents the average value of g(x) over all possibilities of x weighted by the probability of x. Two special cases of expectations are frequently used to characterize probability distributions. The first is called the mean, and it is the expectation of g(x) = x and is symbolized by μ; xf x ω

μ=Ex =

for a discrete random variable

x

μ = E x = xf x ω dx

for a continuous random variable

B9

x

where the summation or integration is over all possible values of the random variable x. The mean μ is also called the first moment of the probability distribution. The second special case is called the variance, and it is the expectation of g(x) = (x − μ)2 and is symbolized by σ 2; σ2 = E x − μ

2

x − μ 2f x ω

=

for a discrete random variable

x

σ2 = E x − μ

2

x − μ 2 f x ω dx

=

for a continuous random variable

B 10

x

where the summation or integration is over all possible values of the random variable x. The variance σ2 is also called the second central moment (or second moment around the mean) of the probability distribution. Note that the mean is the average value of the random variable itself, whereas the variance is the average value of the squared deviation of the random variable from its own average value. The mean measures the central tendency of the probability distribution, whereas the variance measures the tendency of the random variable to deviate from the mean. There are a large number of probability distributions, but only a few are widely used in population genetics and in commonly used statistics. These basic probability distributions are described below, starting with discrete random variables and then moving on to continuous random variables.

Discrete Probability Distributions Hypergeometric Distribution

Suppose there are N objects divided into two types, with a of the objects being of type 1, and b of the objects of type 2. Note that a + b = N. Now, suppose that n objects are drawn from this population of N objects without replacement, that is, if one draws an object of type 1 on the first draw, then there are only a − 1 objects of type 1 left on the second draw. Now, let the random variable be the number

Appendix B Probability and Statistics

of objects of type 1 in the sample of n objects. The probability distribution that describes this situation is called the hypergeometric distribution and has the form: a x

f x a, b, n =

b n−x a+b B 11

n where c d

=

c , d c−d

y = y y−1 y−2

1

Note that a, b, and n are the parameters of this probability distribution. The random variable can take on any integer value between 0 and the minimum of n and a, inclusively. The mean and variance of the hypergeometric are: μ=n

a = np a+b

σ2 =

where p =

a a+b

a + b−n npq where q = 1 − p a + b−1

B 12

Binomial Distribution

If the size of the population being sampled (N) is much larger than the sample size (n) (formally, the limit as N/n ∞), the hypergeometric distribution converges to the form: f x p, n =

n x

px qn − x

B 13

where the random variable can take on any integer value between 0 and n inclusively. The parameters of this distribution are n and p, and the mean and variance are given by: μ = np

B 14

σ 2 = npq

Poisson Distribution

A Poisson distribution arises from the binomial when the sample size of the binomial becomes very large (formally, n ∞) but the probability of drawing an object of type 1 becomes very small (p 0) with their product constant (np = λ). Under these limiting conditions, the binomial takes on the Poisson form: f x λ =

λx e − λ x

B 15

where the random variable x can take on any integer value between 0 and infinity. The Poisson has only a single parameter, λ, and its mean and variance are: μ=λ σ2 = λ

B 16

639

640

Appendix B Probability and Statistics

Negative Binomial Distribution

The form of this distribution is: f x P, n =

n + x−1 n−1

P Q

x

1−

P Q

n

where Q = P + 1

B 17

where the random variable can take on any integer value between 0 and infinity. This distribution also yields the Poisson distribution in the limit as n ∞, P 0, and nP = λ. The mean and variance of the negative binomial are: μ = nP σ 2 = nPQ

B 18

The four discrete distributions discussed above span increasing amounts of variance for a given mean. To see this, note that all the expressions for the variance of these distributions include the mean. Now, substitute μ for the parameters that define the mean within the equations for the variances to yield the relative rankings of the variances for a given mean as: a + b−n a + b−1

1−p μ < 1−p μ