416 31 90MB
English Pages 816 Year 2014
!
LELAND H.
MICHAEL L.
HARTWELL
GOLDBERG" JANICE A.
FISCHER
LEROY
r-rooD
CHARLES F. AQUADRO a
a:
Chapter 1: Genetics - The Study of Biological Information Chapter 2: Mendel's Principles of Heredity Chapter 3: Extensions to Mendel's Laws Chapter 4: The Chromosome Theory of Inheritance Chapter 5: Linkage, Recombination, and the Mapping of Genes on Chromosomes Chapter 6: DNA Structure, Replication, and Recombination Chapter 7: Anatomy and Function of a Gene: Dissection Through Mutation Chapter 8: Gene Expression: The Flow of Information from DNA to RNA to Protein Chapter 9: Digital Analysis of Genomes Chapter 10: Analyzing Genomic Variation Chapter 11: The Eukaryotic Chromosome Chapter 12: Chromosomal Rearrangements and Changes in Chromosome Number Chapter 13: Bacterial Genetics Chapter 14: Organellar Inheritance Chapter 15: Gene Regulation in Prokaryotes Chapter 16: Gene Regulation in Eukaryotes Chapter 17: Manipulating the Genomes of Eukaryotes Chapter 18: The Genetic Analysis of Development Chapter 19: The Genetics of Cancer Chapter 20: Variation and Selection in Populations Chapter 21: Genetics of Complex Traits Guidelines for Gene Nomenclature Brief Answer Section Glossary Index
Dr. Leland Hartwell is President and Director of
Seattle's Fred Hutchinson Cancer
Research Center and Professor of Genome Sciences at the University of Washington.
Dr. Hartwell's primary research contributions were in identifying genes that control cell division in yeast, including those necessary for the division process as well as those necess ary for the fidelity of genome reproduction. Subsequently, many of these same genes have been found to control cell division in humans and often to be the site of alteration in cancer cells. Dr. Hartwell is a member of the National Academy of Sciences and has received the Albert Lasker Basic Medical Research Award, the Gairdner Foundation International Award, the Genetics Society Medal, and the 2001 Nobel Prize in Physiology or Medicine.
Dr. Michael Goldberg is a professor at Cornell University, where he teaches introductory genetics and human genetics. He was an undergraduate at Yale University and received his Ph.D. in biochemistry from Stanford University' Dr. Goldberg performed postdoctoral research at the Biozentrum of the University of Basel (Switzerland)
and at Harvard University, and he received an NIH Fogarty Senior International Fellowship for study at Imperial College (England) and fellowships from the Fondazione Cenci Bolognetti for sabbatical work at the University of Rome (Italy). His current research uses the tools of Drosophila genetics and the biochemical analysis of frog egg cell extracts to investigate the mechanisms that ensure proper cell cycle progression and chromosome segregation during mitosis and meiosis.
Dr. Janice Fischer is a Professor at The University of Texas at Austin, where she is an award-winning teacher of genetics and Director of the Biology Instructional Office. She received hei ph.n. in biochemistry and molecular biology from Harvard University, and did postdoctoral research at The University of California at Berkeley and The Whitehead Institute at MIT. In her current research, Dr. Fischer uses Drosophila to examine the roles of ubiquitin and endocytosis in cell signaling during development.
Dr. Lee Hood received an M.D. from the ]ohns Hopkins Medical School and a ph.D. in biochemistry from the California Institute of Technology. His research interests include immunology, cancer biology, development, and the development of biological instrumentation (for example, the protein sequencer and the automated fluorescent DNA sequencer). His early research played a key role in unraveling the mysteries of antibody diversity. More recently he has pioneered systems approaches and medicine. to biology -Hood has taught molecular evolution, immunology, molecular biology, Dr. genomics andbiochemistry and has co-authored textbooks in biochemistry, molecular biology, and immunology, as well as The Code of Codes-a monograph about the Human Genome Project. He was one of the first advocates for the Human Genome Project and directed one of the federal genome centers that sequenced the human genome. Dr. Hood is currently the president (and co-founder) of the cross-disciplinary Institute for Systems Biology in Seattle, Washington.
ilt
iv
About the Authors
Dr. Hood has received a variety of awards, including the Albert Lasker Award for Medical Research (1987), the Distinguished Service Award from the National Association of Teachers (1998) and the Lemelson/MlT Award for Invention (2003). He is the 2002 recipient of the Kyoto Prize in Advanced Biotechnology-an award recognizing his pioneering work in developing the protein and DNA synthesizers and seq'€ncers that provide the technical foundation of modern biology. He is deeply involved in K-12 science education. His hobbies include running, mountain climbing, and reading.
Dr. Charles Aquadro (Chip) is Professor of Population Genetics, the Charles A. Alexander Professor of Biological Sciences, and Director of the Center for Comparative and Population Genomics at Cornell University. He obtained his Ph.D. in genetics from the university of Georgia, was a postdoc at the National Institute for Environmental Health Sciences/NlH, and joined the faculty at Cornell University in 1985 where he is now a professor. He has served as President of the society of Molecular Biology and Evolution, is an elected Fellow of the AAAS, is a member of the Scientific Advisory Board for National Geographic Society's Genographic Project, was a member of the scientific Advisory Board for the WGBH/NOVA TV series "Evolutioni' and has been a visiting scholar at Cambridge University (England, 1993) and Harvard University (2007). His research and teaching focuses on molecular population genetics, molecular evolution, and comparative genomics. While Drosophila is his primary research system, recent work hai also involved yeast, humans, and plants. At cornell, he teaches a university-wide course to nonmajors on personal genomics and medicine, and a major's course in population genetics.
Digital Author In today's world of learning through technology, it is highly important to have the content of the text be mirrored and delivered in a digital manner and format which leads to classroom success for students while simultaneously delivering individual and/or classroom progress information to instructors. Enter the digital author. With this fifth edition we are pleased to add professor Bruce Bejcek from western Michigan university to the Hartwell Genetics team.
Dr. Bruce Bejcek received his Ph.D. from St. Louis University. After postdoctoral fellowships at the Iewish Hospital of St. Louis and University of Minnesota he joined the faculty at Western Michigan University in the Department of Biological Sciences. Currently a professor, his research interests have focused on the establishment and maintenance of tumors, particularly those that involve the expression of platelet-derived growth factor and, more recently, herpes simplex virus. His research also includes the discovery of anti-tumor compounds from plants native to Michigan. He has taught a variety of courses including cancer biology, cell biology, and general genetics.
Contributors Genetics research tends to proceed down highly specialized paths. A number of in specific areas generously provided information in their areas of expertise. We thank them for their contributions to this edition of our text. experts
|ody
Lars
on,
In str ucti
o
n
al D
esign
er, Textb o ok
D ev elo p m
ent
1
Genetics: The Study of Biological
lnformation
1
HowGenesTravelon Chromosomes 381
Basic Principles: HowTraits Are Transmitted 14
2 3 4
Mendel's Principles of
Extensions to Mendel's
14
45
rfre ChromosomeTheory of
lnheritance
5
Laws
1 12 1
Heredity
85
The Eukaryotic Chromosome 381
Chromosomal Rearrangements and Changes in Chromosome Number 409
13 BacterialGenetics 459 14 Organellar lnheritance 491
Linkage, Recombination, and the Mapping of Genes on
Chromosomes 127
HowGenesAreRegulated 514 What Genes Are and What TheyDo 173
6
15 16
Cene Regulation in Prokaryotes Cene Regulation in Eukaryotes
514 547
Orun Structure, Replication, and
Recombination 173
7 8
Anatomy and Function of a Gene: Dissection Through Mutation 206 Gene Expression: The Flow of lnformation from DNA to RNA to
Protein
254
Analysis of Genetic
9 1O
lnformation 29A
OigitatAnalysis of Genomes 298 nnalyzing Genomic
lnformation 342
Using
Genetics 575
17 tvtanipulating the Genomes of Eukaryotes 575 18 fne Genetic Analysis of Development 598 19 rneGeneticsofCancer 631
Beyond the lndividual Gene and Genome
20 21
Variation and Selection in Populations Genetics of
662 662
ComplexTraits 694
v
AboutThe
Authors
3.2
iii
Preface x Acknowledgements xxv
lnheritance
t
lntroduction: Genetics in the Twenty-First Century 1
lnformation
1
DNA:TheFundamentallnformation Molecule of
1
.3
1.4
Life
2
Proteins: The Functional Molecules of Life Processes 3
Molecular Similarities of All Life-Forms 4
7l
The Modular Construction
4.4
Society
10
Inheritance
85
Genes 86 Sex Chromosomes and Sex Determination Chromosomes:The Carriers of
89
Mitosis: Cell Division That Preserves Chromosome Number 92 Meiosis: Cell Divisions That Halve Number 98
Chromosome
4.5 4.6 4.7
Modern Genetic Techniques 8 Human Genetics and
4.1 4.2 4.3
Gametogenesis
, .
106
Validation of the Chromosome
Theory
of Genomes 6
1.5 1.6
Genetics and Society: Disease Prevention Versus the Right to Privacy v
The Chromosome Theory of
Genetics: The Study of Biological
1.2
55
ffiffir*
Renrtrrkrbtli inkilisnros \Lcrd rtrle to d.t.nhrc a rough srqilorc dnft oflhe huniil geilome bI licbruart 2001, h thh trni:lh. \.quen.e hrn soDc SiPs iild dil rot vet hrv.an iDproprir(e krd oleamq (d dnn de oll/10.000). lil rccu' ile seqrcn.c.ovcring 9716 oithe g.nonrc $is coqn€lcd irod\'drcr.itlir il 200r, nro ycrE ahcii ofschedde.'lh..ndt finnh rLrs lnrddcd bI the l99S ldntc of (iclcra, a pdnl. coilI,ilr lo.on\trct. a d.rii olthe g.norc in jL\l lear il 'hft nq nn'.h lo\r.r con, e.4noflrg a no\!l \eluer.ilrg nrnt.$l li. nxe.utioDalll poncd gcnoil)e elld r.ad.d by nroynrg ils rlNetrblc rhcrd bt selerrl tcm
Poweredbyan intelligent diagnostic and adaptive engine, smartBook
facilitates the reading process by identifying what content a student knows and doesnt know through adaptive assessments
The reports in SmartBook help identifli topics where you need more work.
v j;-'t:
@Eil E!trEEq @EIEEI GfrTEil
A The Srnartbook
.gtt& ,l \\
lEEEEffi}
experience starts
by previewing key concepts from the chapter and ensuring
A
that yon understand the big ideas.
SmaltBook asks you questions that identify gaps in your knowledge' The reading experience then continuously adapts in response to the assessments, highlighting the material you need to review based on what you dont kr-row.
o.
g:.
6, q,-
s-.--
xvi
Online Teaching and Learning Resources
Abnormal
Htt protein
at all, as in Fig. 2.23, but in other cases have a superscript other than f that signifies a particular type of abnormal allele. No effective treatment yet exists for Huntington disease, and because of its late onset, there was until the 1980s no way for children of a Huntington parent to know before middle age-usually until well after their own childbearing years-whether they carried the Huntington disease allele (HD). Most people with the disease allele are HD HD+ heterozygotes, so their children would have a 50% probability of inheriting HD and, before they are diagnosed, a 25o/o probability of passing the defective allele on to one of their children.
In the mid-1980s, with new knowledge of the gene, molecular geneticists developed a DNA test that determines whether an individual carries the HD allele. (This test will be explained in detail in Chapter 10.) Because of the lack of effective treatment for the disease, some young adults whose parents died of Huntington disease prefer not to be tested so that they will not prematurely learn their own fate. However, other at-risk individuals employ the test for the HD allele to guide their decisions about having children. If someone whose parent had Huntington disease does not have HD, he or she has no chance of developing the disease or of transmitting it to offspring. If the test shows the presence of HD, the at-risk person and his or her partner might choose to conceive a child, obtain a prenatal diagnosis of the fetus, and then, depending on their beliefs, elect an abortion if the fetus is affected. The
Genetics and Society box "Developing Guidelines for Genetic Screening" on the next page discusses significant social and ethical issues raised by information obtained from family pedigrees and molecular tests.
Humans
33
A Horizontal Pattern of lnheritance lndicates a Rare RecessiveTrait Unlike Huntington disease, most confirmed single-gene diseases in humans are caused by recessive alleles. One reason is that, with the exception oflate-onset traits, deleterious dominant alleles are unlikely to be transmitted to the next generation. For example, if people affected with Huntington disease all died by the age of 10, the disease would disappear from the population. In contrast, individuals can carry one allele for a recessive disease without ever being affected by any symptoms. Figure 2.24 shows three pedigrees for cystic fibrosis (CF), the most commonly inherited recessive disease among Caucasian children in the United States. A double dose of the recessive CF allele causes a fatal disorder in which the lungs, pancreas, and other organs become clogged with a thick, viscous mucus that can interfere with breathing and digestion. One in every 2000 white Americans is born with cystic fibrosis, and only 10% ofthem survive into their 30s. There are two salient features of the CF pedigrees. First, the family pattern of people showing the trait is often horizontal: The parents, grandparents, and great-grandparents
of children born with CF do not themselves manifest the disease, while several brothers and sisters in a single generation may. A horizontal pedigree pattern is a strong indication that the trait is recessive. The unaffected parents are heterozygous carriers: They bear a dominant normal allele that masks the effects of the recessive abnormal one. An estimated 12 million Americans are carriers of a Figure 2.24 Cystic fibrosis: A recessive condition.
tn (a), the
two affected individuals (Vl-4 and Vll-1) are CFCF; that is, homozygotes for the recessive disease allele. Their unaffected parents must be carriers, so V-1, V-2, Vl-1, and Vl-2 must all be CF CF+. lndividuals ll-2,11-3,lll-2,
lll-4, lV-2, and lV-4 are probably also carriers. We cannot determine which of the founders (l-1 or l-2) was a carrier, so we designate their genotypes as CF+-. Because the CF allele is relatively rare, it is likely that ll-1,11-4,lll-1,lll-3, lV-l, and lV-3 are CF+CF+ homozygotes.The genotype of the remaining unaffected people (Vl-3, Vl-5, and Vll-2) is uncertain (CF+-). (b and c) These two families demonstrate horizontal patterns of inheritance, Without further information, the unaffected children in each pedigree must be regarded as having a CF+- genotype.
(a)
t
il ilt IV (b)
|
(c)
|
VI
vil 2
45
34
Chapter
2
Mendel's Principles of Heredity
GENETICS A.ND Developing Guidelines for Genetic Screening ln the early 1970s, the United States launched a national screening program for carriers of sickle-cell anemia, a recessive genetic disease
Should I be screened if a test is availableT For most inherited
that afflicts roughly 1 in 600 African-Americans.The disease is caused by a particular allele, called Hbt', of the B-globin gene;the dominant normal allele is Hbl. The protein determined by the B-globin gene
of anticipating a fatal late-onset disease for which there is no treatment could be devastating, and therefore some people might decide not to be tested. Others may object to testing for religious reasons, or because of confidentiality concerns.
is one component of the oxygen-carrying hemoglobin molecule. HbPs homozygotes have sickle-shaped red blood cells; these patients suffer a decrease in oxygen supply, tire easily, and often develop heart failure from stress on the circulatory system. The national screening program for sickle-cell anemia was based on a simple test of hemoglobin mobility: Normal and "sickling" hemoglobins move at different rates in a gel. People who par-
Hbe
ticipated in the screening program and found they were carriers could use the test results to make informed reproductive decisions. A healthy man, for example, who learned he was a carrier (that is, that he was a HbPs Hbg heterozygote), would not have to worry about having an affected child if his mate was a noncarrier. The original sickle-cell screening program, based on detection of the abnormal hemoglobin protein, was not an unqualified success, largely because of insuffi cient educational follow-through. Many who learned they were carriers mistakenly thought they had the disease. Moreover, employers and insurance companies that obtained access to the information denied jobs or health insurance to some heterozygotes for no acceptable reason. Problems of public relations and education thus made a reliable screening test into a source of dissent and alienation. Today, at-risk families may be screened for a growing number of genetic disorders, thanks to the ability to evaluate genotypes directly. The need to establish guidelines for genetic screening thus becomes more and more pressing. Several related questions reveal the complexity of the issue.
.
Why carry out genetic screening at allTfhe first reason for
screening is to obtain information that will benefit individuals. For example, if you learn at an early age that you have a genetic predisposition to heart disease, you can change your lifestyle to improve your chances of staying healthy. You can also use the results from genetic screening to make informed
reproductive decisions. The second reason for genetic screening, which often conflicts with the first, is to benefit groups within society. lnsurance companies and employers, for example, would like to know who is at risk for various genetic conditions.
recessive CF allele. Table 2.2 summarizes some of the clues found in pedigrees that can help you decide whether a trait is caused by a dominant or a recessive allele. The second salient feature of the CF pedigrees is that many of the couples who produce afflicted children are blood relatives; that is, their mating is consanguineous (as indicated by the double line). In Fig. 2.24a,lhe consanguin-
diseases, no cures currently exist.The psychological burden
lf a screening program is established, who should be tested? The answer depends on what the test is trying to accomplish
the cost of a procedure must be weighed against the usefulness of the data it provides. ln the United States, for example, only one-tenth as many African-Americans as Caucasians are affected by cystic fibrosis, and Asians almost never exhibit the disease. Should all racial groups be tested for cystic fibrosis, or only Caucasians? Should private employers and insurance companies be allowed to test their clients and employees2 Some employers advocate genetic screening to reduce the incidence of occupational disease, arguing that they can use genetic test results to as well as on its expense. Ultimately,
make sure employees are not assigned to environments that might cause them harm. Critics of this position say that screening violates workers'rights, including the right to privacy. increases racial and ethnic discrimination in the workplace, and provides insurers with an excuse to deny coverage. ln 2008, President George W. Bush signed into law the Genetic lnformation Nondiscrimination Act, which prohibits insurance companies and employers in the United States from discriminating on the basis of information derived from genetic tests. Finally, how should people be educated about the meaning of test results? ln one small-community screening program, people identified as carriers of the recessive, life-threatening
blood disorder known as P-thalassemia were ostracized; as a result, carriers ended up marrying one another, only making medical matters worse. By contrast, in Ferrara, ltaly, where 30 new cases of B-thalassemia had been reported every year, extensive screening combined with education was so successful that the 1 9B0s passed with only a few new cases of the disease. Given all of these considerations, what kind of guidelines would you like to see established to ensure that genetic screening reaches the right people at the right time, and that information gained from such screening is used for the right purposes?
eous mating
in generation V is between third cousins. Of
course, children with cystic fibrosis can also have unrelated carrier parents, but because relatives share genes, their offspring have a much greater than ayerage chance of receiv-
ing two copies of a rare allele. Whether or not they
are
related, carrier parents are both heterozygotes. Thus among their offspring, the proportion of unaffected to affected
2.3 Mendelian Inheritance in
TABLE 2.2
How to Recognize Dominant and Recessive Traits in Pedigrees
Dominant Traits '1. Affected children always have at least one affected parent.
2.
As a result, dominant traits show a vertical pattern
of
inheritance: The trait shows up in every generation.
3.
Two affected parents can produce unaffected children, if both parents are heterozygotes.
Recessive Traits
1.
Affected individuals can be the children of two unaffected carriers, particularly as a result of consanguineous matings.
2. 3.
All the children of two affected parents should be affected.
4.
Rare recessive traits show a horizontal pattern of inheritance: The trait first appears among several members of one generation and is not seen in earlier generations. Recessive traits may show a vertical pattern of inheritance if the trait is extremely common in the population.
children is expected to be 3:1. To look at it another way, the chances are that 1 out of4 children oftwo heterozygous carriers will be homozygous CF sufferers. You can gauge your understanding ofthis inheritance pattern by assigning a genotype to each person inFig.2.24
and then checking your answers against the caption. Note that for several individuals, such as the generation I individuals in part (a) of the figure, it is impossible to assign a full genotype. We know that one of these people must be the carrier who supplied the original CF allele,
Humans
35
but we do not know if it was the male or the female. As with an ambiguous dominant phenotype in peas, the unknown second allele is indicated by a dash.
In Fig. 2.24a, a mating between the unrelated carriers VI-l and VI-2 produced a child with cystic fibrosis. How likely is such a marriage between unrelated carriers for a recessive genetic condition? The answer depends on the gene in question and the particular population into which a person is born. As Table 2.1 on p. 31 shows, the incidence of genetic diseases (and thus the frequency of their carriers) varies markedly among populations. Such variation reflects the distinct genetic histories of different groups. The area of genetics that analyzes differences among groups of individuals is called population genetics, a subject we cover in
detail in Chapter 20. Notice that in Fig. 2.24a, several unrelated, unaffected people, such as II-1 and II-4, married into the family under consideration. Although it is highly probable that these individuals are homozygotes for the normal allele of the gene (CF*CF'), there is a small chance (whose magnitude depends on the population) that any one of them could be a carrier of the disease. Genetic researchers identified the cystic fibrosis gene in 1989, soon after the Huntington disease gene was identified.
The normal, dominant CF* allele makes a protein called Eys-
lic fibrosis lransmembrane conductance legulator (CFTR). CFTR protein forms a channel in the cell membranes in the lungs, and controls the flow of chloride ions and thus the flow.of water through lung cells. Recessive CF disease alleles either produce no CFTR or produce nonfunctional or less functional versions of the protein (Fig. 2,25). Lung
Figure 2.25 Why the allele for cystic fibrosis is recessive.
The CFTR protein regulates the passage of chloride ions through the cell membrane. People who are homozygous for a cystic fibrosis disease allele (CFCfl have the disease because recessive disease alieles either specifiT no CFTR protein as shown, or encode abnormal CFTR proteins that do not function at all or do not function as well as the normal protein (notshown). Disease alleles (CF) are recessive because CF CF+ heterozygotes produce CFTR from the normal (CF+) allele, and this amount of CFTR is sufficient for normal lung function. CF+ CF+
o
or CF
CF*
CF CF
Outside of the cell
C
Outside of the cell
FA
MUCUS
k#
o Lipid bilayer of cell membrane lnside of the cell
CFTR protein
C
al
Normal
Cl- ions
oo lnside of the
cell
C
O
Cystic fibrosis
36
Chapter
2
Mendel's Principles of Heredity
cells without CFTR retain water, and a thick, dehydrated mucus builds up outside the cells. Thus, CF CF homozygotes have no functional CFTR (or not enough) and exhibit cystic fibrosis. Gene therapy-insertion of a normal CF gene into lung cells of patients-has been tried to ameliorate the disease's debilitating symptoms, but so far without success. However, identification of the gene responsible for
cystic fibrosis has very recently led to an effective treatment for the disease in patients with a specific, rare mutant allele of CF. In 2012 the U.S. Food and Drug Administration approved a drug that helps the particular defective form of CFTR specified by this allele to function properly. These varied approaches to the treatment of CF and other inherited diseases will be discussed later in the book.
essential concepts
.
.
ln a vertical pattern of transmission, a trait that appears in an affected individual also appears in at least one parent, one of the affected parent's parents, and so on. lf a trait is rare, a pedigree with a vertical pattern usually indicates that the disease-causing allele is dominant. ln a horizontal pattern of transmission, a trait that appears in an affected individual may not appear in any ancestors, but it may appear in some of the person's siblings. A pedigree with a horizontal pattern usually indicates a rare recessive diseasecausing allele. Affected individuals are often products of consanguineous mating.
.
.
Recessive disease alleles, like the CF alleles that cause cystic fibrosis, usually specify either no protein, or less-functional versions of the protein that the normal, dominant allele produces.
Dominant disease alleles may have a number of biochemical explanations. ln the case of Huntington disease, the disease-causing HD allele specifies an abnormal, deleterious version of the protein produced by the normal, recessive allele.
Mendel answered the three basic questions about heredity as follows: To "What is inherited?" he replied, "alleles of genes." To "How is it inherited?" he responded, "according to the principles of segregation and independent assortmentl'And to "What is the role of chance in heredity?" he said, "for each individual, inheritance is determined by chance, but within a population, this chance operates in a context of strictly defined probabilities." Within a decade of the 1900 rediscovery of Mendel's work, numerous breeding studies had shown that Mendel's laws hold true not only for seven pairs of antagonistic characteristics in peas, but for an enormous diversity of traits in a wide variety of sexually reproducing plant and animal species, including four-o'clock flowers, beans, corn, wheat, fruit flies, chickens, mice, horses, and humans. Some of these same breeding studies, however, raised a challenge to
the new genetics. For certain traits in certain species, the studies uncovered unanticipated phenotypic ratios, or the results included F1 and F2 progeny with novel phenotypes that resembled those of neither pure-breeding parent. These phenomena could not be explained by Mendel's hypothesis that for each gene, two alternative alleles, one completely dominant, the other recessive, determine a single trait. We now know that most common traits, including skin color, eye color, and height in humans, are determined by interactions between two or more genes. We also know that within a given population, more than two alleles may be present for some of those genes. Chapter 3 shows how the genetic analysis of such complex traits, that is, traits produced by complex interactions between genes and between genes and the environment, extended rather than contradicted Mendel's laws of inheritance.
Solving Genetics Problems
like puzzles, Take them in slowly-dont be overwhelmed by the whole problem. Identify useful facts given in the problem, and use the facts to deduce additional information. Use genetic principles and logic to work toward the
The best way to evaluate and increase your understanding
of the material in the chapter is to apply your knowledge in solving genetics problems. Genetics word problems are
Solved
solutions. The more problems you do, the easier they become. In doing problems, you will not only solidify your
understanding
of genetic concepts, but you will
also
develop basic analytical skills that are applicable in many
disciplines.
Note that some of the problems at the end of each chapter are designed to introduce supplementary but important concepts that you can infer from your reading even if you have not encountered this specific information explicitly in the chapter. Solving genetics problems requires much more than simply plugging numbers into formulas. Each problem is unique and requires thoughtful evaluation of the information given and the question being asked. The following are general guidelines you can follow in approaching these word problems: a. Read through the problem once to get some sense of the concepts involved.
b. Go back through the problem, noting all the information supplied to you. For example, genotypes or pheno-
types of offspring or parents may be given to you or implied in the problem. Represent the known information in a symbolic format-assign symbols for alleles; use these symbols to indicate genotypes; make a diagram of the crosses including genotlpes and
phenotypes given
or implied. Be sure that
you
do not assign different letters of the alphabet to two alleles of the same gene, as this can cause confusion. Also, be careful to discriminate clearly between the upper- and lowercases ofletters, such as C(c) or S(s). c. Now, reassess the question and work toward the solution using the information given. Make sure you answer the question being askedl d. When you finish the problem, check to see that the answer makes sense. You can often check solutions by working backwards; that is, see if you can reconstruct the data from your answer. e. After you have completed a question and checked
your answer, spend a minute to think about which major concepts were involved in the solution. This is a critical step for improving your understanding of
l. In cats, white
patches are caused by the dominant allele 4 while pp individuals are solid-colored. Short hair is caused by a dominant allele S, while ss cats have long hair. A long-haired cat with patches whose
mother was solid-colored and short-haired mates with a short-haired, solid-colored cat whose mother was long-haired and solid-colored. What kinds of kittens can arise from this mating, and in what proportions?
37
Answer The solution to this problem requires an understanding of dominance/recessiveness, gamete formation, and the independent assortment of alleles of two genes
in
a cross.
First make a representation of the known information: Mothers:
solid, short-haired
Cross:
cat
solid, long-haired cat 2
1
X
patches, long-haired
solid, short-haired
What genotypes can you assign? Any cat showing a recessive phenotype must be homozygous for the recessive allele. Therefore the long-haired cats are ss,. solid cats are pp. Cat I is long-haired, so it must be homozygous for the recessive allele (ss). This cat has the dominant phenotype of patches and could be either PP or Pp, but because the mother was pp and could only contribute a p allele in her gametes, cat 1 must be Pp. Cat lt full genotype is Pp ss. Similarly, cat 2 is solid-colored, so it must be homozygous for the recessive allele (pp). Because this cat is shorthaired, it could have either the SS or Ss genotype. Its mother was long-haired (ss) and could only contribute an s allele in her gamete, so cat 2 must be heterozygous Ss. The full genotype ispp Ss. The cross is therefore between Pp ss (cat 1) and pp Ss (cat 2). To determine the types of kittens, first establish the types of gametes that can be produced by each cat and then set up a Punnett square to determine the genotypes of the offspring. Cat | (Pp ss) produces Ps and ps gametes in equal proportions. Cat 2 (pp Ss) produces pS andps gametes in equal proportions. Four types of kittens can result from this mating with equal probability: Pp Ss (patches, short-haired), Pp ss (patches, long-haired), pp Ss (solid, short-haired), andpp ss (solid, long-haired). Cat
1
Ps
ps
pS
Pp Ss
pp ss
ps
Pp ss
pp ss
genetics.
For each chapter, the logic involved in solving two or three types of problems is described in detail.
Problems
Cal2
You could also work through this problem using the product rule of probability instead of a Punnett square, as will be shown on the next page. The principles are the same: Gametes produced in equal amounts by either parent are combined at random.
38
Chapter
Cat
2
I
s Ps ll2 p s Il2 p s P
a. Draw the pedigree of the individuals described.
Cat2 Progeny
gamete
gamete
ll2 ll2
Mendel's Principles of Heredity
x x x x
-r -) -+ -+
S 712 p s Uz p S ll2 p s
1lZ p
!!. In tomatoes,
Include the genotypes where possible.
1/4 Pp
Ss
patches, short-haired
1/4 Pp
ss
patches, long-haired
b. Determine the probability that the coupleb first child will be affected.
1/4 pp Ss solid-colored, short-haired 1/4 pp ss solid-colored, long-haired
red fruit is dominant to yellow fruit, and
purple stems are dominant to green stems. The progeny from one mating consisted of 305 red fruit, purple stem plants; 328 red fruit, green stem plants; 1 10 yellow fruit, purple stem plants; and 97 yellow fruit, green stem plants. What were the genotypes of the parents in
Answer This problem requires an understanding of dominance/ recessiveness and probability. Designate the alleles:
f
Tt
this cross?
Answer This problem requires an understanding of independent assortment in a dihybrid cross as well as the ratios predicted from monohybrid crosses. Designate the alleles:
: P: R
red,
r:
yellow
purple stems, p
:
green stems
In genetics problems, the ratios of offspring can indicate the genotype of parents. You
will usually need to
total the number of progeny and approximate the ratio ofoffspring in each ofthe different classes. For this problem, in which the inheritance of two traits is given, consider each trait independently. For red fruit, there are 305 + 328: 633 red-fruited plants out ofa total of 840 plants. This value (6331840) is close to 3 I 4. About Il4 of the plants have yellow fruit ( 1 10 I 97 : 2071840). From Mendel's work, you know that a 3:1 phenotypic ratio results from crosses between plants that are hybrid (heterozygous) for one gene. Therefore, the genotype for fruit color of each parent must have been Rr.
For stem color,305
+
110
or
4151840 plants had
purple stems. About half had purple stems, and the other half (328 + 97)had green stems. A 1:1 phenotlpic ratio occurs when a heterozygote is mated to a homozygous recessive (as in a testcross). The parents' genotypes must have been Pp and pp for stem color. The complete genotype of the parent plants in this cross was
l!!.
Rr Pp X Rt pp.
Tay-Sachs is a recessive lethal disease in which there is neurological deterioration early in life. This disease is rare in the population overall but is found at relatively
high frequency in Ashkenazi |ews from Eastern Europe. A woman whose maternal uncle had the disease is trying to determine the probability that she and her husband could have an affected child. Her father does not come from a high-risk population. Her husband's sister died ofthe disease at an early age.
= normal allele; f = Tay-Sachs allele Tt
ll
tt t Affected 2
ilt
uncle
TT Tt J
Tt 5
4
J tt Affected
sister
The genotypes of the two affected individuals, the woman's uncle (II-1) and the husband's sister (III-3) are tt. Because the uncle was affected, his parents must have been heterozygous. There was a ll4 chance that these parents had a homozygous recessive (affected) child, a 214 chance that they had a heterozygous child (carrier), and a Il4 chance they had a
homozygous dominant (unaffected) child. However, you have been told that the woman's mother (II-2) is unaffected, so the mother could only have had a heterozygous or a homozygous dominant genotype. Consider the probability that these two genotypes will occur. If you were looking at a Punnett square,
there would be only three combinations of alleles possible for the normal mother. Two of these are heterozygous combinations and one is homozygous dominant. There is a 213 chance (2 out of the 3 possible cases) that the mother was a carrier. The father was not from a high-risk population, so we can assume that he is homozygous dominant. There is a 213 chance that the wifet mother was heterozygous
and
if
so,
a ll2
chance that the wife inherited
a
recessive allele from her mother. Because both condi-
tions are necessary for inheritance of a recessive allele, the individual probabilities are multiplied, and the probability that the wife (III-1) is heterozygous is 213 x l12.
The husband (III-2) has a sister who died from the disease; therefore, his parents must have been heterozygous. The probability that he is a carrier is 213 (using the same rationale as for II-2). The probability that the man and woman are both carriers is 213 X
Il2 x
213. Because there is a 1/4 probability that a
particular child of two carriers will be affected, the overall probability that the first child of this couple (III-1 and III-2) will be fficted is 2/3 x 1/2 x 2/3 x 1/4: 4/72, or 1/18.
Problems
39
Vocabulary b. Which of these phenotypes is controlled by the
1. For each of the terms in the left column, choose the best matching phrase in the right column. a. phenotlpe
1.
b.
2. the allele expressed in the phenotlpe
alleles
dominant allele?
c. A normal-colored female snake
is involved in a testcross. This cross produces 10 normal-colored and 11 albino offspring. What are the genotlpes of the parents and the offspring?
having two identical alleles ofa given gene
of
the heterozygote
c.
independent assortment
J. alternate
d. e.
garnetes
4. observable characteristic
gene
5. a cross between
forms of
a gene
5. Two short-haired cats mate and produce six shorthaired and two long-haired kittens. What does this in-
individuals both
formation suggest about how hair length is inherited?
heterozygous for two genes
f.
segregation
6. alleles ofone gene separate into gametes
randomly with respect to alleles of other genes
g.
heterozygote
7
reproductive cells containing only one copy of each gene
h. dominant
8. the ai1e1e
i.
Ft
9. the cross
j.
testcross
that does not contribute to the phenotype of the heterozygote
ofan individual ofambiguous genotype with a homozygous recessive individual
10. an
individual with two different alleles of
a gene
k.
genotype
1.
recessive
heritable entity that determines characteristic
11. the
a
6.
Piebald spotting is a condition found in humans in which there are patches of skin that lack pigmentation. The con-
dition results from the inability of pigment-producing cells to migrate properly during development. TWo adults
with piebald spotting have one child who has this trait and a second child with normal skin pigmentation.
a. Is the piebald spotting trait dominant or recessive? What information led you to this answer? b. What are the genotypes of the parents?
7. As a Drosophila
research geneticist, you keep stocks of flies of specific genotypes. You have a fly that has nor-
m. dihybrid cross
13. the separation
mal wings (dominant phenotype). Flies with short wings are homozygous for a recessive allele of the wing-length gene. You need to know if this fly with
n. homozygote
14. offspring of the P generation
wingJength trait. What cross would you do to deter-
t2. the alleles an individual has
ofthe two alleles ofa gene into different gametes
Section 2.1
2. During the millennia in which selective breeding was practiced, why did breeders fail to uncover the principle that traits are governed by discrete units of inheritance (that is, by genes)?
3. Describe the characteristics of the garden pea that made it a good organism for Mendel's analysis of the basic principles of inheritance. Evaluate how easy or difficult it would be to make a similar study of inheritance in humans by considering the same attributes you described for the pea.
Section 2,2
4. An albino corn snake is crossed with a normal-colored corn snake. The offspring are all normal-colored. When these first-generation progeny snakes are crossed among themselves, they produce 32 normal-colored snakes and 10 albino snakes. a. How do you know that only a single gene is re-
sponsible for the color differences between these snakes?
normal wings is pure-breeding or heterozygous for the mine the genotype, and what results would you expect for each possible genotype?
8. A mutant cucumber plant has flowers
that fail to open when mature. Crosses can be done with this plant by manually opening and pollinating the flowers with pollen from another plant. When closed X open crosses were done, all the Ft progeny were open. The F2 plants were 145 open and 59 closed. A cross of closed X F1 gave 81 open and77 closed. How is the closed trait inherited? What evidence led you to your conclusion?
9. In a particular population of mice, certain individuals display a phenotlpe called "short tail," which is inherited as a dominant trait. Some individuals display a recessive trait called 'dilute," which affects coat color. Which of these traits would be easier to eliminate from the population by selective breeding? Why?
lO. In humans, a dimple in the chin is a dominant characteristic. a. A man who does not have a chin dimple has chil-
dren with a woman with a chin dimple whose mother lacked the dimple. What proportion of
40
Chapter
2
Mendel's Principles of Heredity
their children would be expected to have a chin
different kinds ofprogeny could potentially result from
dimple?
each ofthe four crosses?
b. A man with a chin dimple and a woman who lacks the dimple produce a child who lacks a dimple. What is the man's genotype?
c. A man with a chin dimple and a nondimpled woman produce eight children, all having the chin dimple, Can you be certain of the man's genotype? Why or why not? What genotype is more likely, and why?
1I.
Among Native Americans, two types of earwax (cerumen) are seen, dry and sticky. A geneticist studied the inheritance of this trait by observing the types of offspring produced by different kinds of matings. He observed the following numbers: Offspring Number of mating
Parents Sticky X sticky Sticlry X dry
Dry X dry
pairs
Sticlcy Dry
10
JZ
8
2t
9
72
0
42
6
a. How is earwax type inherited? b. Why are there no 3:1 or 1:1 ratios in the data shown in the chart?
12. Lnagine you have just purchased a black sLallit-rn of unknown genotype. You mate him to a red mare, and she delivers twin foals, one red and one black, Can you tell from these results how color is inherited, assuming that alternative alleles of a single gene are involved? What crosses could you do to work this out? 13. If you roll a die (singular of dice), what is the probability you will roll: (a) a 62. (b) an even number? (c) a number divisible by 3? (d) If you roll a pair of dice, what is the probability that you will roll two 6s? (e) an even number on one and an odd number on the other?
(f) matching numbers? (g) two numbers both over 4? 14. In a standard deck ofplaying cards, there
are four suits (red suits hearts and diamonds, black suits spades and clubs). Each suit has 13 cards: Ace (A),2,3, 4,5,6, 7, 8, 9, 10, and the face cards Jack (J), Queen (Q), and King (K). In a single draw, what is the probability that you will draw a face card? A red card? A red face card?
:
:
t 5. How many genetically different eggs could be formed by women with the following genotlpes? a. AabbCCDD
b. AABbCcdd c. Aa Bb cc Dd d. AaBbCcDd 16. What is the probability of producing a child that will phenotypically resemble either one of the two parents in the following four crosses? How many phenotlpically
a. AaBbCcDdXaabbccdd b. aabbccddxAABBCCDD c. AaBbCcDdX AqBbCcDd d. aabb cc dd x qabb cc dd
17. A mouse sperm of genotype a B C D E fertilizes an egg of genotype a b c D e. What are all the possibilities for the genotypes of (a) the zygole and (b) a sperm or egg of the ba\ mouse that develops from this fertilization?
18. Galactosemia is a recessive human disease that
is treatable by restricting lactose and glucose in the diet. Susan
Smithers and her husband are both heterozygous for the galactosemia gene. a. Susan is pregnant with twins. If she has fraternal
(nonidentical) twins, what is the probability both of the twins will be girls who have galactosemia? b. If the twins are identical, what is the probability that both will be girls and have galactosemia? For parts c-g assume that none of the children is a twin. c. If Susan and her husband have four children, what is the probability that none of the four will have
galactosemia?
d. If the couple has four children, what is the probability that at least one child will have galactosemia? e. If the couple has four children, what is the probability that the first two will have galactosemia and the second two will not? f. If the couple has three children, what is the probability that two of the children will have galactosemia and one will not, regardless of order? g. If the couple has four children with galactosemia, what is the probability that their next child will have galactosemia?
19. Albinism is a condition in which pigmentation is lacking. In humans, the result is white hair, nonpigmented skin, and pink eyes. The trait in humans is caused by a recessive allele. Two normal parents have an albino child. What are the parents' genotypes? What is the probability that the next child will be atbinoi 20. A cross between two pea plants, both of which grew from yellow round seeds, gave the following numbers of seeds: 156 yellow round and 54 yellow wrinkled. What are the genotypes of the parent plants? (Yellow and round are dominant traits.)
21. A third-grader decided to breed guinea pigs for her school science project. She went to a pet store and bought a male with smooth black fur and a female with rough white fur. She wanted to studythe inheritance of those features and was sorry to see that the first litter
of eight contained only rough black animals. To her
Problems
disappointment, the second litter from those same parents contained seven rough black animals. Soon the first litter had begun to produce F2 offspring, and they showed a variety of coat types. Before long, the child had 125 F2 guinea pigs. Eight of them had smooth white coats, 25had smooth black coats, 23 were rough and white, and 69 were rough and black. a. How are the coat color and texture characteristics inherited? What evidence supports your conclusions?
26. A pea plant heterozygous for plant height, pod
shape,
and flower color was selfed. The progeny consisted of 272 tall, inflated pods, purple flowers; 92 tall, inflated, white flowers; 88 tall, flat pods, purple; 93 dwarf, in-
flated, purple; 35 tall, flat, white; 31 dwarf, inflated, white; 29 dwarf, flat, purple; 11 dwarf, flat, white. Which alleles are dominant in this cross?
27. In the fruit fly Drosophila melanogaster lhe following genes and mutations are known:
b. What phenotypes and proportions of offspring
Wing size: recessive allele for tiny wings f; dominant allele for normal wings T Eye shape: recessive allele for narrow eyes n; dominant allele for normal (oval) eyes N.
should the girl expect if she mates one of the smooth white Fr females to an F1 male?
22.Ihe self-fertilization of an F1 pea plant produced from
41
a
parent plant homozygous for yellow and wrinkled seeds and a parent homozygous for green and round seeds resulted in a pod containing seven F2 peas. (Yellow and round are dominant.) What is the probability that all seven peas in the pod are yellow and round?
For each of the following crosses, give the genotlpes of
in response to bright (triggered by anxiety) are Iight) and trembling chin
I tiny
each of the parents.
Wings Eyes
23. The achoo syndrome (sneezing
Wings Eyes Offspring
x
tiny
oval
78 tiny wings, oval eyes
2 normal narrow X
tiny
oval
45 normal wings, oval eyes
oval
24 tiny wings, narrow eyes
both dominant traits in humans. a. What is the probability that the first child of parents who are heterozygous for both the achoo gene and trembling chin will have achoo syndrome but lack the trembling chin? b. What is the probability that the first child will not have achoo syndrome or trembling chin?
24. Apeaplant from a pure-breeding strain that
Female
Male
40 normal wings, narrow eyes
38 tiny wings, oval eyes 44 tiny wings, narrow eyes
3 normal narrow x normal
oval
29 normal wings, narrow eyes 10 1
4 normal narrow x normal
oval
is tall, has
green pods, and has purple flowers that are terminal is crossed to a plant from a pure-breeding strain that is dwarf, has yellow pods, and has white flowers that are
axial. The F1 plants are all tall and have purple axial flowers as well as green pods. a. What phenotlpes do you expect to see in the F2?
b. What phenotypes and ratios would you predict in the progeny from crossing an F1 plant to the dwarf parent?
25. The following chart shows the results of different matings between jimsonweed plants that had either purple
or white flowers and spiny or smooth pods. Determine the dominant allele for the two traits and indicate the genotypes ofthe parents for each ofthe crosses.
35 normal wings, oval eyes
1
62 normal wings, oval eyes 19
28.
tiny wings, oval eyes tiny wings, narrow eyes tiny wings, oval
eyes
Based on the information you discovered in Problem27,
answer the following:
a. A female fruit fly with genotype Tt nn is mated to
a
male of genotype Tt Nn. What is the probability that any one of their offspring will have normal phenotypes for both characters?
b. What phenotypes would you expect among the offspring ofthis cross? Ifyou obtained 200 progeny, how many of each phenotlpic class would you expect?
29. Considering the yellow and green pea color phenotypes studied by Gregor Mendel:
a. What is the biochemical function of the protein
that is specified by the gene responsible for the pea color phenotype?
b. A null allele of
Offspring
Parents
Purple White Purple White Spiny Spiny Smooth Smooth
spiny 94 smooth 40 34 c. purple spiny x white spiny d. purple spiny X white spiny 89 e. purple smooth X purple smooth 0 0 f. white spiny x white spiny a. purple spiny
X purple
32
28
11
b. purple spiny
X purple
0
38
0
30
0
0
92
31
27
0
36
11
45
0
t6
a gene is an allele that does not of the biochemical function that the specify any normallyprovides. Of the two alleles Y andy, gene likely to be a null allele? which is more c. In terms of the underlying biochemistry, why is the Y allele dominant to the y allele? d. Why are peas that are yy homozygotes green? e. The amount of the protein specified by a gene is roughly proportional to the number of functional
42
Chapter
2
Mendel's Principles of Heredity
copies ofthe gene carried by a cell or individual. What do the phenotypes of YY homozygotes,
Yy heterozygotes, and
f.
yy homozygotes tell
us
about the amount of the Sgr enzyme (the product ofthe pea color gene) needed to produce a yellow color? The Sgr enzyme is not needed for the survival of a pea plant, but the genomes of organisms contain many so-called essential genes needed for an individual's survival. For such genes, heterozygotes for the normal allele and the null allele
survive, but individuals homozygous for the null allele die soon after the male and female gametes, each with a null allele, come together at ferlilization. In light of your answer to part (e), what does this tell you about the advantage to an organism of having two copies of their genes? g. Do you think that a single pea pod could contain peas with different phenotypes? Explain. h. Do you think that a pea pod could be of one color (say, green) while the peas within the pod could be of a different color (say, yellow)?
32. The gene that likely controlled flower color (purple or white) in Mendel's pea plants has also been identified. The flower color gene encodes a protein called bHLH that cells require to make three different enzymes (DFR, ANS, and 3GT) that function in the pathway shown above, leading to synthesis of the purple pigment anthocyanin.
a. What is the most likely explanation for,the differ-
b.
ence between the dominant allele (P) and the recessive allele (D of the gene responsible for these flower colors? Given the biochemical pathway shown above, could a different gene have been the one governing Mendel's flower colors?
Section 2.3
33. For each of the following human pedigrees, indicate whether the inheritance pattern is recessive or dominant. What feature(s) of the pedigree did you use to determine the mode of inheritance? Give the genotypes of affected individuals and of individuals who carry the disease allele but are not affected.
Explain. 30. What would have been the outcome (the genotypic and phenotypic ratios) in the F, of Mendel's dihy-
(a)
il
brid cross shown in Fig. 2.1"5 on p. 25 if the alleles of the pea color gene
(YD
|
and the pea shape gene (R,r)
ill
did not assort independently and instead the alleles inherited from a parent always stayed together as a
IV
unit?
31. Recall that Mendel obtained pure-breeding plants with either long or short stems and that hybrids had long stems (Fig. 2.8). Monohybrid crosses produced an F, generation with a 3:1 ratio of long stems to short stems, indicating that this difference in stem length is governed by a single gene. The gene that likely controlled this trait in Mendel's plants has been discovered, and it specifies an enzyme called G3BH, which catalyzes the
(b)
|
I ilt
(c)
reaction shown below. The product of the reaction, gibberellin, is a growth hormone that makes plants
|
il
grow tall. What is the most likely hypothesis to explain the difference between the dominant allele (l) and the recessive allele (/X
ilt
34. Consider the pedigree that follows for cutis laxa, a connective tissue disorder in which the skin hangs in loose folds. a. Assuming that the trait is rare, what is the apparent mode of inheritance?
G3BH H H
Precursor
H
Gibberellin
b. What is the probability that individual II-2 is carrier?
a
Problems
c. What is the probability that individual II-3 is
Huntington disease has remained unusually prevalent there. a. Why could you not conclude definitively that the disease is the result of a dominant or a recessive al-
a
carriet?
d. What is the probability that individual
III-1
43
is
affected by the disease?
lele solely by looking at this pedigree?
b. Is there any information you could glean from the family's history that might imply the disease is due 2
34
2
to a dominant rather than a recessive allele?
4
3
35. A young couple went to see a genetic counselor because each had a sibling affected with cystic fibrosis. (Cystic fibrosis is a recessive disease, and neither mem-
ber of the couple nor any of their four parents is
39. The common grandfather of two first cousins has hereditary hemochromatosis, a recessive condition causing an abnormal buildup of iron in the body. Neither of the cousins has the disease nor do any of their relatives. a. If the first cousins mated with each other and had a child, what is the chance that the child would have hemochromatosis? Assume that the unrelated, unaffected parents of the cousins are not carriers.
affected.)
b. How would your calculation change if you knew that I out ofevery 10 unaffected people in the population (including the unrelated parents of these
is the probability that the female of this couple is a carrier? What are the chances that their child willbe affected
a. What
b.
cousins) was a carrier for hemochromatosis?
with cystic fibrosis?
c. What is the probability that their child will be a carrier ofthe cystic fibrosis disease allele?
40. People with nail-patella syndrome have poorly developed or absent kneecaps and nails. Individuals with
36. Huntington disease is a rare fatal, degenerative neurological disease in which individuals start to show symptoms in their 40s. It is caused by a dominant allele. Joe, a man in his 20s, just learned that his father has Huntington disease. a. What is the probability that Ioe will also develop the
alkaptonuria have arthritis as well as urine that darkens when exposed to air. Both nail-patella syndrome and alkaptonuria are rare phenotypes. In the following pedigree, vertical red lines indicate individuals with nail-patella syndrome, while horizontal green lines denote individuals with alkaptonuria. a. What are the most likely modes of inheritance of nail-patella syndrome and alkaptonuria? What genotypes can you ascribe to each of the individuals in the pedigree for both ofthese phenotypes? b. In a mating between IY-2 and IV-5, what is the chance that the child produced would have both nail-
disease?
b.
Joe and his new wife have been eager to start a fam-
ily. What is the probability that their first child will eventually develop the disease?
37. Is the disease shown in the following pedigree caused by a dominant or a recessive allele? Why? Based on this Iimited pedigree, do you think the disease allele is rare or common in the population? Why?
patella syndrome and alkaptonuria? Nail-patella spdrome alone? Alkaptonuria alone? Neither defect?
2 2 b
3 2
2
3
456
IV
38. Figure 2.22 on p. 32 shows the inheritance of Huntington disease in a family from a small village near Lake Maracaibo in Venezuela. The village was founded by a small number of immigrants, and generations of their descendents have remained concentrated in this isolated location. The allele for
2
3456
7
41. Midphalangeal hair (hair on top of the middle segment of the fingers) is a common phenotype caused by a dominant allele M. Homozygotes for the recessive allele (mm) Iack hair on the middle segment of their fingers.
44
Chapter
2
Mendel's Principles of Heredity
Among 1000 families in which both parents had midphalangeal hair, 1853 children showed the trait while
d. Answer (a) through (c) assuming that the couple had
209 children did not. Explain this result.
e. What is the probability that 4 of the 10 children in the family in (d) have the disease?
42. A man with HuntinSon disease (he is heterozygous HD HD+) anda normal woman have two children. a. lVhat is the probability that only the second child has the disease?
b. What is the probability that only one of the children has the disease?
c. What is the probability that none of the children has the disease?
10
children.
43. Explain why disease alleles for cystic fibrosis (CF) are recessive to the normal alleles (CF*), yet the disease alleles responsible for Huntington disease (HD) are dominant to the normal alleles (HD*).
-l
chaterS
Extensions to Mendel's Laws In this array
of green, brown, and red lentils, some of the seeds have speckled patterns, while others are clear.
chapter outline
. .
3.1 Extensions to Mendel for Single-Gene lnheritance 3.2 Extensions to Mendel for Multifactorial lnheritance
not fall traits, such as neatly into just two opposing phenotypic categories. These complex Mendelian skin and hair color, height, athletic ability, and many others, seem to defy analysis. The same can be said of traits expressed by many of the world's food crops; their size, shape, succulence, and nutrient content vary over a wide range ofvalues. Lentils (Lens culinarls) provide a graphic illustration of this variation. Lentils, a type of legume, are grown in many parts of the world as a rich source of both protein and carbohydrate. The mature plants set fruit in the form of diminutive pods that contain two small seeds. These seeds can be ground into meal or used in soups, salads, and stews. Lentils come in an intriguing array of colors and patterns (Fig. 3.1), and commercial growers always seek to produce combinations to suit the cuisines of different cultures. But crosses between pure-breeding lines of lentils result in some UNLIKETHE PEA traits that Mendel examined, most human characteristics do
startling surprises. A cross between pure-breeding tan and pure-breeding gray parents,
for example, yields an all-brown F1 generation. When these hybrids self-pollinate, the F2 plants produce not only tan, gray, and brown lentils, but also green. Beginning in the first decade of the twentieth century, geneticists subjected many kinds of plants and animals to controlled breeding tests, using Mendel's 3:1 phenotypic ratio as a guideline. If the traits under analysis behaved as predicted by Mendelt laws, then they were assumed to be determined by a single gene with alternative dominant and recessive alleles. Many traits, however, did not behave in this way. For some, no definitive dominance and recessiveness could be observed, or more than two alleles could be found in a particular cross. Other traits turned out to be multifactorial, that is, determined by two or more genes, or by the interaction of genes with the environment. The seed color of lentils is a multifactorial trait. Because such traits arise from an intricate network of interactions, they do not necessarily generate straightforward Mendelian phenotypic ratios. Nonetheless, simple extensions of Mendel's hypotheses can clarify the relationship between genotype and phenotype, allowing explanation of the observed deviations without challenging Mendel's basic laws. 45
46
Chapter
3
Extensions to Mendel's Laws
Figure 3.1 Some phenotypic variation poses a challenge to Mendelian analysis. Lentils show complex speckling patterns that are controlled by a gene that has more than two alleles.
One general theme stands out from these breeding studies: To make of the enormous phenotypic variation of the living world, geneticists usually try to limit the number of variables under investigation at any one time. Mendel did this by using pure-breeding, inbred strains of peas that differed from each other by one or a few traits, so that the action of single genes could be detected. Similarly, twentieth-century geneticists used inbred populations of fruit flies, mice, and other experimental organisms to study specific traits. Of course, geneticists cannot study people in this way. Human populations are tlpically far from inbred, and researchers cannot ethically perform breeding experiments on people. As a result, the genetic basis of much human variation remained a mystery. The advent of molecular biology in the 1970s provided new tools that geneticists now use to unravel the genetics of complex human traits, as described later in Chapters 9 and 10. sense
-.
fll
Extensions to Mendel for Single-Gene lnheritance learning objectives
1.
Categorize allele interactions as completely dominant, incompletely dominant, or codominant.
2.
Recognize progeny ratios that imply the existence recessive lethal alleles.
3.
Predict from the results of crosses whether a gene is polymorphic or monomorphic in a population.
of
William Bateson, an early interpreter and defender of Mendel, who coined the terms genetics, allelomorph (Iater shortened to allele), homozygote, and heterozygote, entrealed the audience at a 1908 lecture: "Treasure your exceptions! . . . Keep them always uncovered and in sight. Exceptions are like the rough brickwork of a growing building which tells that there is more to come and shows where the next construction is to bel'Consistent exceptions to simple Mendelian ratios revealed unexpected patterns of single-gene inheritance. By distilling the significance of these patterns, Bateson and other early geneticists extended the scope of Mendelian analysis and obtained a deeper understanding ofthe relationship between genotype and phenotype. We now look at the major extensions to Mendelian analysis elucidated over the last century.
Dominance ls Not Always Complete
one parent for the trait under consideration, the allele carried by that parent is deemed dominant to the allele carried by the parent whose trait is not expressed in the hybrid. If, for example, a mating between a pure-breeding white line and a pure-breeding blue line produces F1 hybrids that are white, the white allele of the gene for color is dominant to the blue allele. If the F, hybrids are blue, the blue allele is dominant to the white one (Fig. 3.2). Mendel described and relied on complete dominance in sorting out his ratios and laws, but it is not the only kind of dominance he observed. Figure 3.2 diagrams two situations in which neither allele of a gene is completely dominant. As the figure shows, crosses between true-breeding strains can produce hybrids with phenotypes that differ from both parents. We now explain how these phenotypes arise.
Figure 3.2 Different dominance relationships. The phenotype of the heterozygote defines the dominance relationship between two alleles of the same gene (here,Aiand A2). Dominance is complete when the hybrid resembles one of the two pure-breeding parents. Dominance is incomplete when the hybrid resembles neither parent; its novel phenotype is usually intermediate. Codominance occurs when the hybrid shows the traits from both pure-breeding parents. Type ol
Dominance Complete
Complete
lncomplete
A consistent working definition of dominance and recessiveness depends on the F1 hybrids that arise from a mating between two pure-breeding lines. If a hybrid is identical to
Codominant
A1
41
*I* A2A2
'I*I
A7A2 hybrids
47 is dominant to 42 42 is recessive to 47
ff[["J[?#Bil
'r*ffihrl!{"ry*:lm' I * \N fiox?fl,fift?,",,""
,.
to each other
3.1 Extensions to Mendel for Single-Gene
Incomplete dominance:The F1 hybrid resembles neither pure-breeding parent A cross between pure late-blooming and pure early-blooming pea plants results in an F1 generation that blooms in between the two extremes. This is just one of many examples of incomplete dominance, in which the hybrid does not resemble either pure-breeding parent. F1 hybrids that differ from both parents often express a phenotype that is intermediate between those of the pure-breeding parents. Thus, with incomplete dominance, neither parental allele is dom-
inant or recessive to the other; both contribute to the F1 phenotlpe. Mendel observed plants that bloomed midway between two extremes when he cultivated various types of pure-breeding peas for his hybridization studies, but he did not pursue the implications. Blooming time was not one of the seven characteristics he chose to analyze in detail, almost certainly because in peas, the time of bloom was not as clear-cut as seed shape or flower color. In many plant species, flower color serves as a striking example of incomplete dominance. With the tubular flowers of four-o'clocks or the floret clusters of snapdragons, for instance, a cross between pure-breeding red-flowered parents and pure-breeding white yields hybrids with pink blossoms, as if a painter had mixed red and white pigments to get pink (Fig. 3.3a). If allowed to self-pollinate, the F1 pink-blooming plants produce F2 progeny bearing red, pink, and white flowers in a ratio of l:2:l (Fig. 3.3b). This is the familiar genotypic ratio of an ordinary singlegene F1 self-cross. What is new is that because the heterozygotes look unlike either homozygote, the phenotypic ratios are an exact reflection of the genotypic ratios. The simplest biochemical explanation for this type of incomplete dominance is that each allele of the gene under
Inheritance
47
analysis specifi.es an alternative form of a protein molecule
with an enzymatic role in red pigment production.
The
"white" allele does not give rise to a functional enzyme, but the "red" allele does. Thus, in snapdragons and fouro'clocks, two "red" alleles per cell produce a double dose of a red-producing enzyme, which generates enough pigment to make the flowers look fully red. In the heterozygote, one copy of the "red" allele per cell results in only enough pigment to make the flowers look pink. In the homozygole for the "white" allele, where there is no functional enzyme and thus no red pigment, the flowers appear white.
Codominance: The
F1
hybrid exhibits traits
of both parents
A cross between pure-breeding spotted lentils and purebreeding dotted lentils produces heterozygotes that are both spotted and dotted (Fig. 3.aa). These F1 hybrids illustrate a second significant departure from complete dominance. They look like both parents, which means that neither the "spotted" nor the "dotted" allele is dominant or recessive to the other. Because both traits show up equally in the heterozygote's phenotype, the alleles are termed codominant. Self-pollination of the spotted/dotted F1 generation generates F2 progeny in the ratio of 1 spotted :
2 spotted/dotted : 1 dotted. The Mendelian I:2:l ratio
among these F2 progeny establishes that the spotted and dotted traits are determined by alternative alleles of a single gene. Once again, because the heterozygotes can be distinguished from both homozygotes, the phenotypic and genotypic ratios coincide. In humans, some of the complex membrane-anchored molecules that distinguish different tlpes of red blood cells exhibit codominance. For example, one gene (I) with
Figure 3.3 Pink flowers are the result of incomplete dominance. (a) Color differences in these snapdragons reflect the activity of one pair of alleles. (b) The Fr hybrids from a cross of pure-breeding red and white strains of snapdragons have pink blossoms. Flower colors in the F2 appear in the ratio of I red : 2 pink : 1 white. This ratio sign ifies that the alleles of a sing le gene determine these th ree colors.
(al Antirrhinum
majus (snapdragons)
(b) A Punnett square for incomplete dominance
PAAxaa Gametes F1 (all
identical)
t{.d llA
{->{1 €+& Q"qa x
a
d'Aa
A
F2
A
*€ 'Aa
# s' aa
1 AA (red)
:2 Aa (pink):
1 aa (white)
48
Chapter
3
Extensions to Mendelt Laws
Figure 3.4 ln codominance, F' hybrids display the traits of
both parents.
(a) A cross between pure-breeding spotted lentils and pure-breeding dotted lentils produces heterozygotes that are both spotted and dotted. Each genotype has its own corresponding phenotype, so the F, ratio is 1:2:1. (b) The /A and /B blood group alleles are codominant because the red blood cells of an //8 heterozygote have both kinds of sugars at their surface.
(a) Codominant lentil coat patterns
cscs
c
X
CDCD
gs
Gametes
F, (all identicat)
aa gD
I CscD
x.
d
c 6D (spotted)
:
2
€
CD
Red blood cell A
P
F1
in the F1 hybrid. Both incomplete domi-
f
The dominance relations of a gene's alleles do not affect the
a {l CDCD
(sported/dottedl : 1
CD CD
(dorted)
f f*.0, f "Mffi A
for phenotypes reflected in color variations. Determinations of dominance relationships depend on the nance
Mendel's law of segregation still holds
6D
(b) Godominant blood group alleles Blood Type
Figure 3.2 onp.46 summarizes the differences between complete dominance, incomplete dominance, and codomi-
simultaneously
csc
CScD
CC
1
ular level.
nance and codominance yield 1:2:1 F2 ratios.
C9CD
6s
F2
are
usually codominant for phenotypes analyzed at the molec-
phenotype that appears in the F, generation. With complete dominance, F, progeny look like one of the true-breeding parents. Complete dominance, as we saw in Chapter 2, results in a 3:l ratio of phenotypes in the F2. With incomplete dominance, hybrids resemble neither of the parents and thus display neither pure-breeding trait. With codominance, the phenotypes ofboth pure-breeding lines show up
{it I I {->{1
P
both alleles produce a functional gene product, they
B
AB
IAIA
M il tfrm OM
alleles 1a and IB controls the presence of a sugar polymer that protrudes from the red blood cell membrane. Each of the alternative alleles encodes a slightly different form of an en4'me that causes production of a slightly different form of the complex sugar. In heterozygous individuals, the red blood cells carry both the lA-determined and the IB-determined sugars on their surface, whereas the cells of homozygous individuals display the products of either / or 18 alone (Fig. 3.ab). As this example illustrates, when
alleles' transmission. Whether two alternative alleles of a single gene show complete dominance, incomplete dominance, or codominance depends on the kinds of proteins determined by the alleles and the biochemical function of those proteins in the cell. These same phenotypic dominance relations, however, have no bearing on the segregation of the alleles during gamete formation. As Mendel proposed, cells still carry two copies of each gene, and these copies-a pair of either similar or dissimilar alleles-segregate during gamete formation. Fertilization then restores two alleles to each cell without reference to
whether the alleles are the same or different. Variations in dominance relations thus do not detract from Mendel's laws of segregation. Rather, they reflect differences in the way gene products control the production ofphenotypes, adding a level of complexity to the tasks of interpreting the visible results of gene transmission and inferring genotype from phenotype.
A Gene May Have More Than Two Alleles Mendel analyzed "either-or" traits controlled by genes with two alternative alleles, but for many traits, there are more than two alternatives. Here, we look at three such traits: human ABO blood types, lentil seed coat patterns, and human histocompatibility antigens. ABO blood types
If a person with blood type A mates with a person with blood type B, it is possible in some cases for the couple to have a child that is neither A nor B nor AB, but a fourth blood type called O. The reason? The gene for the ABO blood types has three alleles: IA, lB, andi (Fig. 3.5a). Allele 14 gives rise to blood type A by specifying an enzyme that
3.1 Extensions to Mendel for Single-Gene
Figure 3.5 ABO blood types are determined by three alleles of one gene. (a) Six genotypes produce the four blood group phenotypes. (b) Blood serum contains antibodies against foreign red blood cell molecules. (c) lf a recipientt serum has antibodies against the sugars on a donor's red blood cells, the blood types of recipient and donor are incompatible, and coagulation of red blood cells will occur during transfusions. ln this table, a plus (+) indicates compatibility, and a minus (-) indicates incompatibility. Antibodies in the donor's blood usually do not cause problems because the amount of transfused antibody is small. (a)
Genotypes
IA
Corresponding Phenotypes: Type(s) ol Molecule on Cell
lA
A
tAi IB IB
B
tBi
(b)
tA IB
AB
II
o
B
AB
o (c)
Blood Type
of Recipient A AB
Donor Blood Type (Red Cel ls)
o
o +
+
+
+
+
+
and I generate four different phenotypes: blood groups A, B,
AB, and O. With this background, you can understand how a type A and a tlpe B parent corlld produce a type O child: The parents must be IAi and IBi helerczygotes, and the child receives an I allele from each parent. An understanding of the genetics of the ABO system has had profound medical and legal repercussions. Matching ABO blood types is a prerequisite of successful
make neither type of antibody; and O individuals produce
ABAB
B
Third, an allele is not inherently dominant or recessive; its dominance or recessiveness is always relative to a second allele. In other words, dominance relations are unique to a pair of alleles. In our example, IA is completely dominant to r, but it is codominant with 18. Given these dominance relations, the six genotypes possible with 1A, 18,
people manufacture anti-A antibodies; AB individuals
Antibodies against B Antibodies against A No antibodies against A or B Antibodies against A and B
A
49
blood transfusions, because people make antibodies to foreign blood cell molecules. A person whose cells carry only A molecules, for example, produces anti-B antibodies; B
Antibodies in Serum
Blood Type
Inheritance
+
+
+
adds sugar A, 18 results in blood type B by specifying an enzyme that adds sugar B; I does not produce a functional
sugar-adding en y^i. Alleles IA and 18 are both dominant to r, and blood type O is therefore a result of homozygosity for allele l. Note in Fig. 3.5a that the A phenotype can arise from two genotyp es, ff or fi. The same is true for the B blood type, which can be produced by FP or /1. But a combination of the two alleles .IAIB generates blood type AB. We can draw several conclusions from these observations. First, as already stated, a given gene may have more than two alleles, or multiple alleles; in our example' the series of alleles is denoted f , Iu, i. Second, although the ABO blood group gene has three alleles, each person carries only two of the alternativesIAIA, IBIB, IAIB, IAi, IBi, or ii. There are thus six possible
ABO genotypes. Because each individual carries no more than two alleles for each gene, no matter how many alleles there are in a series, Mendel's law of segregation remains intact, because in a sexually reproducing organism' the two alleles of a gene separate during gamete formation.
both anti-A and anti-B antibodies (Fig. 3.5b). These antibodies cause coagulation of cells displaying the foreign molecules (Fig. 3.5c). As a result, people with blood tlpe O have historically been known as universal donorsbecatse
their red blood cells carry no surface molecules that will stimulate an antibody attack in a transfusion recipient' In contrast, people with blood type AB are considered universal recipients, because they make neither anti-A nor anti-B antibodies, which, if present, would target the surface molecules of incoming blood cells. Information about ABO blood types can also be used as legal evidence in court, to exclude the possibility of paternity or criminal guilt. In a paternity suit, for example, if the mother is type A and her child is type B, logic dictates that the /8 allele must have come from the father, whose genotype may be fP, IgIB, or IBi.In 1944, the actress Joan Barry (phenotype A) sued Charlie Chaplin (phenotype O) for support of a child (phenotype B) whom she claimed he fathered. The scientific evidence indicated that Chaplin could not have been the father' since he *u, upput.tttly ii and did not carty an IB allele. This evidence was admissible in court, but the jury was not convinced, and Chaplin had to pay. Today, the molecular genotyping of DNA (DNA fingerprinting see Chapter l0) provides a powerful tool to help establish paternity, guilt' or innocence, but juries still often find it difficult to evaluate such evidence.
Lentil seed coat patterns Lentils offer another example of multiple alleles. A gene for seed coat pattern has five alleles: spotted, dotted, clear (pattern absent), and two types of marbled. Reciprocal crosses between pairs ofpure-breeding lines ofall patterns (marbled-l X marbled-2, marbled-l X spotted, marbled-2 X spotted, and so forth) have clarified the dominance relations
50
Chapter
3
Extensions to Mendel's Laws
of all possible pairs of the alleles to reveal a dominance series in which alleles are listed in order from most dominant to most recessive. For example, crosses of marbled-1 with marbled-2, or of marbled-l with spotted or dotted or clear, produce the marbled-1 phenotype in the F1 generation and a ratio of three marbled-l to one of any of the other phenotypes in the F2. This indicates that the marbled-1 allele is completely dominant to each of the other four alleles. Analogous crosses with the remaining four phenotypes reveal the dominance series shown in Fig. 3.6. Recall that
Figure 3.6 How to establish the dominance relations between multiple alleles. Pure-breeding lentils with different seed coat patterns are crossed in pairs, and the F., progeny are self-fertilized to produce an F2 generation. The 3:1 or 1 :2:1 F, monohybrid ratios from all of these crosses indicate that different alleles of a single gene determine all the traits. The phenotypes ofthe Fr hybrids establish the dominance relationships (bottom). Spotted and dotted alleles are codominant, but each is recessive to the marbled alleles and is dominant to clear.
Parental Generation
F1
Parental seed coat pattern in cross
Generation
F2
F1 phenotype
frequencies
Parentl X Parent2
o{loo x
clear *marbled-1
tflot
marbled-1
marbled-2X clear
r
dotted
*
+marbled-2-
and
phenotypes
798
123
Apparent phenotypic ratio
fl fl 296
3:1
46
3:1
107
3:1
fl ,.s ,$ {l fl{, {l {l
X clear _*
spotted
Generation Total F,
x clear +
spotied
dolted
--*283
.*
1706
+
marbled-1""--* 272
-- marbled-1-*
X
spotted
marbled-1
X
dotted
---
marbled-1-*
1597
marbled-2
x
dotted
.*marbled-2----*
182
spotted
x
dotted*spqtted/dotted*168
marbled-1
ofltil
499
522
3:1
72
3:1
3:1
147
fl
a {l I I $ S C'rtfl o t nfl€t 3:1
549
{,
3:1
70
339
157
to a second allele (marbled-1) but dominant to a third and fourth (dotted and clear). The fact that all tested pairings oflentil seed coat pattern alleles yielded a 3:l ratio in the F2 generation (except for spotted X dotted, which yielded the 1:2:1 phenotypic ratio reflective of codominance) indicates that these lentil seed coat patterns are determined by different alleles of the same gene.
Histocompatibility in humans In some multiple allelic series, each allele is codominant with every other allele, and every distinct genotype therefore produces a distinct phenotype. This happens particularly with traits defined at the molecular level. An extreme example is the group of three major genes that encode a family of related cell surface molecules in humans and
other mammals known as histocompatibility antigens. Carried by all of the body's cells except the red blood cells and sperm, histocompatibility antigens play a critical role in facilitating a proper immune response that destroys intruders (viral or bacterial, for example) while leaving the body's own tissues intact. Because each of the three major histocompatibility genes (called HLA-A, HLA-&, and HLA-C in humans) has between 400 and 1200 alleles, the number of possible allelic combinations creates a powerful potential for the phenotypic variation of cell surface molecules. Other than identical (that is, monozygotic) twins, no two people are likely to carry the same array of histo-
compatibility antigens on the surfaces of their cells.
Mutations Are the Source of New Alleles
o IC o a o .$oc ;t
marbled- 1x marbled-2
dominance relations are meaningful only when comparing
two alleles: An allele, such as marbled-2, can be recessive
1
:2:1
Dominance series: marbled-1 > marbled-2 > spotted = dotted > clear
How do the multiple alleles of an allelic series arise? The answer is that chance alterations of the genetic material, known as mutations, arise spontaneously in nature. Once they occur in gamete-producing cells, they are faithfully inherited. Mutations that have phenotypic consequences can be counted, and such counting reveals that they occur at low frequency. The frequency ofgametes carrying a new mutation in a particular gene varies anywhere from 1 in 10,000 to 1 in 1,000,000. This range exists because different genes have different mutation rates.
Mutations make it possible to follow gene transmission. If, for example, a mutation specifies an alteration in an enzyme that normally produces yellow so that it now makes green, the new phenotype (green) will make it possible to rccognize the new mutant allele. In fact, it takes at least two alleles, that is, some form of variation, to "see" the transmission of a gene. Thus, in segregation studies, geneticists can analyze only genes with variants; they have no way of following a gene that comes in only one form. If all peas were yellow, Mendel would not have been able
3.1 Extensions to Mendel for Single-Gene
to decipher the transmission patterns of the gene for the seed color trait. We discuss mutations in greater detail in Chapter
7.
Allele frequencies and monomorphic genes Because each organism carries two copies of every gene, you can calculate the number of copies of a gene in a given population by multiplying the number of individualsby 2. Each allele ofthe gene accounts for a percentage ofthe total
Inheritance 5l
Figure 3.7 The mouse agouti genei One wild-type allele, many mutant alleles. {a) Black-backed, yellow-bellied (top left\;black (top right); and agouti (bottom) mice. (b) Genotypes and corresponding phenotypes for alleles ofthe agouti gene. (c) Crosses between purebreeding lines reveal a dominance series. lnterbreeding ofthe F1 hybrids (not shown) yields 3:1 phenotypic ratios of F2 progeny, indicating that 4 at,and a are in fact alleles ofone gene.
(a) Mus muscurus (house mouse) coat colors
number of gene copies, and that percentage is known as the allele frequency. The most common alleles in a population are usually called the wild-type alleles, often designated by superscript plus sign (*).Rtt allele is considered wild-tlpe is present in the population at a frequency greater than lo/o. A rare allele in the same population is considered a mutant allele. (A mutation is a newly arisen mutant allele.) In mice, for example, one of the main genes determining coat color is lhe agouti gene. The wild-t1pe allele (A) produces fur with each hair having yellow and black bands that blend together from a distance to give the appearance of dark gray, or agouti. Researchers have identified in the laboratory 14 distinguishable mutant alleles for lhe agouti gene. One of these (ar) is recessive to the wild type and gives rise to a black coat on the back and a yellow coat on the belly; another (a) is also recessive to A and produces a pure black coat (Fig. 3.7). In nature, wild-type agoutis (AA) survive to reproduce, while very few black-backed or pure black mutants (atat or aa) do so because their dark coat makes it hard for them to evade the eyes of predators. As a result, A is present at a frequency of much more than 99% and is thus the only wild-type allele in mice for the agouti gene. A gene with only one common, wild-type allele is a
ifit
(b) Alleles ol the agoufi gene Genotype Phenotype
(c) Evidence for
agouti AA
A_
agouti
atal
black/yellow
aa
black
ata
black/yellow
belly
atat
agouti Aat
black
agouti
a
dominance series
."
black back/yel
monomorphic.
black back/yel
geneticists instead usually refer to the high-frequency alleles
alal
of a polyrnorphic gene as common variants. Certain rare genes
)
agouti AA
Allele frequencies and polymorphic genes In contrast, some genes have more than one common allele, which makes them polymorphic. For example, in the ABO blood type system, all three alleles-IA, IB, andl-have appreciable frequencies in most human populations. Although all three of these alleles can be considered to be wild-type' are so polymorphic that hundreds of allelic variants can be found in populations. We have already discussed the case of lhe HLAhistocompatibility genes in humans, which encode cell surface proteins that help the immune system deal with pathogenic invaders such as bacteria and viruses. Some scientists think that evolution favors the emergence of new HLA gene alleles to ensure that no single pathogen among the many to which we are exposed in the environment could destroy the entire human population. That is, at least a few individuals with particular histocompatilibity gene alleles would be protected from any given pathogen.
X
belly X
Aa
black aa
.
black back/yellow belly +' aq
Dominance series: A > at > a
One Gene May Contribute to Several Characteristics Mendel derived his laws from studies in which one gene determined one trait; but, always the careful observer, he himself noted possible departures. In listing the traits selected for his pea experiments, Mendel remarked that specific seed coat colors are always associated with specific flower colors.
52
Chapter
3
Extensions to Mendel's Laws
The phenomenon of a single gene determining a num-
ber of distinct and seemingly unrelated characteristics is known as pleiotropy. Because geneticists now know that each gene determines a specific protein and that each protein can have a cascade of effects on an organism, we can understand how pleiotropy arises. Among the aboriginal Maori people of New ZeaIand, for example, many of the men develop respiratory problems and are also sterile. Researchers have found that the fault lies with the recessive allele of a single gene. The gene's normal dominant allele specifies a protein necessary for the action of cilia and
and therefore AA homozygotes as expected. There
are,
however, no pure-breeding yellow mice among the progeny. When the yellow mice are mated to each other, they
unfailingly produce 213 yellow and ll3 agouti offspring, a ratio of 2:1, so they must therefore be heterozygotes. In short, one can never obtain pure-breeding yellow mice. How can we explain this phenomenon? The Punnett square in Fig. 3.8b suggests an answer. TWo copies of the H allele prove fatal to the animal carrying them, whereas one copy of the allele produces a yellow coat. This means
reproduction. Because most proteins act in a variety of
that the H allele affects two different traits: It is dominant to A in the determination of coat color, but it is recessive to A in the production of lethality. An allele, such as N, that negatively affects the survival of a homozygote is known as a recessive lethal allele. Note that the same two alleles (l1 and, A) can display different dominance relationships when looked at from the point of view of different phenotypes; we return later to this important point.
tissues and influence multiple biochemical processes, mutations in almost any gene may have pleiotropic effects.
it is easy to detect carriers of this particular recessive lethal
flagella, both of which are hairlike structures extending from the surfaces of some cells. In men who are homozygous for the recessive allele, cilia that normally clear the airways fail to work effectively, and flagella that normally propel sperm fail to do their job. Thus, one gene determines a protein that affects both respiratory function and
Because the ,41 allele is dominant for yellow coat color,
Recessive lethal alleles
Figure 3.8 Av: A recessive lethal allele that also produces
A significant variation of pleiotropy occurs in alleles that
a dominant coat color phenotype. (a) A cross between inbred
not only produce
visible phenotype but also affect viability. Mendel assumed that all genotypes are equally viablethat is, they have the same likelihood of survival. If this were not true and a large percentage of, say, homozygotes for a particular allele died before germination or birth, you would not be able to count them after birth, and this would alter the 1:2:1 genotypic ratios and the 3:1 phenotypic ratios predicted for the F, generation. Consider the inheritance of coat color in mice. As a
mentioned earlier, wild-type agouti (AA) animals have black hairs with a yellow stripe that appear dark gray to the eye. One of the 14 mutant alleles of the agouti gene gives rise to mice with a much lighter, almost yellow color. When pure-breeding AA mice are mated to yellow mice, one always observes a 1:1 ratio of the two coat colors among the offspring (Fig. 3.8a). From this result, we can draw three conclusions: (1) All yellow mice must carry the agouti allele even though they do not express the agouti phenotype; (2) yellow is therefore dominant to agouti; and (3) all yellow mice are heterozygotes.
agouti mice and yellow mice yields a 1:1 ratio of yellow to agouti progeny. The yellow mice are therefore AvAheterozygotes, and for the trait of coat color, Av (for yellow) is dominant to A (for agouti). (b) Yellow mice do not breed true. ln a yellow x yellow cross, the 2:1 ratio ofyellow to agouti progeny indicates that the Av allele is a recessive lethal.
(a) All yellow mice are heterozygotes.
P
AA
Nn
X
F1
4r A
A
Nn
(b) Two copies of Av cause lethality.
Note again that dominance and recessiveness are defined in the context ofeach pair ofalleles. Even though, as previously mentioned, agouti (A) is dominant to the at and a mutations for black coat color, it can still be recessive to the yellow coat color allele. If we designate the allele for yellow as N, the yellow mice in the preceding cross are ,4A helerozygotes, and the agoutis, AA homozygotes. So far, no surprises. But a mating of yellow to yellow produces a skewed phenotypic ratio of two yellow mice to one agouti (Fig. 3.8b). Among these progeny, matings between agouti mice show that the agoutis are all pure-breeding
P
Nn
X
AYA
Fl
4r
4! A = not born
A AYA
AYA
3.1 Extensions to Mendel for Single-Gene
allele in mice. Such is not the case, however, for the vast majority of recessive lethal mutations, as these usually do not simultaneously show a visible dominant phenotlpe for some other trait. Lethal mutations can arise in many different genes, and as a result, most animals, including humans, carry some recessive lethal mutations. Such mutations usually remain "silentl' except in rare cases of homozygosity, which in people are often caused by consanguineous matings (that is, matings between close relatives)' If a mutation produces an allele that prevents production of a crucial molecule, homozygous individuals will not make any of the vital molecule and will not survive. Heterozygotes, by contrast, with only one copy of the deleterious mutation and one wild-type allele, can produce 5070 of the wild-type amount of the normal molecule; this is usually sufficient to sustain normal cellular processes such that life goes on.
Delayed lethality In the preceding discussion, we have described recessive alleles that result in the death of homozygotes prenatally; that is, in utero. With some mutations, however, homozygotes may survive beyond birth and die later from the deleterious consequences of the genetic defect. An example is seen in human infants with Tay-Sachs disease. The seemingly normal newborns remain healthy for five to six months but then develop blindness, paralysis, mental retardation, and other symptoms of a deteriorating nervous system; the disease usually proves fatal by the age of six. Tay-Sachs disease results from the absence of an active lysosomal enzyme called hexosaminidase A, leading to the accumulation of a toxic waste product inside nerve cells' The approximate incidence of Tay-Sachs among live births is 1/35,000 worldwide, but it is 1/3000 among Jewish people of Eastern European descent. Reliable tests that detect carriers, in combination with genetic counseling and educational programs, have all but eliminated the disease in the United States.
Recessive alleles causing prenatal
Complete dominance
or early childhood
by heterozygous carriers, because affected homozygotes die before they can mate. However, for late-onset diseases causing death in adults, homozygous patients can pass on the lethal allele before they become debilitated. An example is provided by the degenerative disease Friedreich ataxia: Some homozygotes first display symptoms of ataxia (loss of muscle coordination) at age 30-35 and die about fi.ve years later from heart failure. Dominant alleles causing late-onset lethality can also be transmitted to subsequent generations; Figure 2.22 on p. 32 illustrates this fact for the inheritance of Huntington disease. By contrast, if the lethality caused by a dominant allele occurs instead during fetal development or early childhood, the allele will not be passed on, so all dominant early lethal mutant alleles must be new mutations. Table 3.1 summarizes Mendel's basic assumptions about dominance, the number and viability of one gene's alleles, and the effects ofeach gene on phenotype, and then compares these assumptions with the extensions contributed by his twentieth-century successors. Through carefully controlled monohybrid crosses, these later geneticists analyzed the transmission patterns of the alleles of single genes, challenging and then confirming the law of segregation.
A Comprehensive Example: Sickle-Cell Disease lllustrates Many Extensions
to Mendel's Analysis Sickle-cell disease is the result of a faulty hemoglobin molecule. Hemoglobin is composed of two types of polypeptide chains, alpha (ct)-globin and beta (p)-globin, each specified by a different gene: Hba for ct-globin and HbB for p-globin. Normal red blood cells are packed full of millions upon millions of hemoglobin molecules, each of which picks up oxygen in the lungs and transports it to all the bodyt tissues.
Extension's Effect on Ratios Resulting from an Ft x Ft Cross
Extension
Heterozygous Phenotype
lncomplete dominance
Unlike either homozygote
Phenotypes coincide with genotypes in a ratio of 1:2:1
Codominance
Two alleles
Multiple alleles
Multiplicity of phenotypes
A series of 3:1 ratios
All alleles are equally viable
Recessive lethal alleles
Heterozygotes survive but may have visible phenotypes
2:1 instead of 3:1
One gene determines one
Pleiotropy: One gene infl uences several traits
Several traits affected in
Different ratios, depending on dominance relations for each affected trait
trait
53
lethality can only be passed on to subsequent generations
Extension's Effect on
What Mendel Described
Inheritance
different ways, depending on dominance relations
54
Chapter
3
Extensions to Mendel's Laws
Multiple alleles
and easily broken. Consumption of fragmented cells by phagocytic white blood cells leads to a low red blood cell
The B-globin gene has a normal wild-type allele (HbBA) that gives rise to fully functional B-globin, as well as close to 400 mutant alleles that have been identified so far. Some of these mutant alleles result in the production of hemoglobin that carries oxygen only inefficiently. Other mutant alleles prevent the production of p-globin, causing a hemolytic (blood-destroying) disease called B-thalassemia. Here, we discuss the most common mutant allele of the B-globin gene, HbBs, which specifies an abnormal polypeptide that causes sickling of red blood cells (Fig. 3.9a).
count, a condition called anemiq.
On the positive side, HbBs HbBs homozygotes
are
resistant to malaria, because the organism that causes the disease, Plasmodium falciparum, can multiply rapidly in
normal red blood cells, but cannot do so in cells that sickle. Infection by P. falciparum causes sickle-shaped cells to break down before the malaria organism has a chance to multiply. Recessive lethality
Pleiotropy
People who are homozygous for the recessive HbBs allele often develop heart failure because ofstress on the circulatory system. Many sickle-cell sufferers die in childhood, adolescence, or early adulthood.
The HbBs allele of the B-globin gene affects more than one
trait (Fig. 3.9b). Hemoglobin molecules in the red blood cells of homozygous HbB" HbB' individuals undergo an aberrant transformation after releasing their oxygen. Instead of remaining soluble in the cltoplasm, they aggregate to form long fibers that deform the red blood cell from a
Different dominance relations Comparisons of heterozygous carriers of the sickle-cell allele-individuals whose cells contain one HbBA and one HbBs allele-with homozygous HbBA HbBA (normal) and homozygous HbBs HbBs (diseased) individuals make it
normal biconcave disk to a sickle shape (see Fig. 3.9a). The deformed cells clog small blood vessels, reducing oxygen flow to the tissues and giving rise to muscle cramps, shortness of breath, and fatigue. The siclded cells are also fragile
Figure 3.9 Pleiotropy of sickle-cell anemia: Dominance relations vary with the phenotype under consideration.
(a) A normat red blood cell (top) is easy to distinguish from the sickled cell in the scanning electron micrograph at the bottom. (b) Different levels of analysis identify various phenotypes. Dominance relationships between the HbBs and HbBA alleles of the Hbp gene vary with the phenotype and sometimes even change
with the environment.
Phenotypes at Different Levels of
Normal
Analysis
(D
Red blood cell shape at sea level
Normal Red blood cell concentration at sea level Normal p-globin polypeptide production
Red blood cell shape at high altitudes
Carrier
HbBA HbpA
o
HbPA
o
o
ffi
Hbys
(D aa
Diseased HbPs Hbqs
E>/
o
Sickled cells -t
Normal
Normal
ffi
Lower
c oo .E0 €o ?ec"e I o o oa (44 4n) Ool !
Red blood cell concentration at high altitudes
Sickled cells presenr
Normal
ffi
Lower
ffi,
Very low. anemia
HbPA and HbBs show incomplete dominance
HbBs;r dominant HbBA is recessive Normal susceptibility
(b)
HbBA and Hbps are codominant
Severe sickling
Susceptibility to malaria
(a)
HbBn is dominant HbBs is recessive
.g
o
Normal
Dominance Relations at Each Level of Analysis
Resistant
Resistant
3.2 Extensions to Mendel for Multifactorial
possible to distinguish different dominance relationships
is dominant to HbBs). When oxygen cave shape sickling occurs in some HbP" HbB" however, levels drop, are codominant)' During World (HbBA HbBs and cells who were heterozygous carriers soldiers War II, certain in transport planes to cross the airlifted and who were crises for this reason. sickling Pacific experienced resistance to malaria, the of trait the Considering HbBA allele. The reason is the to HbBs allele is dominant to malaria because they resistant are that HbBA HbBs cells has a chance to organism malarial break down before the previdescribed cells HbBs just HbBs the like reproduce, phenotypes the for the^heterozygote, ously. But luckily for of anemia or death, HbBs ls ,ecetti,e to HbBA. A corollary
of this observation is that in its effect on general health under normal environmental conditions and its effect on red blood cell count, the HbBA allele is dominant to HbBs ' Thus, for the B-globin gene, as for other genes, dominance and recessiveness are not an inherent quality of alleles in isolation; rather, they are specific to each pair of alleles and to the level of physiology at which the phenotype is examined. When discussing dominance relationsinps, it is therefore essential to define the particular phenotype under analysis. The complicated dominance relationships between the HbBA and HbBs alleles help explain the puzzling observation that the normally deleterious allele HbBs is widespread in certain populations. In areas where malaria is endemic, heterozygotes are better able to survive and pass on their genes than are either type of homozygote' HbBs HbBs individuals often die of sickle-cell disease, while those with the genotype H\PA H\BA often die of malaria. Heterozygotes, however, are relatively immune to both conditions, so high frequencies ofboth alleles persist
environments where malaria is found. We explori this phenomenon in more quantitative detail in Chapter 20 on population genetics.
in tropical
55
.
New alleles of a gene arise by mutation. Alleles with a frequency greater than 10lo in a population are wild-type;
.
When two or more wild-type alleles (common variants) exist for a gene, the gene is polymorphic; a gene with only one wild-type allele is monomorPhic.
.
ln pleiotropy, one gene contributes to multiple traits. The dominance relationship between any two alleles can vary depending on the trait. Homozygotes for a recessive lethal allele that fails to provide an essential function will die.
ior different phenotypic aspects of sickle-cell anemia (Fig. 3.eb). At the molecular level-the production of B-globinboth alleles are expressed such that UbBA and HbBs ate codominant. At the cellular level, in their effect on red blood cell shape, Ihe HbBA and HbBs alleles show either complete dominance or codominance depending on altitude. Under normal oxygen conditions, the great majority red blood cells have the normal biconof a heterozygote's '(nUBo
Inheritance
alleles that are less frequent are mutant.
. .
lf a recessive lethal allele has dominant effects on a visible phenotype, two-thirds of the surviving progeny of a cross between heterozygotes will display this phenotype.
ff,|
Extensions to Mendel
for Multifactorial lnheritance learning obiectives 'l
.
Conclude from the results of crosses whether a single gene or multiple genes control a trait.
2.
lnfer from the results of crosses the existence of interactions between alleles of different genes including: complementation, epistasis, and redundancy.
3.
Recognize when a given genotype in different individuals does not correspond to the same phenotype. Explain how continuous traits, like human height and skin color, are controlled by multiple alleles of multiple genes.
4.
Although some traits are indeed determined by allelic variations of a single gene, the vast majority of common traits in all organisms are multifuctorial, arising from the action of two or more genes, or from interactions between genes and the environment. In genetics, the term environmenthas an unusually broad meaning that encompasses all aspects of the outside world an organism comes into contact with. These include temperature, diet, and exercise as well as the uterine environment before birth. In this section, we examine how geneticists used breeding experiments and the guidelines of Mendelian ratios to analyze the complex network of interactions that give rise
to multifactorial traits.
essential concePts
.
Two alleles of a single gene may exhibit complete dominance, in which heterozygotes resemble the homozygous dominant parent; incomplete dominance, in which heterozygotes have an intermediate phenotype; and codominance, in which
heterozygotes display aspects of each homozygous phenotype.
Two Genes Can lnteract
to Determine OneTrait Two genes can interact in several ways to determine a single trait,iuch as the color of a flower, a seed coat, a chicken's feathers, a dog's fur, or the shape of a plant'.s leaves. In a
56
Chapter
3
Extensions
to Mendel's Laws
dihybrid cross like Mendel's, each type of interaction produces its own signature of phenotypic ratios. In the following examples, the alternate alleles of each of the two genes are completely dominant (such as A andB) and re, cessive (a and b). For simplicity, we sometimes refer to a gene name using the symbol for the dominant allele, for example, gene A. In addition, we refer to the protein product of allele A as protein A (no italics), and when appropriate, that ofallele a as protein a (no italics).
Novel phenotypes resulting from gene interactions
In the chapter opening, we described a mating of tan and gray lentils that produced a uniformly brown F, generation and then an F2 generation containing brown, tan, gray, and green lentil seeds. An understanding of how this can happen emerges from experimental results demonstrating that the ratio of the four F2 colors is 9 brown : 3 tan : 3 giay : I green (Fig. 3.f0a). Recall from Chapter 2 that this is the same ratio Mendel observed in his analysis of the F2 generations from dihybrid crosses following two independently assorting genes. In Mendel's studies, each of the four classes consisted ofplants that expressed a combination of two unrelated traits. With lentils, however, we are looking
at a single trait-seed color. The simplest explanation for the parallel ratios is that a combination of genotypes at two independently assorting genes interacts to produce the phenotype of seed color in lentils. Results obtained from self-crosses with the various types ofFr lentil plants support this explanation. Self-crosses of F, green individuals show that they are pure-breeding, producing an F3 generation that is entirely green. Tan individuals generate either all tan offspring, or a mixture of tan offspring and green offspring. Grays similarlyproduce either all gray, or gray and green. Self-crosses of brown F, individuals can have four possible outcomes: all brown, brown plus tan, brown plus gray, or all four colors (Fig. 3.10b). The two-gene hypothesis explains why there is
. .
only one green genotlpe: pure-breedin g aa bb, but two types of tans: pure-breedi ng AA bb as well as tanand green-producing Aa bb, and two types ofgrays: pure-breedingaaBB andgray- and green-producing aa Bb, yet four types of browns: true-bree ding AA BB, brownand tan-produ cing AA Bh brown- and gray-producing Aa BB, and Aa Bb dihybrids that give rise to plants producing lentils of all four colors.
. .
Figure 3.10 How two genes interact to produce seed colors in lentils.
(a) tn a cross of pure-breeding tan and gray lentils, all the F, hybrids progeny. The 9:3:3:1 ratio of F, phenotypes suggests-that seed coat color is determined by two independently assorting 9enes. (b) Expected results of selfing individual F2 plants of the indicatei phenollipes to produce an F, generation, if seed coat color results from the interaction oftwo genes. The third column shows the proportion ofthe F, population that would be expeited to produce the observed F3 phenotypes. (c) Other two-generation crosses involving pure-breeding parental lines also support the two-gene hypothesis. ln this table, the F., hybrid generation has been omitted. are brown, but four different phenotypes appear among the
(a) A dihybrid cross with lentil coat
X
AAbb
P
Gametes
F1 (all
identicatl
colors
Phenotypes
of F, lndividual
AA BB
I
AB
Ab
eo na
@) Self-pollination of the
€l
I
Q
F2
eO
d na
to produce an
Observed F, Phenotypes
F3
Expected Proportion of F, Population*
Green
Green
1/16
Tan
Tan
Tan
Tan, green Gray, green Gray Brown Brown, tan Brown, gray
1t16 2t16
Gray Gray Brown Brown Brown Brown
eA
F2
.This
Brown, gray, tan, green
2116
1/16 1116
2t16 2t16 4t16
1:1'.2:2:1:1:2:2:
4 F2 genotypic ratio corresponds to a 9 brown :3 tan : 3 gray :1 green F2 phenotypic ratio.
AB
F2
AB 3
A- bb (tan)
Ab
aB
ab
a o a c f, a g a rl (D e a *l a a
qA Bh Aa BE 4a Bb
Ab
qA Bb AA bb Aa Bl. Aa bb
AB
Aa BB Aa
ab
Bt aa BE aa Bl
Ae Bh Aa bb aa Bb aa bh
(c) Sorting out the dominance relations by select crosses Seed Coat Color of
Pure-Breeding Parents
Tan
X
green
x green Brown x gray Gray
Brown Brown
X tan x green
F, Phenotypes and Frequencies
Ratio
231 tan, 85 green
3:1
2586 gray, 867 green
3:1
964 brown, 312gray
3:1
255 brown, 76 tan
3:1
57 brown, 18 gray, 13 tan, 4 green
9:3:3:
1
3.2 Extensions to Mendel for Multifactorial
In short, for the two genes that determine seed color, both dominant alleles must be present to yield brown (A- B-); the dominant allele of one gene produces tan (A- bb);the dominant allele of the other specifies gray (aa B-); and the complete absence of dominant alleles (that is, the double recessive) yields green (aa bb)' Thus, the four color phenotypes arise from four genotypic classes, with each class defined in terms ofthe presence or absence ofthe dominant alleles of two genes: (1) both present (A- B-), (2)
one present
(A- bb), (3) the other
only with a two-gene system in which the dominance and recessiveness of alleles at both genes is complete can the nine different genotypes ofthe F, generation be categorized into the four phenotypic classes described. With incomplete dominance or codominance, the Fz genotypes could not be grouped together in this simple way' as they would give rise to more than four phenotlpes. Further crosses between plants carrying lentils of different colors confirmed the two-gene hypothesis (Fig. 3.f 0c). Thus, the 9:3'.3:l phenotlpic ratio of brown to tan to gray to green in an F, descended from pure-breeding tan and pure-breeding gray lentils tells us not only that two genes issorting independently interact to produce the seed color, but also that each genotypic class (A- B-, A- bb, aa B-, and aa bb) determines a particular phenotype. The genes controlling lentil seed color have not been
identified at the molecular level, and the biochemical pathways in which they function are not known. However, information available about the mechanisms of seed color inheritance in other plant species allows us to formulate a plausible model for this system in lentils (Fig. 3.lf). The model illustrates the important point that the 93;3'.I ratio generally implies that the two genes controlling the same trait not only assort independently, but they also function in independent biochemical pathways.
and an inner layer (the cotyledon). The green chlorophyll in the cotyledon is not visible if the seed coat is colored' Allele A encodes enzyme A. Allele a does not produce this enzyme. Allele I of a second gene encodes a different enzyme; b produces none ofthis enzyme. Seeds appear brown if the tan and gray pigments are both present. ln the absence of both enzymes (aa bb), the seed coat is unpigmented, so the green chlorophyll in the cotyledon will show through. The 9:3:3:1 ratio implies that the A and B genes operate in independent biochemical pathways.
AA, Aa
+ Colorless precursor
Enzyme A
lines of pure-breeding white-flowered sweet peas (Fig. 3.f 2).
Quite unexpectedly, all of the Fr progeny were purple. Selfpollination of these novel hybrids produced a ratio of 9 purple : 7 white in the F2 generation. The explanation? Two genes work in tandem to produce purple sweet-pea flowers, and a dominant allele of each gene must be present to produce that color. A simple biochemical hypothesis for this type of complementary gene action is shown in Fig. 3.f 3. Because it takes
?+n pigment
1
Colorless precursor 2
Enzyme B
Gray pigment
e
i
BB, Bb AA, Aa
I
Enzyme A
Colorless precursor
1
Colorless precursor
2--
?+ri pig*:ent No gray pigment
^.
c
No enzvme B
t
bb aa
I
Colorless precursor
No tan pigment
1
Colorless precursor 2
Enzyme B
Gray Bigment
t
fl
BB, Bb
I
Complementary gene action In some two-gene interactions, the four F2 genotypic classes produce fewer than four observable phenotypes, because some of the phenotypes include two or more genotypic classes. For example, in the first decade of the twentieth century, William Bateson conducted a cross between two
57
Figure 3.1 1 A biochemical model for the inheritance of lentil seed colors. The seed has an opaque outer layer (the seed coat)
present (aa B-), and
(4) neither present(aa bb). Note that the A- notation means that the second allele of this gene can be either A or a, while B- denotes a second allele of either B or b. Note also that
Inheritance
Colorless precursor
No tan pigment
1
3,?:.1,::?2=x-
No gray pigment
o
No enzyme B
t
bb
two enzymes catalyzing tvvo separate biochemical reactions to change a colorless precursor into a colorful pigment, only the A- B- geno\pic class, which produces active forms of both required enzymes, can generate colored flowers. The
58
Chapter
3
Extensions to Mendel's Laws
Figure 3.12 Complementary gene action generates color in sweet peas. purple to white F, plants indicates that at least one dominant allele of each gene
(a) Lathyrus odolatus(sweet peas)
(a)
white and purple sweet pea flowers. (b) The 9:7 ratio of
is necessary for
the development of purple color.
(b) A dihybrid cross involving complementary gene action
i.n
' ,-,'}
X'
iE>e
E, E e
Gene
B
B>
B
b
b
Gene A
AY>a*>a'>a
Av Q,
a' a Gene K
Gene D
Kb
> I?'>
D>d
l1
Gene M
5)sP M|:M2
Black mask on fawn or brindle Eumelanin (dark) and pheomelanin (yellow) pigments Only pheomelanin (cream, tan, red) Black: eumelanin deposited densely Brown: eumelanin deposited less densely
Fawn (lots of light pigment on hair) Agouti (light stripe on dark hair) Tan belly (only hairs on belly have some light pigment) Black or brown (no light stripe on hairs)
kb,
Solid color Brindled
LY
Gene A markings expressed normally
D
Colors not dilute Colors dilute
Kb
d Gene 5
Coat Color and ,Pattern Phenotypes
of Alleles
sp
No white markings Colored patches on white background
MI
Coat color diluted (homozygote has various health problems)
M2
Normal color
5
73
pigment called pheomelanin in each hair, and also where on the body light-colored hairs appear. Gene A specifies a protein that antagonizes the function of protein E; in the absence of protein E activity, pigment production switches from eumelanin to pheomelanin. In dogs, four alleles of the A gene form a dominance series (Table 3.3). The different alleles direct the pigment switch with different efficiencies or in different parts of the body. Unlike mice, dogs may be homozygous NN, and they are an overall light brown color called fawn, because the hairs contain a lot of pheomelanin in addition to eumelanin. The a' allele (like the A allele in mice) gives the dogs an overall gray (agouti) color. In an agouti dog (or mouse), the hairs are mainly black with a single yellow stripe. The at allele results in lighter hair on the belly and solid dark hairs on the back. The a alIele results in no pheomelanin in the hair at all. Interestingly, ee (with no eumelanin) is epistatic to N, a*, and at because these A gene phenotypes require the hair to have some dark pigment. Dogs that are aa ee are white, as neither eumelanin nor pheomelanin is produced. Three alleles of the K gene determine whether or not the A gene markings are visible, and we will consider two ofthem here. Gene K encodes a protein required for protein E to work. The d alele specifies a version of protein K that causes protein E to make eumelanin all of the time, so that pheomelanin is never produced regardless of whether or not protein A is present. If an E- dog also has a .d alele, the dog is solid black regardless of its A gene alleles: IC is epistatic to all alleles of gene A. In contrast, the protein made by the Il allele allows protein A to inhibit protein E sometimes, permitting pheomelanin production.
Dominance Series
Gene E
Inheritance
74
Chapter
3
Extensions to Mendelt Laws
As a result, lCl{ homozygotes express the A gene markings
normally.
Figure 3.29 Dog color pattern is a polygenic trait. Three kinds of color patterns are shown along with the major alleles that determine the phenotype.
Gene D controls deposition of all pigments Recessive alleles of gen e D (dd) dilute the colors dictated by
the other genes, while the dominant allele D does not. For example, an E- B- dd dog would be light black, and an E- bb dd dog would be light brown.
Genes 5 and M control spotting Dogs homozygous for the recessive allele of gene S (that is, spsp) are white with areas of color (controlled by other genes) that appear as large spots; this pattern is called piebald (Fig. 3.29). As long as a dog has one dominant S allele, it will not be piebald. A second gene called M controls the patterning of pigmentation, and it has codominant alleles AE and ttF. UtM heterozygotes, called merle dogs, have patches of diluted color (the Mr phenotype) and patches of
normal color (the Mr phenotype) (Fig. 3.29). Breeders would never mate two merle dogs because the M gene is pleiotropic. The M] allele has recessive deleterious effects:
So-called double merle dogs (NlM\ usually have serious health problems, including defects in hearing and vision. This example of coat color in dogs gives some idea of potential the for variation from just half of the genes known to affect this phenotype. Amazingly, this is just the tip of the iceberg. When you realize that both dogs and humans carry roughly 25,000 genes, the number of interactions that connect the various alleles ofthese genes in the expression of phenotype is in the millions, if not the billions. The potential for variation and diversity among individuals is staggering indeed.
essential concepts
.
merle (Ml M2)
Two or more genes may interact to affect a single trait; these interactions may be detected by ratios that can be predicted
from Mendelian principles.
.
Retention of the 9:3:3:1 phenotypic ratio usually indicates that two genes function in independent pathways.
. . .
.
ln complementary gene action, dominant and normally functioning alleles of two or more genes are required to generate a normal phenotype. ln epistasis, an allele at one gene can hide traits otherwise caused by alleles at another gene.
When genes are redundant for a trait, one dominant and normally functioning allele of either gene is sufficient to generate the normal phenotype.
.
ln incomplete penetrance, a phenotype is expressed in fewer than 10070 of individuals having the same genotype. ln variable expressivity, a phenotype is expressed at different levels among individuals with the same genotype. A continuous
trait can have any value of expression
between two extremes. Most traits of this type are polygenic, that is, determined by the interactions of multiple genes.
Solved
Part of Mendel's genius was to look at the genetic basis of variation through a very narrow window, focusing his first glimpse of the mechanisms of inheritance on simple yet fundamental phenomena. Mendel worked on just a handful oftraits in inbred populations ofone species. For each trait, he manipulated one gene with one completely dominant and one recessive allele that determined two distinguishable, or discontinuous, phenotypes. Both the dominant and recessive alleles showed complete penetrance and negligible differences in expressivity. in the first few decades of the twentieth century, many questioned the general applicability of Mendelian analysis, for it seemed to shed little light on the complex inheritance patterns of most plant and animal traits or on the mechanisms producing continuous variation. Simple embellishments, however, clarified the genetic basis of continuous variation and provided explanations for other apparent exceptions to Mendelian analysis as described in this chapter.
l.
Imagine you purchased an albino mouse (genotype cc) in a pet store. The c allele is epistatic to other coat color genes. How would you go about determining the genotype of this mouse at the brown locus? (in pigmented mice, BB and Bb are black, bb is brown.)
Problems
75
Each embellishment extends the range of Mendelian analysis and deepens our understanding of the genetic basis of
variation. And no matter how broad the view, Mendel's basic conclusions, embodied in his first law of segregation, remain valid. But what about Mendel's second law that genes assort
independently? As it turns out, its application is not as universal as that of the law of segregation. Many genes do assort independently, but some do not; rather, they appear to be linked and transmitted together from generation to generation. An understanding of this fact emerged from studies that located Mendel's hereditary units, the genes, in specific cellular organelles, the chromosomes. In describing how researchers deduced that genes travel with chromosomes, Chapter 4 establishes the physical basis of inheritance, including the segregation of alleles, and clarifies why some genes assort independently while others do not.
Test
mouse genotlT)e BB
Bb
Answer bb
Albino
Expected
mouse
progeny
x
bb
X X X
BB
all all all all
Bb
3/4black, l/4 brown
bb
X
DD
X X
Bb
1/2 black, l/2 brown all black 1/2 black, 1/2 brown all brown
X X
BB Bb
black black black black
This problem requires an understanding of gene interactions, specifically epistasis. You have been placed in the role of experimenter and need to design crosses that will answer the question. To determine the alleles of the B gene present, you need to eliminate the blocking action of the cc genotype. Because only the recessive c allele is epistatic, when a C allele is present, no epistasis will occur. To introduce a C allele during the mating, the test mouse you mate to your albino can have the genotype CC or Cc. (If the mouse is Cc, half of the progeny will be albino and will not contribute use-
From these hypothetical crosses, you can see that with either the Bb or bb genotype would yield distinct outcomes for each of the three possible albino mouse genotypes. However, a bb lesl mouse would be more useful and less ambiguous. First, it is easier to identify a mouse with the bb genotype because a brown mouse must have this homozygous recessive genotype. Second, the results are completely
ful information, but the nonalbinos from this cross would be informative.) What alleles of the B gene
different for each of the three possible genotypes when you use the bb test mouse. (In contrast, a Bb lest
should the test mouse carry? To make this decision, work through the expected results using each of the
mouse would yield both black and brown progeny whether the albino mouse was Bb or bb; the only distinguishing feature is the ratio.) To determine the full
possible genotypes.
bb
a test mouse
76
3
Chapter
Extensions to Mendelt Laws
genotype of the albino mouse, you should ross brown mouse (which could be CC bb or Ccbb).
it to a
W- pp, ww P-, and ww pp (where the dash indicates that the allele could be either a dominant or a recessive form). Are there any combinations of the 9..3..3..I ratio that would be close to that seen in the F2 generation in this example? The numbers seem close to a 9:4:3 ratio. What hypothesis would support combining two of the classes (3 + I)? If w is epistatic to the P gene, then the ww P- and ww pp genotypic classes would have the same white phenotype. With this explanation, 1/3 of the F2lavender plants would beWW pp, and the remaining 2/3 wouldbeWw pp.[Jpon selffertilization, WW pp plants would produce only lavender (WW pp) progeny, while Ww pp plants would produce a 3:1 ratio of lavender (W- pp) and white (ww pp) progeny.
ll. In a particular kind of wildflower, the wild,type flower color is deep purple, and the plants are truebreeding. In one true-breeding mutant stock, the flowers have a reduced pigmentation, resulting in a lavender color. In a different true-breeding mutant stock, the flowers have no pigmentation and are thus white. When a lavender-flowered plant from the first mutant stock was crossed to a white-flowered plant from the second mutant stock, all the F, plantJ had purple flowers. The F, plants were then allowed to self-fertilize to produce an F2 generation. The 277 F2 plants were 157 purple : 71 white :4glavender Explain how flower color is inherited. Is this trait controlled by the alleles of a single gene? What kinds of progeny would be produced if lavender F2 plants were allowed to self-fertilize?
Answer
Are there any modes of single-gene inheritance compatible with the data? The observations that the
look different from either of their parents and that the F2 generation is composed of plants with F1 plants
three different phenotypes exclude complete dominance. The ratio of the three phenotypes in the F, plants has some resemblance to the 1:2:1 ratio expected from codominance or incomplete dominance, but the results would then imply that purple plants must be heterozygotes. This conflicts with the
lll.
Huntington disease (HD) is a rare dominant condition in humans that results in a slow but inexorable deterioration of the nervous system. HD shows what might be called "age-dependent penetrancel'which is to say that the probability that a person with the HD genotlpe will express the phenotype varies with age. Assume that 50% of those inheriting the HD allele will express the symptoms by age 40. Susan is a 35-year-old woman whose father has HD. She currently shows no symptoms. What is the probability that Susan will show symptoms in five years?
Answer This problem involves probability and penetrance. Two conditions are necessary for Susan to show symptoms
ll2
breeding. Consider now the possibility that two genes are involved. From a cross between plants heterozygous for two genes (14/ and P), the F2 generation would
(50o/o) chance that disease. There is a she inherited the mutant allele from her father and a Il2 (50Vo) chance that she will express the phenotype by age 40. Because these are independent events, the probability is the product of the individual probabilities,
contain a 9:3:3:l ratio of the genotypes W- P-,
or 1/4,
information provided that purple plants are true-
Vocabulary
of the
e.
1. For each of the terms in the left column, choose the best matching phrase in the right column. a. epistasis
reduced penetrance
5.
produced by the action of other genes
f. multifactorialtrait
6.
2. the alleles of one
express gene mask
the efects of alleles of another gene
c.
conditional
lethal
3. both parental phenotypes are expressed in the
d.
permissive
condition 4.
a
less than 100%
F1
hybrids
heritable change in
a gene
ofthe individuals
possessing a particular genotype
1. onegeneaffectingmorethanone phenotype
b. modifier gene
genes whose alleles alter phenotlpes
g. h.
it in their phenotype
incomplete dominance
7. environmental condition that allows
codominance
8.
conditional lethals to live trait produced by the interaction ofalleles of at least two genes or from interactions between gene and environment a
Problems i.
mutation
f
j.
pleiotropy
t0.
.
individuals with the same genotype have related phenotypes that vary in intensity
shape can be long (lt), oval (Ll), or round (ll). What phenotypic classes and proportions would you expect among the offspring of a cross between two plants
a genotype that is lethal in some
heterozygous at both loci?
situations (for example, high temperature) but viable in others
k.
variableexpressivity
1
77
1. the heterozygote resembles neither homozygote
Section 3.1
2. In four-o'clocks, the allele for red flowers is incompletely dominant to the allele for white flowers, so heterozygotes have pink flowers. What ratios of flower colors would you expect among the offspring of the
7. A wild legume with white flowers and long pods is crossed to one with purple flowers and short pods. The Fr offspring are allowed to self-fertilize, and the F, generation has 30i long purple, 99 short purple, 612 long pink, 195 short pink, 295 long white, and 98 short white. How are these traits being inherited?
8.
a. The genotype
following crosses: (a) pink X pink, (b) white X pink, (c) red X red, (d) red x pink, (e) white X white, and (f) red X white? If you specifically wanted to produce pink flowers, which of these crosses would be most
anemia.
same mother and father.
3. The Aaheterozygous snapdragons in Fig, 3.3 are pink,
9. Assuming no involvement of the Bombay phenotype: a. If a girl has blood type O, what could be the genotypes and corresponding phenotypes of her parents?
b. If a girl has blood type B and her mother has blood type A, what genotype(s) and corresponding
why the alleles of gene A and the alleles of gene P
phenotype(s) could the other parent have?
interact so differently.
c. If a girl has blood type AB and her mother is also
4. In the fruit fly Drosophila
melanogasteri very dark (ebony) body color is determined by the e allele. The e+ alleleproduces the normal wild-rype, honey-colored body. In heterozygotes for the two alleles (but not in e* e+ homozygotes), a dark marking called the trident can be seen on the thorax, but otherwise the body is honey-colored. The e+ alTele is thus considered to be incompletely dominant to the e allele. a. When iemale e* e* flies are crossed to male e+ e flies, what is the probability that progeny will have the dark trident marking?
AB, what are the genotype(s) and corresponding phenotype(s) of any male who could not be the girl's father? 1O. Several genes in humans in addition to the ABO gene (1) give rise to recognizable antigens on the surface of red blood cells. The MN and Rh genes are
two examples. The Rh locus can contain either
expected to have a trident, how many ebony bodies, and how many honey-colored bodies?
Mother
5. A cross between two plants that both have yellow flowers produces 80 offspring plants, of which 38
how can you describe the inheritance of flower color?
6. In radishes, color and shape are each controlled by a single locus with two incompletely dominant alleles. Color may be red (RR), purple (Rr), or white (rr) and
a
positive or a negative allele, with positive being dominant to negative. M and N are codominant alleles of the MN gene. The following chart shows several mothers and their children. For each mother-child pair, choose the father of the child from among the males in the right column, assuming one child per male.
b. Animals with the trident marking mate among themselves. Of 300 progeny, how many would be
have yellow flowers, 22have red flowers, and 20 have white flowers. If one assumes that this variation in color is due to inheritance at a single locus, what is the genotype associated with each flower color, and
of a person who has sickle-cell
b. The genotype of a person with a normal phenotype who has a child with sickle-cell anemia. c. The total number of different alleles of the B-globin gene that could be carried by five children with the
efficient?
while AA homozygotes are red. However, Mendel's Pp heterozygous pea flowers were every bit as purple as those of PP homozygotes. Assuming that the A allele and the P allele encode normally functional enzymes, and the a and p alleles encode no protein at all, explain
Describe briefly:
a. O M Rhpos b. B MN Rh neg c. OMRhpos d. AB N Rh neg 1
child B MN Rh neg
ONRhneg AMRhneg B MN Rh neg
Males
OMRhneg AMRhpos O MN Rh pos B
MN Rh pos
1. Alleles of the gene that determines seed coat patterns in lentils can be organized in a dominance series: marbled ) spotted : dotted (codominant alleles) ) clear. A lentil plant homozygous for the marbled seed coat pattern allele was crossed to one homozygous for the spotted pattern allele. In another cross,
78
Chapter
3
Extensions to Mendel's Laws
a homozygous dotted
lentil plant was crossed to one
clear. An F, plant from the first cross was then mated to an F, plant from the second
homozygous
for
a. How many different kinds of leaf patterns (including the absence of a pattern) are possible in a population of clover plants in which all seven alleles are
cross.
represented?
a. What phenotlpes in what proportions are expected from this mating between the two F, tlpes? b. What are the expected phenotlpes of the F1 plants from the two original parental crosses? 12. One of your fellow students tells you that there is no way to know that the spotted and dotted patterns on the lentils in Fig. 3.4a (p.48) are due to codominant alleles (d and CD) of a single gene C. He claims that spotting could be controlled by gene S, with a completely dominant allele S that directs spotting and a recessive allele s that directs no spots. Likewise, he claims that dotting could be controlled by a separate gene D, with a completely dominant allele D that directs dotting and a recessive allele d that directs no dots. Is he correct, or does the information in Fig. 3.4a argue against this idea? Explain. 13.
In a population of
rabbits, you find three different coat color phenotypes: chinchilla (C), himalaya (H), and albino (A). To understand the inheritance of coat colors, you cross individual rabbits with each other and note the results in the following table. Cross
number
Parental
phenotypes
1
HXH
2
Phenotypes ofprogeny
HXA
314H:Il4 A ll2H:ll2 A
3
CXC
3l4C:ll4H
4
CXH
all C
5
CXC
314C:ll4 A
6
HXA CXA AXA CXH CXH
7 8 9 10
all H
ll2 C:712 A all A
ll2 C:l12H ll2 C:Il4H:ll4 A
a. What can you conclude about the inheritance of coat color in this population of rabbits? b. Ascribe genotypes to the parents in each ofthe 10
c. What kinds of progeny would you expect, and in what proportions, if you crossed the chinchilla parents in crosses 9 and 10? 14. In clover plants, the pattern on the leaves is determined by a single gene with multiple alleles that are related in a dominance series. Seven different alleles of this gene are known; an allele that determines the
of a pattern is
only a single genotype?
c. In a particular field, you find that the large majority of clover plants lack a pattern on their leaves, even though you can identify a few plants representative of all possible pattern types. Explain this finding. 15. Fruit flies with one allele for curly wings (Cy) and one allele for normal wings (C7+) have curly wings. When two curly-winged flies were crossed, 203
curly-winged and 98 normal-winged flies were obtained. In fact, all crosses between curly-winged flies produce nearly the same curly : normal ratio among the progeny. a. What is the approximate phenotypic ratio in these offspring?
b. Suggest an explanation for these data. c. If a curly-winged fly was mated to a normal-winged fly, how many flies of each type would you expect among 180 total offspring?
16. In certain plant species such as tomatoes and petunias, a very highly polymorphic "incompatibility" gene S with more than 100 known alleles prevents self-fertilization and promotes outbreeding. In this form of incompatibility, a plant cannot accept pollen carrying an allele identical to either of its own incompatibility alleles. If, for example, pollen carrying allele S-l of the incompatibility gene lands onto the stigma (a female organ) of a plant that also carries the Sl allele, the pollen cannot fertilize any ovules in that plant. (This phenomenon occurs because the pollen grain on the stigma cannot grow a pollen tube to allow it to unite with the ovule.) For the followingcrosses, indicate whether anyprog-
if so, list all possible genotypes of these progeny. Remember that pollen grains and ovules are gametes. eny would be produced, and
crosses.
to the other
six
alleles, each of which produces a different pattern.
All
absence
b. What is the largest number of different genotypes that could be associated with any one phenotype? Is there any phenotype that could be represented by
recessive
heterozygous combinations dominance.
of alleles show complete
a. b. c. d.
Sl52x5152 Sl52x5253 Si52x5354 Explain how this mechanism of incompatibility would prevent plant self-fertilization.
e. How does this incompatibility system ensure that all plants will be heterozygotes for different alleles of the
f.
S gene?
How do you know that peas are not governed by this
incompatibility mechanism?
Problems
g. Explain why evolution would favor the emergence of new incompatibility alleles, making the gene increasingly polymorphic in populations of tomatoes or petunias.
17.
In a species of tropical fish, a colorful
orange and black variety called montezuma occurs. When two montezumas, are crossed,2l3 of the progeny are montezuma, and ll3 are the wild-type, dark grayish green color. Montezuma is a single-gene trait, and montezuma fish are never true-breeding. a. Explain the inheritance pattern seen here and show how your explanation accounts for the phenotypic ratios given. b. In this same species, the morphology of the dorsal fin is altered from normal to ruffled by homoWhat zygosity for a recessive allele designated
I
progeny would you expect to obtain, and in what proportions, from the cross of a montezuma fish homozygous for normal fins to a green, ruffled
stallion gave a chestnut foal. Explain how coat color is being inherited in these horses.
20. Filled-in symbols in the pedigree that follows designate individuals suffering from deafness. a. Study the pedigree and explain how deafness is being inherited.
b. What is the genotype of the individuals in generation V? Why are they not affected?
23 il
12
4
o+
IV
7
6
5
345
2
12345
8
6
b
1 2 3 4 5 6 7 B 9
fish?
c. What phenotypic ratios of progeny would be expected from the crossing of two of the montezuma progeny from (b)?
79
7
7
10 11 12
13
21. You do a cross between two true-breeding strains of zucchini. One has green fruit and the other has yellow fruit. The F, plants are all green, but when these are crossed, the F2 plants consist of 9 green :
Section 3.2
7 yellow.
18. A rooster with a particular comb morphology called walnut was crossed to a hen with a type of comb morphology known as single. The F1 progeny all had walnut combs. When F1 males and females were crossed to each other, 93 walnut and 11 single combs were seen among the F2 progeny, but there were also 29 birds with a new kind of comb called
a. Explain this result. What were the genotypes of the
rose and 32 birds with another new comb type called pea. a. Explain how comb morphology is inherited. b. What progeny would result from crossing a homozygous rose-combed hen with a homozygous peacombed rooster? What phenotypes and ratios would be seen in the F2 progeny? c. A particular walnut rooster was crossed to a pea hen, and the progeny consisted of 12 walnut, l1 pea, 3 rose, and 4 single chickens. What are the Iikely genotlpes of the parents? d. A different walnut rooster was crossed to a rose hen, and all the progeny were walnut. What are the possible genotypes ofthe parents?
19. A black mare was crossed to a chestnut stallion and produced a bay son and a bay daughter. The two offspring were mated to each other several times, and they produced offspring of four different coat colors: black, bay, chestnut, and liver. Crossing a liver grandson back to the black mare gave a black foal, and crossing a liver granddaughter back to the chestnut
two parental strains?
b. Indicate the phenotypes, with frequencies, of the progeny of a testcross of the
F1
plants.
22. Two true-breeding white strains of the plant lllegitimati noncarborundum were mated, and the Fr progeny were all white. When the F1 plants were allowed to self-fertilize, 126 white-flowered and 33 purple-flowered F2
plants grew.
a. How could you describe inheritance of flower color? Describe how specific alleles influence each other and therefore affect phenotype.
b. A white
F2
plant is allowed to self-fertilize. Of the
progeny, 3 I 4 are white-fl owered, and I I 4 ar e purpleflowered. What is the genotype of the white F2 plant? c. A purple F2 plant is allowed to self-fertilize. Of the progeny, 3 I 4 ar e purple-fl owered, and I I 4 are white-
flowered. What is the genotype of the purple
F2
plant?
d. Two white F, plants are crossed with each other. Of the progeny, ll2 are purple-flowered, and Il2 are white-flowered. What are the genotypes of the two white F, plants?
23. Explain the difference between epistasis and domi' nance. How many loci are involved in each case? 24. A dominant allele H reduces the number of body bristles in fruit flies, giving rise to a "hairless" phenotype.
80
Chapter
3
Extensions to Mendel's Laws
In the homozygous condition, .FI is lethal. A dominant allele S has no effect on bristle number except in the presence of FI, in which case a single S allele suppresses the hairless phenotype, thus restoring the bristles. However, S is also lethal in homozygotes. a. What ratio of flies with normal bristles to hairless individuals would we find in the live progeny of a cross between two normal flies both carrying the FI allele in the suppressed condition? b. When the hairless progeny of the previous cross are crossed with one of the parental normal flies from (a) (meaning a fly that carries H in the suppressed condition), what phenotypic ratio would you expect to find among their live progeny?
25. "Secretors" (genotypes SS and Ss) secrete their
26. Normally, wild violets have yellow petals with dark brown markings and erect stems. Imagine you discover a plant with white petals, no markings, and prostrate stems. What experiment could you perform to determine whether the non-wild-type phenotypes are due to several different mutant genes or to the pleiotropic effects of alleles at a single locus? Explain how your experiment would settle the question.
27.The following table shows the responses of blood samples from the individuals in the pedigree to anti-A
and anti-B sera. A "+" in the anti-A row indicates that the red blood cells of that individual were clumped by anti-A serum and therefore the individual made A antigens, and a indicates no clumping. The same notation is used to describe the test for the B antigens. a. Deduce the blood type of each individual from the data in the table. b. Assign genotlpes for the blood groups as accurately as you can from these data, explaining the pattern of inheritance shown in the pedigree. Assume that all genetic relationships are as presented in the pedigree (that is, there are no cases offalse paternity).
"-"
anti-A anti-B
+
+
+
il-1 |-2 il-3 +
+
ilt-l +
Fr
|t-2
F2
white-l x white-2 white-i x white-3
red
9
red
9red:Twhite
x
red
9red:Twhite
white-2
white-3
red:
7
white
a. Howmanygenes are involved in determiningkernel color in these three strains?
b. Define your symbols and show the genotypes for the pure-breeding strains white-
A and
B blood group antigens into their saliva and other body fluids, while "nonsecretors" (ss) do not. What would be the apparent phenotlpic blood group proportions among the offspring of an IAIB Ss woman and an IoIo Ss man if typing was done using saliva?
l-1 l-2 t-3 t-4 ++ +
28. Three different pure-breeding strains of corn that produce ears with white kernels were crossed to each other. In each case, the F1 plants were all red, while both red and white kernels were observed in the F2 generation in a 9:7 ratio. These results are summarized in the following table.
1,
white-2, and white-3.
c. Diagram the cross between white,l and white-2, showing the genotlpes and phenotypes of the F, and F, progeny. Explain the observed 9:7 ratio.
29. In mice, the ,4 allele of the agoutl gene is a recessive lethal allele, but it is dominant for yellow coat color. What phenotypes and ratios of offspring would you expect from the cross of a mouse heterozygous at the agouti gene (genotype NA) and also at the albino gene (Cc) to an albino mouse (cc) heterozygous at the agouti gene (NA)? 30.
A
student whose hobby was fishing pulled a very unusual carp out of Cayuga Lake: It had no scales on
its body. She decided to investigate whether this strange
nude phenotype had a genetic basis. She therefore obtained some inbred carp that were pure-breeding for the wild-type scale phenotype (body covered with scales in a regular pattern) and crossed them with her nude fish. To her surprise, the F1 progeny consisted of a 1:1 ratio of wild-type fish and fish with a single linear row of scales on each side. a. Can a single gene with two alleles account for this result?'vVhy or why not?
b. To follow up on the first cross, the student allowed the linear fish from the F1 generation to mate with each other. The progeny of this cross consisted of fish with four phenotypes: linear, wild type, nude, and scattered (the latter had
a
few scales
scattered irregularly on the body). The ratio of these phenotypes was 6:3..2;1, respectively. How many genes appear to be involved in determining these phenotypes?
c. In parallel, the student allowed the phenotypically wild-type fish from the F1 generation to mate with each other and observed, among their progeny, wild-t1pe and scattered carp in a ratio of 3:1. How many genes with how many alleles appear to determine the difference between wild-t1pe and scattered carp?
Problems
d. The student confirmed the conclusions of (c) by crossing those scattered carp with her pure-breeding wild-type stock. Diagram the genotlpes and phenotypes of the parental, Fr, and F2 generations for this cross and indicate the ratios observed. e. The student attempted to generate a true-breeding nude stock of fish by inbreeding. However, she found that this was impossible. Every time she crossed two nude fish, she found nude and scattered fish in the progeny, in a 2:l ratio. (The scattered fish from these crosses bred true.) Diagram the phenotypes and genot)?es of this gene in a nude X nude cross and explain the altered Mendelian ratio. f. The student now felt she could explain all of her results. Diagram the genotypes in the linear X linear cross performed by the student in (b). Show the genotypes of the four phenotypes observed among the progeny and explain the 6:3:2:1 ratio.
31. You picked up two mice (one female and one male) that had escaped from experimental cages in the animal facility. One mouse is yellow in color, and the other is brown agouti. You know that this mouse colony has animals with dtfferent alleles at only three coat color genes: the agouti (A) or nonagouti (a)..or yellow (Av) alleles of the ,4 gene (A' > A > a; ,4' is a recessive lethal), the black (B) or brown (b) allele of the B gene (B ) b), and the albino (c) or nonalbino (C) alleles of the C gene (C ) c; cc is epistatic to all other phenotypes). However, you dont know which alleles of these genes are actually present in each of the animals that you've captured. To determine the genotypes, you breed them together. The first litter has only three pups. One is albino, one is brown (nonagouti), and the third is black agouti. a. \{hat alleles of the A, B, and C genes are present in the two mice you caught?
b. After raising several litters from these two parents, you have many offspring. How many different coat color phenotypes (in total) do you expect to see expressed in the population of offspring? What are the phenotypes and corresponding genotypes?
32. Figure 3.22 on p. 65 and Fig. 3.28b on p' 72 both show traits that are determined by two genes, each of which has two incompletely dominant alleles. But inFig.3.22 the gene interaction produces nine different phenotypes, while the situation depicted in Fig. 3.28b shows only five
possible phenotypic classes. How can you explain this difference in the amount of phenotypic variation?
33. Three genes in fruit flies affect a particular trait, and one dominant allele of each gene is necessary to get a wild-type phenotype. a. What phenotypic ratios would you predict among the progeny ifyou crossed triply heterozygous flies?
81
b. You cross a particular wild-type male in succession with three tester strains. In the cross with one tester strain (AA bb cc), only ll4 of the progeny are wild type. In the crosses involving the other two tester strains (aa BB cc and aabb CC), halfofthe progeny are wild t1pe. What is the genotype of the wild-type male?
34. The garden flower Salpiglossis sinuata ("painted , tongue") comes in many different colors. Several crosses are made between true-breeding parental strains to produce F1 plants, which are in turn selffertilized to produce F2 progeny. Fr
Parents red
x
phenotypes F, phenotypes 102 red, 33 blue
al1 red
blue
lavender X blue
all lavender
149 lavender, 51 blue
lavender X red
all bronze
84 bronze, 43 red,
red X yellow
all red
133 red, 58 yellow, 43 blue
yellow X blue
all lavender
i83 lavender,
4l lavender
81 yellow, 59 blue
a. State a hypothesis explaining the inheritance of flower color in painted tongues. F1 progeny, and F2 progeny for all five crosses. In a cross between true-breeding yellow and true-
b. Assign genoty?es to the parents,
c.
breeding lavender plants, all ofthe F1 progeny are bronze. if you used these F1 plants to produce an F2 generation, what phenotypes in what ratios would you expect? Are there any genotypes that might produce a phenotype that you cannot predict from earlier experiments, and if so, how might this alter the phenotypic ratios among the F2
progeny?
35. In foxgloves, there are three different petal phenotypes: white with red spots (WR), dark red (DR), and light red (LR). There are actually two different kinds of true-breeding WR strains (WR-l and WR-2) that can be distinguished by two-generation intercrosses with true-breeding DR and LR strains: F2
Parental
Fr
WR
LR
DR
I
WR-1 X LR
39
119
99
0
32
3
WR.l X DR DRXLR
allWR allWR
480
2
0
43
732
4
WR-2 X LR
193
64
0
5
WR-2 X DR
286
24
74
all DR
allWR altWR
a. What can you conclude about the inheritance of the petal phenotypes in foxgloves? b. Ascribe genotypes to the four true-breeding parental strains (WR-1, WR-2, DR, and LR). c. A WR plant from the F, generation of cross 1 is now crossed with an LR plant. Of 500 total progeny from
82
Chapter
3
Extensions to Mendel's Laws
this cross, there were 253 WR, 124 DR, and 123 LR plants. What are the genotypes of the parents in this WR X LR mating?
36. In a culture of fruit flies, matings between any two flies with hairy wings (wings abnormally containing additional small hairs along their edges) always produce both hairy-winged and normal-winged flies in a 2:I ratio. You now take hairy-winged flies from this culture and cross them with four types of normal-winged flies; the results for each cross are shown in the following table. Assuming that there are only two possible alleles of the hairy-winged gene (one for hairy wings and one for normal
and B are enzymes that catalyze the indicated steps of
the pathway. Alleles A and B encode functional enzymes A and B, respectively; these are completely
dominant to alleles a andb, which do not encode any of the corresponding enzyme. If functional enzyme is present, assume that the compound to the left of the arrow is conyerted completely to the compound to the right of the arrow. For each pathway, what phenotypic ratios would you expect among the progeny of a dihybrid cross of the form Aa Bb X Aa Bbz.
a. Independent pathways
wings), what can you say about the genotypes of the four types of normal-winged flies? Progeny obtained from cross with
hairy-winged flies Fraction with normal wings
Type of normal-
winged flies
Enz A __
Compound2
Compound3
Enz B _-.*
Compound4
b. Redundant pathways
Fraction with hairy wings
U2
Compoundl
Enz A
Compoundl
0
314
u4
213
rl3
37. Suppose that blue flower color in a plant species is controlled by two genes, A and B. The dominant alleles A and B encode proteins that function in the pathways shown below. The A and B proteins are both required to make blue pigment from a colorless precursor. A and B proteins also independently inhibit the production of blue pigment from a different colorless precursor; that is, the presence of either protein A or protein B is sufficient to prevent blue pigment production from precursor 2. The recessive mutant
Compound2
Enz B
U2
I
----T
c. Sequential pathway compound
'l
Enz
A'
compound
,
EnzB
,
compound 3
d. Complementary gene action (enzymes A and both needed to catalyze the reaction indicated) Compound
EnzA+EnzB 1
B
Compound 2
e. Branched pathways (assume a limitless supply of compound
1)
compound'l
EnzA'
compound2
alleles a and b specify no protein. Two different pure-
breeding mutant strains
Enz B
with white flowers
were crossed and complementation was observed so that all the F1 were blue. Colorless precursor
AB
f.
++Blue
Now consider independent pathways
as
in (a), but
the presence of compound 2 masks the colors due to
1
Colorless precursor 2
Compound 3
all other compounds.
T
T
A
B
Blue
a. What
are the genoq?es of each white mutant strain and the F,? b. If the F1 are selfed, what would be the phenotypic ratio of lhe Fr?
38. This problem examines possible biochemical explanations for variations of Mendel's 9:3:3:1 ratio. Except where indicated, compounds 1, 2,3, and 4 have different colors, as do mixtures of these compounds. A
g. Next consider the sequential pathway shown in (c), but compounds
1
and 2 arethe same color.
h. Finally, examine the pathway shown below Here, compounds I and2 have different colors. The protein encoded by A prevents the conversion of compound 1 to compound 2. The protein encoded by B prevents protein A from functioning. Protein B
Compoundl
Protein A
+
Compound2
Probiems
39. Considering your answers to Problem 38, does the existence of a particular variation of a 9:3:3:1 ratio among the F2 progeny allow you to infer the operation ofa specific biochemical mechanism responsible for these phenotlpes? Inversely, if you know a biochemical mechanism of gene interaction, can you predict the ratios of the phenotypes you would see among the F2 progeny?
40. As shown in the picture that follows' flowers of the plant Arabidopsis thaliana (mustard weed) normally contain four different types oforgans: sepals (leaves), petals, anthers (male sex organs), and carpels (female sex organs). The mutant strain shown in the picture
at the right has abnormal flower morphology-the flower is made up entirely of sepals! Three genes (called SEPl, SEP2, and SEP3) function redundantly in the pathway for generating petals, anthers, and carpels. For normal flower morphology' the plant requires only one dominant, normally functioning allele of any one of these genes: SEPI (A) or SEP2 (B) or SEP3 (C). Recessive mutant alleles of these genes (a, b, or c) specify no Protein' a. What is the genotype of the mutant plant below? b. In a trihybrid cross of the type AA bb cc X aa BB CC, where all of the F, are Aa Bb Cc, what is the expected fraction of normal plants among the F2 progeny?
a. Does this description of people with spherocytosis represent incomplete penetrance, variable expressivity, or both? Explain your answer. Can you derive any values from the numerical data to measure penetrance or expressivity? b. Suggest a treatment for spherocytosis and describe
how the incomplete penetrance and/or variable expressivity of the condition might affect this treatment.
42. Familial hypercholesterolemia (FH) is an inherited trait in humans that results in higher than normal serum cholesterol levels (measured in milligrams of cholesterol per deciliter of blood [mg/dl]). People with serum cholesterol levels that are roughly twice normal have a 25 times higher frequency of heart attacks than unaffected individuals. People with serum cholesterol levels three or more times higher than normal have severely blocked arteries and almost always die before they reach the age of 20. The follow-
ing pedigrees show the occurrence of FH in four ]apanese families:
a. What is the most likely mode of inheritance of FH based on these data? Are there any individuals in any ofthese pedigrees who do not fit your hypothesis? What special conditions might account for such
individuals?
c. Suggest a model to explain how the Arabidopsis thaliana genome came to acquire three redundant genes.
Carpel
83
b. Why do individuals in the same phenotypic class (unfilled, yellow, or red symbols) show such variation in their levels ofserum cholesterol?
Anther Key to serum cholesterol levels:
Family
1
Petal
Sepal
Family 2
Normal
Mutant
41. Spherocytosis is an inherited blood disease in which the erythrocytes (red blood cells) are spherical instead of biconcave. This condition is inherited in a domi-
nant fashion, with SPFI (the mutant allele) dominant SPH+. In people with spherocytosis' the spleen "reads" the spherical red blood cells as defective and removes them from the bloodstream, leading to ane-
to
Family 3
mia. The spleen in different people removes the spherical erythrocytes with different efficiencies. Some people with spherical erythrocytes suffer severe
Family 4
anemia and some mild anemia, yet others have spleens that function so poorly there are no symptoms of anemia at all. When 2400 people with the genotype SPH SPH+ were examined, it was found that2250 had anemia of varying severity, but 150 had no symptoms.
OI C I OI
Soomg/dl
84
Chapter
3
Extensions to Mendelt Laws
43. You have come into contact with two unrelated patients who express what you think is a rare phenotype-a dark spot on the bottom of the foot. According to a medical source, this phenotype is seen in 1 in every 100,000 people in the population. The two patients give their family histories to you, and you generate the pedigrees that follow.
a. Given that this trait is rare, do you think the inheritance is dominant or recessive? Are there any special conditions that appear to apply to the inheritance?
b. Which nonexpressing members of these families must carry the mutant allele?
c. If this trait is instead quite common in the population, what alternative explanation would you propose for the inheritance?
d. Based on this new explanation in (c), which nonexpressing members of these families must have the genotype normally causing the trait? The Smiths 2
23 1
23
4
IV
5
1234 The Jeffersons
12
3 4 5 12
6
7
3 4 56
44. Polycystic kidney disease is a dominant trait that causes the growth of numerous cysts in the kidneys. The condition eventually leads to kidney failure. A child with polycystic kidney disease is born to a couple, neither of whom shows the disease. What possibilities might explain this outcome? 45. Using each of the seven coat color genes discussed in the text (listed in Table 3.3, p.73), propose a possible genotype for each of the three Labrador retrievers in Fig. 3.14a on p. 59. Keep in mind that the Labrador retrievers are pure-breeding for uniformly colored coats without spots or eye masks. Explain any ambiguities in your genotype assignments.
", chapter
4
The Chromosome Theory of lnheritance
Each of these three human chromosomes carries hundreds of genes.
lN THE SPHERICAL, membrane-bounded nuclei of plant and
animal cells prepared for viewing under the microscope, chromosomes appear as brightly colored, threadlike bodies. The nuclei of normal human cells carry 23 pairs of chromosomes for a total of 46. There are noticeable differences in size and shape among the 23 pairs, but within each pair, the two chromosomes appear to match exactly'
I ''
chapter outline
. . .
4.1 Chromosomes:The Carriers of Genes 4.2 Sex Chromosomes and Sex Determination 4.3 Mitosis: Cell Division That Preserves Chromosome
Number
.
4.4 Meiosis: Cell DivisionsThat Halve Chromosome
Number
(The only exceptions are the male's sex chromosomes, des. 4.5 Gametogenesis ignated X and Y, which constitute an unmatched pair.) . 4.6 Validation of the Chromosome Theory Down syndrome was the first human genetic disorder . 4.7 Sex-Linked and Sexually DimorphicTraits in Humans attributable not to a gene mutation but to an abnormal number of chromosomes. Children born with Down syndrome have 47 chromosomes in each cell nucleus because they carry three, instead of the normal paia of a very small chromosome referred to as number 2I.The aberrant genotFpe, called trisomy 21, gives rise to an abnormal phenotype, including a wide skull that is flatter than normal at the back, an unusually large tongue, learning disabilities caused by the abnormal development of the hippocampus and other parts of the brain, and a propensity to respiratory infections as well as heart disorders, rapid aging, and leukemia (Fig. a.l)' How can one extra copy of a chromosome that is itself of normal size and shape cause such wide-ranging phenotypic effects? The answer has two parts. First and foremost, chromosomes are the cellular structures responsible for transmitting genetic information. In this chapter, we describe how geneticists concluded that chromosomes are the carriers of genes, an idea that became known as the chromosome theory of inheritance. The second part of the answer is that proper development depends not just on what type of genetic material is present but also on how much of it there is. Thus the mechanisms governing gene transmission during cell division must vigilantly maintain each cell's chromosome number. C.ll division proceeds through the precise chromosome-parceling mechanisms of mitosis (for somatic, or body cells) and meiosis (for gametes-eggs and sperm)' When 85
\ 86
Chapter
4
The Chromosome Theory of Inheritance
Figure 4.1 Down syndrome: One extra chromosome 21 has widespread phenotypic consequences. Trisomy 21 usually causes changes in physical appearance as well as in the potential for learning. Many children with Down syndrome, such as the fifth grader at the center of the photograph, can participate fully in regular activities.
the machinery does not function properly, errors in chromosome distribution can have dire repercussions on the individual's health and survival. Down syndrome, for example, is the result of a failure of chromosome segregation during meiosis. The meiotic error gives rise to an egg or sperm carrying an extra chromosome 21 which, if incorporated in the zygote at fertilization, is passed on via mitosis to every cell of the developing embryo. Trisomy-three copies of a chromosome instead of two-can occur with other chromosomes as well, but in nearly all of these cases, the condition is prenatally lethal and results in a miscarriage. TWo themes emerge in our discussion of meiosis and mitosis. First, direct microscopic observations of chromosomes during gamete formation led early twentieth-century investigators to recognize that chromosome movements parallel the behavior of Mendel's genes, so chromosomes are likely to carry the genetic mate-
rial. This chromosome theory of inheritance was proposed in 1902 and was confirmed in the following 15 years through elegant experiments performed mainly using the fruit fly Drosophila melanogaster. Second, the chromosome theory transformed the concept of a gene from an abstract particle to a physical realitypart of a chromosome that could be seen and manipulated.
-.
!f,l
Genes Reside in the Nucleus
chromosomes! The Carriers of Genes
The nature of the specific link between sex and reproduction remained a mystery until Anton van Leeuwenhoek,
Iearning objectives
1.
Differentiate among somatic cells, gametes, and zygotes
with regard to the number and origin of their chromosomes.
2.
Distinguish between homologous and nonhomologous chromosomes.
3.
Distinguish between sister chromatids and nonsister chromatids.
One of the first questions asked at the birth of an infant-is it a boy or a girl?-acknowledges that male and female nor-
mally are mutually exclusive characteristics like the yellow versus green of Mendel's peas. What's more, among humans and most other sexually reproducing species, a roughly 1:1 ratio exists between the two genders. Both males and females produce cells specialized for reproduction-sperm or eggs-that serve as a physical link to the next generation.
In bridging the gap between generations, these
gametes
must each contribute half of the genetic material for making a normal, healthy son or daughter. Whatever part of the gamete carries this material, its structure and function must be able to account for the either-or aspect of sex determination as well as the generally observed 1: 1 ratio of males to females. These two features of sex determination were among the earliest clues to the cellular basis of heredity.
one of the earliest and most astute of microscopists, discovered in 1667 that semen contains spermatozoa (literally "sperm animals"). He imagined that these microscopic creatures might enter the egg and somehow achieve fertilization, but it was not possible to confirm this hypothesis for another 200 years. Then, during a 2}-year period starting in 1854 (about the same time Gregor Mendel was beginning his pea experiments), microscopists studying fertilization in frogs and sea urchins observed the union of male and female gametes and recorded the details of the process in a series of drawings. These drawings, as well as later micrographs (photographs taken through a microscope), clearly show that egg and sperm nuclei are the only elements contributed equally by maternal and paternal gametes. This observation implies that something in the nucleus contains the hereditary material. In humans, the nuclei of the gametes are less than2 millionths of a meter in diameter. It is indeed remarkable that the genetic link between generations is packaged within such an exceedingly small space.
Genes Reside in Chromosomes Further investigations, some dependent on technical innovations in microscopy, suggested that yet smaller, discrete structures within the nucleus are the repository of genetic information. In the 1880s, for example, a newly discovered
4.1 Chromosomes: The Carriers of
combination of organic and inorganic dyes revealed the existence of the long, brightly staining, threadlike bodies within the nucleus that we call chromosomes (literally 'tolored bodies"). It was now possible to follow the movement of chromosomes during different kinds of cell division. In embryonic cells, the chromosomal threads split lengthwise in two just before cell division, and each of the two newly forming daughter cells receives one-half of every split thread. The kind of nuclear division followed by cell division that results in two daughter cells containing the same number and type of chromosomes as the original parent cell is called mitosis (from the Greek mitos meaning "thread" and -osis meaning "formatiori' or "increase"). In the cells that give rise to male and female gametes, the chromosomes composing each pair become segregated, so that the resulting gametes receive only one chromosome from each chromosome pair. The kind of nuclear division that generates egg or sperm cells containing half the number of chromosomes found in other cells within the same
organism is called meiosis (from the Greek word for
'diminution'). Fertilization: The union of haploid gametes to produce diploid zygotes In the first decade of the twentieth century, cytologistsscientists who use the microscope to study cell structureshowed that the chromosomes in a fertilized, egg actually consist of two matching sets, one contributed by the maternal gamete, the other by the paternal gamete. The corresponding maternal and paternal chromosomes appear alike in size and shape, forming pairs (with one exception-the sex chromosomes-which we discuss in a
and shape of chromosomes Scientists analyze the chromosomal makeup of a cell when the chromosomes are most visible-at a specific moment in the cell cycle of growth and division, just before the nucleus divides. At this point, known as metaphase (described in detail later), individual chromosomes have duplicated and condensed from thin threads into compact rodlike struc-
tures. Each chromosome now consists of two identical halves known as sister chromatids attached to each other at a specific location called the centromere (Fig. a.3). In metacentric chromosomes, the centromere is more or less in the middle; in acrocentric chromosomes, the centromere is very close to one end. Chromosomes thus always have two "arms" separated by a centromere, but the relative sizes of the two arms can vary in different chromosomes. Cells in metaphase can be fixed and stained with one of several dyes that highlight the chromosomes and accentuate the centromeres. The dyes also produce characteristic band-
ing patterns made up of lighter and darker regions. Chromosomes that match in size, shape, and banding are called homologous chromosomes, or homologs. The two homologs of each pair contain the same set of genes, although for
Figure 4.2 Diploid versus haploid:2n versus n. Most body cells are diploid: They carry a maternal and paternal copy of each chromosome. Meiosis generates haploid gametes with only one copy of each chromo-
some. ln Drosophila, diploid cells have eight chromosomes (2n : 8), while gametes have four chromosomes (n : 4). Note that the chromosomes in this diagram are pictured before their replication.The X and Y chromosomes determine the sex of the individual.
Drosophila melanogaster
Gametes and other cells that carry only a single set of chromosomes are called haploid (from the Greek word for "single"). Zygotes and other cells carrying two matching sets are diploid (from the Greek word for 'doubld'). The number of chromosomes in a normal haploid cell is designated by the shorthand symbol r; the number of chromosomes in a normal diploid cell is then 2n.Figure 4.2 shows diploid cells as well as the haploid gametes that arise from them in Drosophila, where 2n : 8 and n : 4. In humans,
You can see how the halving of chromosome number during meiosis and gamete formation, followed by the union of two gametes' chromosomes at fertilization, normally allows a constant 2n nttmber of chromosomes to be maintained from generation to generation in all individuals of a species. The chromosomes of every pair must segregate from each other during meiosis so that the haploid gametes will each have one complete set of chromosomes. After fertilization forms the zygote, the process of mitosis then ensures that all the cells of the developing individual have identical diploid chromosome sets.
87
Species variations in the number
later section).
2n: 46;n:23.
Genes
o+
d o
t
"I Diploid cells
2n=B
f,--/
X6
Haploid cells (gametes)
n=4
88
Chapter
4
The Chromosome Theory of Inheritance
Figure 4.3 Metaphase chromosomes can be classified by centromere position. Before celldivision, each chromosome
Figure 4.4 Karyotype of a human male.
replicates into two sister chromatids connected at a centromere. ln highly condensed metaphase chromosomes, the centromere can appear near the middle (a metacentric chromosome), very near an end (an acrocentric chromosome), or anywhere in between. ln a diploid cell, one homologous chromosome in each pair is from the mother and the other from the father,
Pair of
Homologous
Pair of Homologous
MetacentricChromosomes AcrocentricChromosomes
photos of metaphase human chromosomes are paired and arranged in order of decreasing size. ln a normal human male karyotype, there are 22 pairs of autosomes, as well as an X and a Y (2n 46). Homologous chromosomes share the same characteristic pattern of dark and light bands.
-
{t ?r }} I
2
II Ir $i If ll tI
3
4
ll lr lr ?t 6
Centromere
i:
,: ii
&..,,"'"-H Sister chromatids
I
7
t4
t?
Nonsister chromatids
t0
9
rf13 la gr l5
7r:
rlt?
l9
rl
20
2t
5
lir
16
l2
sI fil8 l7
6..' 1t
f x
3
Y
Nonhomologous chromosomes Homologous
chromosomes
Homologous chromosomes
The Genetics and Society box on the next page describes how physicians use karyotlpe analysis and a technique calTed amniocentesis
some of those genes, they may carry different alleles. The differences between alleles occur at the molecular level and don't show up in the microscope. Figure 4.3 introduces a system of notation employed
throughout this book, using color to indicate degrees of relatedness between chromosomes. Thus, sister chro, matids, which are identical duplicates, appear in the same shade of the same color. Homologous chromosomes, which carry the same genes but may vary in the identity of particular alleles, are pictured in different shades (light or dark) of the same color. Nonhomologous chromosomes, which carry completely unrelated sets of genetic information, appear in different colors. To study the chromosomes of a single organism, geneticists arrange micrographs of the stained chromosomes in ho-
mologous pairs of decreasing size to produce a karyotype. Karyotype assembly can now be speeded and automated by computerized image analysis. Figure 4.4 shows the karyot)?e of a human male, with 46 chromosomes a rranged in22matching pairs of chromosomes and one nonmatching pair. The 44 chromosomes in matching pairs are known as autosomes. The two unmatched chromosomes in this male karyotype are caJledsex chromosomes, because theydetermine the sex of the individual. (We discuss sex chromosomes in more detail in subsequent sections.)
Modern methods of DNA analysis can reveal differ-
Through thousands of karyotypes on normal individuals, cytologists have verified that the cells of each species carry a distinctive diploid number of chromosomes. For example, Mendel's peas contain 14 chromosomes in 7 pairs in each diploid cell, the fruit fly Dros ophila melanogaster carries 8 chromosomes (4 pairs), macaroni wheat has 28 (14 pairs), giant sequoia trees22 (11 pairs), goldfish 94 (47 pairs), dogs 78 (39 pairs), and people 46 (23 pairs). Differences in the size, shape, and number of chromosomes reflect differences in the assembled genetic material that determines what each species looks like and how it functions. As these figures show, the number of chromosomes does not always correlate with the size or complexity of the organism. In the next section, you will see that the discovery that chromosomes carry information about an individual's gender led to the realization that chromosomes carry the genes that determine all traits.
essentiol concepts
. . .
ences between the maternally and paternally derived chro-
mosomes of a homologous pair, and can thus track the origin of the extra chromosome 2l that causes Down syndrome in individual patients. In 80% of cases, the third chromosome 21 comes from the egg; in 2\o/o,fromthe sperm.
to diagnose Down syndrome prenatally, roughly
three months after a fetus is conceived.
Chromosomes are cellular structures specialized for the storage and transmission of genetic material. Genes are located on chromosomes and travel with them
during cell division and gamete formation. Somatic cells carry a precise number of homologous pairs of chromosomes, which is characteristic of the species.
.
ln diploid organisms, one homolog of a pair is of maternal origin, and the other paternal.
4.2 Sex Chrornosomes and Sex
Determination
89
Prenatal Genetic Diagnosis With new technologies for observing chromosomes and the DNA in genes, modern geneticists can define an individual's genotype directly. Doctors can use this basic strategy to diagnose, before birth, whether or not a baby will be born with a genetic condition. The first prerequisite for prenatal diagnosis is to obtain fetal cells whose DNA and chromosomes can be analyzed for genotype.
The most frequently used method for acquiring these cells
is
amniocentesis (Fig. A).To carry out this procedure, a doctor inserts a needle through a pregnant woman's abdominal wall into the amniotic sac in which the fetus is growing; this procedure is performed about 16 weeks after the woman's last menstrual period. By using ultrasound imaging to guide the location of the needle, the doctor then withdraws some of the amniotic fluid in which the fetus is suspended into a syringe. This fluid contains living cells called amniocytes that were shed by the fetus. When placed in a culture medium, these fetal cells undergo several rounds of mitosis and increase in number. Once enough fetal cells are available, clinicians look at the chromosomes and genes in those cells. ln later chapters, we describe techniques that allow the direct examination of the DNA constituting particular disease genes. Amniocentesis also allows the diagnosis of Down syndrome through the analysis of chromosomes by karyotyping. Because the risk of Down syndrome increases rapidly with the age of the mother, more than half the pregnant women in North America who are over the age of 35 currently undergo amniocentesis. Although the goal of this karyotyping is usually to learn whether the fetus is trisomic for chromosome 21, many other abnormalities in chromosome number or shape may show up when the karyotype is examined.
The availability of amniocentesis and other techniques of prenatal diagnosis is intimately entwined with the personal and societal issue of abortion. The large majority of amniocentesis procedures are performed with the understanding that a fetus whose genotype indicates a genetic disorder, such as Down syndrome, will be aborted. Some prospective parents who are opposed to abortion still elect to undergo amniocentesis so that they can better prepare for an affected child, but this is rare.
[f,|
Sex Chromosomes and Sex Determination learning objectives 1.
Predict the sex of humans with different complements
ofXandYchromosomes.
2. 3.
Describe the basis of sex reversal in humans. Compare the means of sex determination in different organisms.
Figure A Obtaining fetal cells by amniocentesis. A physician guides the insertion of the needle into the amniotic sac (aided by ultrasound imaging) and extracts amniotic fluid containing fetal cells into the syringe.
Syringe
t.'
*
Amniotic fluid
Placenta Fetus
Amniotic sac Uterus Cervix
'.,
The ethical and political aspects ofthe abortion debate influofthe practical questions underlying prenatal diagnosis. For example, parents must decide which genetic conditions would be sufficiently severe that they would be willing to abort the fetus. They must also assess the risk that amniocentesis might harm the ence many
fetus. The normal risk of miscarriage at 16 weeks of gestation is about 2-3olo; amniocentesis increases that risk by about 0.5olo (about 1 in 200 procedures). From the economic point of view, society must decide who should pay for prenatal diagnosis procedures. ln current practice, the risks and costs of prenatal testing generally restrict amniocentesis to women over age 35 or to mothers whose fetuses are at hig h risk for a testa ble genetic condition because of family history.
The personal and societal equations determining the frequency of prenatal testing may, however, need to be overhauled in the not-toodistant future because of technological advances that will simplify the procedures and thereby minimize the costs and risks.
Walter S. Sutton, a young American graduate student at Columbia University in the first decade of the twentieth century, was one of the earliest cytologists to realize that particular chromosomes carry the information for determining sex. In one study, he obtained cells from the testes of the great lubber grasshopper (Brachystola magna; Fig. a.5)
and followed them through the meiotic divisions that produce sperm. He observed that prior to meiosis, precursor cells within the testes of a great lubber grasshopper contain a total of 24 chromosomes. Of these, 22 are found in 11 matched pairs and are thus autosomes. The remaining two chromosomes are unmatched. He called the larger of these the X chromosome and the smaller the Y chromosome.
90
Chapter
4
The Chromosome Theory of Inheritance
Figure 4.5 The great lubber grasshopper. the smaller male
is
tn this mating pair,
astride the female.
Figure 4.6 The X and Y chromosomes determine sex in humans.
(a) This colorized micrograph shows the human X chromosome on the /eft and the human Y on the right. (b) Children can receive only an X chromosome from their mother, but they can inherit either an X or a Y from their father.
(a)
After meiosis, the sperm produced within these testes are of two equally prevalent tlpes: one-half have a set of 11 autosomes plus an X chromosome, while the other half have a set of 11 autosomes plus a Y. By comparison, all of the eggs produced by females of the species carry an 11-plus-X set of chromosomes like the set found in the first class of sperm. When a sperm with an X chromosome fertilizes an egg, an XX female grasshopper results; when a Y-containing sperm fuses with an egg, an XY male develops. Sutton concluded that the X and Y chromosomes determine sex. Several researchers studying other organisms soon verified that in many sexually reproducing species, two distinct
(b)
9xx x
dxy
chromosomes-known as the sex chromosomes-provide the basis of sex determination. One sex carries two copies of the same chromosome (a matching pair), while the other
_\
-
sex has one of each type of sex chromosome (an unmatched
pair). The cells of normal human females, for example, contain 23 pairs of chromosomes. The two chromosomes of each pair, including the sex-determining X chromosomes, appear to be identical in size and shape. In males, however, there is one unmatched pair of chromosomes: the larger of these is the X; the smaller, the Y (Fig. 4.4 and Fig. 4.6a). Apart from this difference in sex chromosomes, the two sexes are not distinguishable at any other pair of chromosomes. Thus, geneticists can designate women as XX and men as XY and represent sexual reproduction as a simple
XX and XY. sex is an inherited trait determined by a pair of sex chromosomes that separate to different cells during gamete formation, then an XX X XY cross could account for both the mutual exclusion of genders and the near 1:1 ratio of males to females, which are hallmark features of sex detercross between
If
mination (Fig. .6b). And if chromosomes carry information defining the two contrasting sex phenotypes, we can easily infer that chromosomes also carry genetic information specifying other characteristics as well.
XX
o+
XY
d
ln Humans, the SRYGene Determines Maleness You have just seen that humans and other mammals have a
pair of sex chromosomes that are identical in the XX female but different in the XY male. Several studies have shown that in humans, it is the presence or absence of the Y that actually makes the difference; that is, any person carrying
a
Y chromosome will look like a male. For exam-
ple, rare humans with two X and one Y chromosome (XXY)
are males displaying certain abnormalities collectively called Klinefelter syndrome. Klinefelter males are typically tall, thin, and sterile, and they sometimes show mental retardation. That these individuals are males shows that two X chromosomes are insufficient for female development in the presence ofa Y. In contrast, humans carrying an X and no second sex chromosome (XO) are females with Turner syndrome.
4.2 Sex Chromosomes and
4.7 Sex reversal. Sex-reversed XX males have a partofthe including the sRy gene on one of their X chromosomes. Sex-reversed XY females lack SRY on their Y chromosome either because it has been replaced by part ofthe X chromosome, or because it has been inactivated by mutation. Figure
Y
d
XX
XY
Figure 4.8 Human sex chromosomes have both shared and unique genes. PAR'I and PAR2 (black) are homologous regions of the X and Y chromosomes that together contain about 30 genes.The MSY regign is Y-specific and contains at least 78 genes needed for maleness itself (SRY) and for male fertility.
O
e
sRv
:1,,q; XY
XX
XY
characteristics such as pubic hair, are of short stature, and of skin between their necks and shoulders (webbed necks). Even though these individuals have only one X chromosome, they develop as females because they have no Y chromosome. In 1990, researchers discovered that it is not the entire Y chromosome, but rather a single Y-chromosome-specific ) gene called SRY (sex determining yegion of Y) that is the primary determinant of maleness. The evidence implicating SRY came from so-called sex reversal: the existence of XX males and XY females (Fig. .7).In sex-reversed XX males, one of the two X chromosomes is often found to carry a portion of the Y chromosome. Although in different
have folds
XX males, different portions of the Y chromosome are found on the X, one particular gene-SRY-is always present. Sex-reversed XY females, however, always have a Y chromosome lacking a functional SRY gene; the portion of the Y chromosome containing SRY is either replaced by a portion of the X chromosome, or the Y contains a nonfunctional mutant copy of SRY (Fig. .7). Later, experiments with mice confirmed that SRYindeed determines maleness. These experiments are described in the Fast Forward Box on p. 93. SRYis one ofabout 110 protein-coding genes on the Y chromosome. The two ends of the Y chromosome are called the Bseudoautosomal legions (PARs), because homologous
regions are present at the ends of the
-1100 genes
XY
Turner females are usually sterile, lack secondary sexual
X
chromosome
(Fig. a.8). The two PARs (PARI and PAR2) together contain about 30 genes, copies ofwhich are found on both the X and Y chromosomes. Most of the Y chromosome, however, is a male-gpecific region (MSY), which includes SRY and also genes required for spermatogenesis. The X chromosome contains about 1100 genes, most of which have nothing to do with sex; they encode proteins needed byboth males and females.
91
sRy+
sRv
fr"
Determination
Sex-reversed
Normal
I
Sex
-110
genes
Why does having an SRY gene mean that you will be male and not having SRY mean that you will be female? Approximately six weeks after fertilization, SRY protein activates testes development in XY (or sex-reversed XX) embryos. The embryonic testes secrete hormones that trigger the development of male sex organs and prevent the formation of female sex organs. In the absence of SRY, an ovary develops instead of a testis, and other female sex organs develop by default.
SpeciesValy Enormously in Sex Determining Mechanisms Other species show variations on this XX versus XY chromosomal strategy of sex determination. In fruit flies, for example, although normal females are XX and normal males XY (see Fig. 4.2), it is ultimately the number of X chromosomes (and not the presence or absence of the Y) that determines sex. The different responses of humans and Drosophila to the same unusual complements of sex chromosomes (Table 4.1) reveal that the mechanisms for sex determination differ in flies and humans. XXY flies are female because they have two X chromosomes, but XXY humans are male because they have a Y. Conversely, because they have one X chromosome, XO flies are male, while XO humans are female because they lack a Y. male strategy of sex determifemale / XY The XX nation is by no means universal. In some species of moths, for example, the females are XX, but the males are XO. In C. elegans (one species of nematode), males are similarly XO, but XX individuals are not females; they are instead selffertilizing hermaphrodites that produce both eggs and sperm. In birds and butterflies, males have the matching sex chromosomes, while females have an unmatched set; in such species, geneticists represent the sex chromosomes as ZZ in the male and ZW in the female. The gender having
:
:
92
Chapter
4
The Chromosome Theory of Inheritance
IE@il
Sex DEterrninatisn in Fruit Flies and Hurnans Complement of Sex Chromosomes
n(
xn(
xxY
xo
XY
XYY
OY
Drosophila
Dies
Normal female
Normal female
Sterile male
Normal male
Normal male
Dies
Humans
Nearly
Normal female
Klinefelter male (sterile); tall, thin
Turner female (sterile); webbed neck
Normal male
Normal or nearly normal
Dies
normal female
male
Humans can tolerate extra X chromosomes (e.g., XXX) better than can Drosophila because in humans all but one X chromosome becomes a Barr body, as discussed later in this chapter. complete absence of an X chromosome is lethal to both fruit flies and humans. Additional Y chromosomes have little effect in either species. Although the y chromosome in Drosophila does not determine whether a fly looks like a male, it is necessary for male fertility; XO flies are thus sterile males.
two different sex chromosomes is termed the heterogametic sex because it gives rise to two different types of gam, etes. These gametes would contain either X or Y in the case of male humans, and either Z or W in the case of female birds. Yet other variations include the complicated sexdetermination mechanisms of bees and wasps, in which females are diploid and males haploid, and the systems of certain fish, in which sex is determined by changes in the environment, such as fluctuations in temperatu re. Table 4.2 summarizes some of the astonishing variety in the ways that different species have solved the problem ofassigning gender to individuals. In spite of these many differences between species, early researchers concluded that chromosomes can carry the genetic information specifying sexual identity-and probably many other characteristics as well. Sutton and other early adherents of the chromosome theory realized that the perpetuation of life itself therefore depends on
d
Humans and Drosophila
XX
XY
Moths and
XX
C. elegans
(hermaphrodites in C. elegans)
ZW
XO
and
Lizards and Alligators Tortoises and Turtles Anemone Fish
. .
Many sexually reproducing organisms have chromosomes that are sex-specific and that determine gender.
ln humans, male sex determination is triggered by a Y-linked gene called SRY; female sex determination occurs in XX embryos by default. Mechanisms of sex determination vary remarkably; in some species sex is determined by environmental factors rather than by specific chromosomes.
Diploid Cool temperature Warm temperature Older adults
Iearning objectives
1. 2.
Describe the key chromosome behaviors during mitosis.
Diagram the forces and structures that dictate chromosomal movement during mitosis.
ZZ
Butterflies Bees Wasps
.
Mitosis: Cell Division That Preserves Chromosome Number
?
and
essential concepts
lfl
TABLE 4.2
Birds
the proper distribution of chromosomes during cell division. In the next sections, you will see that the behavior of chromosomes during mitosis and meiosis is exactly that expected of cellular structures carrying genes.
Haploid Warm
temperature Cool
temperature Young adults
ln the species highlighted in purple, sex is determined by sex chromosomes. The species highlighted in blue have identical chromosomes in the two sexes, and sex is determined instead by environmental or other factors. Anemone fish (bottom row) undergo a sex change from male to female as they age.
The fertilized human egg is a single diploid cell that preserves its genetic identity unchanged through more than 100 generations ofcells as it divides again and again to produce a full-term infant ready to be born. As the newborn infant develops into a toddler, a teenager, and an adult, yet more cell divisions fuel continued growth and maturation. Mitosis, the nuclear division that apportions chromosomes in equal fashion to two daughter cells, is the cellular mechanism that preserves genetic information through all these generations of cells. In this section, we take a close look at
how the nuclear division of mitosis fits into the overall scheme of cell growth and division.
4.3 Mitosis: Cell Division That Preserves Chromosome
Number
93
Transgenic Mice Prove That SRY ls the Maleness Factor Genes similar to human SRY have been identified on the Y chromo-
somes of nearly all mammalian species. ln 1991, researchers used mouse transgenic technology to show definitively that the sRy gene is the crucial determinant of maleness. A transgenic mouse is one whose genome contains copies of a gene that came from another individual-or even from another species. Such genes are called transgenes. One focus of genetic engineering is technology forthe manipulation and insertion of transgenes. To determine if SRY is sufficient to determine maleness, researchers wanted to introduce copies of the mouse SRY gene into
Figure A Using pronuclear injection to generate mice transgenic for the SRYgene. SEY gene DNA
the genome of chromosomally female (XX) mice. lf SRY is the critical determinant of maleness, then XX mice containing an SRytransgene would nevertheless be male. First, the scientists isolated the DNA of the mouse SRY gene using cloning technology to be discussed in Chapter 1 0. Next, using a method called'pronuclear injection," transgenic mice were generated that contained the SRIgene on one oftheir autosomes, To perform pronuclear injection, many fertilized mouse eggs are collected from mated females. Next, the sperm or egg nucleus (called a pronucleus in the zygote) is injected with hundreds of copies of the SRY gene DNA (Fig. A). Nuclear enzymes integrate the DNA into random locations in the genome (Fig. A). After the injected zygotes ma(ured into early embryos, they were implanted into surrogate mothers. When the mice were born, cells were taken from their tails and tested for the presence of the SRy transgene using molecular biology techniques. Figure B shows at the right a transgenic mouse (transformed" with SRYJ obtained in this study. Although it is chromosomally XX, it is phenotypically male.This result demonstrates conclusively that the SRY gene
lnjection into pronucleus of zygote (fertilized egg)
sBv
I
alone
is
sufficient to determine maleness.
Figure B An XX mouse transformed with SRY is phenotypically male. Both the transformed XX mouse
at the
right and its normal XY littermate at the ieff have normal male genitalia. Arrows point to the penis.
+ Random integration of SFY gene -tDNA inio a chromosome
I
Tail cells tested for presence of SRY transgene
If you were to peer through a microscope and follow the history of one cell through time, you would see that for much of your observation, the chromosomes resemble a mass of extremely fine tangled string-called chromatinsurrounded by the nuclear envelope. Each convoluted thread of chromatin is composed mainly of DNA (which carries the genetic information) and protein (which serves as a
scaffold for packaging and managing that information, in Chapter 11). You would also be able to dis-
as described
tinguish one or two darker areas of chromalin called nucleoll (singular, nucleolus, literally "small nucleus"); nucleoli play a key role in the manufacture of ribosomes, organelles that function in protein synthesis. During the period between cell divisions, the chromatin-laden nucleus houses a great deal of invisible activity necessary for the growth and survival of the cell. One particularly important part of this activity is the accurate duplication of all the chromosomal
material.
94
Chapter
4
The Chromosome Theory of Inheritance
With continued vigilance, you would observe a dramatic change in the nuclear landscape during one very short period in the cell's life history: The chromatin condenses into discrete threads, and then each chromosome compacts even further into the twin rods clamped together at the centromere that can be identified in karyotype analysis (review Fig.4.3 on p. 88). Each rod in a duo is called a chromatid; as described earlier, it is an exact duplicate of the other sister chromatid to which it is connected. Continued observation would reveal the doubled chromosomes beginning to jostle around inside the cell, eventually lining up at the cell's midplane. At this point, the sister chromatids comprising each chromosome separate to opposite poles of the now elongating cell, where they become identical sets of chromosomes. Each ofthe two identical sets eventually ends up enclosed in a
separate nucleus in a separate cell. The two cells, known as
daughter cells, are thus genetically identical.
The repeating pattern of cell growth (an increase in size) followed by division (the splitting of one cell into two) is called the cell cycle (Fig. a.9). Only a small part of the cell cycle is spent in division (or M phase); the period between divisions is called interphase.
Figure 4.9 The cell cycle: An alternation between interphase and mitosis. (a) Chromosomes replicate to form sister chromatids during synthesis (5 phase); the sister chromatids segregate to daughter cells during mitosis (M phase). The gaps between the S and M phases, during which most cell growth takes place, are called the Gl and G2 phases. ln multicellular organisms, some terminally differentiated cells stop dividing and arrest in a "G6" stage. (b) lnterphase consists of the G,, S, and G2 phases together. (a) The cell cycle
(b) Chromosomes replicate during S phase ---l>
S: DNA synthesis
If
1';;:
"'_'1
'l::_;
*
tg'"4e
and gap 2 (Gr) (Fig. 4.9). G, lasts from the birth of a new cell to the onset of chromosome replication; for the genetic material, it is a period when the chromosomes are neither duplicating nor dividing. During this time, the cell achieves most of its growth by using the information from its genes to make and assemble the materials it needs to function normally. G1 varies in length more than any other phase of the cell cycle. In rapidly dividing cells of the human embryo, for example, G, is as short as a few hours. In contrast, mature brain cells become arrested in a resting form of G1 known as Gs and do not normally divide again during a person's lifetime.
Synthesis (S) is the time when the cell duplicates its genetic material by synthesizing DNA. During duplication, each chromosome doubles to produce identical sister chromatids that will become visible when the chromosomes condense at the beginning of mitosis. The two sister chromatids remain joined to each other at the centromere. (Note that this joined structure is considered a single chromosome as long as the connection between sister chromatids is maintained.) The replication of chromosomes during S phase is critical; the genetic material must be copied exactly so that both daughter cells receive identical sets of chromosomes. Gap 2 (G2) is the interval between chromosome duplication and the beginning of mitosis. During this time, the cell may grow (usually less than during Gr); it also synthesizes proteins that are essential to the subsequent steps of mitosis itself. In addition, during interphase an array of fine microtubules crucial for many interphase processes becomes visible outside the nucleus. The microtubules radiate out into the cytoplasm from a single organizing center known as the centrosome, usually located near the nuclear envelope. In animal cells, the discernible core of each centrosome is a pair of small, darkly staining bodies called centrioles (Fig. 4. l0a); the microtubule-organizing center of plants does not contain centrioles. During the S and G2 stages of interphase, the centrosomes replicate, producing two centrosomes that remain in extremely close proximity.
During Mitosis, Sister Chromatids Separate andTwo Daughter Nuclei Form
"{ a
Although the rigorously choreographed events of nuclear and cellular division occur as a dynamic and continuous process, scientists traditionally analyze the process in sepa-
and chromosome
"
During Interphase, Cells Grow and Replicate Their Chromosomes Interphase consists of three parts: gap I (Gr), synthesis (S),
rate stages marked by visible cytological events. The artist's sketches in Fig. 4.10 illustrate these stages in the nematode b
Ascaris, whose diploid cells contain only four chromo, somes (two pairs of homologous chromosomes).
4.3 Mitosis: Cell Division That Preserves Chromosome
Figure 4.10 Mitosis maintains the chromosome number of the parent cell in the two daughter nuclei.
Number
ln the photomicrographs of
newt lung cells at the left, chromosomes are stained blue and microtubules appear either green or yellow. ln animal cells Centriole Microiubules Centrosome
(a) Prophase:
(1) Chromosomes condense and become visible; (2) centrosomes move apart toward opposite poles and generate new microtubules; (3) nucleoli begin to disappear
Centromere Chromosome Sister ch Nuclear envelope
Astral microtubu
(b) Prometaphase:
Kinetochore
(1) Nuclear envelope breaks down; (2) microtubules from the cenirosomes invade ihe nucleus; (3) sister chromatids attach to microtubules from opposite centrosomes.
Kinetochore microtubules Polar microtubules
Metaphase plats
(c) Metaphase:
Chromosomes align on the meiaphase plate with sister chromatids facing opposite poles.
-
Separating sister chromatids
rt
(d) Anaphase: (1) Centromeres divide;
(2) the now separated sister chromatids move to opposite poles.
/
Re{orming nuclear envel ope
(e) Telophase: (1) Nuclear
membranes and nucleoli re-form; (2) spindle fibers disappear; (3) chromosomes uncoil and become a tangle of chromatin.
Nucleoli reappear Chromatin
(f)
s
95
Gytokinesis: The cytoplasm divides, splitting the elongated parent cell into two daughter cells with identical nuclei.
96
Chapter
4
The Chromosome Theory of Inheritance
Prophase: Chromosomes condense (Fig. 4.10a) During all of interphase, the cell nucleus remains intact, and the chromosomes are indistinguishable aggregates of chromatin. At prophase (from the Greek pro- meaning "before"), the gradual emergence, or condensation, of individual chromosomes from the undifferentiated mass of chromatin marks the beginning of mitosis. Each condensing chromosome has already been duplicated during interphase and thus consists of sister chromatids attached at the centromere. At this stage in Ascarls cells, there are therefore four chromosomes with a total of eight chromatids. The progressive appearance of an array of individual chromosomes is a truly impressive event. Interphase DNA molecules as long as 3-4 cm condense into discrete
chromosomes whose length is measured in microns (millionths of a meter). This is equivalent to compacting a 200 m length of thin string (as long as two football fields) into a cylinder 8 mm long and I mm wide. Another visible change in chromatin also takes place during prophase: The darkly staining nucleoli begin to break down and disappear. As a result, the manufacture of ribosomes ceases, providing one indication that general cellular metabolism shuts down so that the cell can focus its energy on chromosome movements and cellular
division. Several important processes that characterize prophase
occur outside the nucleus in the cytoplasm. The centrosomes, which replicated during interphase, now move apart
and become clearly distinguishable as two separate entities in the light microscope. At the same time, the interphase scaffolding of long, stable microtubules disappears and is replaced by a set of dynamic microtubules that rapidly grow from and shrink back toward their centrosomal organizing centers. The centrosomes continue to move apart, migrating around the nuclear envelope toward opposite ends of the nucleus, apparently propelled by forces exerted be-
tween interdigitated microtubules extending from both centrosomes.
Prometaphase: The spindle forms (Fig. a.10b) Prometaphase ("before middle stage") begins with the breakdown of the nuclear envelope, which allows microtubules extending from the two centrosomes to invade the nucleus. Chromosomes attach to these microtubules through
the kinetochore, a structure in the centromere region of each chromatid that is specialized for conveyance. Each kinetochore contains proteins that act as molecular motors, enabling the chromosome to slide along the microtubule. When the kinetochore of a chromatid originally contacts a microtubule at prometaphase, the kinetochore-based motor moves the entire chromosome toward the centrosome from which that microtubule radiates. Microtubules growing from the two centrosomes randomly capture chromosomes by the kinetochore of one of the two sister chromatids. As a result, it is sometimes possible to observe
groups of chromosomes congregating in the vicinity of each centrosome. In this early part of prometaphase, for each chromosome, one chromatid's kinetochore is attached to a microtubule, but the sister chromatid's kinetochore remains unattached. During prometaphase, three different types of microtubule fibers together form the mitotic spindle; all of these microtubules originate from the centrosomes, which function as the two "poles" of the spindle apparatus. Microtubules that extend between a centrosome and the kinetochore of a chromatid are called kinetochore microtubules, or c entr o m er i c fib er s. Micr otubules from each centrosome that are directed toward the middle of the cell are polar microtubules; polar microtubules originating in opposite centrosomes interdigitate near the cell's equator. Finally, there are short astral microtubules that extend out from the centrosome toward the cell's periphery. Near the end of prometaphase, the kinetochore of each chromosome's previously unattached sister chromatid now associates with microtubules extending from the opposite centrosome. This event orients each chromosome such that one sister chromatid faces one pole of the cell, and the
other, the opposite pole. Experimental manipulation has shown that if both kinetochores become attached to microtubules from the same pole, the configuration is unstable; one of the kinetochores will repeatedly detach from the spindle until it associates with microtubules from the other pole. The attachment of sister chromatids to opposite spindle poles is the only stable arrangement.
Metaphase: Chromosomes align at the cell's equator (Fig. 4.10c) During metaphase ("middle stage"), the connection of sister chromatids to opposite spindle poles sets in motion a series of jostling movements that cause the chromosomes to move toward an imaginary equator halfway between the two poles. The imaginary midline is called the metaphase plate. When the chromosomes are aligned along it, the forces pulling sister chromatids toward opposite poles are in a balanced equilibrium maintained by tension across the chromosomes. Tension results from the fact that the sister chromatids are pulled in opposite directions while they are still connected to each other by the centromere. Tension compensates for any chance movement away from the metaphase plate by restoring the chromosome to its position equidistant between the poles.
Anaphase: Sister chromatids move to opposite spindle poles (Fig. 4.10d) The nearly simultaneous severing of the centromeric connection between the sister chromatids of all chromosomes indicates that anaphase (from the Greek ana- meaning "up' as in "up toward the poles") is underway. The separation of sister chromatids allows each chromatid to be pulled toward the spindle pole to which it is connected by its kinetochore
4.3 Mitosis: Cell Division That Preserves Chromosome
,microtubules; as the chromatid moves toward the pole, its /kinetochore microtubules shorten. Because the arms of the chromatids lag behind the kinetochores, metacentric chromatids have a characteristic V shape during anaphase. The connection of sister chromatids to microtubules emanating from opposite spindle poles means that the genetic information migrating toward one pole is exactly the same as its counterpart moving toward the opposite pole.
Telophase: ldentical sets of chromosomes are enclosed in two nuclei (Fig.4.10e) The final transformation of chromosomes and the nucleus during mitosis happens at telophase (from the Greek telomeaning "end"). Telophase is like a rewind of prophase. The spindle fibers begin to disperse; a nuclear envelope forms
Number
97
Sometimes cytoplasmic division does not immediately follow nuclear division, and the result is a cell containing more than one nucleus. An animal cell with two or more nuclei is known as a syncytium. The early embryos of fruit flies are multinucleated syncytia (Fig. 4.12), as are the precursors of spermatozoa in humans and many other animals. A multinucleate plant tissue is called a coenocyte; coconut milk is a nutrient-rich food composed of coenocltes. Figure 4.1 1 Cytokinesis: The cytoplasm divides, producing two daughter cells. (a) ln this dividing frog zygote, the contractile ring at the cell's periphery has contracted to form a cleavage furrow that will eventually pinch the cell in two. (b) ln this dividing onion root cell, a cell plate that began forming near the equator of the cell expands to the periphery, separating the two daughter cells.
(a) Cytokinesis in an animal cell
around the group of chromatids at each pole; and one or more nucleoli reappear. The former chromatids now function as independent chromosomes, which decondense (uncoil) and dissolve into a tangled mass of chromatin. Mitosis, the division of one nucleus into two identical nuclei, is over.
Contractile
f, *i
Cytokinesis: The cytoplasm divides (Fig. 4.1Of) In the final stage of cell division, the daughter nuclei emerging at the end oftelophase are packaged into two separate daughter cells. This final stage of division is called cytokinesis (literally'tell movement"). During cytokinesis, the elongated parent cell separates into two smaller independent daughter cells with identical nuclei. Cytokinesis usually begins during anaphase, but it is not completed until after telophase. The mechanism by which cells accomplish c1'tokinesis differs in animals and plants. In animal cells, cytoplasmic division depends on a contractile ring that pinches the cell into two approximately equal halves, similar to the way the pulling of a string closes the opening of a bag of marbles (Fig. a.f 1a). Intriguingly, some types of molecules that form the contractile ring also participate in the mechanism responsible for muscle contraction. In plants, whose cells are surrounded by a rigid cell wall, a membrane-enclosed disk, known as the cell plate, forms inside the cell near the equator and then grows rapidly outward, thereby dividing the cell in two (Fig. a.llb). During cytokinesis, a large number of important organelles and other cellular components, including ribosomes, mitochondria, membranous structures such as Golgi bodies, and (in plants) chloroplasts, must be parcelled out to the emerging daughter cells. The mechanism accomplishing this task does not appear to predetermine which organelle is destined for which daughter cell. Instead, because most cells ,contain many copies of these cytoplasmic structures, each .,lnew cell is bound to receive at least a few representatives of each component. This original complement of structures is enough to sustain the cell until synthetic activity can repopulate the cltoplasm with organelles.
Cleavage furrow
(b) Cytokinesis in a plant cell
Cell plate
Figure 4.12 lf cytokinesis does not follow mitosis, one cell may conta in many nuclei. ln fertilized Drosophi I a eggs, 1 3 rounds of mitosis take place without cytokinesis. The result is a single-celled syncytial embryo that contains several thousand nuclei. The photograph shows part of an embryo in which the nuclei are all dividing; chromosomes arein red, and spindle fi bers are in green. Nuclei at the upper left are in metaphase, while nuclei toward the bottom right are progressively later in anaphase. Membranes eventually grow around these nuclei, dividing the embryo into cells.
98
Chapter
4
The Chromosome T1-reoly of Inheritar-rce
Figure 4.13 Checkpoints help regulate the cell cycle. cellular
Regulatory Checkpoints Ensure Correct Chromosome Separation The cell cycle is a complex sequence of precisely coordinated events. In higher organisms, a cell's'decision'to divide depends on both intrinsic factors, such as conditions within the cell that register a sulficient size for division, and signals from the environrnent, such as hormonal cues or contacts with neighboring cells that encourage or restrain division. Once a cell has initiated events leading to division, usually during the G1 period of interphase, everything else
follows like clockwork.
A
number
a
.
ls cell of sufficient size? . Have proper signals been received? THEN: Duplicate chromosomes and centrosomes
molecular signal that prevents the
sister chromatids of all chromosomes from separating at their centromeres. This signal makes the beginning of anaphase dependent on the prior proper alignment of all the chromosomes at metaphase. As a result of multiple cellcycle checkpoints, each daughter cell reliably receives the right number of chromosomes. Breakdown of the mitotic machinery can produce division mistakes that have crucial consequences for the cell. Improper chromosome segregation, for example, can cause serious malfunction or even the death of daughter cells. Gene mutations that disrupt mitotic structures, such as the spindle, kinetochores, or centrosomes, are one source of improper segregation. Other problems occur in cells where the normal restraints on cell division, such as checkpoints, have broken down. Such cells may divide uncontrollably, leading to a tumor. We present the details of cell-cycle regulation, checkpoint controls, and cancer formation in Chapter 19.
Through mitosis, diploid cells produce identical diploid progeny cells.
. .
Have the chromosomes been completely duplicated? THEN: Enter mitosis
Mitosis rnrerPnase
{- w
;,:ff['"6 synthesis \ .J
t :,a1.
F-
-
3ifln,o*,n ( q' *
w
and
cytokinesi
sF- r
fsJ
>€
Metaphase
F
tetopnase--
rc
Prophase
.
Anaphase
Have all chromosomes arrived and aligned at the metaphase plate? THEN: lnitiate anaphase
MeiosBss Cell DivEsions
That
Flalve ehromcsome Numher learning objectives Describe the key chromosome behaviors during meiosis
that lead to haploid gametes.
essentidl concepts
.
.
s
1.
.
Chromosome and centrosome duplication
of checkpoints-
moments at which the cell evaluates the results of previous steps-allow the sequential coordination of cell-cycle events. Consequently, under normal circumstances, the chromosomes replicate before they condense, and the doubled chromosomes separate to opposite poles only after correct metaphase alignment of sister chromatids ensures equal distribution to the daughier nuclei (Fig. 4.13). In one illustration of the molecular basis of checkpoints, even a single kinetochore that has not attached to spindle fibers generates
checkpoints (red wedges) ensure that important events in the cell cycle occur in the proper sequence. At each checkpoint, the cell determines whether prior events have been completed before it can proceed to the next step of the cell cycle. (For simplicity, we show only two chromosomes per cell.)
At metaphase, the sister chromatids are being pulled at their kinetochores toward opposite spindle poles; these poleward forces are balanced because the chromatids are connected at the centromere.
2. 3.
Compare chromosome behaviors during mitosis and meiosis.
Explain how the independent alignment of homologs, and also crossing-over during the first meiotic division, each contribute to the genetic diversity of gametes.
sister chromatids separate and move to opposite spindle poles.
During the many rounds of cell division within an ernbryo, most cells either grow and divide via the mitotic cell cycle just described, or they stop growing and become arrested in
Cell cycle checkpoints help ensure correct duplication and
Gs. These
separation of chromosomes.
so-called somatic cells whose descendants continue to make up the vast majority of each organism's tissues
Atthe beginning of anaphase,the centromere
is severed so
mitotically dividing and Gs-arrested cells are the
4.4 Meiosis: Cell Divisions That Halve Chromosome
throughout the lifetime of the individual. Early in the embryonic development of animals, however, a group of cells is set aside for a different fate. These are the germ cells: cells destined for a specialized.role in the production of gametes. Germ cells arise later in plants, during floral development instead of during embryogenesis. The germ cells become incorporated in the reproductive organs-ovaries and testes in animals; ovaries and anthers in flowering plantswhere they ultimately undergo meiosis, the special two-part cell division that produces gametes (eggs and sperm or pollen) containing half the number of chromosomes other body cells have. The union of haploid gametes at fertilization yields diploid offspring that carry the combined genetic heritage of two parents. Sexual reproduction therefore requires the alternation of haploid and diploid generations of cells. If gametes were diploid rather than haploid, the number of
Number
Figure 4.14 An overview of meiosis: The chromosomes replicate once, while the nuclei divide twice. ln this figure,
99
all
four chromatids of each chromosome pair are shown in the same shade of the same color. Note that the chromosomes duplicate before meiosis but they do not duplicate between meiosis I and meiosis ll.
l,
Chromosomes duplicate
chromosomes would double in each successive generation such that in humans, for example, the children would have 92 chromosomes per cell, the grandchildren 184, and so on.
Meiosis prevents this lethal, exponential accumulation of chromosomes.
\
,/
ln Meiosis, the Chromosomes Replicate Once but the Nucleus Divides Twice Unllke mitosis, meiosis consists of two successive nuclear divisions, logically named division I of meiosis and division II of meiosis, or simply meiosis I and meiosis II.
With each round, the cell
passes through a prophase, metaphase, anaphase, and telophase. In meiosis I, the parent nucleus divides to form two daughter nuclei; in meiosis Ii, each of the two daughter nuclei divides, resulting in
four nuclei (Fig. a.fa). These four nuclei-the final prod-
ucts of meiosis-become partitioned
in four
separate
daughter cells because cytokinesis occurs after both rounds
of division. The chromosomes duplicate at the start of meiosis I, but they do not duplicate in meiosis II, which explains why the gametes contain half the number of chromosomes found in other body cells. A close look at each round of meiotic division reveals the mechanisms by which each gamete comes to receive one full haploid set of chromosomes.
During Meiosis l, Homologs Pair, Exchange Parts, and Then Segregate The events of meiosis I are unique among nuclear divisions (Fig. 4.15, meiosis I). The process begins with the replication of chromosomes, after which each one consists of two
sister chromatids. A key to understanding meiosis I is the observation that the centromeres joining these chromatids
remain intact throughout the entire division, rather than splitting as in mitosis.
As the division proceeds, homologous chromosomes align across the cellular equator to form a coupling that ensures proper chromosome segregation to separate nuclei. Moreover, during the time homologous chromosomes face each other across the equator, the maternal and paternal
chromosomes of each homologous pair may exchange parts, creating new combinations of alleles at different genes along the chromosomes. Afterward, the two homolo-
gous chromosomes, each still consisting of two sister chromatids connected at a single, unsplit centromere, are pulled to opposite poles of the spindle. As a result, it is homologous chromosomes (rather than sister chromatids as in mitosis) that segregate into different daughter cells at the conclusion of the first meiotic division. With this overview in mind, let us take a closer look at the specific events of meiosis I, remembering that we analyze a dynamic, flowing sequence of cellular events by breaking it down somewhat arbitrarily into the easily pictured, traditional phases.
Prophase l: Homologs condense and pair, and crossing-over occurs Among the critical events of prophase I are the condensation of chromatin, the pairing of homologous chromosomes, and the reciprocal exchange of genetic information between these paired homologs. Figure 4.15 shows a generalized view ofprophase I; however, research suggests that the exact sequence of events may vary in different species. These complicated processes can take many days, months, or even years to complete. For example, in the female germ cells of several species, including humans, meiosis is suspended at prophase I for many years until ovulation (as will be discussed further in Section 4.5).
100
Chapter
4
The Chromosome Theory of Inheritance
FEATURE FIGURE 4.15 Meiosis: One Diploid Cell Produces Four Haploid Cells Meiosis l: A reductional division
+ Prophase l: Lepiotene 1, Chromosomes thicken and become visible, but the chromatids remain
Prophase l: Zygotene Homologous chromosomes enter
invisible. 2. Centrosomes begin to move toward opposite poles.
2. fhe synaptonemal complexlorms.
Prophase l: Pachytene 1. Synapsis is complete.
2. Crossing-over, genetic exchange behveen
synapsis.
Metaphase I 1. Tetrads ine up along the metaphase f
nonsister chromatids of a homologous pair, occurs.
Anaphase
pair poles. the
2. Each chromosome of a homologous
attaches to fibers lrom opposite 3. Sister chromatids attach to libers from same pole.
Meiosis ll: An equational division
I
plate. 1 . The centromere does not divide. 2. The chiasmata dissolve. 3. Homologous chromosomes move to opposite poles.
*
Prophase ll 1. Chromosomes condense. 2. Centrioles move toward the poles. 3. The nuclear envelope breaks down at lhe end of prophase ll (not shown).
Metaphase ll 1. Chromosomes align at the metaphase plate.
2. Sister chromatids attach to spindle fibers from opposite poles.
Anaphase ll 1. Centromeres divide, and sister chromatids move to opposite poles.
4.4 Meiosis: Cell Divisions That Halve Chromosome
Number
101
Figure 4.1 5 To aid visualization of the chromosomes, the figure is simplified in two ways: (1) The nuclear envelope is not shown during prophase of either meiotic division. (2)The chromosomes are shown as
fully condensed at zygotene; in reality, is not achieved until
full condensation
-t
diakinesis.
Prophase l: Diplotene 1. Synaptonemal complex dissolves. 2. Atetrad of four chromatids is visible. 3. Crossover points appear as chiasmata, holding nonsister chromatids together.
Prophase l: Diakinesis 1. Chromatids thicken and shorten. 2. At the end of prophase l, the nuclear membrane (not shown earlier) breaks down, and the spindle begins io form.
4. Meiotic arrest occurs at this time in many species.
r-
Telophase
I
1. The nuclear envelope re-forms. 2. Resultant cells have half the number of chromosomes, each consisting of two sister chromatids.
Telophase ll 1. Chromosomes begin to uncoil. 2. Nuclear envelopes and nucleoli (not shown) re-form.
lnterkinesis 1. This is similar to interphase with one important exception'. No chromosomal duplication takes place. 2. ln some species, the chromosomes decondense; in others, they do not.
Cytokinesis 1. The cytoplasm divides, forming four new haploid cells.
1O2
Chapter
4
The Chromosome Theory of Inheritance
Figure 4.16 Prophase I of meiosis at very high magnification. Sister chromatid
1
+
Synaptonemal complex
Sister chromatid 2 Synaptonemal complex
Homologous chromosomes
Sister chromatid 3 Recombination nodules
+
Sister chromatid 4
(a) Leptotene: Threadlike chromosomes begin to condense and thicken, becoming visible as discrete structures. Although the chromosomes have duplicated, the sister chromatids of each chromosome are not yet visible in the microscope.
(b) Zygotene: Chromosomes are clearly visible and begin pairing with homologous chromosomes along the synaptonemal complex to form a bivalent, or tetrad.
(c) Pachytene: Full synapsis of homologs. Recombination nodules appear along the synaptonemal complex.
Chiasmata (d) Diplotene: Bivalent appears to pull apart slightly but remains connected at crossover sites, called chiasmata.
'thin' and 'delicate") first definable substage ofprophase I, the time when the long, thin chromosomes begin to thicken (see Fig. 4.l6afor a more detailed view). Each chromosome has already duplicated prior to prophase I (as in mitosis) and thus consists of two sister chromatids affixed at a centromere. At this point, however, these sister chromatids are so tightly bound together that they are not yet visible as separate entities. Zygotene (from the Greek for'tonjugation') begins as each chromosome seeks out its homologous partner and the matching chromosomes become zipped together in a process known as synapsis. The "zipper" itself is an elaboLeptotene (from the Greek for
is the
rate protein structure called the synaptonemal complex that aligns the homologs with remarkable precision, juxtaposing the corresponding genetic regions of the chromosome pair (Fig. a.l6b). Pachytene (from the Greek for'thicli' or "fat") begins at the completion of synapsis when homologous chromosomes are united along their length. Each synapsed chromosome pair is known as a bivalent (because it encompasses two chromosomes), or a tetrad (because it contains four chromatids). On one side of the bivalent is a maternally derived chromosome, on the other side a paternally derived one. Because X and Y chromosomes are not identical, they do not synapse completely. However, the pseudoautosomal regions previously shown in Fig. 4.8 provide small stretches of similarity (or "homology'') behveen the X and the Y chromosomes that allow them to pair with each other during meiosis I in males.
(e) Diakinesis: Further condensation of chromatids. Nonsister chromatids that have exchanged parts by crossing-over remain closely associated at chiasmata.
During pachytene, structures called recombination nodules begin to appear along the synaptonemal complex, and an exchange of parts between nonsister (that is, between maternal and paternal) chromatids occurs at these nodules (see Fig. 4.16c for details). Such an exchange is known as crossing-over; it results in the recombination of genetic material. As a result of crossing-over, chromatids may no longer be of purely maternal or paternal origin; however, no genetic information is gained or lost, so all chromatids retain their original size. Diplotene (from the Greek for "twofold" or'double") is signaled by the gradual dissolution of the synaptonemal zipper complex and a slight separation of regions of the homologous chromosomes (see Fig. 4.16d). The aligned homologous chromosomes of each bivalent nonetheless re-
main very tightly merged at intervals along their length called chiasmata (singular, chiasma), which represent the sites where crossing-over occurred.
Diakinesis (from the Greek for 'double movement") is accompanied by further condensation of the chromatids. Because of this chromatid thickening and shortening, it can now clearly be seen that each tetrad consists of four sepa-
rate chromatids, or viewed in another way, that the two homologous chromosomes of a bivalent are each composed of two sister chromatids held together at a centromere (see Fig. 4.f6e). Nonsister chromatids that have undergone crossing-over remain closely associated at chiasmata. The
end of diakinesis is analogous to the prometaphase of
4.4 Meiosis: Cell Divisions That Halve Chromosome
mitosis: The nuclear envelope breaks down, and the microtubules of the spindle apparatus begin to form.
Metaphase l: Paired homologs attach to spindle fibers from opposite poles a kinetochore that from opposite to microtubules emanating becomes attached is poles. During meiosis I, the situation different. The spindle fuse, so that each chromokinetochores of sister chromatids During functional kinetochore. some contains only a single (see I), it is the kinetochores Fig. 4.15, meiosis metaphase I of homologous chromosomes that attach to microtubules from opposite spindle poles. As a result, in chromosomes aligned at the metaphase plate, the kinetochores of maternally and paternally derived chromosomes are subject to pulling forces from opposite spindle poles, balanced by the physical connections between homologs at chiasmata. Each bivalent's alignment and hookup is independent of that of every other bivalent, so the chromosomes facing each pole are a random mix of maternal and paternal origin.
During mitosis, each sister chromatid has
Anaphase l: Homologs move to opposite spindle poles At the onset of anaphase I, the chiasmata joining homologous chromosomes dissolve, which allows the maternal and
, paternal homologs to begin to move toward opposite spinrdle poles (see Fig.4.15, meiosis I). Note that in the first meiotic division, the centromeres do not divide as they do in mitosis. Thus, from each homologous pair, one chromosome consisting of two sister chromatids joined at their centromere segregates to each spindle pole. Recombination through crossing-over plays an important role in the proper segregation of homologous chromosomes during the first meiotic division. The chiasmata hold the homologs together and thus ensure that their kinetochores remain attached to opposite spindle poles throughout metaphase. When recombination does not occur within a bivalent, mistakes in hookup and conveyance maycause homologous chromosomes to move to the same pole, instead of segregating to opposite poles. In some organisms, however, proper segregation of nonrecombinant chromosomes nonetheless occurs through other pairing processes. Investigators do not yet com-
pletely understand the nature ofthese processes and are currently evaluating several models to explain them.
Number
103
In most species, cytokinesis follows telophase I, with daughter nuclei becoming enclosed in separate daughter cells. A short interphase then ensues. During this time, the chromosomes usually decondense, in which case they must recondense during the prophase ofthe subsequent second meiotic division. In some species, however, the chromosomes simply stay condensed. Most importantly, there is no S phase during the interphase between meiosis I and meiosis II; that is, the chromosomes do not replicate during meiotic interphase. The relatively brief interphase between meiosis I and meiosis II is known as interkinesis.
During Meiosis ll, Sister Chromatids Separate to Produce Haploid Gametes The second meiotic division (meiosis II) proceeds in a fashion very similar to that of mitosis, but because the number of chromosomes in each dividing nucleus has already been
reduced by half, the resulting daughter cells are haploid. The same process occurs in each of the two daughter cells generated by meiosis I, producing four haploid cells at the end of this second meiotic round (see Fig. 4.15, meiosis II).
Prophase ll: The chromosomes condense If the chromosomes decondensed during the preceding interphase, they recondense during prophase II. At the end ofprophase II, the nuclear envelope breaks down, and the spindle apparatus re-forms.
Metaphase ll: Chromosomes align at the metaphase plate The kinetochores of sister chromatids attach to microtubule fibers emanating from opposite poles of the spindle apparatus, just as in mitotic metaphase. There are nonetheless two significant features of metaphase II that distinguish it from mitosis. First, the number of chromosomes is one-half that in mitotic metaphase of the same species. Second, in most chromosomes, the two sister chromatids are no longer strictly identical because of the recombination through crossing-over that occurred during meiosis I. The sister chromatids still contain the same genes, but they may carry different combinations of alleles.
Telophase l: Nuclear envelopes re-form The telophase of the first meiotic division, or telophase I, takes place when nuclear membranes begin to form
Anaphase ll: Sister chromatids move to opposite spindle poles |ust as in mitosis, severing of the centromeric connection between sister chromatids allows them to move toward opposite spindle poles during anaphase II.
around the chromosomes that have moved to the poles. Each of the incipient daughter nuclei contains one-half the number of chromosomes in the original parent nucleus, but each chromosome consists of two sister chromatids joined at the centromere (see Fig. 4.15, meiosis I). Because the number of
Telophase ll: Nuclear membranes re-form, and cytokinesis follows Membranes form around each of four daughter nuclei in telophase II, and cltokinesis places each nucleus in a sepa-
chromosomes is reduced to one-half the normal diploid number, meiosis I is often called a reductional division.
rate cell. The result is four haploid gametes. Note that at the end of meiosis II, each daughter cell (that is, each gamete)
104
Chapter
4
The Chromosome Theory of Inheritance
has the same number of chromosomes as the parental cell present at the beginning of this division. For this reason, meiosis II is termed an equational division.
Mistakes in Meiosis Produce Defective Gametes
Figure 4.17 How meiosis contributes to genetic diversity. (a) The variation resulting from the independent assortment of nonhomologous chromosomes increases with the number of chromosomes in the
genome. (b) Crossing-over between homologous chromosomes ensures that each gamete is unique.
(b) Recombination
(a) lndependent assortment
I
Orientation
Ag' ;>s a< CD
Ab' >-
\-
& tu,
&,
&
Meiosis produces four haploid cells, one (egg) or all (sperm) ol which can become gametes. None of these is identical to each other or to the original cell, because meiosis results in combinatorial change.
105
106
Chapter
4
The Chromosome Theory of Inheritance
Oogenesis in Humans Produces One Ovum from Each Primary Oocyte
essential concepts
. .
.
ln meiosis, chromosomes replicate once (before meiosis but the nucleus divides twice (meiosis I and ll).
l),
During metaphase of the first meiotic division, homologous chromosomes connect to opposite spindle poles. The independent alignment of each pair of homologs ensures the independent assortment of genes carried on different chromosomes. Crossing-over during the first meiotic division maintains the connection between homologous chromosomes until anaphase I and contributes to the genetic diversity of gametes.
.
Sister chromatids separate from each other during meiosis ll so that gametes have only one copy of each
chromosome.
. .
Fertilization-the union of egg and sperm-restores the diploid number of chromosomes (2n) to the zygote. Errors during meiosis may produce gametes with missing or extra chromosomes, which often is lethal to offspring.
]f,|
Gametogenesis
Iearning objectives Compare the processes of oogenesis and spermatogenesis in humans. 2
Distinguish between the sex chromosome complements of human female and male germ-line cells at different stages of gametogenesis.
In all sexually reproducing animals, the embryonic germ cells (collectively known as the germ line) undergo a series
of mitotic divisions that yield a collection of specialized diploid cells, which subsequently divide by meiosis to produce haploid cells. As with other biological processes, many variations on this general pattern have been observed. In some species, the haploid cells resulting from meiosis are the gametes themselves, while in other species, those cells must undergo a specific plan of differentiation to fulfill that function. Moreover, in certain organisms, the four haploid products of a single meiosis do not all become gametes. Gamete formation, or gametogenesis, thus gives rise to haploid gametes marked not only by the events of meiosis se but also by cellular events that precede and follow meiosis. Here we illustrate gametogenesis with a description of egg and sperm formation in humans. The details of gamete formation in several other organisms appear throughout the book in discussions of specific experimental studies.
per
The end product of egg formation in humans is a large, nutrient-rich ovum whose stored resources can sustain the early embryo. The process, known as oogenesis (Fig. 4.f8), begins when diploid germ cells in the ovary, called oogonia (singular, oogonium), multiply rapidly by mitosis and produce a large number of primary oocytes, which then undergo meiosis.
For each primary oocyte, meiosis I results in the formation of two daughter cells that differ in size, so this division is asymmetric. The larger of these cells, the secondary oocyte, receives over 95o/o of the cytoplasm. The other small sister cell is known as the first polar body. During meiosis II, the secondary oocyte undergoes another asymmetrical division to produce a large haploid ovum and a small, haploid second polar body. The first polar body usually arrests its development. The two small polar bodies apparently serve no function and disintegrate, leaving one large haploid olrrm as the functional gamete. Thus, only one of the three (or rarely, four) products of a single meiosis serves as a female gamete. A normal human ovum carries 22 a:utosomes and an X sex chromosome. Oogenesis begins in the fetus. By six months after conception, the fetal ovaries are fully formed and contain about half a million primary oocytes arrested in the diplotene substage of prophase I. These cells, with their homologous chromosomes locked in slmapsis, were thought for decades to be the only oocytes the female will produce. If so, a girl is born
with all the oocytes she will ever possess. Remarkably, recent research has brought this long-held theory into question. Scientists have shown that germ-line precursor cells removed
from adult ovaries can produce new eggs in a petri dish. However, it is not yet known whether these eggs are viable nor if these germ-line cells normally produce eggs in adults. From the onset of puberty at about age 12, until menopause some 35-40 years later, most women release one primary ooclte each month (from alternate ovaries), amounting to roughly 480 oocytes released during the reproductive years. The remaining primary oocl'tes disintegrate during menopause. At ovulation, a released oocyte completes meiosis I and proceeds as far as the metaphase of meiosis II. If the ooclte is then ferlilized, that is, penetrated by a sperm nucleus, it quickly completes meiosis II. The nuclei of the sperm and ovum then fuse to form the diploid nucleus of the zygote, and the zygote divides by mitosis to produce a functional embryo. In contrast, unfertilized oocl'tes exit the body during the menses stage of the menstrual cycle. The long interval before completion of meiosis in oocytes released bywomen in their 30s, 40s, and 50s may contribute to the observed correlation between maternal age and meiotic segregational errors, including those that produce trisomies. Women in their mid-2Os, for example, run a very small risk of trisomy 21; only 0.05% of children born
4.5 Gametogenesis 107
Figure 4.18 ln humans, egg formation begins in the fetal ovaries and arrests during the prophase of meiosis l. Fetal contain about 500,000 primary oocytes arrested in the diplotene substage of meiosis l. lf the egg released during a menstrual cycle completed. Only one of the three cells produced by meiosis serves as the functional gamete, or ovum.
is
ovaries
fertilized, meiosis
is
Arrest at diplotene of meiosis L Oocyte grows and accumulates nutrients.
Mitosis (occurs in fetal ovary)
n
Meiosis l: Asymmetrical division
Meiosis ll: Asymmetrical division
(completed at ovulation)
(occurs only after fertilization)
Anested primary oocyte
1l
First First polar body
body Second polar body
7
Secondary oocyte
Mature ovum
Ovarian ligament 3. Mature follicle with secondary oocyte
) Ovary 4. Ruptured follicle
5. Released secondary oocyte
to women of this age have Down syndrome. During the later childbearing years, however, the risk rapidly rises; at age 35, it is 0.9% of live births, and at age 45, it is 3%. You would not expect this age-related increase in risk if meiosis were completed before the mother's birth.
Spermatogenesis in Humans Produces Four Sperm from Each Primary Spermatocyte The production of sperm, or spermatogenesis (Fig.4.l9), begins in the male testes in germ cells known as spermatogonia.
Mitotic divisions of the spermatogonia produce many diploid cells, the primary spermatocytes. Unlike primary oocytes, primary spermatocytes undergo a symmetrical meiosis I, producing two secondary spermatocytes, each of which undergoes a symmetrical meiosis IL At the conclusion of meiosis, each original primary spermatocyte
thus yields four equivalent haploid spermatids. These sper-
matids then mature by developing a characteristic whiplike tail and by concentrating all their chromosomal material in a head, thereby becoming functional sperm. A human sperm, much smaller than the ovum it will fertilize, contains 22 autosomes and either anX or aY sex chromosome. The timing of sperm production differs radically from that of egg formation. The meiotic divisions allowing conversion of primary spermatocytes to spermatids begin only at puberty, but meiosis then continues throughout a man's life. The entire process of spermatogenesis takes about 48-60 days: 16-20 for meiosis I,16-20 for meiosis II, and 16-20 for
the maturation of spermatids into fully functional sperm. Within each testis after puberty, millions of sperm are always in production, and a single ejaculate can contain up to 300 million. Over a lifetime, a man can produce billions of sperm, almost equally divided between those bearing an X and those bearing a Y chromosome.
108
Chapter
4
The Chromosome Theory of Inheritance
4.19
Figure Human sperm form continuously in the testes after puberty. Spermatogonia are located near the exterior of seminiferous tubules in a human testis. Once they divide to produce the primary spermatocytes, the subsequent stages of spermatogenesis-meiotic divisions in the spermatocytes and maturation of spermatids into sperm-occur successively closer to the middle of the tubule. Mature sperm are released into the central lumen of the tubule for ejaculation.
Spermatogonia Primary spermatocyte
,
Secondary spermatocyte Spermatid Sperm
\
.#_* .=:=
v [1==l
,ffft
,,fi) {,^^,.?:^\ 7 ',1
\\
*i'-
U/,
-,
ffi' '
ff',* =€n
\\ Af, r{t
{+
:
\r-
iJii::y,.:ffi'J3H:*"
duplication)
Secondary spermatocyte
"7-*
Spermatids Mitosis (occurs in adult testis)
Meiosis
I
essential concepts
. . . .
Diploid germ cell precursors proliferate by mitosis and then undergo meiosis to produce haploid gametes.
sPerm
Differentiation
lfit
Validation of the Chromosome Theory
Human females are born with oocytes arrested in prophase of meiosis l. Meiosis resumes at ovulation but is not completed until fertilization.
1.
Spermatogenesis begins at puberty and continues through the lifetimes of human males.
2,
The two meiotic divisions of oogenesis are asymmetrical, so that a primary oocyte results in a single egg. The two meiotic
Infer from the results of crosses whether or not a trait is sex-linked.
3.
Predict phenotypes associated with nondisjunction of sex chromosomes.
divisions of spermatogenesis are symmetrical, so that a primary spermatocyte results in four sperm.
.
Meiosis ll
l
All human oocytes contain a single X chromosome; sperm contain either an X or a Y.
learning objectives Describe the key events of meiosis that explain Mendel's first and second laws.
We have presented thus far two circumstantial lines of evidence in support of the chromosome theory of inheritance.
4.6 Validation of the Chromosome
First, the phenotype of sexual identity is associated with the inheritance of particular chromosomes. Second, the events of mitosis, meiosis, and gametogenesis ensure a constant number of chromosomes in the somatic cells of all members of a species over time; one would expect the genetic material to exhibit this kind of stability even in organisms with very different modes of reproduction. Final acceptance of the chromosome theory depended on researchers going beyond the circumstantial evidence to a rigorous demonstration of two key points: (1) that the
inheritance of genes corresponds with the inheritance of chromosomes in every detail, and (2) that the transmission of particular chromosomes coincides with the transmission of specific traits other than sex determination.
Mendel's Laws Correlate with Chromosome Behavior During Meiosis Walter Sutton first outlined the chromosome theory of inheritance in 1902-1903, building on the theoretical ideas and experimental results of Theodor Boveri in Germany, E. B. Wilson in New York, and others. In a 1902 paper, Sutton speculated that "the association of paternal and maternal chromosomes in pairs and their subsequent separation during the reducing division [that is, meiosis I] . . . may constitute the physical basis of the Mendelian law of hered1 ityJ'In 1903, he suggested that chromosomes carryMendel's ' hereditary units for the following reasons:
1. Every cell contains two copies ofeach kind ofchromosome, and there are two copies of each kind of
2.
gene. The chromosome complement, like Mendel's genes, appears unchanged as it is transmitted from parents to
offspring through generations.
3. During meiosis, homologous chromosomes pair
109
different (that is, nonhomologous) chromosomes, the behavior of chromosomes can be seen to parallel the behavior of genes. Walter Sutton's observation of these parallels led him to propose that chromosomes and genes are physically connected in some manner. Meiosis ensures that each gamete
will contain only a single chromatid of
a bivalent and thus only a single allele of any gene on that chromatid (Table 4.4a). The independent behavior of two bivalents during meiosis means that the genes carried on different chromosomes will assort into gametes independently (Table .ab). From a review of Fig. 4.17a (on p. 104), which follows two different chromosome pairs through the process of meiosis, you might wonder whether crossing-over abolishes the clear correspondence between Mendel's laws and the movement of chromosomes. The answer is no. Each chromatid of a homologous chromosome pair contains only one copy of a given gene, and only one chromatid from each pair of homologs is incorporated into each gamete. Because alternative alleles remain on different chromatids even after crossing-over has occurred, alternative alleles still segregate to different gametes as demanded by Mendel's first law. And because the orientation of nonhomologous chromosomes is completely random with respect to each other during both meiotic divisions, the genes on different chromosomes assort independently even if crossing-over occurs, as demanded by Mendel's second law. In Fig. 4.I7a, you can see that without recombination, each of the two random alignments of the nonhomologous chromosomes results in the production of only two of the four gamete types: AB and ab for one orientation, and Ab and aB for the other orientation. With recombination, each of the alignments of alleles in Fig. 4.17a may in fact generate all four gamete types. (Imagine a crossover switching the positions ofA and a nonsister chromatids inFig.4.l7a, as happens in Fig. a.17b). Thus, both the random alignment of nonho-
mologous chromosomes and crossing-over contribute to
and then separate to different gametes, just as the alternative alleles ofeach gene segregate to different
the phenomenon of independent assortment.
gametes.
4. Maternal and paternal
copies of each chromosome pair move to opposite spindle poles without regard to
the assortment of any other homologous chromosome pair, just as the alternative alleles of unrelated genes assort independently. 5. Alfertllization, an egg's set of chromosomes unites with a randomly encountered sperm's set of chromosomes, just as alleles obtained from one parent unite at random with those from the other parent. 6. In all cells derived from the fertilized egg, one-half of the chromosomes and one-half of the genes are of maternal origin, the other half of paternal origin. The two parts of Table 4.4 show the intimate relation-
ship between the chromosome theory of inheritance and Mendel's laws of segregation and independent assortment.
Theory
If
Mendelt genes for pea shape and pea color are assigned to
Specifi c Traits Are Transmitted with Specifi c Chromosomes The fate of a theory depends on whether its predictions can be validated. Because genes determine traits, the predic-
tion that chromosomes carry genes could be tested by breeding experiments that would show whether transmission of a specific chromosome coincides with transmission of a specific trait. Cytologists knew that one pair of chromosomes, the sex chromosomes, determines whether an individual is male or female. Would similar correlations exist for other traits? A gene
determining eye color
on the Drosophila X chromosome Thomas Hunt Morgan, an American experimental biologist with training in embryology, headed the research group
1
10
Chapter
4
The Chromosome Theory of Inheritance
How the Chromosorne Theory of lnheritance Explains Mendel's Laws {a) The Law of Segregation
{bl The Law of lndependent Assortment
F1
Fl
Homologous pair for seed color
r
Meiosis I Anaphase
Meiosis I Anaphase
Homologous pair for seed texture
(Y) Yellow
Round (R)
(y)
Wrinkled (r)
Green
t"K-
-)r'F OR
rru'** :=r:"[.Meiosis ll
+ Possible gametes
Possible gametes Round
(R)
R
r
R
r
$
0il
RR
Rr
l1
3ry
Rr
rr
F1
,€:
si'
{t'
Green wrinkled
Green round (y R)
Yellow wrinkled (Y r)
$; Yr
Ci:
UT
fli: Y
fl: flir rIir YR Rr YY RR
{E:
hybrid plant, the allele for round-seeded peas (R) is found on one chromosome, and the allele for wrinkled peas (r) is on the homologous chromosome. The pairing between the two homologous chromosomes during prophase through metaphase of meiosis I makes sure that the homologs will separate to opposite spindle poles during anaphase l. At the end of meiosis ll, two types of gametes have been produced: half have R, and half have r, but no gametes have both alleles. Thus, the separation of homologous chromosomes at meiosis I corresponds to the segregation of alleles. As the Punnett square shows, fertilization of 50o/o R and 507o r eggs with the same proportion of R and r pollen leads to Mendel's 3:1 ratio in the F2 generation. ln an
+
{i:
F2
F2
ll
Yellow
round (Y R)
Wrinkled (r)
Meiosis
Yr
YY
{n: {rt, YY rr
YY Rr
:
VR
*ir; #i* Yv RR
Yy Rr
*i
GE=
Yv Rr
Yv
rr
6t, f:ril
ff:G,
. r*
€sr
Yy RR
Yy Rr
vv RR
yy Rr
€r *i* ffi; yr
€r
€+
yR
Yv Rr
Yv
rr
VV TR
vv rr
One pair of homologous chromosomes carries the gene for seed texture (alleles R and r). A second pair of homologous chromosomes carries the gene for seed color (alleles Y and y). Each homologous pair aligns at random at the metaphase plate during meiosis l, independently of the other homologous pair. Thus, two equally likely configurations are possible for the migration of any two chromosome pairs toward the poles during anaphase l. As a result, a dihybrid individual will generate four equally likely types of gametes with regard to the two traits in question. The Punnett square affirms that independent assortment of traits carried by nonhomologous chromosomes produces Mendel's 9:3:3: l ratio.
4.6 Validation of the Chromosome
whose findings eventually established a firm experimental base for the chromosome theory. Morgan chose to work with the fruit fly Drosophila melanogaster because it is extremely prolific and has a very short generation time, taking only 12 days to develop from a fertilized egg into a mature adult capable of producing hundreds of offspring. Morgan fed his flies mashed bananas and housed them in empty milk bottles capped with wads of cotton. In 1910, a white-eyed male appeared among alatge group of flies with brick-red eyes. A mutation had apparently altered a gene determining eye color, changing it from the normal wild-type allele specifying red to a new allele
that produced white. When Morgan allowed the whiteeyed male to mate with its red-eyed sisters, all the flies of the F1 generation had red eyes; the red allele was clearly dominant to the white (Fig.4.20, cross A). Establishing a pattern of nomenclature for Drosophila geneticists, Morgan named the gene identified by the abnormal white eye color, the white gene, for the mutation
that revealed its existence. The normal wild-type allele of the vvhite gene, abbreviated w+ , is for brick-red eyes, while the counterpart mutant w allele results in white eye color.
Theory
111
The superscript * signifies the wild type. By writing the gene name and abbreviation in lowercase, Morgan symbolized that the mutant w allele is recessive to the wildtype w+ . (If a Drosophila mutation results in a dominant non-wild-type phenotype, the first letter of the gene name or of its abbreviation is capitalized; thus the mutation known as Bar eyes is dominant to the wild-typ e Bar+ allele. (See the Guidelines for Gene Nomenclature on p. A-1.) Morgan then crossed the red-eyed males of the F1 generation with their red-eyed sisters (Fig.4.20, cross B) and obtained an F2 generation with the predicted 3:1 ratio of red to white eyes. But there was something askew in the pattern: Among the red-eyed offspring, there were two females for every one male, and all the white-eyed offspring were males. This result was surprisingly different from the equal transmission to both sexes of the Mendelian traits discussed in Chapters 2 and 3. In these fruit flies, the ratio of eye color phenotypes was not the same in male and female progeny. By mating F2 red-eyed females with their white-eyed brothers (Fig. 4.20, cross C), Morgan obtained some females with white eyes, which then allowed him to mate
Figure 4.20 A Drosophilo eye color gene is located on the X chromosome.
X-linkage explains the inheritance of alleles of the whlte gene
in this series of crosses performed byThomas Hunt Morgan. The progeny of crosses A, B, and C outlined with green dotted boxes are those used as the parents in the next cross of the series.
Cross B
Cross A
x**
x*'
ffi
w
o+
t-
ffiI
sft d
X
/\
t-
xdx
x**x*
X
ffi
ffid
o+
r
xwY
/
x**
x*
i
;;;;l
ffifd
I
\\
t-
i
I
XW,Y
I
I
XWY
ffi i4l iry ffi d I I
I I
?
I I
J
xfr
x'
ffi9d
XWY
w
Ffu d
.J
'1
x*x*
\f
dftXffi gd
./
t-
XW'Y
3 red
Cross D
x**x*
ffi d
\\\\ \\
x'
o+
All progeny red-eyed
Cross C
x*t
X
ffiI ffi ffid
L __-_l
J
XdY
x**x*
x"Y
xfrY X
ffi d
X X'Y
).t
EftIU
Crisscross inheritance
white
I
112
Chapter
4
The Chromosome Theory of Inheritance
a white-eyed female with a red-eyed wild-type male (Fig. 4.20, cross D). The result was exclusively red-eyed daughters and white-eyed sons. The pattern seen in cross D is known as crisscross inheritance because the males inherit their eye color from their mothers, while the daughters inherit their eye color from their fathers. Note in Fig. 4.20 that the results of the reciprocal crosses red female X white male (cross A) and white female X red male (cross D) are not identical, again in contrast with Mendel's findings. From the data, Morgan reasoned that the white gene for eye color is X-linked, that is, carried by the X chromosome. (Note that while symbols for genes and alleles are italicized, symbols for chromosomes are not.) The Y chromosome carries no allele of this gene for eye color. Males, therefore, have only one copy of the gene, which they inherit from their mother along with their only X chromosome; their Y chromosome must come from their father. Thus, males are hemizygous for this eye color gene, because their diploid cells have half the number of alleles carried by the female on her two X chromosomes. If the single white gene on the X chromosome of a male is the wild-type w' allele, h_e will have red eyes and a genotlpe that can be written X' Y. (Here we designate the chromosome [X or Y] together with the allele it carries, to emphasize that certain genes are X-linked.) In contrast to an X" Y male, a hemizygous X'Y male would have a phenotype of white eyes. Females with two X chromosomes can be one of three genotypes: X* X'(white-eyed), 4'X"* (red-eyed because u+ is dominantto w), or X'-X'* (redeyed). As shown in Fig. 4.20, Morgan's assumption that the gene for eye color is X-linked explains the results of
his breeding experiments. Crisscross inheritance, for example, occurs because the only X chromosome in sons of a white-eyed mother (X'X') must carry the w allele, so the sons will be white-eyed. In contrast, because daughters of a red-eyed (X'' Y) father must receive a w+ bearing X chromosome from their father, they will have
red eyes.
Validation of the chromosome theory from the analysis of nondisjunction Although Morgan's work strongly supported the hypothesis that the gene for eye color lies on the X chromosome, he himself continued to question the validity of the chromosome theory until Calvin Bridges, one of his top students, found another key piece of evidence. Bridges repeated the cross Morgan had performed between whiteeyed females and red-eyed males, but this time he did the experiment on a larger scale. As expected, the progeny of this cross consisted mostly of red-eyed females and whiteeyed males. However, about 1 in every 2000 males had red eyes, and about the same small fraction of females had white eyes.
Bridges hypothesized that these exceptions arose through rare events in which the X chromosomes fail to separate during meiosis in females. He called such failures in chromosome segregation nondisjunction. As
Fig.4.2la shows, nondisjunction would result in some eggs with two X chromosomes and others with none. Fertilization of these chromosomally abnormal eggs could produce four types of zygotes: XXY (with two X chromosomes from the egg and a Y from the sperm), XXX (with two Xs from the egg and one X from the sperm), XO (with the lone sex chromosome from the sperm and no sex chromosome from the egg), and OY (with the only sex chromosome again coming from the sperm). When Bridges examined the sex chromosomes of the rare white-eyed females produced in his large-scale cross, he found that they were indeed XXY individuals who must have received two X chromosomes and with them two ry alleles from their white-eyed X'X'mothers. The exceptional red-eyed males emerging from the cross were XO; their eye color showed that they must have obtained their sole sex chromosome from their X " Y fathers. In this study, transmission of the white gene alleles followed the predicted behavior of X
chromosomes during rare meiotic mistakes, indicating that the X chromosome carries the gene for eye color. These results also suggested that zygotes with the two other abnormal sex chromosome karyotypes expected from nondisjunction in females (XXX and OY) die during embryonic development and thus produce no progeny. Because XXY white-eyed females have three sex chromosomes rather than the normal two, Bridges reasoned they would produce four kinds of eggs: XY and X, or XX and Y (Fig. 4.21b). You can visualize the formation of these four kinds of eggs by imagining that when the three chro-
mosomes pair and disjoin during meiosis, two chromosomes must go to one pole and one chromosome to the other. With this kind of segregation, only two results are possible: Either one X and the Y go to one pole and the second X to the other (yielding XY and X gametes), or the two Xs go to one pole and the Y to the other (yielding XX and Y gametes). The first of these two scenarios occurs more often because it comes about when the two similar X chromosomes pair with each other, ensuring that they will go to opposite poles during the first meiotic division. The second, less likely possibility happens only if the two X chromosomes fail to pair with each other. Bridges next predicted that fertilization of these four kinds of eggs from an XXY female by normal sperm would generate an array of sex chromosome karyotypes associated with specific eye color phenotypes in the progeny. Bridges verified all his predictions when he analyzed the eye color and sex chromosomes of a large number of offspring. For instance, he showed cytologically that all of the white-eyed
females emerging from the cross in Fig. 4.2Ib had two X chromosomes and one Y chromosome, while one-half of the white-eyed males had a single X chromosome and two
4.6 Validation of the Chromosome
Theory
1
13
.. Figure 4.21
Nondisjunction: Rare mistakes in meiosis help confirm the chromosome theory. (a) Rare events of nondisjunction in an produce XX ana O eggs. The results of normal disjunction in the female are not shown. XO males are sterile because the missing Y chromoXX-f"rut" ) producing progeny some is needed for male fertility in Drosophila. (b) ln an XXY female, the three sex chromosomes can pair and segregate in two ways, with unusual sex chromosome complements'
female d wtrite-eyed Q
(b) Segregation in
(a) Nondisjunction in an XX An1
"\yv'
P
XX
/ Gametes Nondisjunction
XX
Q wrrite-eyed
*g(
X
,00"F
I
\
w
Gametes
XY
I
F1
A \y
xtxt o
x*x'xfr
Kvy
0cF
Xw'
XWY
x'x*Y
white Q
fr OY
d
Red-eyed
\
More frequent
n red X"" O sterile
d
F
XY
F1
dies
u...*
/fii\
r0
XX
xw'
Anfi \J \t ty
I
Meiosis
Normal segregation
o
X
XXY
XY
Meiosis
/fii\ WY
Red-eyed
an XXY female
firrn u\tll x#x'Y red O +
Atlr
x*
dies
Less
\y\y
x*
_
x''
redQ
0FC xtYY
whited
cfr XWY
whited
frequent
x*x*
Annl \y \y
rg_
#x*' dies
0cc xtx'Y
whiteQ
0e
m
redd
dies
x/Y
Y chromosomes. Bridges' painstaking observations provided compelling evidence that specific genes do in fact
change in a particular gene (that is, in a particular part of a chromosome). If a mutation occurs in the germ line, it can
reside on specific chromosomes.
be transmitted to subsequent generations.
The Chromosome Theory lntegrates Many Aspects of Gene Behavior Mendel had assumed that genes are located in cells. The chromosome theory assigned the genes to a specific structure within cells and explained alternative alleles as physically matching parts of homologous chromosomes. In so doing, the theory provided an explanation of Mendel's laws. The mechanism of meiosis ensures that the matching parts of homologous chromosomes will segregate to different gametes (except in rare instances of nondisjunction), accounting for the segregation of alleles predicted by Mendel's first law Because each homologous chromosome pair aligns independently of all others at meiosis I, genes carried on different chromosomes will assort independently, as predicted by Mendel's second law. The chromosome theory is also able to explain the creation of new alleles through mutation, a spontaneous
Finally, through mitotic cell division in the embryo and after birth, each cell in a multicellular organism receives the same chromosomes-and thus the same maternal and paternal alleles of each gene-as the zygote received from the egg and sperm at fertilization. In this way, an individual's genome-the chromosomes and genes he or she carriesremains constant throughout life.
essential concepts
. .
Segregation of homologous chromosomes into daughter cells at meiosis I explains Mendel's first law.
lndependent alignment of homologs with respect to each other and crossing-over of nonsister chromatids during meiosis I explain Mendel's second law
.
ln organisms with XX/XY sex determination, males are hemizygous for X-linked genes, while females have two copies.
'l'14
Chapter
4
The Chromosome Theory of Inheritance
@
Sex-Linked and Sexually Dimorphic Traits in Humans Iearning objectives
i.
Determine from pedigree analysis whether human traits are X-linked or autosomal.
2.
Explain how human cells compensate for the X-linked gene dosage difference in XX and XY nuclei.
A person unable to tell red from green would find it nearly impossible to distinguish the rose, scarlet, and magenta in the flowers of a garden bouquet from the delicately variegated greens in their foliage, or to complete a complex electrical circuit by fastening red-clad metallic wires to red ones and green to green. Such a person has most likely inherited some form of red-green colorblindness, a recessive condition that runs in families and affects mostly males.
Figure 4.22 Red-green colorblindness is an X-linked recessive trait in humans, How the world looks to a person with either normal color vision (top) or a kind of red-green colorblindness known as deuteranopia (bottom).
Among Caucasians in North America and Europe, 87o of men but only 0.44o/o of women have this vision defect. Figure 4.22 suggests to readers with normal color vision what people with red-green colorblindness actually see. In 191 1, E. B. Wilson, a contributor to the chromosome theory of inheritance, combined familiarity with studies of colorblindness and recent knowledge of sex determination by the X and Y chromosomes to make the first assignment of a human gene to a particular chromosome. The gene for red-green colorblindness, he said, lies on the X because the condition usually passes from a maternal grandfather through an unaffected carrier mother to roughly 50% of the grandsons. Several years after Wilson made this gene assignment, pedigree analysis established that various forms of hemophilia, or "bleeders disease" (in which the blood fails to clot properly), also result from mutations on the X chromosome that give rise to a relatively rare, recessive trait. In this context, rare means "infrequent in the populationl' The family histories under review, including one following the descendants of Queen Victoria of England (Fig.4.23a),
showed that relatively rare X-linked traits appear more often in males than in females and often skip generations. The clues that suggest X-linked recessive inheritance in a pedigree are summarized in Table 4.5.
Unlike colorblindness and hemophilia, somealthough very few-of the known rare mutations on the X chromosome are dominant to the wild-type allele. With
Figure 4.23 X-linked traits may be recessive or dominant. (a) Pedigree showing inheritance of the recessive X-linked trait hemophilia in Queen Victoria's family. (b) Pedigree showing the inheritance of the dominant X-linked trait hypophosphatemia, commonly referred to as
vitamin D-resistant rickets.
(a) X-linked recessive: Hemophilia Oueen Victoria
o Carrier
Prince Albert
T Victoria
Edward
Alfred Louise
Arthur
Hemophiliac
Beatrice
vlt Alice
IV
Leopold
lil Alix Nicholas
IV
ll
Alexis
Rupert
(b) X-linked dominant: Hypophosphatemia
Helene
4.7 Sex-Linked and Sexually Dimorphic Traits in
)lg
'Pedigree,Fatterns Suggesting Sex-Linked lnheritance
Humans
1.
The trait appears in more males than females because a female must receive two copies of the rare defective allele to display the phenotype, whereas a hemizygous male with only one copy will show it.
2.
The mutation will never pass from father to son because sons receive only a Y chromosome from their father.
3.
An affected male passes the X-linked mutation to all his daughters, who are thus carriers. One-half of the sons of these carrier females will inherit the defective allele and thus the trait'
4.
The trait often skips a generation as the mutation passes from grandfather through a carrier daughter to grandson.
5.
The trait can appear in successive generations when a sister of an affected male is a carrier. lf she is, one-half of her sons will be affected.
6.
With the rare affected (homozygous) female. all her sons will be affected and all her daughters will be carriers'
15
Figure 4.24 Barr bodies are densely staining particles in XX cell nuclei. The arrow points to a Barr body in the nucleus ofan XX cell treated with a DNA stain. The Barr body appears bright whlte in this negative image. Unlike the other chromosomes, the Barr body attached to the nuclear envelope. XY cells have no Ban bodies.
X-Linked Recesslve Tralt
1
is
X-Unked Dominant Trait
1.
More females than males show the aberrant trait.
2.
The trait is seen in every generation because
3.
All the daughters but none of the sons of an affected male will be affected. This criterion is the most useful for distinguishing an X-linked dominant trait from an autosomal dominant trait.
4.
One-half the sons and one-half the daughters of an affected female will be affected.
5.
For incompletely dominant X-linked traits, carrier females may show the trait in less extreme form than males with the defective allele.
it
is dominant.
Y-Llnked Trait
1.
The trait is seen only in males.
2. 3.
All male descendants of an affected man will exhibit the trait. Not only do females not exhibit the trait, they also cannot transmit it.
such dominant X-linked mutations, more females than males show the aberrant phenotype. This is because all the daughters of an affected male but none of the sons will have the condition, while one-half the sons and one-half the daughters of an affected female will receive the dominant allele and therefore show the phenotype (see Table 4.5).
Vitamin D-resistant rickets, or hypophosphatemia, is an example of an Xlinked dominant trait. Figure 4.23b presents the pedigree of a family affected by this disease'
ln XX Human Females, One X Chromosome ls lnactivated The XX and XY system of sex determination presents human cells with a curious problem that requires a solution
called dosage compensation. As mentioned earlier, the X chromosome contains about 1100 genes, and the proteins that they specify need to be present in the same amounts in
male and female cells. To compensate for female cells having two copies of each X-linked gene and male cells having only one, XX cells inactivate one of their two X chromosomes. Almost all of the genes on the inactivated X chromosome are turned off, so no gene product can be made. Two weeks after fertilization, when an XX human embryo is composed of only 500-1000 cells, each cell chooses one X chromosome at random to condense into a so-called Barr body and thereby inactivate it. Barr bodies, named after the c1'tologist Murray Barr who discovered them, appear as small, dark chromosomes in interphase cells treated with a DNA stain that allows chromosomes to be visible under a light microscope (Fig.4.24).
Each embryonic cell 'decides" independently which X chromosome will be inactivated-either the X inherited from the mother or the paternal X. Once the determination is made, it is clonally perpetuated so that all of the millions of cells descended by mitosis from a particular embryonic cell condense the same X chromosome to
a Barr body (Fig. 4.25a). Human females are thus
a
patchwork of cells, some containing a maternally derived active X chromosome, and the others an active paternal X (Fig. 4.2sb). The phenomenon of X chromosome inactivation may have interesting effects on the traits controlled by X-linked genes. When females arehelerozygous at an X-linked gene, parts of their bodies are in effect hemizygous for one allele, and parts are hemizygous for the other allele in terms of
116
Chapter
4
The Chromosome Theory of Inheritance
Figure 4.25 X chromosome dosage compensation makes human females a patchwork for X-linked gene expression. (a) Early in embryogenesis, each XX cell inactivates one randomly chosen X chromosome by condensing it into a Barr body (black oval). The same X chromosome remains a Barr body in all descendants of each cell. XM
:
maternal X chromosome; XP : paternal X chromosome. (b) Human females have patches of cells in which either the materal or paternal X chromosome is inactivated. The twins shown here are heterozygotes (Dd) for the recessive condition anhidrotic ectodermal dysplasia, which prevents sweat gland development. Patches of skinin blue lack sweat glands because the chromosome with the wildtype allele (D) is inactivated.
(a) Perpetuation of X chromosome inactivation after cell divisions Early cell divisions
Barr body
G
"'l
/\
/\
*ry
xMf
xuf
/ trf /\ xMf xuf
\
OG
X-inactivation at 500-1000 cell stage
cronarpatches
(b) X chromosome inactivation results in patchwork females
of skin that lack sweat glands interspersed with patches of normal skin; the phenotFpe of a patch depends upon which X chromosome is inactivated. Each patch is a clone of skin cells derived from a single embryonic cell that made the decision to inactivate one of the X chromosomes. In a second example, women heterozygous for an XJinked recessive hemophilia allele are called 'tarriers" of the disease allele, even though they may have some symptoms of hemophilia. The severity of the condition depends on the particular random pattern ofcells that inactivated the disease allele and cells that inactivated the normal allele. In Chapter 3, we discussed how chance events work through genes to affect phenotype; X inactivation is a perfect example of such an event. Recall that the two tips of the X chromosome, the pseudoautosomal regions (PARs), contain genes also present at the tips of the Y chromosome (Fig. 4.8, p. 91). In order to equalize the dosage of these genes in XX and XY cells, the PAR genes on the Barr body X chromosome escape inactivation. This feature of dosage compensation may explain at least in part why XXY males (Klinefelter syndrome) and XO females (Turner syndrome) have abnormal morphological features. Although one of the two X chromosomes in XXY males becomes a Barr body, Klinefelter males have three doses (rather than the normal two) of the genes in the PAR regions. The single X chromosome in XO cells does not become a Barr body, yet these cells have only one dose of the PAR genes (rather than two in XX females).
X chromosome inactivation is common to mammals, and we will present the molecular details of this process in Chapter 11. It is nonetheless important to realize that other organisms compensate for sex chromosome differences in alternative ways. Fruit flies, for example, hyperactivate the single X chromosome in XY (male) cells, so that most X chromosome genes produce twice as much protein product as each X chromosome in a female. The nematode C. elegang in contrast, ratchets down the level of gene activity on each of the X chromosomes in XX hermaphrodites relative to the single X in XO males.
Maleness and Male FertilityAre the Only Known Y-Linked Traits in Humans
gene function. Moreover, which body parts are functionally hemizygous for one allele or the other is random; even identical twins, who have identical alleles of all of their genes, will have a different pattern of X chromosome inactivation. In Fig.4.25b, females heterozygous for the X-linked recessive trait anhidrotic epidermal dysplasia have patches
Theoretically, phenotypes caused by mutations on the y chromosome should also be identifiable by pedigree analysis. Such traits would pass from an affected father to all of his sons, and from them to all future male descendants. Females would neither exhibit nor transmit a Y-linked phenotype (see Table 4.5). However, besides the determination of maleness itself, as well as contributions to sperm formation and thus male fertility, no clear-cut Y-linked visible traits have turned up in humans. The paucity of known Y-linked traits reflects the fact that, as mentioned earlier, the small Y chromosome contains very few genes. Indeed, one
4.7 Sex-Linked and Sexually Dimorphic Traits in
would expect the Y chromosome to have only a limited effect on phenotype because normal XX females do perfectly
well without it.
Autosomal Genes Contribute to Sexual Dimorphism Not all genes that produce sexual dimorphism (differences in the two sexes) reside on the X or Y chromosomes. Some autosomal genes govern traits that appear in one sex but not the other, or traits that are expressed differently in the two
sexes.
Sex-limited traits affect a structure or process that is found in one sex but not the other. Mutations in genes for sex-limited traits can influence only the phenotype of the sex that expresses those structures or processes. A vivid example of a sex-limited trait occurs in Drosophila males homozygous for an autosomal recessive mutation known as stuck, which affects the ability of mutant males to retract their penis and release the claspers by which they hold on to female genitalia during copulation. The mu-
tant males have difficulty separating from females after mating. In extreme cases, both individuals die, forever caught in their embrace. Because females lack penises and claspers, homozygous stuck mutant females can mate normally. )
Sex-influenced traits show up in both sexes, but expression of such traits may differ between the two sexes because of hormonal differences. Pattern baldness, a condition in which hair is lost prematurely from the top of the head but not from
the sides (Eig. a.26), is a sex-influenced trait in humans. Although pattern baldness is a complex trait that can be
Figure 4.25 Male pattern baldness, a sex-influenced trait. (a) John Adams
(1
735-1 826), second president
ofthe United
States, at
about age 60. (b) John Quincy Adams (1767-1848), son ofJohn Adams and the sixth president ofthe United States, at about the same age. The father-to-son transmission suggests that male pattern baldness in the Adams family is likely determined by an allele of an autosomal gene.
Humans
117
affected by many genes, an autosomal gene appears to play an important role in certain families. Men in these families
who are heterozygous for the balding allele lose their hair while still in their 20s, whereas heterozygous women do not show any significant hair loss. In contrast, homozygotes in both sexes become bald (though the onset of baldness in homozygous women is usually much later in life than in homozygous men). This sex-influenced trait is thus dominant in men, recessive in women.
Mutations in Sex Determination PathwayGenes Can Result in lntersexuality Disorders We previously saw that the SRY gene on the Y chromosome is essential to maleness because it initiates testis de-
velopment early in embryogenesis. But the functions of many genes are required for testis development, or for subsequent events that rely on hormones made in the testes for the development of sexual organs. Some of these genes are autosomal and some are X-linked; in either case, an XY individual with mutant alleles for any of these genes may have unusual intersexual phenotypes. In one important example, XY people with nonfunctional mutant alleles of the X-linked AR gene encoding the androgen receptor have a disorder known as complete 4ndrogen insensitivity syndrome (CAIS). These XY individuals have testes that make the hormone testosterone, but in the absence of the androgen receptor to which it binds' the testosterone has no effect. Without the androgen receptor, these people cannot develop male genitalia (penis and scrotum) nor male internal duct systems (the vas deferens, seminal vesicles, and ejaculatory ducts); instead, their external genitalia assume the default female state (labia and clitoris). However, the testes makes another hormone that prevents formation of female internal duct systems (including the fallopian tubes, uterus, and vagina). The result is that a person with CAIS is externally female, but sterile because they lack internal duct systems of either sex.
essential concepts
.
.
Sex-linked (X-linked) traits show sex-specific inheritance patterns because sons always inherit their fathert Y chromosome, while daughters always inherit their father's X chromosome. Random inactivation of either the maternal or paternal X chromosome in XX cells ensures that male and female mammalian cells express equivalent amounts of the proteins
encoded by most X-linked genes.
. (a)
(b)
Mutations of genes-whether autosomal or X-linked-can have different effects in males and females.
1
18
Chapter
4
The Chromosome Theory of Inheritance
T. H. Morgan and his students, collectively known as the Drosophila group, acknowledged that Mendelian genetics could exist independently of chromosomes. "Why then, we are often asked, do you drag in the chromosomes? Our answer is that because the chromosomes furnish exactly the kind of mechanism that Mendelian laws call for, and since there is an ever-increasing body of information that points clearly to the chromosomes as the bearers of the Mendelian factors, it would be folly to close one's eyes to so patent a relation. Moreover, as biologists, we are interested in heredity not primarily as a mathematical formulation, but rather as a problem concerning the cell, the egg, and the sperm."
The Drosophila group went on to find several X-linked mutations in addition to white eyes. One made
l.
the body yellow instead of brown, another shortened the wings, yet another made bent instead of straight body bristles. These findings raised several compelling questions. First, ifthe genes for all ofthese traits are physically
linked together on the X chromosome, does this linkage affect their ability to assort independently, and if so, how? Second, does each gene have an exact chromosomal address, and if so, does this specific location in any way affect its transmission? In Chapter 5 we describe how the Drosophila group and others analyzed the transmission patterns of genes on the same chromosome in terms of known chromosome movements during meiosis, and then used the information obtained to localize genes at specifi c chromosomal positions.
In humans, chromosome 16 sometimes has a heavily stained area in the long arm near the centromere. This feature can be seen through the microscope but has no effect on the phenotype of the person carrying it. When such a "blob" exists on a particular copy of chromosome 16, it is a constant feature of that chromosome and is inherited. A couple conceived a child, but the fetus had multiple abnormalities and was miscarried. When the chromosomes of the fetus were studied, it was discovered that it was trisomic for chromosome 16, and that two of the three chro-
poles. If this occurred in the father, the chromosome with the blob and the normal chromosome l6 would segregate into the same cell (a secondary spermato-
cyte). After meiosis II, the gametes resulting from this cell would carry both types of chromosomes. If such sperm fertilized a normal egg, the zygote would have two copies of the normal chromosome 16 and one copy of the chromosome with a blob. On the other hand, if nondisjunction occurred during meiosis
mosome 16s had large blobs. Both chromosome 16 homologs in the mother lacked blobs, but the father was heterozygous for blobs. Which parent experienced nondisjunction, and in which meiotic division did it occur? Answer This problem requires an understanding of nondisjunction during meiosis. When individual chromosomes contain some distinguishing feature that allows one homolog to be distinguished from another, it is possible to follow the path of the two homologs through meiosis. In this case, because the fetus had two chromosome 16s with the blob, we can conclude that the extra chromosome came from the father (the only parent with a blobbed chromosome). In which meiotic division did the nondisjunction occur? When nondisjunction occurs during meiosis I, homologs fail to segregate to opposite
II in the father in a secondary spermatocyte con-
taining the blobbed chromosome 16, sperm with two copies of the blob-marked chromosome would be produced. After fertilization with a normal egg, the result would be a zygote of the type seen in this spontaneous abortion. Therefore, the nondisjunction occurred in meiosis
ll.
II in the father.
(a) What sex ratio would you expect among the off, spring of a cross between a normal male mouse and a female mouse heterozygous for a recessive X,linked lethal gene? (b) What would be the expected sex ratio among the offspring of a cross between a normal hen and a rooster heterozygous for a recessive Z-linked lethal allele?
Answer This problem deals with sexlinked inheritance and sex
determination.
a. Mice have a sex determination system of XX : female and XY : male. A normal male mouse
Problems
(XRY) x a heterozygous female mouse (XoX) would result in xRxR, xRx', XRY and X'Y mice' The XY mice would die, so there would be a 2:1 ratio of females to males. b. The sex determination system in birds is ZZ : male and ZW : female. A normal hen (ZRW) x aheterozygous rooster (ZRZ') would result in ZRZR, ZRZ', ZRW, and Z'W chickens. Because lhe Z'W offspring do not live, the ratio of females to males would be 1:2.
lll.
A woman with normal color vision whose father was color-blind mates
with
a man
with normal color
vision.
a. What do you expect to see among their offspring? b. rvVhat would you expect if it was the normal man's father who was color-blind?
Vocabulary 1. Choose the best matching phrase in the right column for each of the terms in the left column. a. meiosis
1. XandY
b.
2. chromosomes that do not differ
gametes
between the sexes
c.
karyotype
4. microtubule organizing
Answer This problem involves sex-linked inheritance. a. The woman's father has a genotype of X"bY. Because the woman had to inherit an X from her father, she must have an X'b chromosome, but because she has normal color vision, her other X chromosome must be XcB. The man she mates with has normal color vision and therefore has an XCBY genotyp e. Their
children could with equal probabitity be -Xcu-fu (normal female), {ux'o (carrier female), /{BY (normal male), or XbY (color-blind male). b. If the man with normal color vision had a colorblind father, the X'b chromosome would not have been passed on to him, because a male does not inherit an X chromosome from his father' The man has the genotype x{BY and cannot pass on the color-blind allele.
o. polarbody
15. the time during mitosis when sister chromatids separate
p.
16. connection beftveen sister chromatids
spermatoc)'tes
Section 4.1 have 46 chromosomes in each somatic cell. a. How many chromosomes does a child receive from its father? b. How many autosomes and how many sex chromosomes are present in each somatic cell? c. How many chromosomes are present in a human
2. Humans
3. one of the two identical halves of a replicated chromosome
d. mitosis
119
centers at
the spindle poles
e. interphase
5. cells in the testes that undergo
f.
syncltium
g.
synapsis
6. division of the cytoPlasm 7. haploid germ cells that unite
meiosis
ovum?
d. How many sex chromosomes are present in a huat
man ovum?
fertilization an animal cell containing more than one nucleus
h.
sex chromosomes
8.
i.
cytokinesis
9. pairing of homologous chromosomes
j.
anaphase
10. one diploid cel1 gives rise to two diploid cells
k.
chromatid
I
1.
autosomes
12. the part of the cell cycle during which the chromosomes are not
l.
the array of chromosomes in a given cell
visible m. centromere
13. one diploid cell gives rise to four
n.
14. cell produced by meiosis that does
haploid cells centrosomes
not become
a gamete
Section 4.2
3. The figure that follows shows the metaphase chromosomes of a male of a particular species. These chromosomes are prepared as they would be for a karyotype, but they have not yet been ordered in pairs of decreasing size. a. How manY centromeres are shown? b. How many chromosomes are shown? c. How many chromatids are shown? d. How many pairs of homologous chromosomes
are
shown?
e. How many chromosomes on the figure are metacentric? Acrocentric?
'l2O
f.
Chapter
4
The Chromosome Theory of Inheritance
What is the likely mode of sex determination in this species? What would you predict to be different about the karyotype of a female in this species?
8. a. What
are the four major stages of the cell cycle?
b. Which stages are included in interphase? c. What events distinguish G1, S, and G2?
9. Answer the questions that follow for
each stage of the cell cycle (G,, S, G2, prophase, metaphase, anaphase, telophase). If necessary, use an arrow to indicate a change that occurs during a particular cell cycle stage (for example, I --> 2 or yes -+ no). a. How many chromatids comprise each chromosome during this stage?
X n
b. Is the nucleolus present? c. Is the mitotic spindle organized? d. Is the nuclear membrane present?
*
4. XX males who are sex-reversed because they have a mutant X chromosome like that shown in Fig. 4.7 on p. 91 often learn of their condition when they want to have children and discover that they are sterile. Can you explain why they are sterile?
5.
Researchers discovered recently that the sole function of the SRY protein is to activate an autosomal gene called Soxg in the presumptive gonad (before it has 'decided" to become a testis or an ovary).
a. What would be the sex of an XY individual homozygous for nonfunctional mutant alleles of Soxg? Explain.
b. Given your answer to part (a), why is SRy, rather than Sox9, considered the male determining factor? (Hint: What do you think would happen if you did an experiment like the one in the Fast Forward Box on p. 93 except that you used a Soxg transgene instead of SRYi)
Section 4.3
6. One oak tree cell with 14 chromosomes
undergoes mitosis. How many daughter cells are formed, and what is the chromosome number in each cell?
7.
Indicate which of the cells numbered
of the following stages of mitosis: a. anaphase
b. prophase c. metaphase d.Gr. e. telophase/cytokinesis t.
lt.
IV
i-v
matches each
10. Does any reason exist that would prevent mitosis from occurring in a cell whose genome is haploid? Section 4.4 11. One oak tree cell with 14 chromosomes undergoes meiosis. How many cells will result from this process, and what is the chromosome number
in each cell?
12. Which type(s) of cell division (mitosis, meiosis I, meiosis II) reduce(s) the chromosome number by half? Which type(s) of cell division can be classified as reductional? Which type(s) of cell division can be classified as equational?
13. Complete the following statements using as many of the following terms as are appropriate: mitosis, meiosis I (first meiotic division), meiosis II (second meiotic division), and none (not mitosis nor meiosis I nor meiosis II). a. The spindle apparatus is present in cells undergoing
b. Chromosome replication occurs just prior to c. The cells resulting from _ aploidy of n. d. The cells resulting from _
in a haploid cell have in a diploid cell have
ploidy of n. e. Homologous chromosome pairing regularly occurs during f. Nonhomologous chromosome pairing regularly oca
curs during -. g. Physical recombination leading to the production of recombinant progeny classes occurs during
_.
h. Centromere division occurs during _. i. Nonsister chromatids are found in the same cell during 14. The five cells shown in figures a-e on the next page are all from the same individual. For each cell, indicate whether it is in mitosis, meiosis I, or meiosis iI. What stage of cell division is represented in each case? What is n in this organism?
Problems
d
a.
I
b.
K
e.
l(
x
I
x
)t
\
\
X c.
(
r(
17. Assuming (i) that the two chromosomes in a homologous pair carry different alleles of some genes, and (ii) that no crossing-over takes place, how many genetically different offspring could any one human couple potentially produce? Which of these two assumptions (i or ii) is more realistic? 18. In the moss Polytrichum commune, the haploid chromosome number is 7. A haploid male gamete fuses with a haploid female gamete to form a diploid cell that divides and develops into the multicellular sporophyte. Cells of the sporophlte then undergo meiosis
I
15. One of the first microscopic observations of chromosomes in cell division was published in 1905 by Nettie Stevens. Because it was hard to reproduce photographs at the time, she recorded these observations as camera lucida sketches. One such drawing, of a completely normal cell division in the mealworm Tenebrio molitor, is shown here' The techniques of the time were relatively unsophisticated by today's standards, and they did not allow her to resolve chromosomal structures that must have been present.
a. Describe in as much detail as possible the kind of cell division and the stage of division depicted in the
drawing.
b. What chromosomal structure(s) cannot be resolved in the drawing? c. How many chromosomes are present in normal Teneb r io m
121
to produce haploid cells called spores. What is the probability that an individual spore will contain a set of chromosomes all of which came from the male gamete? Assume no recombination.
19. Does any reason exist that would prevent meiosis from occurring in an organism whose genome is always haploid?
20. Sister chromatids are held together through metaphase of mitosis by complexes of cohesin proteins that form rubber band-like rings bundling the two sister chromatids. Cohesin rings are found both at centromeres and at many locations scattered along the length of the chromosomes. The rings are destroyed by protease enzymes at the beginning of anaphase, allowing the sister chro-
matids to separate. a. Cohesin complexes between sister chromatids are also responsible for keeping homologous chromosomes together until anaphase of meiosis I. With this point in mind, which of the two diagrams that follow (i or ii) properly represents the arrangement of chromatids during prophase through metaphase of meiosis I? Explain. b. What does your answer to part (a) allow you to infer about the nature of cohesin complexes at the centromere versus those along the chromosome arms? Suggest a molecular hypothesis to explain your inference.
olit o r gametes?
16. A person is simultaneously heterozygous for two autosomal genetic traits. One is a recessive condition for albinism (alleles A and a); this albinism gene is found near the centromere on the long arm of an acrocentric
autosome. The other trait is the dominantly inherited Huntington disease (alleles HD and HD+). The Huntington gene is located near the telomere of one of the arms of a metacentric autosome' Draw all copies of the two relevant chromosomes in this person as they would appear during metaphase of (a) mitosis' (b) meiosis I, and (c) meiosis II. In each figure, label the location on every chromatid of the alleles for these two genes, assuming that no recombination takes place.
21. The pseudoautosomal regions (PARs) of the X and Y chromosomes enable the sex chromosomes to pair and synapse during meiosis in males. Given the location of the SRY gene near PARI, can you propose a mechanism for how the mutant X and Y chromosomes in Fig. 4.7 (in which part of the X is on the Y
122
Chapter
4
The Chromosome Theory of Inheritance
and part of the Y is on the X) may have arisen during meiosis?
Section 4.5
22. Somatic cells of chimpanzees contain 48 chromosomes. How many chromatids and chromosomes are present at: (a) anaphase of mitosis, (b) anaphase I of meiosis, (c) anaphase II of meiosis, (d) G1 prior to mitosis, (e) G2 prior to mitosis, (f) G1 prior to meiosis I, and (g) prophase of meiosis I? How many chromatids or chromosomes are pres, ent in: (h) an oogonial cell prior to S phase, (i) a spermatid, (j) a primary oocyte arrested prior to ovulation, (k) a secondary oocyte arrested prior to fertilization, (l) a second polar body, and (m) a chimpanzee sperm? 23. In humans: a. How many sperm develop from 100 primary spermatocytes?
b. How many sperm develop from 100 secondary spermatocytes?
c. How many sperm develop from
100 spermatids?
d. How many ova develop from 100 primary oocytes? e. Howmanyova develop from 100 secondaryoocytes? f. How many ova develop from 100 polar bodies?
24. Women sometimes develop benign tumors called ovarian teratomas or dermoid cysts in their ovaries. Such a tumor begins when a primary oocyte escapes from its prophase I arrest and finishes meiosis I within the ovary. (Normally meiosis I does not finish until the primary oocyte is expelled from the ovary upon ovulation.) The secondary oocyte then develops as if it were an embryo. Development is disorganized, however, and results in a tumor containing differentiated diploid tissues, including teeth, hair, bone, muscle, and nerve. If a dermoid cyst forms in a woman whose genotype is Aa, whaL are the possible genotlpes of the cyst, assuming no recombination?
25. In a certain strain of turkeys, unfertilized eggs sometimes develop parthenogenetically to produce diploid offspring. (Females have ZW and males have ZZ sex chromosomes. Assume that WW cells are inviable.) What distribution of sexes would you expect to see among the parthenogenetic offspring according to each of the following models for how parthenogenesis occurs? a. The eggs develop without ever going through meiosis. b. The eggs go all the way through meiosis and then
duplicate their chromosomes to become diptoid.
c. The eggs go through meiosis I, and the chromatids separate to create diploidy.
d. The egg goes all the way through meiosis and then fuses at random with one of its three polar bodies (this assumes the first polar body goes through meiosis II).
Section 4.6
26. Imagine you have two pure-breeding lines of canaries, one with yellow feathers and the other with brown feathers. In crosses between these two strains, yellow female X brown male gives only brown sons and daughters, while brown female X yellow male gives only brown sons and yellow daughters. propose a hypothesis to explain these results. 27. A system of sex determination known as haplodiploidy is found in honeybees. Females are diploid, and males (drones) are haploid. Male offspring result from the development of unfertilized eggs. Sperm are produced
by mitosis in males and fertilize eggs in the females. Ivory eye is a recessive characteristic in honeybees; wild-type eyes are brown. a. What progeny would result from an ivory,eyed queen and a brown-eyed drone? Give both genotype and phenotype for progeny produced from fertilized and nonfertilized eggs. b. What would result from crossing a daughter from the mating in part (a) with a brown-eyed drone?
28. In Drosophila, the autosomal recessive brown eye color mutation displays interactions with both the X-linked recessive yermilion mutation and the autosomal recessive scarlet mutation. Flies homozygous for brown and
simultaneously hemizygous or homozygous for yermilion have white eyes. Flies simultaneously homozygous for both thebrown and scailet mutations also have white eyes. Predict the F1 and F, progeny of crossing the following true-breeding parents: a. vermilion females X brown males b. brown females X vermilion males c. scarlet females X brown males d. brown females X scarlet males
29. Barred feather pattern is a Z-linked dominant trait in chickens. What offspring would you expect from (a) the cross of a barred hen to a nonbarred rooster? (b) the cross of an F, rooster from part (a) to one of his sisters?
30. When Calvin Bridges observed a large number of
offspring from a cross of white-eyed female Drosophila to red-eyed males, he observed very rare white-eyed females and red-eyed males among the offspring. He was able to show that these exceptions resulted from nondisjunction, such that the whiteeyed females had received two Xs from the egg and a Y from the sperm, while the red-eyed males had received no sex chromosome from the egg and an X
from the sperm. What progeny would have arisen from these same kinds of nondisjunctional events if they had occurred in the male parent? What would their eye colors have been?
Problems
31. In a vial of Drosophila, a research student noticed several female flies (but no male flies) with "bag" wings each consisting of alarge, liquid-filled blister instead of the usual smooth wing blade. When bag-winged females were crossed with wild-type males, 1/3 of the progeny were bag-winged females, 1/3 were normalwinged females, and Il3 were normal-winged males. Explain these results.
32. In 1919, Calvin Bridges began studying an X-linked recessive mutation causing eosin-colored eyes in Drosophila. Within an otherwise true-breeding culture of eosin-eyed flies, he noticed rare variants that had much lighter cream-colored eyes. By intercrossing these variants, he was able to make a truebreeding cream-eyed stock. Bridges now crossed males from this cream-eyed stock with true-breeding wild-type females. All the F1 progeny had red (wildtype) eyes. When F1 flies were intercrossed, the Fz progeny were 104 females with red eyes, 52 males with red eyes, 44 males with eosin eyes, and 14 males with cream eyes. Assume this represents an 8:4:3:1 ratio. a. Formulate a hypothesis to explain the F1 and F2 results, assigning phenotypes to all possible genotypes.
)
b. What do you predict in the F1 and F, generations if the parental cross is between true-breeding eosin-eyed males and true-breeding cream-eyed females?
c. What do you predict in the Fr and F, generations if the parental cross is between true-breeding eosin-eyed females and true-breeding cream-eyed males?
33. In Drosophila, a cross was made between a yellowbodied male with vestigial (not fully developed) wings and a wild-type female (brown body). The Fi generation consisted of wild-type males and wild-type females. Fr males and females were crossed, and the F2 progeny consisted of 16 yellow-bodied males with vestigial wings, 48 yellow-bodied males with normal wings, 15 males with brown bodies and vestigial wings, 49 wild-type males, 31 brown-bodied females with vestigial wings, and97 wild-type females. Explain the inheritance of the two genes in question based on
a. Formulate
a
hypothesis to explain the inheritance of
these eye colors.
b. Predict the F1 and F, progeny if the parental cross was reversed (that is, if the parental cross was between true-breeding white-eyed females and truebreeding purple-eyed males).
Section 4.7
35. The following is a pedigree of a family in which a rare form of colorblindness is found (filled-in symbols). Indicate as much as you can about the genotypes of all the individuals in the pedigree.
36. Each of the four pedigrees that follow represents a human family within which a genetic disease is segregating. Affected individuals are indicated by filledin symbols. One of the diseases is transmitted as an autosomal recessive condition, one as an X-linked recessive, one as an autosomal dominant, and one as an X-linked dominant. Assume all four traits are rare
in the population. a. Indicate which pedigree represents which mode of inheritance, and explain how you know. b. For each pedigree, how would you advise the parents of the chance that their child (indicated by the hexagon shape) will have the condition? Pedigree
1
white-eyed males carrying this mutation were crossed with true-breeding purple-eyed females, all the Ft progeny had wild-type (red) eyes. When the F1 progeny were intercrossed, the F2 progeny emerged in the ratio 3/8 wild-type females: 1/4 white-eyed males: 3/16 wild-type males: 1/8 purple-eyed females: 1/16 purple-eyed males.
5
4
Pedigree 2
Pedigree 3 2
234
these results.
34. As we learned in this chapter, the white mutation of Drosophila studied by Thomas Hunt Morgan is Xlinked and recessive to wild type. When true-breeding
123
Pedigree 4 2
1234
5
6
37. The pedigree that follows indicates the occurrence of albinism in a group of Hopi Indians, among whom the trait is unusually frequent. Assume that the trait is fully penetrant (all individuals with a genotype that could
124
Chapter
4
The Chromosome Theory of Inheritance
give rise to albinism will display this condition). a. Is albinism in this population caused by a recessive
or a dominant allele?
b. Is the gene sex-linked or autosomal? What are the genotypes of the following individuals? c. individual I-1
d. individual l-8 e. individual I-9 f. individual II-6 g. individual II-8 h. individual III-4 'l
2
4
6
I
ilt 6
7
IV
38. Duchenne muscular dystrophy (DMD) is caused by a relatively rare X-linked recessive allele. It results in progressive muscular wasting and usually leads to death before age 20. In this problem, an "affected" person is one with the severe form of DMD caused by hemizygosity or homozygosity for the disease allele.
a. What
is the probabilitythat the first son of whose brother is affected will be affected?
a
woman
b. What is the probability that the second son of a woman whose brother is affected will be affected, if her first son was affected?
c. What is the probability that a child of an unaffected man whose brother is affected will be affected? d. An affected man mates with his unaffected first cousin; there is otherwise no history of DMD in this family. If the mothers of this man and his mate were sisters, what is the probability that the couple's first child will be an affected boy? An affected girl? An unaffected child?
IV
39. The X-linked gene responsible for DMD encodes a protein called dystrophin that is required for muscle function. Dystrophin protein is not secreted-it remains in the cells that produce it. Given what you know about Barr body formation, do you think that females heterozygous for the recessive DMD disease allele could have the disease in some parts of their bodies and not others? 40. Males have hemophilia when they are hemizygous for a nonfunctional recessive mutant allele of the X-linked gene for clotting factor VIII. Factor VIII is normally secreted into the blood serum by cells in the
78
3456
e. If two of the parents of the couple in part (d) were brother and sister, what is the probability that the couple's first child will be an affected boy? An affected girl? An unaffected child?
bone marrow that produce it. a. Do you think that females heterozygous for the hemophilia disease allele could have hemophilia in some parts of their bodies and not others? b. If such a female 'tarrier" of hemophilia suffered a cut, would her blood coagulate (form clots) faster, slower, or in about the same time as that of an individual homozygous for a normal allele of the factor VIII gene? Would the rate of clotting vary significantly among heterozygous females?
41. The pedigree at the bottom of the page shows five generations of a family that exhibits congenital hypertrichosis, a rare condition in which affected individuals are born with unusually abundant amounts of hair on their faces and upper bodies. The two small black dots in the pedigree indicate miscarriages.
a. What can you conclude about the inheritance of hypertrichosis in this family, assuming complete penetrance of the trait?
b. On what basis can you exclude other modes of inheritance?
c. With how many fathers did III-2 and III-9 have children?
Problems
42. Consider the following pedigrees from human families containing a male with Klinefelter syndrome (a set of abnormalities seen in XXY individuals; indicated with shaded boxes). In each, A and B refer to codominant alleles of the X-linked G6PD gene. The phenotypes of each individual (A, B, or AB) are shown on the pedigree. Indicate if nondisjunction occurred in the mother or father of the son with Klinefelter slmdrome for each of the three examples. Can you tell if the nondisjunction was in the first or second meiotic division? a.
AB
ABA ABA
44. The ancestry of a white female tiger bred in a city zoo is depicted in the pedigree following part (e) of this problem. White tigers are indicated with unshaded symbols. (As you can see, there was considerable inbreeding in this lineage. For example, the white tiger Mohan was mated with his daughter,) In answering the following questions, assume that "white" is determined by allelic differences at a single gene and that the trait is fully penetrant. Explain your answers by citing the relevant information in the pedigree. a. Could white coat color be caused by a Y-linked allele? b. Could white coat color be caused by a dominant Xlinked allele? c. Could white coat color be caused by a dominant autosomal allele?
d. Could white coat color be caused by a recessive X-linked allele? e. Could white coat color be caused by a recessive au-
H
b.
125
tc I
tosomal allele?
ABA AB /.\T-l VIU
Mohan
t
c.
A
Mohini
43. Several different antigens can be detected in blood tests. The following four traits were tested for each individual shown: )
type Rh type MN type xg(") type A11
for
1B
codominant, i recessive)
(R/r+ dominant to
Rft
1982 female
)
(M and N codominant) (xglo*l dominantto xglo l)
of these blood type genes are autosomal, Xg(u),
except
which is X-linked.
Mother Daughter 1 Aileged father 2 Allegedfather3 Alleged father 4 Alleged father
a. Which, if
AB A AB A B O
Rh Rh+ Rh+ RhRh+ Rh
MN MN M N N MN
Xg("+)
xg(a-) Xg("+l
Xgk
)
xg(a-)
Xgc-)
ter had Tirrner syndrome (the abnormal phenotype seen in XO individuals)?
ilt ) /lv
b.
If
so, how?
tation? is the BRCA2 mutation dominant or recessive to the normal BRCA2 allele in terms of its cancer-causing effects?
c.
b. Would your answer to part (a) change if the daugh-
il
45. The pedigree at the bottom of the page shows the inheritance ofvarious types ofcancer in a particular family. Molecular analyses (described in subsequent chapters) indicate that with one exception, the cancers occurring in the patients in this pedigree are associated with a rare mutation in a gene called BRCA2. a. Which individual is the exceptional cancer patient whose disease is not associated with a BRCA2 mu-
any, of the alleged fathers could be the real
father?
I
Sumita
Kamala (IA and
ABO
Tony
Kesari
Is the BRCA2 gene likely to reside on the X chromo-
Y chromosome, or an autosome? How definitive is your assignment of the chromosome some, the
carrying BRCA2?
Z A O Q E @
Deceased
Breastcancer Ovarian cancer and deceased
Other canceranddeceased
126
Chapter
4
The Chromosome Theory of Inheritance
d. Is the penetrance of the cancer phenotype complete or incomplete?
e. Is the expressivity ofthe cancer phenotype unvarying or variable?
f.
Are any of the cancer phenotypes associated with the BRCA2 mutation sex-limited or sex-influenced?
g. How can you explain the absence of individuals diagnosed with cancer in generations I and II? 46. In 1995, doctors reported a Chinese family in which retinitis pigmentosa (progressive degeneration of the retina leading to blindness) affected only males. All six sons of affected males were affected, but all of the five daughters of affected males (and all of the children of these daughters) were unaffected. a. \Mhat is the likelihood that this form of retinitis pigmentosa is due to an autosomal mutation showing complete dominance?
b. What other possibilities could explain the inheritance of retinitis pigmentosa in this family? Which of these possibilities do you think is most likely?
47.In
cats,
the dominant O allele of the X-linked
orange gene is required to produce orange fur; the recessive o allele of this gene yields black fur. a. Tortoiseshell cats have coats with patches oforange fur alternating with patterns of black fur.
Approximately 90%
of all tortoiseshell cats are
females. What type of crosses would be expected to produce female tortoiseshell cats? b. Suggest a hypothesis to explain the origin of male tortoiseshell cats. c. Calico cats (most ofwhich are females) have patches ofwhite, orange, and black fur. Suggest a hypothesis for the origin ofcalico cats.
a. Predict the possible coat color phenotypes of the progeny of both sexes if a female marsupial homozygous for a mutant allele of an X-linked coat color gene was mated with a male hemizygous for the alternative wild-type alleles of this gene. b. Predict the possible coat color phenotypes ofprogeny of both sexes if a male marsupial hemizygous for an allele of an X-linked coat color gene was mated with a female homozygous for the alternative wild-t1pe allele of this gene.
c. Why are the terms "recessive" and 'dominant" not useful in describing the alleles of X-linked coat color in marsupials? d. Why would marsupials heterozygous for two alleles of an X-linked coat color gene not have patches of genes
fur of two different colors
as cats described in Problem 47?
did the tortoiseshell
49. The pedigree diagram below shows a family in which many individuals are affected by a disease called Leri-Weill dyschondrosteosis (LWD). People with LWD are short in stature due to leg bone deformities; arm bones are also malformed in some individuals. The mutant gene responsible for LWD was identified in 1998 as SFIOX, a gene located in a pseudoautosomal region (PARI) of the X and y chromosomes.
a. Is the SHOX allele that causes LWD dominant or recessive? Explain. (Nofe; Sex reversal is not involved.)
b. Even though SF/OX is located on the X chromo, some, the pedigree is atypical for an X-linked allele. What features of the pedigree are incompatible with X-linkage?
c. For each affected individual in the pedigree, determine whether the SHOX disease allele is on the X or the Y.
d. Explain the inheritance pattern of the SHOXdisease allele and the SHOX+ normal allele in pedigree.
e. Diagram the crossover event that generated the y chromosome in individual III-5. Your diagram should indicate the positions of the SF/OX (disease) and SHOX+ (normal) alleles on the X and y chromosomes in the germ-line cells of individual II-3, and SRY+ on Y.
2
2
Tortoiseshell
48.
Calico
lil
In
marsupials like the opposum or kangaroo, X inactivation selectively inactivates the paternal X chromosome.
12
IV
123
J
o
567
45
cAhal s"
canal st IRIEECA 1.9
--.l
chaprcr
5
* Chambere St 1.2.s.e tg Bridge-
Linkage,
Recombination, and the Mapping of Genes on Chromosomes
oo
8r1'S
Rector Bt t-Naasau 19ef
grg
H'dp Maps illustrate the spatial relationships of obiects, such as the Iocations of subway stations along subway lines. Genetic maps portray the positions of genes along chromosomes.
chapter outline l,:,ii+r,i
::| 1rf,,_r:,_r,
il i:i.
'
.
5.1 Gene Linkage and Recombination
5.2 Recombination: A Result of Crossing-Over During
Meiosis
lN 1928, DOCTORS completed a four-generation pedigree 5.3 Mapping: Locating Genes Along a Chromosome tracing two known X-linked traits: red-green colorblind5.4The Chi-SquareTest and Linkage Analysis ) ness and hemophilia A (the more serious X-linked form 5.5 Tetrad Analysis in Fungi of "bleeders disease"). The maternal grandfather of the 5.6 Mitotic Recombination and Genetic Mosaics family exhibited both traits, which means that his single X chromosome carried mutant alleles of the two corresponding genes. As expected, neither colorblindness nor hemophilia showed up in his sons and daughters, but two grandsons and one great-grandson inherited both ofthe X-linked conditions (Fig. 5.1a). The fact that none of the descendants manifested one of the traits without the other suggests that the mutant alleles did not assort independently during meiosis. Instead they traveled together in the gametes forming one generation and then into the gametes forming the next generation, producing grandsons and great-grandsons with an X chromosome specifying both colorblindness and hemophilia. Genes that travel together more often than not exhibit genetic linkage. In contrast, another pedigree following colorblindness and the slightly different B form of hemophilia, which also arises from a mutation on the X chromosome, revealed a different inheritance pattern. A grandfather with hemophilia B and colorblindness had four grandsons, but only one of them exhibited both conditions. In this family, the genes for colorblindness and hemophilia appeared to assort independently, producing in the male progeny all four possible combinations of the two traits-normal vision and normal blood clotting, colorblindness and hemophilia, colorblindness and normal clotting, and normal vision and hemophilia-in approximately equal frequencies (Fig. 5.lb). Thus, even though the mutant alleles of the two genes were on the same X chromosome in the grandfather, they had to separate to give rise to grandsons III-2 and III-3. This separation of genes on the same chromosome is the result of recombination, the occurrence in progeny of new gene combinations not seen in previous generations. (Note that recombinant progeny can result in either of two ways: from the recombination of genes on the same chromosome during gamete formation, discussed in this chapter, or from the independent 127
128
(a)
Chapter
5
Linkage, Recombination, and the Mapping of Genes on Chromosomes
Figure 5.1 Pedigrees indicate that colorblindness and two forms of hemophilia are X-linked traits. (a) Transmission of
I
II
2
ilt IV
Male
red-green colorblindness and hemophilia A. The traits travel together through the pedigree indicating their genetic linkage. (b) Transmission of red-green colorblindness and hemophilia B. Even though both genes are X-linked, the mutant alleles are inherited together in only one of four grandsons in generation lll. These two pedigrees indicate that the gene for colorblindness is close to the hemophilia A gene but far away from the hemophilia B gene.
Female
A lI= Hemopnitia e [l
(b) r
lt
Hemophilia
assortment of genes on nonhomologous chromosomes, previously described in Chapter 4')
Flcoorbnd .r"y.iJ3ffJfiil,:1';il::il'*"T5"ilJ:"il:"$;'",:: some. The first is that the farther apart two genes are, the greater is the probability of separation through recombination. Extrapolating from this general rule, you can see that the gene for hemophilia A is likely very close to the gene for red-green colorblindness, because, as Fig. 5.1a shows, the two rarely separate. By comparison, the gene for hemophilia B must lie far away from the colorblindness gene, because, as Fig. 5.1b indicates, new combinations of alleles of the two genes occur quite often. A second theme is that geneticists can use data about how often genes separate during transmission to map the genes'relative locations on a chromosome. Such mapping is a key to sorting out and tracking down the components of complex genetic networks; it is also crucial to geneticists' ability to isolate and characterize genes at the molecular level.
lil
E H [:ffifl|]'. ".0
-.
f!|
Gene Linkage
and Recombination Iearning objectives
1. 2. 3. 4.
Define linkage with respect to gene loci and chromosomes.
Differentiate between parental and recombinant gametes. Conclude from ratios of progeny in a dihybrid cross whether two genes are linked. Explain how a testcross can provide evidence for or against linkage.
Ifpeople have roughly 25,000 genes but only 23 pairs of chromosomes, most human chromosomes must carry hundreds, if not thousands, of genes. This is certainly true of the human X chromosome, which contains about 1100 protein-coding genes, as just described in Chapter 4. Recognition that many genes reside on each chromosome raises an important question. If genes on dffirent chromosomes assort independently because nonhomologous chromosomes align independently on the spindle during meiosis I, how do genes on the sqme chromosome assort?
Some Genes on the Same Chromosome Do Not Assort lndependently-lnstead,
TheyAre Linked We begin our analysis with X-linked Drosophila genes first to be assigned to a specific chromosome. As we outline various crosses, remember that females carry two X chromosomes, and thus two alleles for each X-linked gene. Males, in contrast, have only a single X chromosome (from the female parent), and thus because they were the
only a single allele for each ofthese genes. We look first at two X-linked genes that determine a fruit fly's eye color and body color. These two genes are said to be syntenic because they are located on the same chromosome. The white gene was introduced in
Chapter 4; you will recall that the dominant wild-type allele w+ specifies red eyes, while the recessive mutant allele w confers white eyes. The alleles of the yellow body color gene are y* (the dominant wild-type allele for brown bodies) and 7 (the recessive mutant allele for yellow bodies). To avoid confusion, note that lowercase 7 and y+ refer to alleles of the yellow gene, while capitil Y refers to the Y chromosome (which does not carry genes for either eye or body color). You should also pay attention to the slash symbol (/), which is used to separate genes found on chromosomes of a pair (either the
5.1 Gene Linkage and
chromosomes as in this case, or a pair of X chromosomes or homologous autosomes). Thus w y lY represents the genotlpe of a male with an X chromosome bearing w and 7, as well as a Y chromosome; phenotypically this male has white eyes and a yellow body.
X and Y
Detecting linkage by analyzing the gametes produced by a dihybrid In a cross between a female with mutant white eyes and
a
wild-t1pe bro wnbody (w y* I w y* ) and a male with wild-type red eyes and a mutant yellow body (w' y lY), the F1 offspring are evenly divided between brown-bodied females with normal red eyes (* y* I w+ y) andbrown-bodied males with mutant white eyes (w y* lV) (Fig. 5.2). Note that the male progeny look like their mother because their phenotype directly reflects the genotype of the single X chromosome they received from her. The same is not true for the F1 females, who received w and y* on the X from their mother and w* 7 on the X from their father. These F1 females are
Figure 5.2 When genes are linked, parental combinations outnumber recombinant types. Doubly heterozygous wy*/w* y produce four types of male offspring. Sons that look like the father (w+ y /Y) or mother (w y* / y) of the Fr females are parental types. Other sons (w*y*/Y or wy /Y) are recombinant types. For these closely linked genes, many more parental types are produced than recombinant types. F'' females
s-tr '#r"'
P
dw+ylY
w
F1
g
w y+lw+
y
X
dwy+lY
I F2 males 4484
rn.rw
76
53
Total
9026
w+rr"
Parental types =
*+;#tx1oo=ee%
W
*.r.,ffi
rrt"S:
129
thus dihybrids. With two alleles for each X-linked gene, one derived from each parent, the dominance relations of each pair of alleles determine the female phenotype. Now comes the significant cross for answering our question about the assortment of genes on the same chromosome. If these two Drosophila genes for eye and body color assort independently, as predicted by Mendel's second law, the dihybrid Fr females should make four kinds of gametes, with four different combinations of genes on the X ihro-oto-" -w l* , wn y, w* y+, and w y. These four types of gametes should occur with equal frequency, that is, in a ratio of 1:1:1:1. If it happens this way, approximately half of the gametes will be of the two parental types, carrying either the w y* allele combination seen in the original female of the P generation or lhe w+y allele combination seen in the original male of the P generation. The remaining half of
the gametes will be of two recombinant types, in which reshufiling has produced either w+y+ or w y allele combinations not seen in the P generation parents of the F1 females. We can see whether the 1:1:1:1 ratio of the four kinds of gametes actually materializes by counting the different tlpes of male progeny in the F2 generation, as these sons
receive their only X-linked genes from their maternal gamete. The bottom part of Fig. 5.2 depicts the results of a breeding study that produced 9026 Fz males. The relative numbers of the four X-linked gene combinations passed on by the dihybrid F1 females' gametes reflect a significant departure from the 1:1:1:1 ratio expected of independent assortment. By far, the largest numbers of gametes carry the parental combinations w y+ and w*y. Of the total9026 male flies counted, 8897, or almost 99o/o,had these genot1pes. In contrast, the new combinations **y* and w y made up little more than 1% of the total. We can explain why the two genes fail to assort independently in one of two ways. The w y* and wn y combinations could be preferred because some intrinsic chemical affinity exists between these particular alleles. Alternatively, these combinations of alleles might show up most frequently because they are parental types. That is, the Ft female inherited w and y* together from her P generation mother, and w* and 7 together from her P generation father; the F1 female is then more likely to pass on these parental com-
4413
Recombination
Recombinant types =
76+53 9026
X100=1%
binations of alleles, rather than the recombinant combinations, to her own progeny.
Linkage: A preponderance of parental classes of gametes A second set of crosses involving the same genes but with a different arrangement of alleles explains why the dihybrid Fr females do not produce a 1:1:1:l ratio of the four possible types of gametes (see Cross Series B in Fig. 5.3). In this second set ofcrosses, the original parental generation consists of red-eyed, brown-bodied females
(** y* / ,* y*)
and white-eyed, yellow-bodied males
130
Chapter
5
Linkage, Recombination, and the Mapping of Genes on Chromosomes
Figure 5.3 Designations of "parental" and "recombinant" relate to past history. Figure 5.2 has been redrawn here as cross Series A for easier comparison with Cross Series B, in which the dihybrid F, females received different allelic combinations of the white and yellow genes. Note that the parental and recombinant classes in the two cross series are the opposite of each other. The percentages of recombinant and parental types are nonetheless similar in both experiments, showing that the frequency of recombination is independent of the arrangement of alleles. Cross Series A P
wy* .--.#-
I +r-wy* F.l
I
wy* ---.1*w*y -F--l-
F2 males
Parental -99"/o
P
w+y
*4---F*
X
X
/N. t!
9--+w' y'
d
X
v.?
Fl
wy*
w+ y*
d
-{--.{-
w+ywyw*y* "-t-----t-i:i -t--#* Parental
Cross Series B
.--t-
I -.1 --+-:r-* wy
W'
n X
,d
ti
F, males
W
...--+--r.i-
Recombinant Recombinant -'t%
(w y / Y), and the resultant F, females are all w+ y+ / w y dihybrids. To find out what kinds and ratios of gametes these F1 females produce, we need to look at the telltale F2 males. This time, as Cross Series B in Fig. 5.3 shows, w+ y/Y and w y+ / Y are the recombinants that account for little more than 1% of the total, while w y / Y and w+ y+ / Y are the parental combinations, which again add up to almost 99o/o. You can see that there is no preferred association of w+ and y or of y* and w in this cross. Instead, a comparison of the two experiments with these particular X chromosome genes demonstrates that the observed frequencies ofthe various types ofprogeny depend on how the arrange-
ment of alleles in the F, females originated. We have redrawn Fig. 5.2 as Cross Series A in Fig. 5.3 so that you can make this comparison more directly. Note that in both experiments, it is the parental classes-the combinations originally present in the P generation-that show up most frequently in the F2 generation. The reshuffled recombinant classes occur less frequently. It is important to appreciate that the designation of 'parental" and "recombinant" gametes or progeny of a doubly heterozygous F, female is operational, that is, determined by the particular set of alleles she receives from each of her parents. When genes assort independently, the numbers of parental and recombinant F, progeny are equal, because a doubly heterozygous F1 individual produces an equal number of all four types of gametes. By comparison, two genes are considered linked when the number of F2 progeny with parental genotypes exceeds the number of F, progeny with recombinant genotypes. Instead of assorting independently, the genes behave as if they are connected
Parental
Parental
Recombinant Recombinant
-99%
-t
10t /o
to each other much of the time. The genes for eye and body color that reside on the X chromoso me in Drosophila are an extreme illustration of the linkage concept. The two genes are so tightly coupled that the parental combinations of alleles-r,y y+ and w+ y (in Cross Series A of Fig. 5.3) or w- y- and w 7 (in Cross Series B)-are reshuffled to form recombinants in only I out of every 100 gametes formed. In other words, the two parental allele combinations ofthese tightly linked genes are inherited together 99 times out of 100.
Gene-pair-specifi c variation in the degree of linkage Linkage is not always this tight. In Drosophila, a mutation for miniature wings (z) is also found on the X chromosome. A cross of red-eyed females with normal wings (w* m* / ** mn) and white-eyed males with miniatuie wings (w m/ Y) yields an F1 generation containing all redeyed, normal-winged flies. The genotype of the dihybrid F, females is w' m' / w m. Of the F2 males, 67.2o/o areparental types (ta+ m+ and w m),while the remaining32.So/o are recombinanls (w m+ and w+ m). This preponderance of parental combinations among the F, genotypes reveals that the two genes are linked: The parental combinations of alleles travel together more often than not. But compared to the 99o/o linkage between the w and y genes for eye color and body color, the linkage of w to m is not that tight. The parental combinations for color
and wing size are reshuffled in roughly 33 (instead of out of every 100 gametes.
1)
5.1 Gene Linkage and Recombination 131
Linkage of autosomal traits Linked autosomal genes are not inherited according to the 9:3:3:l Mendelian ratio expected for two independently assorting, non-interacting genes, each with one completely dominant and one recessive allele. Mendel observed the
9:3:3:l phenotlpic ratio in the F2 of his dihybrid crosses because the four possible gamete types (A B, A b, A B, and a b)
were produced at equal frequency by both parents. Equal numbers of each of the four gamete types-independent assortment-means that each one of the 16 boxes in the Punnett square for the F, is an equally likely fertilization with a frequency of 1116 (recall Fig. 2.I5 on p. 25). Had Mendel's two genes been linked, the phenotypic ratio in the F2 would no longer have been 9:3:3:l because the parental gametes would have been present at greater frequency than the recombinant gametes. Figure 5.4 shows the consequences of linkage if the F1 dihybrid individuals were both ofgenotypeA B / ab:thegl16andtlt6 classes ofF2 would have increased at the expense of the two 3/16 classes. Conversely, if the alleles of the parents are configured
Figure 5.4 The 9:3:3:1 ratio is altered when genes A and
B
For linked genes, the F2 genotypic classes produced most often by parental gametes increase in frequency at the expense of the other classes. ln the A B/a b dihybrid cross shown here, the A- B- and aa bb classes in the F2 will occur at higher frequencies, and the two other classes (A- bb and aa B-) at lower frequencies than predicted by
are linked,
)
the 9:3:3:1 ratios. Note that the blue colors in this figure denote the frequencies at which particular genotypic classes will appear in the F, generation.
ABIAB
P
ablab
X
I
t
AB
Fl
ab
x
QnBtao
dABlab
x
>114
F2
@- >114 dl
EI 0)l (dl
o_L >114 > 9116 A- B 1116 aa bb
Et cD
a c(6tr 114
> NPD the two genes are linked. lf PD : NPD, the two genes assort independently (they are unlinked). The map distance between two genes if they are linked
ttr*atg*
hr
=
2
NDP
+
(1/2)T
Total tetrads
x
100
For Ordered Tetrads Only
(b) Corresponding genetic map
uro
16.7
7.6 m.u
-
The map distance between a gene and the centromere
m.u. 10 m.u
,
-
111y
-
(1/2) Mlr
Total tetrads
x
100
5.6 Mitotic Recombination and Genetic
essential concepts
. .
tetrad is the group of four haploid spores within an ascus that results from a single meiosis in fungi. A
ln a parental ditype (PD), a tetrad has four parental spores; in a nonparental ditype (NPD), a tetrad contains four recombinant spores; in a tetratype (T), an ascus contains two different parental spores and two different recombinant spores.
. .
f,f|
Mitotic Recombination
and Genetic Mosaics Iearning objectives 1.
Explain how mitotic recombination leads to the mosaic
condition termed twin spots.
2.
Describe sectored colonies in yeast and their significance in evaluating mitotic recombination.
The recombination of genetic material is a critical feature of meiosis. It is thus not surprising that eukaryotic organisms
express a variety of enzymes (to be described in Chapter 6) that specifically initiate meiotic recombination. Recombination can also occur during mitosis. Unlike what happens in meiosis, however, mitotic crossovers are initiated by mistakes in chromosome replication or by chance exposures to radiation that break DNA molecules, rather
than by a well-defined cellular program. As a result, mitotic recombination is a rare event, occurring no more frequently than once in a million somatic cell divisions. Nonetheless, the growth of a colony of yeast cells or the development of a complex multicellular organism involves so many cell divisions that geneticists can routinely detect these rare mitotic events.
"Twin Spots" lndicate Mosaicism Caused by Mitotic Recombination In
1936, the Drosophila geneticist Curt Stern inferred the existence of mitotic recombination from observations of "twin spots" in fruit flies. Twin spots are adjacent islands
of tissue that differ both from each other and from the
157
Figure 5.27 Twin spots: A form of genetic mosaicism. ln a y sn* / y+sn Drosophilo female, most of the body is wild type, but aberrant patches showing either yellow color or singed bristles sometimes occur. ln some cases/ yellow and singed patches are adjacent to each other, a configuration known as twin spots.
s\
*$ \.1
When a dihybrid sporulates, if PD tetrads are equal to NPD tetrads, the genes in question are unlinked; when PDs greatly outnumber NPDs, the genes are linked. Analysis of unordered tetrads reveals linked genes and the map distance between them; analysis of ordered tetrads further allows determination of the distance between a gene and its chromosome's centromere.
Mosaics
Single yellow spot
Twin spot
N Single singed spot
tissue surrounding them. The distinctive patches arise from homozygous cells with a recessive phenotype growing amid a generally heterozygous cell population dis-
playing the dominant phenotype. In Drosophila, the yellow (y) mutation changes body color from normal brown to yellow, while the singed bristles (sn) mutation causes body bristles to be short and curled rather than long and straight. Both of these genes are X-linked. In his experiments, Stern examined Drosophila females of genotype y sn* | y* sn.These double heterozygotes were generally wild type in appearance, but Stern noticed that some flies carried patches of yellow body color, others had small areas of singed bristles, and still others displayed twin spots: adjacent patches of yellow cells and cells with singed bristles (Fig. 5.27). He assumed that mistakes in the mitotic divisions accompanying fly development could have led to these mosaic animals containing tissues of different genotypes. Individual yellow or singed patches could arise from chromosome loss or by mitotic nondisjunction. These errors in mitosis would yield XO cells containing only 7 (but not y*) or sn (but not sn+) alleles; such cells would show one of the recessive phenotypes. The twin spots must have a different origin. Stern reasoned that they represented the reciprocal products of mitotic crossing-over between the sn gene and the centromere. The mechanism is as follows: During mitosis in a diploid cell, after chromosome duplication, homologous chromosomes occasionally-but rarely-pair up with each other. While the chromosomes are paired, nonsister chromatids can exchange parts by crossingover. The pairing is transient, and the homologous chromosomes soon resume their independent positions on the mitotic metaphase plate. There, the two chromosomes can line up relative to each other in either of two ways (Fig. 5.28a). One of these orientations would yield two daughter cells that remain heterozygous for both genes and are thus indistinguishable from the surrounding
158
Chapter
5
Linkage, Recombination, and the Mapping of Genes on Chromosotnes
Figure5.28 Mitoticcrossing-over.(a)lnaysn*/ynsnDrosophilafemale,amitoticcrossoverbetweenthecentromereandsncanproduce two daughter cells, one homozygous for y and the other homozygous for s4 that can develop into adjacent aberrant patches (twin spots). This outcome depends on a particular distribution of chromatids at anaphase (top). lf the chromatids are arranged in the equally likely opposite orientation, only phenotypically normal cells will result (bottom). (b) Crossovers between sn andy can generate single yellow patches. However, a single mitotic crossover in these females cannot produce a single singed spot if the sn gene is closer to the centromere than the y gene. Transient pairing during mitosis
+
+
Mitotic metaphase
Daughter cells
(a) Crossing-over between snand the centromere Yellow
t
.sn'
sn
sn'
I
v
sn+
v'
sn1/\
v
sn'
v
Twin
-.(F.+-r= sn y'
spot
-.O-----+-.---ts
snf
Singed Wild type
or
sny,t
sn' v
sn' v
snf
sn+ y
sn
v'
Normal
I
sn
f
sn'
v
{++ snf
#-+
tissue
sn*
y
Wild type
(b) Crossing-over between sn and y Yellow ll
lsn*ylsny -a--!+
- gene after the first mutation observed; they designate the wild-type allele as rv* and the various mutations as wI (the original white-eyed mutation discovered by T. H. Morgan, often simply designated as w), tv'h"'/, wcorat , wapricot, and wb'f . As an example, the eyes of a wt I woP"'ot female are a dilute apricot color; because the phenotlpe of this heterozygote is not wild-t1pe, the two mutations are allelic. Figure 7.22b illustrates how researchers collate data from many complementation tests in a complementation table. Such a table helps visualize the relationships among a large
group of mutants.
In Drosophila, mutations in the rv gene map very close together in the same region of the X chromosome, while mutations in other eye color genes lie elsewhere on the chromosome (Fig.7.22c). This result suggests that genes are not disjointed entities with parts spread out from one end of a chromosome to anothe4 each gene, in fact, occupies only a relatively small, discrete area of a chromosome. Studies defining genes at the molecular level have shown that most genes consist of 1000-20,000 contiguous base pairs (bp). In humans, among the shortest genes are the roughly 500 base pair-long genes that govern the production of histone proteins, while the longest gene so far iden-
tified is the Duchenne musculqr dystrophy (DMD) gene, which has a length of more than 2 million nucleotide pairs. All known human genes fall somewhere between these extremes. To put these figures in perspective, an average human chromosome is approximately 130 million base pairs in length.
A Gene ls a Set of Nucleotide Pairs That Can Mutate lndependently and Recombine with Each Other Although complementation testing makes it possible
to distinguish mutations in different genes from mutations in the same gene, it does not clarify how the structure of a gene can accommodate different mutations and how these different mutations can alter phenotype in different ways. Does each mutation change the whole gene at a single stroke in a particular way, or does it change only a specific part of a gene, while other mutations alter other parts? In the late 1950s, the American geneticist Seymour Benzer used recombination analysis to show that two different mutations that did not complement each other and were therefore known to be in the same gene can in fact change different parts of that gene. He reasoned that if a
7.3 What Mutations Tell Us About Gene
Structure
227
Figure 7.22 Complementation testing of Drosophila eye color mutations. (a) A heterozygote has one mutation (m,) on one chromosome and a different mutation (mz) on its homolog. lf the mutations are in different genes, the heterozygote will be wild type; the mutations complement each other (/eft). lf both mutations affect the same gene, the phenotype will be mutanC the mutations do not complement each other (rght). Complementation testing makes sense only when both mutations are recessive to wild type. (b) This complementation table reveals five complementation groups (five different genes) for eye color. A "+" indicates mutant combinations with wild-type eye color; these mutations complement and are thus in different genes. Several mutations fail to complement (-) and are thus alleles of one gene, white. (c) Recombination mapping shows that mutations in different genes are often far apart, while different mutations in the same gene are very close together. (a) Complementation testing No Defective gene
Functional gene
Defective gene
Functional gene
G
R
G
R
Functional gene
Defective gene
Malernal chromosome
Paternal chromosome
Functional gene
Defective
u
(f,
R
Conclusion: m.' and m2 are in different genes.
Conclusion: mt and m2 are in the same gene.
m,/m, has wildtype phenotype because
m1/m2 has mutant phenotype because organism has no gene Gfunciion.
one chromosome supplies gene G function, while the other supplies gene F function.
(b) A complementation table: X-linked eye color mutations in Drosophila Mutation white
white garnet ruby vermilion cherry coral apricot buff +
garnet ruby
+
+
+
+
carnation +
+ +
+ + +
+ +
+
+
+ +
cheffy
+ + +
coral apricot
+ +
buff
I
+
vermilion
+
+
carnation
(c) Genetic map: X-linked eye color mutations in Drosophila .\ sJ
$
$"ttl$ 1 m.u
*ao
.s*
0 1.5
7.5
^.a-'
\U
33.0
of separately mutable subunits, then it should be possible for recombination to occur within a gene is composed
gene, between these subunits. Therefore, crossovers between
homologous chromosomes carrying different mutations known to be in the same gene could in theory generate a
wild-tpe allele (Fig. 7.23).
se 44.4
&to"
62.5
Genes
Distance (m.u.)
bacterium generates 100-1000 progeny in less than an hour, Benzer could easily produce enough rare recombinants for his analysis (Fig. 7.24a.2). Moreover, by exploit, ing a peculiarity of certain T4 mutations, he devised conditions that allowed only recombinant phages, and not parental phages, to proliferate.
Because mutations affecting a single gene are likely to
lie very close together, it is necessary to examine a very large number of progeny to see even one crossover event between them. The resolution of the experimental system
\must thus be extremely high, allowing rapid detection of jrare genetic events. For his experimental organism, Benzer - chose bacteriophage T4, a virus that infects E. coli ceIIs (Fig. 7.24a.1). Because each T4 phage that infects a
The experimental system: rrr- mutations of bacteriophage T4 Even though bacteriophages are too small to be seen without the aid of an electron microscope, a simple technique makes it possible to detect their presence with the unaided eye (Fig. 7.24a,.3). To do this, researchers mix a population
228
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
Figure 7.23 How recombination within a gene could generate a wild-type allele. Suppose a gene, indicated by the region between brackets, is composed of many sites that can mutate independently. Recombination between mutations m1 and m2 at different sites in the same gene produces a wild-type allele and a reciprocal allele containing both mutations.
Recombination Resultant Original chromosomes event chromosomes Gene
f**.-*****l Mutation
1
-+/+
f+
F
f*+m,+++mr+l
F
Recombinant gene with iwo mutations
m2
Recombinant wild-type gene
-
of bacteriophage particles with a much larger number of bacteria in molten agar and then pour this mixture onto a petri plate that already contains a bottom layer ofnutrient agar. Uninfected bacterial cells grow throughout the top layer, forming an opalescent lawn of living bacteria. How-
if
a single phage infects a single bacterial cell somewhere on this lawn, the cell produces and releases progeny viral particles that infect adjacent bacteria, which, in turn, produce and release yet more phage progeny. With each release of virus particles, the bacterial host cell dies. The agar in the top layer prevents the phage particles from diffusing very far. Thus, several cycles ofphage infection, replication, and release produce a circular cleared area in the lawn, called a plaque, devoid of living bacterial cells. The process of mixing phages with bacteria to produce a lawn and plaques on a petri plate is called "plating" phages. ever,
Most plaques contain from I million to l0 million descendants of the single bacteriophage that originally infected
a cell in that position on the petri
plate.
Sequential dilution of phage-containing solutions makes it possible to measure the number of phages in a particular plaque and arrive at a countable number of viral
particles (Fig. 7 .2aa.4). When Benzer first looked for genetic traits associated with bacteriophage T4, he found mutants that, when added to a lawn of E. coli B strain bacteria, produced larger plaques with sharper, more clearly rounded edges than those produced by the wild-type bacteriophages (Fig.7.2ab). Because these changes in plaque morphology result from
the abnormally rapid lysis of the host bacteria, Benzer named the mutations r for "rapid lysisl' Many r mutations map to a region of the T4 chromosome known as the r11 region; these are called
rII-
recombination events. The rll region has two genes
+ + + + + m"+'l
-F Mutation 2
Wild-type rII+ bacteriophages form plaques of normal shape and size on cells of both the E. coli B strain and a strain known as E. col, K(tr'). The rII- mutants, however, have an altered host range; they cannot form plaques with E. coli K(\) cells, although as we have seen, they produce large, unusually distinct plaques with E. coll B cells (Fig. 7.2a$. The reason lhat rII- mutants are unable to infect cells of the K(\) strain was not clear to Benzer, but this property allowed him to develop an extremely simple and effective test for rII+ gene function, as well as an ingenious way to detect rare intragenic (within the same gene)
mutations.
An additional property of rII- mutations makes them ideal for the genetic fine structure mapping (the mapping of mutations within a gene) undertaken by Benzer.
Before he could check whether two mutations in the same gene could recombine, Benzer had to be sure he was really looking at two mutations in a single gene. To verify this, he performed customized complementation tests tailored to two significant characteristics of bacteriophage T4: They are monoploid (that is, each phage carries a single T4 chromosome, so the phages have one copy of each of their genes), and they can replicate only in a host bacterium. Because T4 phages are monoploid, Benzer needed to ensure that two different T4 chromosomes entered the same bacterial cell in order to test for complementation between the mutations. In his complementation tests, he simultaneously infected E. coli K(L) cells with two types of T4 chromosomes-one carried one rII- mutation, the other carried a different rlf- mutation-and then looked for cell lysis (Fig.7.24c.l). To ensure that the two kinds of phages would
infect almost every bacterial cell, he added many more phages of each tlpe than there were bacteria. When tested by Benzer's method, if the two
rII- mutations were in different genes, they would complement each other: Each of the mutant T4 chromosomes would supply one wild-typ e rII+ gene function, making up for the lack of that function in the other chromosome and resulting in lysis. On the other hand, if the two rll- mutations were in the same gene, they would fail to complement: No plaques would appea\because neither mutant chromosome would be able to supply the missing function. Tests of many different pairs of rll- mutations showed that they fall into two complementation groups: rIIA and r11B. However, Benzer had to satisfy one final experimental requirement: For the complementation test to be meaningful, he had to make sure that pairs of r11- mutations that failed to complement were each recessive to wild type and also did not interact with each other to produce an rIT phenotype dominant to wild type. He checked these points by a control experiment in which he recombined pairs of rIIA- or rIIB- mutalions onto the same chromosome (as described in the next section) and then simultaneously infected E. coli K(\) with these double r11- mutants and with wild-type phages (Fig.7.24c.2). If the mutations were recessive and
did not interact with each otheq the cells
FEATURE FIGURE 7.24 How Benzer Analyzed the rllGenes of Bacteriophage T4 (a.1 )
1. Phage injecis its DNA into host cell.
(a.21
Viral
chromosome Host chromosome
JU
2. Phage proteins synthesized; DNA replicated. Host chromosome degraded.
Sheath
4. Lysis of host cell
w
Tail
fibers 3. Assembly of phages within host cell
(a.3)
(a.4)
Pipette out 0.01 0.01 ml
ml
.l .l
0.1 ml 0.1 ml Add plaiing bacteria
+
1ml
1ml
Concentrated solulion of bacteriophages
1ml
1ml
25 plaques
Tubes containing medium without phages
(a) Working with bacteriophage T4 1. BacteriophageT4(atamagnificationofapproximatelyl00,000x)andinanartist'srendering,Theviral
chromosomeiscontained within a protein head. Other proteinaceous parts of the phage particle include the tail fibers, which help the phage attach to host cells, and the sheath, a conduit for injecting the phage chromosome into the host cell.
2. Thelyticcycleof bacteriophageT4.Asinglephageparticleinfectsahostcell; thephageDNAreplicatesanddirectsthesynthesis of viral protein components using the machinery of the host cell; the new DNA and protein components assemble into new bacteriophage particles. Eventual lysis of the host cell releases up to 1 000 progeny bacteriophages into the environment.
3. Clearplaquesofbacteriophagesinalawnofbacterialcells.Amixtureofbacteriophagesandalargenumberofbacteriainmolten'topagar" are layered onto "bottom agar" previously poured into a petri plate. Uninfected bacterial cells grow producing an opalescent /own. A bacterium infected by a single bacteriophage will lyse and release progeny bacteriophages, which infect adjacent bacteria. Several cycles of infection result in a plaque, a ctcular cleared area containing millions of bacteriophages genetically identical to the one that originally infected the bacterial cell.
4. Countingbacteriophagesbyserial dilution.Asmall sampleofaconcentratedsolutionofbacteriophagesistransferredtoatesttube containing fresh medium, and a small sample of this dilution is transferred to another tube of fresh medium. Successive repeats of this process increase the degree of dilution. A sample of the final dilution, when mixed with bacteria and top agar and layered on the bottom agar of a petri plate, yields a countable number of plaques from which it is possible to extrapolate back and calculate the number of bacteriophage particles in the starting solution.The original 1 ml of solution in this illustration contained roughly 2.5 x 107 bacteriophages.
(b) Phenotypic properties of r//- mutants of bacteriophage T4 E.
plated
//- mutants, when coli B cells, produce plaques (b.1)
1. on
(b.2)
E coli strain
that are larger and more distinct (with sharper edges) than plaques formed by r//" wild-type phages.
2. r// mutants are particularly useful for looking at rare recombination events because they have an altered host range. ln contrast to r//+ wild-type phages, r//- mutants cannot form plaques in lawns of E coll strain K(\) host bacteria. This property enabled Benzer to select
B
ird ;
rJ
fT
Large, distinct
rll+
Small, fuzzy
K(1,)
No plaques
Small, fuzzy
(Continued )
229
mutants: lnalysatewithmillionsofr//'phages,evenasingler//+ plaque phage on a lawn of E coll K(\). form a would the r//* phage could be identified because only
forrarerll-phagesthatarosebyrecombinationbetweentwodifferentrl/
(c.1) Complementation
(c'2) Control
(trans
(cls configuration)
test configuration) Mixed
^ rtt-mut.z ^ infectiontp, r//-mut. 1 SN
r/l- mut.l+2
tfr1rtt+
fu
I
m1
nonfunctional
-,
$'
fln,n
E. coli K(X)
or
+,. ," )4
m2
I
rllA
\11-
m1
m2
+
mut.1+2
E. coli K()")
E. coliK()')
r
ril-
rllA
rllB
functional functional
/\
/\
-
+
+
+
lf
rllB functional
mutations
mutations
lf mutations are recessive, are dominant no cell lysis. cell lf
lf mutations are recessive, are dominant, no cell lysis. cell
lysis.
lysis.
No complementation Complementation - cell lYsis - no cell - Phage Progeny - no phage
lysis progeny
(c) A customized complementation test between rll- mutants of bacteriophage T4 1. EcoliK(L) cellsareinfectedsimultaneouslywithanexcessoftwodifferentr/l mutants(m'andmr). lnsidethecell,thetwomutations will be in trans,.that is, they lie on different chromosomes. lf the two mutations are in the same gene, they will affect the same function and cannot complement each other, so no progeny phages will be produced. lf the two mutations are in different gen es (rllA and rllB)' they will complement each other, leading to progeny phage production and cell lysis. 2. An important control for this complementation test is the simultaneous infection of E coli K(\) bacteria with a wild-typeT4 strain and a T4 strain containing m1 and mrthat failed to complement, and are thus in the same gene, lnside the infected cells, the two mutations will be in cis; that is, they lie on the same chromosome. Release of phage progeny shows that both mutations are recessive to wild type and that there is no interaction between the mutations that prevents the cells from producing progeny phages. Complementation tests are meaningful only if the two mutations tested are both recessive to wild type. (d.1
)
Recombination test
rllAl
rllA2
a
rlla1
fi
$
rttA2
E. coliB -
Recombination
(d.2) Control
I I
rllA'f 1114' t'
a-
&t
r//A7S E. coli B
n,o,
E. coliB
/\ I
rll+
progeny wild
:
rllAl
I
+ rllA2
1t
type double mutant
Forms plaques No plaques on E. coll on E. coliK()v)
K(i")
t
I
I
+
rllAt
rllAzl No plaques on E. coliK(lt) -
(d) Detecting recombination between two mutations in the same gene l.E.cotiBcellsareinfectedwithalargeexcessoftwodifferentr//A
mutants(rllArandrllAr).lfnorecombinationbetweenthetwori/A-
mutations takes place, progeny phages will carry either of the original mutations and will be phenotypically rll . lf recombination between the two mutations occurs, one of the products will be an r//' recombinant, while the reciprocal product will be a double mutant chromosome containing both r//A, and rllAr.When the phage progeny subsequently infect E coli K(tr) bacteria, only rll+ recombinants will be able to form plaques. 2. As a control, E coll B cells are infected with a large amount of only one kind of mutant (r//4, or r//A2).The only r//* phages that can result are revertants of either mutation. This control experiment shows that such revertants are extremely rare and can be ignored among the r//+ progeny made in the recombination experiment at the leff. Even if the two r//A mutations are in adjacent base pairs, the number of r//-
recombinants obtained
230
is
more than 1 00 times higher than the number of r//* revertants the cells infected by a single mutant could produce.
7.3 What Mutations Tell Us About Gene
would lyse, in which case the complementation test would be interpretable. The significant distinction between the actual complementation test and the control experiment is in the placement of the two r1I mutations. In the complementation test, one rII mutation is on one chromosome, while the other r11- mutation is on the other chromosome (Fig. 7.2ac.I); two mutations arranged in this way are said to be in the trans configuration. In the control experiment (Fig. 7.2ac.2), the two mutations are on the same chromo-
Structure
231
A Gene ls a Discrete Linear Set of Nucleotide Pairs How are the multiple nucleotide pairs that make up a gene arranged-in a continuous row or dispersed in precise patterns around the genome? And do the various mutations that affect gene function alter many different nucleotides, or only a small subset within each gene?
some, in the so-called cis configuration. The complete test,
Using deletions to map mutations approximately
including the complementation test and the control experiment, is known as a cis-trans test. In the complete experiment, two mutations that do not produce lysis in trans but do so when in cls are in the same complementation group. Benzer called any complementation group identified by the cis-trans test a cistron, and some geneticists still use the term'tistron' as a synonym for'genel' With the knowledge that the r11 locus consists of two genes (rIIA and rllB),Benzer could look for two mutations in the same gene and then see if they ever recombine to produce wild-type progeny.
To answer these questions about the arrangement of nucleotides in a gene, Benzer eventually obtained thousands of
When Benzer infected E. coli B strain bacteria with a mixture of phages carrying different mutations in the same gene (rIIAt and rIIA2, for example), he did observe the ap-
rII+
progeny (Fig.
7
.2ad). He knew these
wild-type progeny resulted from recombination and not from reverse mutations because the frequencies of the r11+ phage particles he observed, even if rare, were much higher than the frequencies of rII+ revertants seen among progeny produced by infecting B strain bacteria with either mutant alone (Fig.7.24d). These experiments were possible only because Benzer devised a selection for rare rII+ recombinants. In a selection, conditions are such that the only survivors are the rare individuals you seek to identify. Benzer's selection condition for identifiiing rarc rII+ recombinant progeny was plating for plaques on E. coli K(\). Benzer could assay a phage lysate containing tens ofthousands ofphage prog-
eny on a single petri plate containing a lawn of E. coli
K(}'). Because none of the
rII-
phage
To map the location of a thousand mutants through comparisons of all possible two-point crosses, Benzer would have had to set up a million (tO3 x 103) matings. But by taking advantage of deletion mutations, he could obtain the same information with far fewer crosses.
Deletions are mutations that remove contiguous nucleotide pairs along a DNA molecule. In crosses between bacteriophages carrying a mutation and bacteriophages
carrying deletions of the corresponding region, no wild-
Recom bi nation between d ifferent mutations in a single gene
pearance of rare
spontaneous and mutagen-induced r11- mutations that he needed to map with respect to each other.
in the lysate could
form plaques, even a single rlf+ recombinant among them could be identified as a plaque. On the basis of his observations with the rll genes, Benzer drew three conclusions about gene structure and function: (1) A gene consists of different parts that can each mutate; (2) recombination can occur between different mutable sites in the same gene; and (3) a gene performs its normal function only if all of its components are wild type. From what we now know about the molecular structure of DNA, this all makes perfect sense.
type recombinant progeny can arise, because neither chromosome carries the proper information at the location of the mutation. However, if the mutation lies outside the region deleted from the homologous chromosome, wildtype progeny can appear (Fig.7.25a). This is true whether the mutation is a point mutation affecting one or a few nucleotides, or is itself a large deletion. Crosses between any uncharaclerized mutation and a known deletion thus immediately reveal whether the mutation resides in the region deleted from the other phage chromosome, providing a rapid way to find the general location of a mutation. Using a series of overlapping deletions, Benzer divided
the rII region into a series of intervals. He could then assign any point mutation to an interval by observing whether it recombined to give rII+ progeny when crossed with the series of deletions (Fig.7.25b). Benzer mapped 1612 spontaneous point mutations and several deletions in the rII locus of bacteriophage T4 through recombination analysis. He first used recombination to determine the relationship between the deletions. He then found the approximate location of individual point mutations by observing which deletions could recombine with each point mutant to yield wild-type progeny. Determining RF between rll- mutations for precise mapping Benzer next performed recombination tests to measure the genetic distance between pairs of point mutations he had found by deletion mapping to lie in the same small region of the chromosome. The distance between any two
232
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
Figure 7.25 Fine structure mapping of the bacteriophage T4 rll genes.
(a) A phage cross between a point mutation and a deletion removing the DNA at the position of the mutation cannot yield wild-type recombinants.The same is true if two different deletion mutations overlap each other. (b) Larqe deletions divide the r// locus into regions; finer deletions divide each region into subsections. Point mutations, such as 271 (in red atbottom), map to region 3 if they do not recombine with deletionsPfl,PB242, or 4105 but do recombine with deletion 638 (rop). Point mutations can be mapped to subsections of region 3 using other deletions (middle). Recombination tests map point mutations in the same subregion (bottom). Point mutations 201 and 155 cannot recombine to yield wild-type recombinants because they affect the same nucleotide pair. (c) Benzer's fine structure map. Hotspots are locations with many independent mutations that cannot recombine with each other.
(a) Using deletions for rapid
mapping
(b) Portion ot lhe rllA deletion map at increasing resolutions
Point mutation outside deletion limits
Point mutation within deletion limits
PT'1
Overlapping deletions
Nonoverlapping deletions
Cannot produce wild-type progeny by recombination
Produce wild-type progeny by recombination
Regions
-
PT8
(c) Fine structure of the ,'r, region .,.*,
ry.a./
*
164
Each box represents an independent occtrence of a mutaiion at this site.
d*fu* m + - *- -.+,,,,,,Fr,,tH PB82
-$,{*r
-
\--* *t
* - #'tT-tf
-*&$.8*
-6h
\.
#fu#-
g*"
"
\o* : \+.*"-"t-dt+
D
E
+{-*-H- uuiru""-.--- ,**,Hf,u, *E
r-T+4{tr * * T*t**"fi**t*
/ {
Subsections
nft+Ht++#}F'4t+[f
,#
i
-ry
w
I
**rf+,t,tH-:
s
*
:i"
.*ffiu-}lTli::,*:"" fiifil'6 i't ++l-+ +**.$LJh" +* +
"hotsoot."
'6
mr
* Citrulline 7}*4lgininosuccinate\ Arginine Carbamyl
Aspartate
phosphate
w
w
Minimal medium
No growth = nutritional mutant 4. Conidia from cultures that fail to grow on minimal medium are tested on minimal medium supplemented ) with individual amino acids.
Figure 7.27 Experimental support for the'bne gene, one enzyme" hypothesis. (a) Beadle and Tatum mated an X-ray-
\'\%',,"!-e'",;o%;*^% ve
Addition of arginine restores growth, reveals arginine auxotroph.
mutagenized strain of Neurospora with another strain, and they isolated haploid ascospores that grew on complete medium. Cultures that failed to grow on minimal medium were nutritional mutants. Nutritional mutants that could grow on minimal medium plus arginine were Argauxotrophs. (b) The ability of wildtype and mutant strains to grow on minimal medium supplemented with intermediates in the arginine pathway. (c) Each of the four ARG genes encodes an enzyme needed to convert one intermediate to the next in the pathway.
then, each gene controls the synthesis or activity of an
Only a cell with mutations affecting both pathways would
enzyme, or as stated by Beadle and Tatum: one gene, one enzyme. Of course, the gene and the enzyme are not the
display an aberrant phenotype. Even with nonlinear progressions such as these, careful genetic analysis can reveal the nature of the biochemical pathway on the basis of Beadle and Tatum's insight that genes encode proteins.
same thing; rather, the sequence of nucleotides in a gene contains information that somehow encodes the structure of an enzyme molecule. Although the analysis of the arginine pathway studied by Beadle and Tatum was straightforward, studies of biochemical pathways are not always so easy to interpret. Some biochemical pathways are not linear progressions of stepwise reactions. For example, a branching pathway occurs if different enzymes act on the same intermediate to convert it into two different end products. If the cell requires both of these end products for growth, a mutation in a gene encoding any of the enzymes required to synthesize the intermediate would make the cell dependent on supplementation with both end products. A second possibility is that a cell might employ either of two independent, parallel pathways to synthesize a needed end product. In such a case, a mutation in a gene encoding an enzyme in one of the pathways would be without effect.
Genes Specify the ldentity and Order of Amino Acids in Polypeptide Chains Although the one gene, one enzyme hypothesis was a critical advance in understanding how genes influence phenotype, it is an oversimplification. Not all genes govern the construction of enzymes active in biochemical pathways. Enzpnes are only one class of the molecules known as proteins, and cells contain many other kinds of proteins. Among the other t)?es are proteins that provide shape and rigidity to a cell, proteins that transport molecules in and out of cells, proteins that help fold DNA into chromosomes, and proteins that act as hormonal messengers. Genes direct the
236
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
synthesis of all proteins, enzymes and nonenzymes alike. Moreover, as we see next, genes actually determine the construction of polypeptides, and because some proteins are composed of more than one type of polipeptide, more than one gene determines the construction of such proteins.
Proteins: Linear polymers of amino acids linked by peptide bonds To review the basics, proteins are polymers composed of building blocks known as amino acids. Cells use mainly 20 different amino acids to synthesize the proteins they need. All of these amino acids have certain basic fea-
tures, encapsulated by the formula NH2-CHR-COOH (Fig. 7.2Sa). The -COOH component, also known as carboxylic acid, is, as the name implies, acidic; the -NH, component, also known as an amino group, is basic. The R refers
to
side chains that distinguish each of the amino acids (Fig. 7.2Sb). An R group can be as simple as a hydrogen
atom (in the amino acid glycine) or as complex as a benzene
ring (in phenylalanine). Some side chains are relatively neutral and nonreactive, others are acidic, and still others are basic.
In addition to the 20 common amino acids, two rare ones can be incorporated into proteins in specific circumstances (Fig. 7.2Sc). A very few proteins (only 25 in humans) are known to contain selenocysteine. Pyrrolysine
is present only in the proteins of certain prokaryotic organisms.
During protein synthesis, a cell's protein-building machinery links amino acids by constructing covalent peptide bonds that join the -COOH group of one amino acid to the -NH2 group of the next (Fig. 7.28d). A pair of amino acids connected in this fashion is a dipeptide; several amino acids linked together constitute an oligopeptide. The amino acid chains that make up proteins contain hundreds to thousands of amino acids joined by peptide bonds and are known as polypeptides. Proteins are thus linear polymers of amino acids. Like the chains of nucleotides in DNA, polypeptides have a chemical polarity. One end of a polypeptide is called the N terminus because it contains a free amino group that is not connected to any other amino acid. The other end of the polypeptide chain is the C terminus, because it contains a free carboxylic acid group.
Mutations can alter amino acid sequences Each protein is composed of a unique sequence of amino acids. The chemical properties that enable structural proteins to give a cell its shape, or enzymes to catalyze specific reactions, are a direct consequence of the identity, number, and linear order of amino acids in the protein. If genes encode proteins, then at least some mutations could be changes in a gene that alter the proper sequence
of amino acids in the protein encoded by that gene. In the mid-1950s, Vernon Ingram began to establish what kinds of changes particular mutations cause in the corresponding protein. Using techniques that had just been developed for determining the sequence of amino acids in a protein, he compared the amino acid sequence of the normal adult form of hemoglobin (HbA) with that of hemoglobin in the bloodstream of people homozygous for the mutation that causes sickle-cell anemia (HbS). Remarkably, he found only a single amino acid difference between the wild-type and mutant proteins (Fig.7.29a, p.238). Hemoglobin consists of two types of polypeptides: a so-called a (alpha) chain and a B Geta) chain. The sixth amino acid from the N terminus of the p chain was glutamic acid in normal individuals but valine in sickle-cell patients. Ingram thus established that a mutation substituting one amino acid for another had the power to change the structure and function of hemoglobin and thereby alter the phenotype from normal to sickle-cell anemia (Fig.7.29b).
We now know that the glutamic acid-to-valine change affects the solubility of hemoglobin within the red blood cell. At low concentrations of oxygen, the less soluble sickle-cell form of hemoglobin aggregates into long chains that deform the red blood cell (Fig.7.29a). Because people suffering from a variety of inherited anemias also have defective hemoglobin molecules, Ingram and other geneticists were able to determine how a large
number of different mutations affect the amino acid sequence of hemoglobin (Fig.7.29c). Most of the altered hemoglobins have a change in only one amino acid. In various patients with anemia, the alteration is generally in different amino acids, but occasionally, two independent mutations result in different substitutions for the same amino acid. Geneticists use the term missense mutation to describe a genetic alteration that causes the substitution of one amino acid for another.
A Protein's Amino Acid Sequence Dictates Its Three-Dimensional Structure Despite the uniform nature of protein construction-a string of amino acids joined by peptide bonds-each polypeptide folds into a unique three-dimensional shape. Biochemists often distinguish between four levels of protein structure: primary, secondary, tertiary, and quaternary. The first three of these apply to any one polypeptide chain, while the quaternary level describes associations between multiple polypeptides within a protein complex.
Primary, secondary, and tertiary protein structures The linear sequence of amino acids
within a polypeptide is
its primary structure. Each unique primary structure places constraints on how a chain can arrange
itselfin three-
dimensional space. Because the R groups distinguishing
Function
7.4 What Mutations Teil Us About Gene
(a) Generic amino acid structure .1. _':,Hi
;
,.,.,.,,,
l
i
(b) Amino acids with nonpolar R
Amino (-NH2) group R I
l{:=}li- c I
H
K
groups
groups
Backbone
(G)
Glycine (Gly)
R
)cHn srorp
R groups H2
-l -C\
HcC
I
H-
Backbone
froline (Pro) (P)
H C
-
COOH
H
NHz Alanine (Ala) (A)
,/lc -
Hzc
I
Carboxyl (-COOH) group
Phenylalanine (Phe) (F)
H
H I
C
-
cHz
COOH
c - cooH
I
I
NHz Valine (Val) (V)
(c)
cH"
Rare amino acids H
cHs
I
SeH-CH2-
C-
COOH Leucine (Leu) (L)
I
N H2
Tryptophan Cfrp) (W)
H
H I
I
CH
Selenocystoine (Sec) (U)
-
-
cooH
cHe- c N H2
NHz
H
Methionine (Met)
H
cHz-
(M)
H
H
cHg
I
H"C
'\crl
CH
I
cHg
c -NH-CH2- CH2-CHl-QHzil
c-
cooH
CHg- S- CH2- CH2- CNHe
NHz
(l)
lsoleucine (lle)
COOH
I
I
COOH
I
o
I
I
c-
cooH
I
I
NHz
Pyrrolysine (Pyl) (O) HC=N
cooH
H
I
CH3-
H
NHa
I
cH3- cH2- cH- c- cooH
tl
CH3
NH2
Amino acids with uncharged polar R groups Serine (Ser)
(d) Peptide bond formation AA2
AA1
R
(S)
AA3
HO-CH2-c-COOH
I
cH2-c-cooH
HO
I
-c-
I
NH2
Threonine (Thr)
I
H
(T)
NHe Asparagine (Asn) (N)
H
NHa
I
cH3- cH2- c-
a@
I
ll oH
I HR I
N terminus
tt
N-CI
H
H
I
+
I
Tyrosine (Tyr) (y)
H
C terminus
Cysteine (Cys) (C)
ll
o
,/z
o'
NHz
-cHz-c*cooH I
NHz
(o) H NHz \"-"",-cHo-6-coor //
Giutamine (Gln)
I
HS-CH2-c-COOH
I
O'l
I
NHe
Peptide bonds
H I
cooH
n
NH2
Amino acids with basic R groups Lysine (Lys)
(K)
Histidine (His) (H)
H I
H2N-CH2-GH2-CH2-QH2-
Figure 7.28 Proteins are chains of amino acids linked by peptide bonds. (a) Amino acids contain a basic amino groVp (-NHr), an acidic carboxylic acid group (-COOH), and a )CHR moiety, where R stands for one of the 22 different side/chains. (b) The 20 amino acids commonly
found in proteins, arranged according to the properties of their R
(d) One molecule of water is lost when a covalent amide linkage (a peptide bond) is formed between the -COOH of one amino acid and the -NH, of the next amino acid. Polypeptides such as the tripeptide shown here have polarity; they extend from an N terminus (with a free amino group) to a C terminus (with a free carboxylic acid group).
-
COOH
HC_ C_ cHe-
ll
N__c.NH
I
NHe
Arginine (Arg)
(R)
n
H I
-
cooH
I
N H2
H I
-llr
HrN-C-NH-CHa-CHc-CHc-
NH
groups. (c) Selonocysteine and pyrrolysine are amino acids
that are found only in a few proteins or in specific organisms.
C
C
tt,
COOH
Amino acids with acidic R groups Aspartlc acid (Asp) (D)
"o\
Glutamlc acid (Glu) (E)
H
ro\
I
c
-
I
NHz
cooH
237
c ,/l -CH2-CH2o'
H I
I
N H2
cooH
238
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
Figure 7.29 The molecular basis of sickle-cell and other anemias. (a) Substitution
of glutamic acid by valine at the sixth amino
acid from the N terminus affects the three-dimensional structure of the B chain of hemoglobin. Hemoglobins incorporating the mutant p chain form aggregates that cause red blood cells to sickle. (b) Red blood cell sickling has many phenotypic effects. (c) Other mutations in the B chain gene also cause anemias.
(b) Sickle-cell anemia is pleiotropic
(a) From mutation to phenotype Sickle-cell individual
Normal individual
-Hffimffi..
N
a"?,J;;$;;!:"-i:2^ 1.
The polypeptide: t the P chain
I
of
%%
l"
a
I
a
a Free t, 6 proteins a
3. Red blood
a
gWtWrtW
cell making thousands of hemoglobin molecules
I
I Enlargement and damage to spleen
Fatigue, heart damage, overactivity of bone marrow
Damage to heart, kidney, muscle/joints, brain, lung, gastrointestinal tract
I
(c)
p chain
substitutions/variants
1
Sickle-shaped
the 22 amino acids have dissimilar chemical properties, some amino acids form hydrogen bonds or electrostatic bonds when brought into proximity with other amino acids. Nonpolar amino acids, for example, may become associated with each other by interactions that "hide" them from water in localized hydrophobic regions. As another example, two cysteine amino acids can form covalent disulfide bridges (-S-S-) through the oxidation of their
-SH groups. AII of these interactions (Fig. 7.30a) help stabilize the polypeptide in a specific three-dimensional conformation. The primary structure (Fig. 7.30b) determines threedimensional shape by generating secondary structure: localized regions with a characteristic geometry (Fig. 7.30c). Primary structure is also responsible for other folds and twists that together with the secondary structure produce the ultimate three-dimensional tertiary structure of the entire polypeptide (Fig. 7.30d). Normal tertiary structure-
the way a long chain of amino acids naturally folds in three-dimensional space under physiological conditionsis known as a polypeptidds native configuration Various forces, including hydrogen bonds, electrostatic bonds,
2 3...
Amino acid position 7'.. 26 ". 63...
67.. -125.
-.1
46
HbS
Val His Leu Val
Hbc
Val His Leu Lys Glu Glu His Val Glu His
Glu Glu His Val Glu His
HbG San Jose
Val His Leu Glu Glv Glu His Val Glu His
HbE HbM Saskatoon
Val His Leu Glu Glu Lvs HiS Val Glu His Val His Leu Glu Glu Glu Tyr Val Glu His Val His Leu Glu Glu Glu Arg Val GIu His
Hb Zurich Disk-shaped
in spleen
Local failures in blood supply
Long fibers
Wwwa,w
Accumulation
of red blood cells
Anemia
%
Valine
Glutamic acid
cells-1
Clumping of cells; interference with circulation
Rapid destruction ot sickle cells
*r(";;!:";i:,"J"-l
hemoglobin
2. The protein: (made of two cx and iwo B chains)
Sickling of red blood
[-
HbM Milwaukee HbDB Puniab
1
Val His Leu GIu Glu Glu His Glu Glu His Val His Leu Glu Glu Glu His Val Gln His
hydrophobic interactions, and disulfide bridges help stabilize the native configuration.
It is worth
repeating that primary structure-the
sequence of amino acids
in a polypeptide-directly deter-
mines secondary and tertiary structures. The information required for the chain to fold into its native configuration
is inherent in its linear sequence of amino acids. In one example of this principle, many proteins unfold, or become denatured, when exposed to urea and mercaptoethanol or to increasing heat or pH. These treatments disrupt the interactions that normally stabilize the secondary and tertiary structures. When conditions return to normal, many proteins spontaneously refold into their native configuration without help from other agents. No other information beyond the primary structure is needed to achieve the proper three-dimensional shape of such proteins.
Quaternary structure: Multimeric proteins Certain proteins, such as the rhodopsin that promotes black-and-white vision, consist of a single polypeptide. Many others, however, such as the lens crystallin protein,
7.4 What Mutations Tell Us About Gene
Function
239
Figure 7.30 Levels of polypeptide structure. (a) Covalent and noncovalent interactions determine the structure of a polypeptide. (b) A polypeptide's primary (1') structure is its amino acid sequence. (c) Localized regions form secondary (2') structures such as ct helixes and B-pleated sheets. (d) The tertiary (3') structure is the complete three-dimensional arrangement of a polypeptide. ln this portrait of myoglobin, the iron-containing heme group, which carries oxygen, is re4 while the polypeptide itself is green, (a) lnteractions determining polypetide
structure
(c) 2'structures
COVALENT
Peptide
u
fr
Disulfide s
NONCOVALENT
-
I
S
cr
Hydrogen
B-pleated sheets
helix
(d) Nonpolar
3" structure
,cH"
lonic
(b) 1'structure acid
HOR.H
o
-3- -+- -'l-T"-,'ll! HCN.r.Cc I tHil RlHOR3
I
I
I
H
H
N terminus
H
;o-
c lt
o C teminus
which provides rigidity and transparency to the lenses of our eyes, or the hemoglobin molecule described earlier, are composed of two or more polypeptide chains that associate in a specific way (Fig. 7.3laand b). The individual polypeptides in an aggregate are known as subunits, and the complex of subunits is often referred to as a multimer. The three-dimensional configuration of subunits in a multimer is a complex protein's quaternary structure.
The same forces that stabilize the native form of a polypeptide (that is, hydrogen bonds, electrostatic bonds,
)
hydrophobic interactions, and disulfide bridges) also contribute to the maintenance of quaternary structure. As Fig.7.3Ia shows, in some multimers, the two or more interacting subunits are identical polypeptides. These identical chains are encoded by one gene. In other multimers, by contrast, more than one kind of polypeptide makes up the protein (Fig. 7.31b). The different polypeptides in these multimers are encoded by different genes. Alterations in just one kind of subunit, caused by a mutation in a single gene, can affect the function of a multimer. The adult hemoglobin molecule, for example, consists of two cr and two B subunits, with each type of
Myoglobin
subunit determined by a different gene-one for the ct chain and one for the B chain. A mutation in the HbJ3 gene resulting in an amino acid switch at position 6 in the B chain causes sickle-cell anemia. Similarly, if several multimeric proteins share a common subunit, a single mutation in the gene encoding that subunit may affect all the proteins simultaneously. An example is an X-linked mutation in mice and humans that incapacitates several different proteins all known as intefleukin (IL) receptors. Because
all of these receptors are essential to the normal function of immune system cells that fight infection and generate immunity, this one mutation causes the life-threatening condition known as X-linked tevere eombined lmmune Qeficiency (XSCID; Fig. 7.3rc). The polypeptides of complex proteins can assemble into extremely large structures capable of changing with the needs of the cell. For example, the microtubules that make up the spindle during mitosis are gigantic assemblages of mainly two polypeptides: cr-tubulin and B-tubulin
(Fig. 7.31d). The cell can organize these subunits into very long hollow tubes that grow or shrink as needed at different stages of the cell cycle.
240
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
Figure 7.31 Multimeric proteins. (a) 92 lens crystallin contains two copies of one kind of subunit; the two subunits are the product of a single gene. The peptide backbones of the two subunits are shown in different shades of purple. (b) Hemoglobin is composed of two different kinds of subunits, each encoded by a different gene. (c) Three distinct protein receptors for the immune-system molecules called interleukins (lLs; purple). All contain a common gamma (y) chain (yellow), plus other receptor-specific polypeptides (green). A mutant y chain blocks the function of all three receptors, leading to XSCID. (d) One cr-tubulin (red) and one p-tubulin (b/ue) polypeptide associate to form a tubulin dimer. Many tubulin dimers form a single microtubule. The mitotic spindle is an assembly of many microtubules. (a) A multimer with identical subunits
(c) One polypeptide in different proteins
P2 lens crystallin
lL-4
'-o*.
Receptor 't-0,,,
t
Receptor . 4;{}u_zna
lL-2
IL-7 Receptor
,.r;-,'''{t:""o ,.-ffi^,..t nit
XSCID
a..,
(d) Microtubules:
fl
Two identical subunits
I
92 lens crystallin gene
Iarge assemblies of subunits
II t
[8
:
f..;ffff]-
ruburin dimer
Assembly of microtubules: mitotic metaphase
(b) A multimer with nonidentical subunits Chromosomes aligned on spindle apparaius
Hemoglobin Microtubule
Disassembly of microtubules: mitotic telophase
I II I I ^8 I I I g I E
-- subunits Two ,".'**-*.'** cr
Hba
gene
--
Two B subunits
Spindle apparatus breaks down
Hbp gene
One gene, one polypeptide Because more than one gene governs the production of some multimeric proteins and because not all proteins are enzymes, the 'bne gene, one enzyme" hypothesis is not broad enough to define gene function. A more accurate statement is 'bne gene, one polypeptidd': Each gene governs the construction of a particular polypeptide. As you will see in Chapter 8, even this reformulation does not encompass the function of all genes, as some genes in all organisms do not determine the construction of proteins; instead, they encode RNAs that are not translated into polypeptides.
Knowledge about the connection between genes and polypeptides enabled geneticists to analyze how different
in a single
gene can produce different phenotypes. If each amino acid has a specific effect on the threedimensional structure of a protein, then changing amino acids at different positions in a polypeptide chain can alter
mutations
protein function in different ways. For example, most enzymes have an active site that carries out the enzymatic task, while other parts of the protein support the shape and position of that site. Mutations that change the identity of amino acids at the active site may have more serious consequences than those affecting amino acids outside the
active site. Some kinds of amino acid substitutions, such as replacement of an amino acid having a basic side chain with an amino acid having an acidic side chain, would be more likely to compromise protein function than would
7.5 A Comprehensive Example: Mutations That Affect
substitutions that retain the chemical characteristics of the original amino acid. Some mutations do not affect the amino acid composition of a protein but still generate an abnormal phenotype. As discussed in Chapter 8, such mutations change the amount of normal polypeptide produced by disrupting the biochemical processes responsible for decoding a gene
into a polypeptide.
essential concepts Most genes specify the linear sequence of amino acids in polypeptide; this sequence determines the polypeptide's three-dimensional structure and thus its function.
a
A missense mutation changes the identity of a single amino acid in a polypeptide.
Multimeric proteins include two or more polypeptides (subunits). lf these subunits are different, they must be encoded by different genes.
f,fl
A Comprehensive Example:
Mutations That Affect Vision Iearning objectives
1. 2. 3.
Describe the functions ofthe four photoreceptor proteins in human vision. Outline how the genes encoding the photoreceptors evolved through duplication and divergence of an ancestral gene. Explain how mutations in the photoreceptor genes result in different vision defects.
Researchers first described anomalies of color perception in humans close to 200 years ago. Since that time, they have discovered a large number of mutations that modify human vision. By examining the phenotype associated with each
mutation and then looking directly at the DNA alterations inherited with the mutation, they have learned a great deal about the genes influencing human visual perception and the function ofthe proteins they encode. Using human subjects for vision studies has several advantages. First, people can recognize and describe variations in the way they see, from trivial differences in what the color red looks like, to not seeing any difference between red and green, to not seeing any color at all.
, Second, the highly developed science of psychophysics J provides sensitive, noninvasive tests for accurately defin-
ing and comparing phenotypes. Finally, because inherited variations in the visual system rarely affect an individual's
Vision
241
life span or ability to reproduce, mutations generating many of the new alleles that change visual perception remain in a population over time. Cells of the Retina Contain
Light-Sensitive Proteins People perceive light through neurons in the retina at the back of the eye (Fig. 7.32a). These neurons are of two types: rods and cones. The rods, which make up 95o/o of all light-receiving neurons, are stimulated by weak light over a range of wavelengths. At higher light intensities, the rods become saturated and no longer send meaningful information to the brain. This is when the cones take over,
processing wavelengths of bright light that enable us to see color.
The cones come in three
forms-one specializes in the
reception of red light, a second in the reception of green, and a third in the reception ofblue. For each photoreceptor cell, the act of reception consists of absorbing photons from light of a particular wavelength, transducing information about the number and energy of those photons to electrical signals, and transmitting the signals via the optic nerve to the brain. The brain integrates the information from the three types of cones and enables humans to discriminate more than I million colors. Four related proteins with different light sensitivities The protein that receives photons and triggers the processing of information in rod cells is rhodopsin. It consists of a
single pollpeptide chain containing 348 amino acids that snakes back and forth across the cell membrane (Fig. 7 .32b). One lysine within the chain associates with retinal, a carotenoid pigment molecule that actually absorbs photons. The amino acids in the vicinity of the retinal constitute rhodopsin's active site; by positioning the retinal in a particular way, they detelmine its response to light. Each rod cell contains approximately 100 million molecules of rhodopsin in its specialized membrane. As you learned at the beginning of this chapter, the gene governing the production ofrhodopsin is on chromosome 3. The protein that receives and initiates the processing of photons in the blue cones is a relative of rhodopsin, also
consisting of a single polypeptide chain containing 348 amino acids and also encompassing one molecule of retinal. Slightly less than half of the 348 amino acids in the blue-receiving protein are the same as those found in rhodopsin; the rest are different and account for the specialized light-receiving ability of the protein (Fig. 7.32b). The gene for the blue protein is on chromosome 7. Similarly related to rhodopsin are the red- and greenreceiving proteins in the red and green cones. These are
242
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
Figure 7.32 The cellular and molecular basis of vision. (a) Rod and cone cells in the retina carry membrane-bound photoreceptors. (b) The photoreceptor in rod cells is rhodopsin. The blue, green, and
red receptor proteins in cone cells are related to rhodopsin. (c) One red photoreceptor gene and one to three green photoreceptor genes are clustered on the X chromosome. (d) The genes for rhodopsin and the
three color receptors probably evolved from a primordial photoreceptor gene through three gene duplication events followed by divergence of the duplicated copies.
(a) Photoreceptor-containing cells Pigmented epithelium
Rod and cone cells
Retina
surface-
Light
Photoreceptor cells
Cone Disc membrane
Light Retinal Rhodopsin
Membranous disc
also single polypeptides associated with retinal and embedded in the cell membrane, although they are both slightly
larger at 364 amino acids in length (Fig.7.32b). Like the blue protein, the red and green proteins differ from rhodopsin in nearly half of their amino acids; they differ from each other in only four amino acids out of every hundred. Even these small differences, however, are sumcient to differentiate the spectral sensitivities ofred and green cone cells. The
genes for the red and green proteins both reside on the chromosome in a tandem head-to-tail arrangement. Most individuals have one red gene and one to three green genes on their X chromosomes (Fig. 7.32c).
X
Evolution of the rhodopsin gene family The similarities in structure and function among rhodopsin and the three rhodopsin-related photoreceptor proteins suggest that the genes encoding these polypeptides arose by duplication of an original photoreceptor gene and then divergence through the accumulation of mutations. Many of the mutations that promoted the ability to see color must have provided selective advantages to their bearers over the course of evolution. The red and green genes are the most similar, differing by less than five nucleotides out of every hundred. This fact suggests they diverged from each other only in the relatively recent evolutionary past. The less pronounced amino acid similarity of the red or green proteins with the blue protein, and the even lower relatedness between rhodopsin and any color photoreceptor, reflect earlier duplication and divergence events (Fig.7.32d).
(b) Photoreceptor proteins Rhodopsin
Green-receiving protein
(c)
Red/green pigment genes X chromosomes f rom normal individuals:
Blue-receiving protein
W protein
(d) Evolution of visual pigment genes Primordial gene
Red Green
gene gene
Blue gene
llh0dopsin cene
How Mutations in the Rhodopsin Gene Family Affect the Way We See Mutations in the genes encoding rhodopsin and the three color photoreceptor proteins can alter vision through many different mechanisms. These mutations range from point mutations that change the identity of a single amino acid in a single protein to larger aberrations that can increase or decrease the number ofphotoreceptor genes.
Mutations in the rhodopsin gene At least 29 different single nucleotide substitutions in the rhodopsin gene cause an autosomal dominant vision disorder known as retinitis pigmentosa that begins with an early loss of rod function, followed by a slow progressive degeneration of the peripheral retina. Figure 7.33a shows the location of the amino acids affected by these mutations. These amino acid changes result in abnormal rhodopsin proteins that either do not fold properly or, once folded, are unstable. Although normal rhodopsin is an essential structural element of rod cell membranes, these nonfunctional mutant proteins are retained in the body
7.5 A Comprehensive Example: Mutations That Affect
Figure 7.33 How mutations modulate light and color perception. (a) Amino acid substitutions (black dots) that disrupt
Vision
243
Figure 7.34 How the world looks to a person with tritanopia. Compare with Fig. 4.22onp.114.
rhodopsin's three-dimensional structure result in retinitis pigmentosa. Other substitutions diminishing rhodopsin's sensitivity to light cause night blindness. (b) Substitutions in the blue pigment can produce tritanopia (blue colorblindness). (c) Red colorblindness can result from particular mutations that destabilize the red photoreceptor. (d) Unequal crossing-over between the red and green genes can change gene number and create genes encoding hybrid photoreceptor proteins.
(a)
Retinitis pigmentosa
Night blindness Ala292-Gly
Gly90-Asp
Rhodopsin
Rhodopsin
(c)
(b) Tritanopia
Red colorblindness
Pt0264-
Cys203eArg
Red photoreceptor Blue photoreceptor
(d) Unequal crossing-over
--r5 -r-..r-*, --!l----G -f---r
ts
---lrlrfrfr-r r[* rlr-rrr-r rlrt -..llrl*
r[---
of the cell, where they remain unavailable for insertion into the membrane. Rod cells that cannot incorporate enough rhodopsin into their membranes eventually die. Depending on how many rod cells die, partial or complete blindness ensues.
Other mutations in the rhodopsin gene cause the far less serious condition of night blindness (Fig. 7.33a). These mutations change the protein's amino acid sequence so that the threshold of stimulation required to trigger the vision cascade increases. With the changes, very dim light is no longer enough to initiate vision.
Mutations in the cone cell pigment genes Vision problems caused by mutations in the cone cell pigment genes are less severe than those caused by similar defects in the rod cell rhodopsin gene. Most likely, this difference occurs because the rods make up 95o/o of a person's light-receiving neurons, while the cones comprise
only about 5%. Some mutations in the blue gene on chromosome 7 cause tritanopia, a defect in the ability to
discriminate between colors that differ only in the amount of blue light they contain (Figs.7.33b and7.34). Mutations in the red gene on the X chromosome can modify or abolish red protein function and as a result, the red cone cells' sensitivity to light. For example, a change at position 203 in the red-receiving protein from cysteine to arginine disrupts one of the disulfide bonds required to support the protein's tertiary structure (see Fig. 7.33c). Without that bond, the protein cannot stably maintain its native configuration, and a person with the mutation has red colorblindness.
Unequal crossing-over between the red and green genes People with normal color vision have a single red gene; some of these normal individuals also have a single adjacent green gene, while others have two or even three green genes. The red and green genes are 960/o identical in DNA sequence; the different green genes, 99.9o/o idenlical Their proximity and high degree of homology make these genes unusually prone to an error in meiotic recombination called unequal crossing-over. When homologous chromosomes associate during meiosis, two closely related DNA sequences that are adjacent to each other, like the red and green photoreceptor genes, can pair with each other incorrectly. If recombination takes place between the mispaired sequences, photoreceptor genes may be deleted, added, or changed. A variety ofunequal recombination events produce DNA containing no red gene, no green gene, various combinations of green genes, or hybrid redgreen genes (see Fig. 7.33d). These different DNA combi-
nations account for the large majority of the known aberrations in red-green color perception, with the remaining abnormalities stemming from point mutations, as described earlier. Because the accurate perception of red and green depends on the differing ratios ofred and green
244
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
light processed, people with no red or no green gene perceive red and green as the same color (see Fig. 4.22 on p. 114).
The four genes of the rhodopsin family evolved from an ancestral photoreceptor gene by successive rounds of
gene duplication and divergence.
essential concepts
Mutations in the rhodopsin gene may disrupt rod function, leading to blindness. Mutations in cone cell
.
photoreceptor genes are responsible for various forms of colorblindness.
The vision pigments in humans consist of the protein rhodopsin in rods plus the blue-, red-, and green-sensitive photoreceptors in cones.
"
W}IAT'S NEX'T Careful studies of mutations showed that genes are linear arrays of mutable elements that direct the assembly of amino acids in a polypeptide. The mutable elements are the nucleotide building blocks of DNA. Biologists call the parallel between the sequence of nucleotides in a gene and the order of amino acids in a
l.
In Chapter 8, we explain how colinearity arises from base pairing, a genetic code, specific enzymes, and macromolecular assemblies like ribosomes that guide the flow of information from DNA through RNA to protein. polypeptide colinearity.
Imagine that 10 independently isolated recessive lethal mutations (1, f, f, etc.) map to chromosome 7 in mice. You perform complementation testing by mating all pairwise combinations of heterozygotes bearing these lethal mutations, and you score the absence of complementation by examining pregnant females for dead fetuses. A * in the chart means that the two lethals complemented, and dead embryos were not found. A - indicates that dead embryos were found, at the rate of about one in four conceptions. (The crosses between heterozygous mice would be expected to yield the homozygous recessive showing the lethal phenotype in Il4 of the embryos.) The lethal mutation in the parental heterozygotes for each cross are listed across the top and down the left side of the chart (that is, lr indicates aheterozygote in which one chromosome bears the 11 mutation and the homologous chromosome is wild tlpe).
How many genes do the 10 lethal mutations represent?
What are the complementation groups?
Answer This problem involves the application of the comple, mentation concept to a set of data. There are two ways to analyze these results. You can focus on the mutations that do complement each other, conclude that they are in different genes, and begin to create a list of mutations in separate genes. Alternatively, you can focus on mutations that do not complement each other and therefore are alleles of the same genes. The latter approach is more efficient when several mutations are involved. For example, lI does not comple,
ment 16 and
17. These
three alleles are
in
one
complementation group. /2 does not complementll0; they are in a second complementation group. l3 does not complement 14, ls, 18, or le, so they form a third complementation group. There are three complementation groups. (Note also that for each mutant, the cross between individuals carrying the same alleles resulted in no complementation, because homozygotes for the recessive lethal mutation were generated.) The three complementation groups consist of (1) 16, 17 ; (2) 12, llo; and (3) 1L3, 14, ls, P, le.
l, ll.
W, X, and Y are the intermediates (in that order) in a biochemical pathway whose product is Z. Z mutants are found in five different complementation groups. ZI
Problems
mutants will grow on Y or Z,but not W or X. 22 mutants
will grow on X, Y, or Z. 23 mutants will only grow on Z, 24 mutants will grow on Y or Z. Finally, 25 mutants will grow on W, X, Y, or Z. a. Order the five complementation groups in terms the steps they block.
of
b. What does this genetic information reveal about the nature of the enzyme that carries out the conversion
ofxtoY?
intermediate compound, the enzymatic (and hence gene) defect must be before production of that intermediate compound. The Zl mutants that grow on Y or Z (but not on W or X) must have a defect in the enzyme that produces Y. 22 mutants have a defect prior to X; 23 mutants have a defect prior to Z; 24 mutants have a defect prior to Y; Z5 have a defect prior to W. The fve complementation groups can be placed in order of activity within the biochemical pathway
Answer
as
follows:
25
This problem requires that you understand complementation and the connection between genes and enzymes in a biochemical pathway. a. A biochemical pathway represents an ordered set of reactions that must occur to produce a product. This problem gives the order of intermediates in a path-
way for producing product Z. The lack of any enzyme along the way will cause the phenotype of Z- ,but the block can occur at different places along the pathway. If the mutant grows when given an
245
22
21,
zu
zs
----+ W -----+ X ----+ Y -----+ Z
b. Mutants Z1 and 24 affectthe same step, but because they are in different complementation groups, we know they are in different genes. Mutations Zl and 24 are probably in genes that encode subunits of a multisubunit enzyme that cqrries out the conversion X to Y Alternatively, there could be a currently unknown additional intermediate step between X of
and Y.
Vocabulary
1. The following is a list of mutational
changes. For each of the specific mutations described, indicate which of the terms in the right-hand column applies, either as a description of the mutation or as a possible cause. More than one term from the right column can apply to each statement in the left column.
1.
an
A-T
base
pair in the wild-tl?e gene G-C pair
is changed to a
2.
an
3.
the sequence AAGCTTATCG is changed to AAGCTATCG
4.
the sequence AAGCTTMCG is changed to AAGCTTTATCG
5.
A-T base pair is changed to T-A pair
a
the sequence AACGTTATCG is changed to AATGTTATCG
a.
transition
b. base substitution c. transversion d. deletion e. insertion
f.
deamination
g.
X-rayirradiation
h.
intercalator
6. thesequenceAACGTCACACACACATCG is changed to
AACGTCACATCG
Section 7.1
2. What explanations can account for the following pedigree ofa very rare trait? Be as specific as possible. How might you be able to distinguish between these explanations?
IV
3. The DNA sequence of a gene from three independently isolated mutants is given here. Using this information, what is the sequence of the wild-type gene in
this region? mutant mutant mutant
1
2 3
ACCGTAATCGACTGGTAAACTTTGCGCG ACCGTAGTCGACCGGTAAACTTTGCGCG ACCGTAGTCGACTGGTTAACTTTGCGCG
4. Among mammals, measurements of the rate of generation of autosomal recessive mutations have been made almost exclusively in mice, while many measurements of the rate of generation of dominant muta-
tions have been made both in mice and in humans. What do you think is the reason for this difference?
5. Over a period of several years, a large hospital kept track of the number of births of babies displaying the trait achondroplasia. Achondroplasia is a very rare autosomal dominant condition resulting in
246
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
dwarfism with abnormal body proportions. After 120,000 births, it was noted that27 babies had been born with achondroplasia. One physician was interested in determining how many of these dwarf babies resulted from new mutations and whether the apparent mutation rate in this geographical area was higher than normal. He looked up the families of the 27 dwarf births and discovered that four of the dwarf babies had a dwarf parent. What is the apparent mutation rate of the achondroplasia gene in this population? Is it unusually high or low?
6.
Suppose you wanted to study genes controlling the structure ofbacterial cell surfaces. You decide to start by isolating bacterial mutants that are resistant to infection by a bacteriophage that binds to the cell surface. The selection procedure is simple: Spread cells from a culture of sensitive bacteria on a petri plate, expose them to a high concentration ofphages, and pick the bacterial colonies that grow. To set up the selection you could (1) spread cells from a single liquid culture of sensitive bacteria on many different plates and pick every resistant colony; or (2) start many different cultures, each grown from a single colony of sensitive bacteria, spread one plate from each culture, and then pick a single mutant from each plate. Which method
would ensure that you are isolating many independent mutations?
7.In
a genetics lab, Kim and Maria infected a sample from an E. coli culture with a particular virulent bacteriophage. They noticed that most of the cells were lysed, but a few survived. The survival rate in their sample was about 1 X 10-4. Kim was sure the bacteriophage induced the resistance in the cells, while
Maria thought that resistant mutants probably already existed in the sample of cells they used. Earlier, for a different experiment, they had spread a dilute suspension of E. coli onto solid medium in a large petri dish, and, after seeing that about 105 colonies were growing up, they had replica-plated that plate onto three other plates. Kim and Maria decided to use these plates to test their theories. They pipette a suspension of the bacteriophage onto each of the three replica plates. What should they see if Kim is right? What should they see if Maria is right?
8. The
results of the fluctuation test (Fig.7.5 on p. 21i) were interpreted to mean that different numbers of mutant bacteria preexisted in each of the 11 culture tubes because the mutations arose spontaneously at different times during the growth of each culture. However, another possibility is that the differences in the number of colonies on the plates are simply due to differences in the ability of the petri plates to support the growth of colonies. For example, perhaps the
selective agent or the nutrients in the media were not evenly distributed in the molten agar poured into the petri dishes. What experiment could you do to deter-
mine whether or not differences in the petri plates were a factor in the experiment?
9. The pedigree below
shows the inheritance of a completely penetrant, dominant trait called amelogenesis imperfecta that affects the structure and integrity of the teeth. DNA analysis of blood obtained from affected individuals III-1 and IiI-2 shows the presence of the same mutation in one of the two copies of an autosomal gene called ENAM that is not seen in DNA from the blood of any of the parents in generation II. Explain this result, citing Fig. 4.19 on p. 108 and Fig. 7.5 on p. 2Il. Do you think this type of inheritance pattern is rare or common?
2 2
3 2
10. Autism is a neurological disorder thought to be caused by mutant alleles of one or more genes in an individual. Scientists had been wondering why the number of children diagnosed as autistic has increased dramatically in the last decade, from I in 500 in 2002, to 1 in 88 in 2012. Researchers now think that they might have found at least part of the answer: Men are fathering children at later and later ages. A paper published in the journal Nature in 2012 showed a correlation between paternal age and the incidence of autism; the age of the mother was not a factor. How does this observation provide a possible explanation for the apparent increase in the rate of autism?
Section 7.2
11. Remember that balancer chromosomes prevent the recovery of recombinant chromosomes between the balancer and its homolog. Why was the balancer X chromosome crucial to the design of Muller's experiment (Fig. 7.13, p. 218)? (Hint: The best way to answer this question is to consider what the experimental results would have been without the balancer.)
12. In the experiment shown in Fig. 7.13 on p. 2I8, H.l. Muller first performed a control in which the P generation males were not exposed to X-rays. He found that 99.7o/o of the individual F1 Bar-eyed females produced some male progeny with Bar eyes and some
Problems
with wild-tFpe (non-Bar) eyes, but 0.3% of these females produced only wild-type (non-Bar) male progeny. a. If the average spontaneous mutation rate for Drosophila genes is 3.5 X 10-" mutations/gene/ gamete, how many genes on the X chromosome can be mutated to produce a recessive lethal allele? b. As of the year 2013, analysis of the Drosophila genome had revealed a total of 2283 genes on the X chromosome. Assuming the X chromosome is tlpical of the genome, what is the fraction of genes in the fly genome that is essential to survival? c. Muller now exposed male flies to a specific high
247
the aflatoxin-guanine "adduct" that is pictured below. (In the figure, the aflatoxin rs orange, and the guanine base is purple.) This adduct distorts the DNA double helix and blocks replication. a. What type(s) of DNA repair system is (are) most likely to be involved in repairing the damage caused by exposure of DNA to aflatoxin 81? b. Recent evidence suggests that the adduct ofguanine and aflatoxin B1 can attack the bond that connects it to deoxyribose; this liberates the adducted base, forming an apurinic site. How does this new information change your answer to part (a)?
dosage of X-rays and found thatl2o/o of F1 Bar-eyed
females produced male progeny that were all wild type. What does this new information say?
13. Figure 7.I4 on pp.219-220 shows examples of
base
substitutions induced by the mutagens 5-bromouracil, hydroxylamine, ethylmethane sulfonate, and nitrous acid. Which of these mutagens cause transitions, and
which cause transversions? 14. So-called two-way mutagens can induce both a particular mutation and (when added subsequently to cells whose chromosomes carry this mutation) a rever-
sion of the mutation that restores the original DNA )
In contrast,
one-wa)) mutagens can induce mutations but not exact reversions of these mutations. Based on Fig. 7 .14 (pp. 219-220), which of the following mutagens can be classified as one-way and which as two-way? sequence.
a. S-bromouracil b. hydroxylamine c. ethylmethane sulfonate d. nitrous acid e. proflavin 15. In L967,J. B. |enkins treated wild-type male Drosophila with the mutagen ethylmethane sulfonate (EMS) and mated them with females homozygous for a recessive mutation called dumpy that causes shortened wings. He found some F1 progeny with two wild-type wings, some with two short wings, and some with one short wing and one wild-type wing. When he mated single F1 flies with two short wings to dumpy homozygotes, he surprisingly found that only about 1/3 of these matings produced any short-winged progeny. a. Explain these results in light of the mechanism of action of EMS shown in Fig. 7.14 onpp.219-220.
b. Should the short-winged progeny of the second cross have one or two short wings?'vVhy?
16. Aflatoxin B1 is a highly mutagenic and carcinogenic compound produced by certain fungi that infect crops such as peanuts. Aflatoxin is a large, bulky molecule that chemically bonds to the base guanine (G) to form
H2
Aflaioxin-guanine adduct
a particular mutagen identified by the Ames test is injected into mice, it causes the appearance of many tumors, showing that this substance is carcinogenic. When cells from these tumors are injected into other mice not exposed to the mutagen, almost all of the new mice develop tumors. However, when mice carrying
17. When
mutagen-induced tumors are mated to unexposed mice,
virtually all of the progeny are tumor free. Why can the tumor be transferred horizontally (by injecting cells) but not vertica\ (from one generation to the next)?
18. When the his Salmonella strain used in the Ames test is exposed to substance X, no his+ revertants are seen. If, however, rat liver supernatant is added to the cells along with substance X, revertants do occur. Is substance X a
potential carcinogen for human cells? Explain. uses the reversion rate (his- to his+) to for mutagenicity. compounds test Is it possible that a known mutagen, like proflavin, a. unable to revert a particular hls- mutant would be Ames test? How do you think that the used in the is Ames test designed to deal with this issue? b. Can you think of a way to use forward mutation (his+ to his-) to test a compound for mutagenicity? (Hint: Consider using the replica plating technique in Fig. 7.6 on p. 212.) c. Given that the rate of forward mutation is so much higher than the rate of reversion, why does the Ames test use the reversion rate to test for mutagenicity?
19. The Ames test
20. In human DNA, 70% of cytosine residues that are followed by guanine (so-called CpG dinucleotides, where
248
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
p indicates the phosphate in the phosphodiester bond between these two nucleotides) are methylated to form 5-methylcytosine. As shown in the following figure, if 5-methylcytosine should undergo spontaneous deamination, it becomes thlrrnine.
smooth edges. The investigators started to perform complementation tests with these mutants, but some of the tests could not be completed because of an accident in the greenhouse. The results of the complementation tests that could be finished are shown in the table that follows.
Deamination
I I
H
5-methylcytosine
Thymine
a. Methylated CpG dinucleotides are hotspots for point mutations in human DNA. Explain why. b. Making the simplifying assumptions that human DNA has an equal number of C-G and A-T base pairs, and that the human DNA sequence is random, how frequently in the human genome would you expect to find the base sequence CpG? c. It turns out that, even after taking into account the actual GC content of human DNA (-42%), the frequency of CpG in human DNA is much lower than predicted by the calculation in part (b). Explain why this might be the case. Section 7.3
21. Albinism in animals is caused by recessive mutations in an autosomal gene required for synthesis of melanin, a chemical precursor for skin and eye pigments. Albino animals are often confused with so-called leucistic animals that are white due to recessive mutations in a gene required in a different pathway, for example a pathway for development of the cells that produce pigments. Suppose you have two white hummingbirds-a male and female-and they mated. Assuming that all relevant mutations are rare, autosomal, and recessive to wild-type alleles, what would you expect their progeny to look like under the following conditions: a. They are both albinos. b. They are both leucistic. c. One is albino and the other is leucistic. 22. Imagine that you caught a female albino mouse in your kitchen and decided to keep it for a pet. A few months later, while vacationing in Guam, you caught a male albino mouse and decided to take it home for some interesting genetic experiments. You wonder whether the two mice are both albino due to mutations in the same gene. What could you do to find out the answer to this question? Assume that both mutations are recessive. 23. Plant breeders studying genes influencing leaf shape in the plant Arabidopsis thaliana identified six independent recessive mutations that resulted in plants that had unusual leaves with serrated rather than
1
4
3
+
5
6
+
2 J
4 5
+
6
a. Exactly what experiment was done to fill in individual boxes in the table with a * or a - ? What
*
b.
-
does represent? What does represent? Why are some boxes in the table filled in green? Assuming no complications, what do you expect for the results of the complementation tests that were not performed? That is, complete the table by placing a or a in each ofthe blankboxes.
*
-
c. How many genes are represented among this collection of mutants? Which mutations are in which genes?
24. In humans, albinism is normally inherited in an autosomal recessive fashion. Figure 3.24c on p. 68 shows a pedigree in which two albino parents have several children, none of whom is an albino. a. Interpret this pedigree in terms of a complementation test. b. It is very rare to find examples of human pedigrees such as Fig.3.24c that in effect represent a comple-
mentation test. The reason is that most genetic conditions in humans are rare, so it is highly unlikelythat unrelated people with the same condition
would mate. In the absence of complementation testing, what kinds of experiments could be done to determine whether a particular human disease phenotype can be caused by mutations at more than one gene?
c. Complementation testing requires that the two mutations to be tested are both recessive to wild type. Suppose that two dominant mutations cause similar
phenotypes. How could you establish whether these mutations affected the same gene or different genes?
25. a. Seymour Benzer's fine structure analysis of the rII region of bacteriophage T4 depended in large part on deletion analysis as shown in Fig. 7.25 on p. 232. But to perform such deletion analysis, Benzer had to know which rll- bacteriophage strains were
Problems
deletions and which were point mutations. How do you think he was able to distinguish rII- deletions
')
from point mutations? b. Figure 7.25c on p. 232 shows Benzer's fine structure map of point mutations in the rII region. A keyfeature ofthis map is the existence of "hotspotsl' which Benzer interpreted as nucleotide pairs that were particularly susceptible to mutation. How
are dominant markers
to the left and right of
ry,
could Benzer say that all of the independent muta-
tions in a hotspot were due to mutations of the
b. What is the genetic distance separating ryal and
25. a. You have a test tube containing 5 ml of a solution of bacteriophages, and you would like to estimate the number of bacteriophages in the tube. Assuming the tube actually contains a total of 15 billion bacteriophages, design a serial dilution experiment that would allow you to estimate this number. Ideally, the final plaque-containing plates you count should contain more than 10 and fewer than 1000 plaques. b. When you count bacteriophages by the serial dilution method as in part (a), you are assuming a plating eficiency of 100%; that is, the number of plaques on the petri plate represents exactly the number of bacteriophages you mixed with the plating bacteria. Is there any way to test the possibility that only a certain percentage of bacteriophage particles are able to form plaques (so that the plating efficiency would be less than 100%)? Conversely, why is it fair to assume that any plaques are initiated by one rather than multiple bacteriophage particles?
27. You found five T4 rII- mutants that will not grow on E. coll K(\). You mixed together all possible combinations of two mutants (as indicated in the following chart), added the mixtures lo E. coli K(\), and scored for the ability of the mixtures to grow and make plaques (indicated as a + in the chart). 12345
l-++-+ 2--+ ?-+4-+ 5a. How many genes were identified by this analysis? b. Which mutants belong to the same complementation groups?
)
rytun are two independently isolated alleles of ry. Ly lQra (narrow) wingsl and Sb lstubble (short) bristles] respectively. These females are now mated to males homozygous for ryal . out of 100,000 progeny, 8 have wild-t1pe eyes, Lyra wings, and Stubble bristles, while the remainder have rosy eyes. a. What is the order of these two ry mutations relative to the flanking genes Ly and Sb?
same nucleotide pair?
)
249
28. The rosy (ry) gene of Drosophila encodes an enzyme called xanthine dehydrogenase. Flies homozygous for ry mutalions exhibit a rosy eye color. Heterozygous females were made that had rynt Sb on one homolog and Ly rytun on the other homolog, where ryal and
,ytun?
29. Nine r/1 mutants of bacteriophage T4 were used in pairwise infections of E. coli K(}") hosts. Six of the mutations in these phages are point mutations; the other three are deletions. The ability of the doubly infected cells to produce progeny phages in large numbers is scored in the following chart.
The same nine mutants were then used in pairwise infections of E. coli B hosts. The production of progeny phages that can subsequently lyse E. coll K(}') hosts is now scored. In the table, 0 means the progeny do not produce any plaques on E. coll K(\) cells; - means that only a very few progeny phages produce plaques; and * means that many progeny produce plaques (more than 10 times as many as in
the
-
cases).
123456789 I 1
J
4 5
6 7 8
-++++--++ ++++-++ 0-+0++ +-+++ -+*++ 00-+ 0++ -+
9
a. Which of the mutants
are the three deletions? What
criteria did you use to reach your conclusion? b. If you know that mutation 9 is in the r11B gene, draw the best genetic map possible to explain the data, including the positions of all point mutations and the extent ofthe three deletions.
25O
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
c. There should be one uncertainty remaining in your answer to part (b). How could you resolve this uncertainty? 30. In a haploid yeast strain, eight recessive mutations were found that resulted in a requirement for the amino acid lysine. All the mutations were found to revert at a frequency of about 1 X 10-6, except mutations 5 and 6, which did not revert. Matings were made between a and cv cells carrying these mutations. The ability of the resultant diploid strains to grow on minimal medium in the absence of lysine is shown in
the following chart no growth.)
(*
means growth and
t234s678 l-++++-+ 2+-++++++ 3++ 4++ 5++ 6-+
-
means
+
a. How many complementation groups were revealed by these data? Which point mutations are found
within which complementation groups? The same diploid strains are now induced to undergo sporulation, The vast majority of resultant spores are auxotrophic; that is, they cannot form colonies when plated on minimal medium (without lysine). However, particular diploids can produce rare spores that do
form colonies when plated on minimal medium (prototrophic spores). The following table shows whether (+) or not (-) any prototrophic spores are formed upon sporulation of the various diploid cells.
Section 7.4
31. The pathway for arginine biosynthesis in Neurospora crassa involves several enzymes that produce a series
of intermediates.
ARG.E ARG-F ARG-G
ARG-H
N-acetylornithine -+ ornithine -+ citrulline -+ argininosuccinate
+
arginine
a. If you did a cross between ARG-E- and ARG-H Neurospora strains, what would be the distribution of Arg+ and Arg- spores within parental ditype and nonparental ditype asci? Give the spore types in the order in which they would appear in the ascus. b. For each of the spores in your answer to part (a), what nutrients could you supply in the media to get spore growth?
32. In corn snakes, the wild-type color is brown. One autosomal recessive mutation causes the snake to be orange, and another causes the snake to be black. An orange snake was crossed to a black one, and the F, offspring were all brown. Assume that all relevant genes are unlinked.
a. Indicate what phenotypes and ratios you would expect in the F2 generation ofthis cross ifthere is one pigment pathway, with orange and black being different intermediates on the way to brown.
b. Indicate what phenotypes and ratios you would expect in the F2 generation if orange pigment is a product of one pathway, black pigment is the product of another pathway, and brown is the effect of mixing the two pigments in the skin of the snake.
33. In a certain species of flowering plants with a diploid genome, four enzymes are involved in the generation of flower color. The genes encoding these four enzymes
12345678
l-++++-++ 2+-++++++ 3++-++++ 4+++ 5++-
mutations under study. Show the extent of any de-
letions involved, and indicate the boundaries of the various complementation groups.
-++ -++
are on different chromosomes. The biochemical pathway involved is as follows; the figure shows that either of two different enzymes is sufficient to convert a blue
pigment into a purple pigment. white -+ green -+ blue
b. When prototrophic spores occur during sporulation of the diploids just discussed, what ratio of auxotrophic to prototrophic spores would you generally expect to see in any tetrad containing such a prototrophic spore? Explain the ratio you expect. c. Using the data from all parts of this question, draw the best map of the eight lysine auxotrophic
]
no.nt.
A true-breeding green-flowered plant is mated with a true-breeding blue-flowered plant. A1l of the plants in the resultant F, generation have purple flowers. Fr plants are allowed to self-fertilize, yielding an F, generation. Show genotypes for R F1, and F2 plants, and indicate which genes specify which biochemical steps. Determine the fraction of F2 plants with the following phenotypes: white flowers, green flowers,
Problems
as shown in the following table (+ means growth; - no growth). These mutants are also tested for their ability to grow on the intermediates A-E. What is the order of these intermediates in the glutamine and proline pathways, and at which point in the pathways is each mutant blocked?
blue flowers, and purple flowers. Assume the green-flowered parent is mutant in only a single step
of the pathway. 34. The intermediates A, B, C, D, E, and F all occur in the same biochemical pathway. G is the product of the pathway, and mutants I through 7 are all G-, meaning that they cannot produce substance G. The following table shows which intermediates will promote growth in each of the mutants. Arrange the intermediates in order of their occurrence in the pathway, and indicate the step in the pathway at which each mutant strain is blocked. A "+" in the table indicates that the strain will grow if given that substance, an "O" means lack of growth.
ABCD
F
G
+ + + + + +
+
+
+
+
+
o o o
o
o
o
o
o o
+ +
+
+
o
o
+ +
+ +
+ +
o o
6
+ +
o o o
7
o
o
o
1
2 3
4 5
+
+
o
+
1
+
+
3
+
4
+
5
+
37. The following complementing E. coli mutants were tested for growth on four known precursors of thymine, A-D. Precursor/product
A
Mutant
B
+
9
35. In each of the following cross schemes, two truebreeding plant strains are crossed to make F1 plants, all of which have purple flowers. The F1 plants are then self-fertilized to produce F2 progeny as
purpie all purple all purple all purple all
l8
15
+
D
Thymine
+ + + +
+ + +
+
a. Show a simple linear biosynthetic pathway of the four precursors and the end product, thymine. Indicate which step is blocked by each of the five mutations.
9 purple: 4 white: 3 blue
b. What precursor would accumulate in the following
white
9 purple: 3 red: 3 biue:
+
c
27
F2
9 purple: 7
+ +
l4
shown here. Fr
+
+
+
10
Cross Parents I blue X white 2 white x white 3 red X blue 4 purple X purple
+ + + + +
+
+
6
o
+
+
+
2
7
E
Gln * Pro
MutantABCDEGlnPro
Supplements
Mutant
251
double mutants: 9 and 10? 10 and 14?
I white
purple: 1 white
38.
In
in the British Medical ]ournal interesting differences in the behavior of
1952, an article
similar subunits, both of which are required for
reported blood plasma obtained from several individuals who suffered from X-linked recessive hemophilia. When mixed together, the cell-free blood plasma from certain combinations of individuals could form clots in the test tube. For example, the following table shows whether (+) or not (-) clots could form in various combinations of plasma from four individuals with
enzyme activity?
hemophilia:
a. For each cross, explain the inheritance of flower color. b. For each cross, show a possible biochemical pathway that could explain the data. c. Which of these crosses is compatible with an underlying biochemical pathway involving only a single step that is catalyzed by an enzyme with two dis-
d. For each of the four
crosses, what would you expect in the F1 and F2 generations if all relevant genes were
tightly linked?
landl I and2 I and3
36. The pathways for the biosynthesis of the amino acids
I
and4
glutamine (Gln) and proline (Pro) involve one or more common intermediates. Auxotrophic yeast mutants numbered I-7 are isolated that require either glutamine or proline or both amino acids for their growth,
2
and2
2 and3 2 and4
+ +
+ +
3and3 3 and4
4
and{
What do these data tell you about the inheritance of hemophilia in these individuals? Do these data allow
252
Chapter
7
Anatomy and Function of a Gene: Dissection Through Mutation
you to exclude any models for the biochemical pathway governing blood clotting?
39. Mutations in an autosomal gene in humans cause a form of hemophilia called von Willebrand disease (vWD). This gene specifies a blood plasma protein cleverly called von Willebrand factor (vWF). vWF stabilizes factor VIII, a blood plasma protein specified by the wild-type hemophilia A gene. Factor VIII is needed to form blood clots. Thus, factor VIII is rapidly destroyed in the absence of vWF. Which of the following might successfully be employed in the treatment of bleeding episodes in hemophiliac patients? Would the treatments work immediately or only after some delay needed for protein synthesis? Would the treatments have only a short-term or a prolonged effect? Assume that all mutations are null (that is, the mutations result in the
Mutant in gene for protein A
B
c
D
E
F
+
+
+
+ + +
+
+ + +
B
C
D
+
E
+
+
+
F
Complete the followingfigure, which shows the construc-
tion of the hlpothetical protein complex, by writing the letter of the proper protein in each circle. The two proteins marked with arrowheads can assemble into the complex independently of each other, but both are needed for the addition of subsequent proteins to the complex. Outside
complete absence of the protein encoded by the gene) and that the plasma is cell-free.
a. transfusion of plasma from normal blood into vWD patient
Protein production and location
A
Embryo surface
a
b. transfusion of plasma from a vWD patient into a different vWD patient c. transfusion of plasma from a hemophilia A patient into a vWD patient d. transfusion of plasma from normal blood into a hemophilia A patient
e. transfusion of plasma from a vWD patient into
a
hemophilia A patient transfusion of plasma from a hemophilia A patient into a different hemophilia A patient g. injection of purified vWF into a vWD patient
f.
h. injection of purified vWF into a hemophilia A
i. j.
41. Adult hemoglobin is a multimeric protein with four polypeptides, two of which are cr-globin and two of which are B-globin. a. How many genes are needed to define the structure of the hemoglobin protein?
b. If a person is heterozygous for wild-type alleles and alleles that would yield amino acid substitution variants for both cr-globin and p-globin, how many differ-
patient
injection of purified factor VIII into a vWD patient injection of purified factor VIII into a hemophilia A
ent kinds of hemoglobin protein would be found in the persoris red blood cells and in what proportion?
patient
Assume all alleles are expressed at the same level.
4O. Antibodies were made that recognize six proteins that are part of a complex inside the Caenorhabditis elegans
one-cell embryo. The mother produces proteins that are believed to assemble stepwise into a structure in the egg, beginning at the embryot inner surface. The antibodies were used to detect the protein location in embryos produced by mutant mothers (who are homozygous recessive for the gene[s] encoding each protein). The C. elegans mothers are self-fertilizing hermaphrodites, so no wild-type copy of a gene will
be introduced during fertilization. In the following table, * means the protein was present and at the embryo surface, - means that the protein was not
42. Each complementation group (ARG-E,ARG-4 ARG-G, and ARG-H) in Fig. 7.27b (p.235) is able to grow on a unique subset of supplements. Why were these four subsets the only ones observed? For example, why were there no complementation groups that behaved like the four hypothetical ones shown in the table below? (Symbols as in Fig. 7.27b: -t means growth, means
-
no growth) Supplements
Hypothetical Argininomutant strain Nothing Ornithine Citrulline succinate Wildtype: Arg+ ARG-1
present, and + means that the protein was present but not at the embryo surface. Assume all mutations pre-
ARG-K
vent production of the corresponding protein.
ARG-1,'
ARG.J
+
+ +
+
+ +
+ + +
Arginine
+ + + + +
Problems
light-receiving neurons are cone cells with pigments that respond to light of specific wavelengths of high intensity. What does this suggest about the
Section 7.5
43. In addition to the predominant adult hemoglobin, HbA, which contains two a-globin chains and two B-globin chains (or!), there is a minor hemoglobin, HbA2, composed of two cr and two 6 chains (crzbz). The B- and E-globin genes are arranged in tandem and are highly homologous. Draw the chromosomes that would result from an event of unequal crossing-over between the B and 6 genes. 44. Most mammals, including "New World" primates such as marmosets (a kind of monkey), are dichromats: They have only two kinds of rhodopsin-related color receptors. "Old World" primates such as humans and gorillas arc trichromafs with three kinds of color receptors. Primates diverged from other mammals roughly 65 million years ago (Myr), while Old World and New World primates diverged from each other roughly 35 Myr. a. Using this information, define on Fig. 7.32d (see p.2a2) the time span of any events that can be dated. b. Some New World monkeys have an autosomal color receptor gene and a single X-linked color receptor gene. The X-linked gene has three alleles, each of which encodes a photoreceptor that responds to
light of
lifestyle of the earliest mammals?
45" Humans are normally trichromats; we have three different types of retinal cones, each containing either a red, green, or blue rhodopsin-like photoreceptor protein. This is because most humans have genes for red and green photoreceptors on the X chromosome, and a blue photoreceptor gene on an autosome. Our brain
integrates the information from each type of cone, making it possible for us to see about one million colors. Some scientists think that rare people may be tetrachromats, that is, they have four different kinds of cones. Such people, if they exist, could potentially detect 100 million colors! For parts (a) and (b), assume that each X chromosome has one red and one green photoreceptor protein gene.
a. Explain why scientists expect that many more females than males would be tetrachromats.
b. in X-linked, red/green'tolorblindnessi' mutation of
a different wavelength (all three wavelengths
are different from that recognized by the autosomal
)
color receptor). How is color vision inherited in these monkeys? 95o/o of all light-receiving neurons in humans and other mammals are rod cells containing rhodopsin, a pigment that responds to low-level light of many wavelengths. The remaining 5o/o of
253
c. About
c.
either the red or green photoreceptor gene results in a rhodopsinJike protein with altered spectral sensitivity. The mutant photoreceptor is sensitive to wavelengths in between the normal red and green photoreceptors. Why do scientists think that a woman with a son who is red/green color-blind is more likely to be a tetrachromat than a woman whose sons all have normal vision? Suggest a scenario based on Fig. 7.33d, (p. 243) that could explain how extremely rare males might be tetrachromats.
chapter
8 Gene Expression: The Flow of lnfbrmation from DNA tO RNA
to Protein jTtl,'
.l iH'ltt i
The ability of an aminoacyl-tRNA synthetase (red) to recognize a particular IRNA (blue) and couple it to its corresponding amino
nslrr$,:
li'li,i
:i,!; rl,r.. j! rli4' .jltl '!i3x;.twi
',ltfi;'itrg*s4
A DEDICATED EFFORT to determine the complete nucleotide sequence of the genome in a variety of organisms has been underway since 1990. This massive endeavor has been more successful than many scientists thought possible. By the time of this writing in 2013, the DNA sequence in the genomes of more than 3700 different species had already been deposited in databases, and sequencing projects for more than 16,000 additional species were in progress. With this sequence information in
acid (not shown) is central to the molecular machinery that converts the language of nucleic acids into the language of proteins.
chapter outline
. . . .
8.'l The Genetic Code 8.2 Transcription: From DNA
to RNA
8.3 Translation: From mRNA
to Protein
8.4 Differences in Gene Expression Between
Prokaryotes and Eukaryotes
.
8.5 The Effects of
Mutations on Gene Expression and
Function
hand, geneticists can consult the genetic code-the cipher equating nucleotide sequence with amino acid sequence-to decide what parts of a genome are likely to be genes. As a result, modern geneticists can discover the number and amino acid sequences of all the polypeptides that determine phenotype. Knowledge of DNA sequence thus opens up powerful new possibilities for understanding an organism's growth and development at the molecular level. In this chapter, we describe the cellular mechanisms that carry out gene expression, the means by which genetic information can be interpreted as phenotype. As intricate as some of the details may appear, the general scheme of gene expression is elegant and straightforward: Within each cell, genetic information flows from DNA to RNA fo protein. This statement was set forward as the "Central Dogmd' of molecular biology by Francis Crick in 1957. As Crick explained, "Once 'information has passed into protein, it cannot get out again." The Central Dogma maintains that genetic information flows in two distinct stages (Fig. 8.f). The conversion of the information in DNA to its equivalent in RNA is known as transcription. The product of transcription is a transcript a molecule of messenger RNA (mRNA) in prokaryotes, a molecule of RNA that undergoes pro-
in eukaryotes. of gene expression, the cellular machinery decodes the sequence of nucleotides in mRNA into a sequence of amino acids-a polypeptide-by cessing to become an mRNA
In the
254
second stage
8.1 The Genetic
the process known as translation. It takes place on molecular workbenches called ribosomes, which are composed of proteins and ribosomal RNAs (rRNAs). Translation depends on the'dictionary" known as the genetic code, which defines each amino acid in terms of specific sequences of three nucleotides. Translation also requires transfer RNAs (tRNAs), small RNA adapter molecules that place specific amino acids at the correct position in a growing polypeptide chain. The Central Dogma does not explain the behavior of all genes. As Crick himself realized, many genes are transcribed into RNAs that are never translated into proteins. You will see in this chapter that many nontranslated RNAs are critical to various steps of gene expression. The genes encoding rRNAs and tRNAs belong to
this group. Four general themes emerge from our discussion of gene expression. First, the pairing of complementary bases is key to the transfer of information from DNA to RNA, and from RNA to protein. Second, the polarities (directionality) of DNA, RNA, and polypeptides help guide the mechanisms of gene expression. Third, like DNA replication and recombination, gene expression requires an input of energy and the participation of specific proteins, RNAs, and macromolecular assemblies, such as ribosomes. Finally, mutations that change genetic information or obstruct the flow of its expression can have dramatic effects on phenotype.
f,f,l
The Genetic Code
Iearning objectives
Code
255
Figure 8.1 Gene expression: The flow of genetic information from DNA via RNA to protein. ln transcription, the enzyme RNA polymerase copies DNA
to produce an RNA transcript. ln trans-
lation, the cellular machinery uses instruc-
tions in mRNA to synthesize a polypeptide, following the rules of the genetic code. DNA
I
Transcription
RNA transcript: serves directly as mRNA
in prokaryotes; processed to become mRNA in eukaryotes
N
Translation
Polypeptide
one, two, three, or four dots or dashes in various combina-
tions represent individual letters. For example, the symbol for C is dash dot dash dot (- . - .), the symbol for O is dash dash dash (- - -), D is dash dot dot (- . '), and E is a single dot (.). Because anywhere from one to four symbols specif' each letter, the Morse code requires a symbol for "pause" (in practice, a short interval of time) to signify where one
1.
Explain the reasoning establishing that a sequence of three nucleotides (a triplet codon) is the basic unit of the code relating DNA to protein.
2.
Summarize the evidence showing that the sequence of nucleotides in a gene is colinear with the sequence of amino acids in a protein.
3.
Define reading frame and discuss its significance to the genetic code.
Triplet Codons of Nucleotides Represent lndividual Amino Acids
4.
Describe experiments that determined which codons are associated with each amino acid and which are stop codons.
5.
The language of nucleic acids is written in four nucleotides-A, G, C, and T in the DNA dialect; A, G, C, and U
Explain how mutations were used to verify the genetic code.
6.
in the RNA dialect-while the language of proteins is written in amino acids. The first hurdle to be overcome in
Discuss evidence that the genetic code is almost universal, and cite some exceptions.
deciphering how sequences of nucleotides can determine the order of amino acids in a polypeptide is to determine how many amino acid "letters" exist. Over lunch one day at a local pub, Watson and Crick produced the now accepted list of the 20 common amino acids that are genetically encoded by DNA. They created the list by analyzing the known amino acid sequences of a variety of naturally occurring polypeptides. Amino acids that are present in only a small number of proteins or in only certain tissues or organisms did not qualify as standard building blocks; Crick and Watson correctly assumed that most such
A code is a system of symbols that equates information in one language with information in another. A useful analogy for the genetic code is the Morse code, which uses dots and dashes to transmit messages over radio or telegraph wires. Various groupings of the dot-dash symbols represent the 26 letters of the English alphabet. Because there are many more letters than the two symbols (dot or dash), groups of
Ietter ends and the next begins.
256
Chapter
8
Gene Expression: The Flow of Information from
amino acids arise when proteins undergo modification after their synthesis. By contrast, amino acids that are present in most proteins made the list. The question then became: How can four nucleotides encode 20 amino
DNA to RNA to Protein
Figure 8.2 The genetic code: 61 codons represent the 20 amino acids, while 3 codons signify stop. To read the code, find the first letter inthe leftcolumn, the second letter along the top, and the third letter in the right column; this reading corresponds to the 5'-to-3' direction along the mRNA.
acids?
Like the Morse code, the four nucleotides encode
Second letter
20 amino acids through specific groupings of A, G, C, and T (in DNA) or A, G, C, and U (in RNA). Researchers
UUU
initially arrived at the number of letters per grouping by deductive reasoning and later confirmed their guess by experiment. They reasoned that if only one nucleotide represented an amino acid, there would be information for only four amino acids: A would encode one amino acid; G, a second amino acid; and so on. If two nucleotides represented each amino acid, there would be 42 : 16 possible combinations of couplets. Of course, if the code consisted of groups containing one or two nucleotides, it would have 4 + 16 : 20 groups and could account for all the amino acids, but there would be nothing left over to signify the pause required to denote where one group ends and the next begins. Groups ofthree nucleotides in a row would provide 43 64 different triplet combinations, more than enough to code for all the amino acids. Ifthe code consisted ofdoublets and triplets, a signal denoting a pause would once again be necessary. But a triplets-only code would require no symbol for "pausd' if
U
among successive triplets was very reliable. Although this kind of reasoning generates a hypotheit sis, does not prove it. As it turned out, however, the ex-
periments described later
in this
chapter
did
indeed nucleotides represent all that groups of three demonstrate Each nucleotide triplet is called a codon. 20 amino acids. Each codon, designated by the bases defining its three nucleotides, specifies one amino acid. For example, GAA is a codon for glutamic acid (Glu), and GUU is a codon for valine (Val). Because the code comes into play only during the translation part of gene expression, that is, during the decoding of messenger RNA to polypeptide, geneticists usually present the code in the RNA dialect of A, G, C, and U, as depicted in Fig. 8.2. When speaking of genes, they can substitute T for U to show the same code in the DNA dialect. Ifyou knew the sequence ofnucleotides in a gene or its transcript as well as the sequence of amino acids in the corresponding pollpeptide, you could then deduce the genetic code without understanding how the underlying cellular machinery actually works. Although techniques for determining both nucleotide and amino acid sequence are available today, this was not true when researchers were trying to crack the genetic code in the 1950s and 1960s. At that time, they could establish a polypeptide's amino acid sequence, but not the nucleotide sequence of DNA or RNA. Because of their inability to read nucleotide sequence, they used an assortment of genetic and biochemical techniques
UUC
)
Phe
)
Leu
UUA UUG CUU
c (!)
cuc
Leu
UCU
UAU
ucc
UAC
Ser
a
ir A
AUG
q
UGA
Stop
UGG
np
ccu ccc
CAU Pro
cAc ) CAA CAG
ACC
Thr
AAC
GUU
GCU
GAU
GUC
GCC
GAC
GUG
GCA GCG
Ala
Gln
Asn
)
Lys
CGU
cGc
Asp
)
Glu
Arg
CGA
-l l: o
CGG AGU AGC
)
Se r
)
Arg
AGA AGG
)
GAA GAG
His
)
AAA AAG
Val
)
AAU
ACG
GUA
cys
Stop
ACA Met
)
Stop
ACU
AUA
UGC
UAG
AUU lle
UGU
UAA
CUG
AUC
)
Tyr
G
UCG
UCA
ccA ccG
CUA
A
-9
:
the mechanism for counting to three and distinguishing
c
U
o o I
GGU GGC GGA
Glv
GGG
i'",
t
ii-;
I
to fathom the code. Theybegan by examining how different mutations in a single gene affected the amino acid sequence of the gene's polypeptide product. In this way, they were able to use the abnormal (specific mutations) to understand the normal (the general relationship between genes and polypeptides).
A Gene's Nucleotide Sequence ls Colinear vuith the Amino Acid Sequence of the Encoded Polypeptide As you know DNA is a linear molecule with base pairs fol-
lowing one another down the intertwined chains. Proteins, by contrast, have complicated three-dimensional structures. Even so, if unfolded and stretched out from N terminus to C terminus, proteins have a one-dimensional, linear structure-a specific sequence of amino acids. If the information in a gene and its corresponding protein are colinear, the consecutive order of bases in the DNA from the beginning to the end of the gene would stipulate the consecutive order of amino acids from one end to the other of the outstretched protein. In the 1960s, Charles Yanofslg was the first to compare maps of mutations within a gene to the particular amino acid substitutions that resulted. He began by generating a large
8.1 The Genetic
number of trp- auxotrophic mutants in E. coli that carried mutations inthe trpA gene for a subunit of the enzyme tryp-
Code
257
acid (Glu) at the same position. In another example, muta-
tion 78 changed the glycine at position 234 lo cysteine
tophan synthetase. He next made a fine structure recombinational map of these mutations analogous to Benzer's fine structure map for the rII region of bacteriophage T4, which was discussed in Chapter 7. Yanofsky then purified and determined the amino acid sequence of the mutant tryptophan slmthetase subunits. As Fig. 8.3a illustrates, his data showed that the order of mutations mapped within the DNA of the gene by recombination was indeed colinear with the positions of the amino acid substitutions occurring in the resulting mutant proteins. By carefully examining the results of his analysis, Yanofsky deduced key features ofthe relationship between nucleotides and amino acids, in addition to his confirmation of colinearity.
(Cys), while mutation 58 produced aspartic acid (Asp) at the same position. These are all missense mutations that change a codon for one amino acid into a codon that specifies a different amino acid. In both cases, Yanofslcyalso found that recombination could occur between the two mutations that changed the identity of the same amino acid; such
Evidence that a codon is composed of more than one nucleotide
Evidence that each nucleotide is part of only one codon As Fig. 8.3a illustrates, each of the point mutations in the tryptophan synthetase gene characterized by Yanofsky alters the identity of only a single amino acid. This is also true of the point mutations examined in many other genes, such as the human genes for rhodopsin and hemoglobin
Yanofsky observed that point mutations altering different nucleotide pairs may affect the same amino acid. In one example shown in Fig. 8.3a, mutation 23 changed the glycine (Gly) at position 2Il of the wild-type polypeptide chain to arginine (Arg), while mutation 46 yielded glutamic
recombination would produce a wild-type tryptophan synthetase gene (Fig. 8.3b). Because the smallest unit of recombination is the base pair, two mutations capable of recombination-in this case, in the same codon because
they affect the same amino acid-must be in different (although nearby) nucleotides. Thus, a codon must contain more than one nucleotide.
Figure 8.3 Mutations in a gene are colinear with the sequence of amino acids in the encoded polypeptide.
ta) rne
relationshipbetweenthegeneticmapof E coli'strpAgeneandthepositionsof aminoacidsubstitutionsinmutanttryptophansynthetaseproteins. (b) Codons must include two or more base pairs. When two mutant strains with different amino acids at the same position were crossed, recombination could produce a wild-type allele.
(a) Colinearity of genes and proteins 1 m.u.
Genetic map for f/pA mutation
c
N
Position of altered amino acid in TrpA polypeptide 1
Amino acid in wild-type polypeptide Amino acid in mutant polypeptide
15
22
tt Lys Phe
l+
STOPLEU
49 I
Glu
,1N
Val Gln Met
175 177 183
235 243 268
213
ttt Tyr Leu Tyr
tl
I
++t
Ser
Glv
Cys Arg lle
Gln
++ Leu
+
Val
STOP
(mutant number)
(b) Recombination within
a codon
0.00'l m.u. codon for
0.001 m.u. codon for
aa234
aa 211
trpA- mutant (Arg)
-
trpA- mutant (Cys)
78
trpA- mutant (Glu)
frpA- mutant (Asp)
46
58
trpAr wildtype recombinant (Gly) codon for aa 211
trpA+ wild-type recombinant
codon for
aa234
(Glv)
258
Chapter
8
Gene Expression: The Flow of Information from DNA
(see Chapter 7). Because point mutations that change only a single nucleotide pair affect only a single amino acid
in
a
pollpeptide, each nucleotide in a gene must influence the identity of only a single amino acid. In contrast, if a nucleotide were part of more than one codon, a mutation in that nucleotide would affect more than one amino acid.
Nonoverlapping Triplet Codons Are Set in a Reading Frame Although the most efficient code to specify 20 amino acids requires three nucleotides per codon, more complicated scenarios are possible. But in 1955, Francis Crick and Sydney Brenner obtained convincing evidence for the triplet nature of the genetic code in studies of mutations in the bacteriophage T4 rIIB gene originally characterized by Seymour Benzer (Chapter 7). They induced the mutations wilh proflavin, an inlercalating mutagen that can insert itself between the paired bases stacked in the center of the DNA molecule (recall Fig. 7.I4c, pp. 219-220). Their original assumption was that proflavin would act like other mutagens, causing single-base substitutions. If this were true, it would be possible to generate revertants through treatment with other mutagens that might restore the
wild-type DNA sequence. Surprisingly, genes with proflavin-induced mutations did not revert to wild-type upon treatment with other mutagens known to cause nucleotide substitutions. Only further exposure to proflavin caused proflavin-induced mutations to revert to wild-type. Crick and Brenner had to explain this observation before they could proceed with their phage experiments. With keen insight, they correctly guessed that proflavin does not cause base substitutions; instead, it causes insertions or deletions of a single base pair. This hypothesis explained why base-substituting mutagens could not cause reversion of proflavin-induced mutations. Evidence for a
triplet code
Crick and Brenner began their experiments with a particular proflavin-induced rIIB- mutation they called FCO. They next treated this mutant strain with more proflavin to isolate an rIIB+ revertant (Fig. S.4a). By recombining this revertant with wild-type bacteriophage T4, Crick and Brenner were able to show that the revertant's chromosome actually contained two different rIIB- mutations (Fig. 8.4b). One was the original FCO mutation; the other was the newly induced FC7. Either mutation by itself yields a mutant phenotype, but their simultaneous occurrence in the same gene yielded an rIIB+ phenotype. Crick and Brenner reasoned that if the first mutation was the addition of a single base pair, represented by the sy.mbol (*), then the counteracting mutation must be the deletion of a base pair, represented as (-). The restoration of gene function by one mutation
to RNA to Protein
canceling another in the same gene is known as intragenic
suppression. Crick and Brenner supposed not only that each codon is a trio ofnucleotides, but that each gene has a single starting point. This starting point establishes a reading frame: the sequential partitioning of nucleotides into groups of three to generate the correct order of amino acids in the resulting polypeptide chain (Fig. 8.4a). Changes that alter the grouping of nucleotides into codons are called frameshift mutations; they shift the reading frame for all codons beyond the point of insertion or deletion, almost always abolishing the function of the polypeptide product. If codons are read in order from a fixed starting point, a deletion (-) can counterbalance an insertion (+) to restore the reading frame (Fig. 8.4a). (Note that the gene would regain its wild-t1pe activity only if the portion of the polypeptide encoded between the two mutations of opposite sign is not required for protein function, because in the double mutant, this region would have an improper amino acid sequence.)
Crick and Brenner realized that they could use * and mutations in rIIB to test the hypothesis that codons were indeed nucleotide triplets. If codons are composed of three nucleotides, then combining two different rlIB mutations of the same sign (+* or - -) in the same gene should never lead to intragenic suppression (an rIIB+ phenotype). Combinations of three f or three - mutations, however, should sometimes result in an rIIB+ reyertant. These predictions were exactly verified by the results (Fig. 8.ac). Evidence that most amino acids are specified
by more than one codon As Fig. 8.4c illustrates, intragenic suppression occurs only if, in the region between two frameshift mutations of opposite sign, a gene still dictates the appearance of amino acids-even if these amino acids are not the same as those appearing in the normal protein. If the frameshifted part of the gene instead encodes instructions to stop protein synthesis by introducing a triplet that does not correspond to any amino acid, then production of a functional polypeptide will not be possible. The reason is that polypeptide synthesis would stop before the compensating mutation could reestablish the correct reading frame. The fact that intragenic suppression occurs as often as it does suggests that the code includes more than one codon for some amino acids. Recall that there are 20 common amino acids but 43 : 64 different combinations of three nucleotides. If each amino acid corresponded to only a single codon, there would be 64 - 20 : 44 possible triplets
not encoding an amino acid. These noncoding triplets would act as "stop' signals and prevent further polypeptide synthesis. In this scenario, more than half of all frameshift
mutations (44164) would cause protein synthesis to stop at the first codon after the mutation, and the chances of
8.1 The Genetic
Code
259
Figure 8.4 Studies of frameshift mutations in the bacteriophage T4 rllB gene showed that codons consist of three )
nucleotides. (a) Treatment with proflavin produces an rilB- frameshift mutation at one site (FCo) by insertion of a single nucleotide; the reading frame of all codons downstream of the insertion is shifted (yellow), A second proflavin exposure results in a second mutation (FC7), deletion of a single nucleotide within the same gene, which suppresses FCO by restoring the proper reading frame (green). lbl When the revertant is crossed with a wild-type strain, crossing-over separates the two r//8 frameshift mutations (FC0 and FC7) onto separate DNA molecules. The reversion \o an rllB+ phenotype was thus the result of intragenic suppression. (c) When recombined onto a single DNA molecule, two addition (++) or two deletion mutations do not supply r//8+function, but the reading frame is restored by three mutations of the same sign (+++ or
(--)
---).
(a) lntragenic suppression of
FCO by FC7
,11g+
S--Reading
frame start
r//B+ wild type Exposure
rttp.-
&
rto
proflavin
FCO '
rllB-
Exposure,to proflavin
rllB+ revefianl
*-
FCO ' FC7 * -:, - -J -. - -Original Second
-, ,- |
mutation (b) rllB+ revertant
FCO
rllB+ reveftanl
--,
fFCl cqg
+-CiEGr$AGr*Ah pcqp
G49_
!4t_999_S4c
GA,g-
!4X *trn8Aqr
1FCT
rllB- recombinants. FC7
rllB;FCO + *E€r*AGnAdIr 99c,9 lqp GA,g
r//B+ wild type
rllB-
FC0
rllB-
rllB-4 (c) Codons
rllB+
mutation
X wild type yields
fFCo
+ d*GrAdGn44* pcqg cc,g
!4
qc,9 !4c
;FQT
iAr6rlAAcr+arr+cq-fc+faffAl-- qpc g{9_ FC7
are triplets
;-Reading
wild type ++
+++
rIIB+
frame start
+,rcr+.wAG-FA4#cq:FAqr
rllB- 4{+*asfrAAIF 9999 tEg_ s49 9_SA _G I e_SAc rllB-
c{9rrfl48r*r*..€&,cEc q{9_9^A Fqg_s4g_
rIIB+
$frq+Aq+a+9999
rllB+
drensAe+EFFqq 989_9xE_94L9 q q44E
jgcW
extending the protein would diminish exponentially with each additional amino acid. As a result, intragenic suppression would rarely occur. However, we have seen that many frameshift mutations of one sign can be offset by mutations of the other sign. The distances between these mutations, estimated by recombination frequencies, are in some cases large enough to code for more than 50 amino acids, which would be possible only if most of the 64 possible triplet l codons specified amino acids. Thus, the data of Crick and Brenner provide strong support for the idea that the genetic code is degenerate: Two or more nucleotide triplets specify
Reading frame shifted
Reading frame restored
most of the 20 amino acids (see the genetic code in Fig. 8.2 on p. 256).
Cracking the Code:Which Codons Represent Which Amino Acids? Although the genetic experiments just described allowed remarkably prescient insights about the nature of the genetic code, they did not establish a correspondence between specific codons and specific amino acids. The discovery of
260
Chapter
8
Gene Expression: The Flow of Information from
messenger RNA and the development of techniques for synthesizing simple messenger RNA molecules had to occur first, so that researchers could manufacture simple proteins in the test tube.
In the 1950s, researchers exposed eukaryotic cells to amino acids tagged with radioactivity and observed that protein synthesis incorporating the radioactive amino acids into polypeptides takes place in the cytoplasm, even though the genes for those polypeptides are sequestered in the cell nucleus. From this discovery, they deduced the existence ofan intermediate molecule, made in the nucleus and capable of transporting DNA sequence information to the cltoplasm, where it can direct protein synthesis. RNA was a prime candidate for this intermediary information-carrying molecule.
of RNAs potential for base pairing with
Figure 8.5 How geneticists used synthetic mRNAs to limit the coding possibilities. (a) Poly-U mRNA generates a polyphenylalanine polypeptide. (b) Polydi-, polytri-, and polytetra-nucleotides encode simple polypeptides. Some synthetic mRNAs, such as poly-GUAA, contain stop codons in all three reading frames and thus specify the construction only of short peptides.
(a) Poly-U mRNA encodes polyphenylalanine.
The discovery of messenger RNAs
Because
DNA to RNA to Protein
a
strand of DNA, one could imagine the cellular machinery copying a strand of DNA into a complementary strand of RNA in a manner analogous to the DNA-Io-DNA copying of DNA replication. Subsequent studies in eukaryotes using radioactive uracil, a base found only in RNA, showed that although the molecules are synthesized in the nucleus, at least some of them migrate to the cytoplasm. Among those RNA molecules that migrate to the cytoplasm are the messenger RNAs, or mRNAs, depicted in Fig. 8.1 on p.255. They arise in the nucleus from the transcription of DNA sequence information and then move (after processing) to the cy'toplasm, where they determine the proper order of
amino acids during protein synthesis.
Using synthetic mRNAs and in vifro translation Knowledge of mRNA served as the framework for two experimental breakthroughs that led to the deciphering of the genetic code. In the first, biochemists obtained cellular extracts that, with the addition of mRNA, synthesized polypeptides in a test tube. They called these extracts "in vitro translational systemsi' The second breakthrough was the development of techniques enabling the synthesis of artificial mRNAs containing only a few codons of known com-
position. When added to in vitro translational systems, these simple, synthetic mRNAs directed the formation of very simple polypeptides. In 1961, Marshall Nirenberg and Heinrich Matthaei added a spthetic poly-U (5'. . . UUUUUUUUUUUU . . . 3') mRNA to a cell-free translational system derived from E. coli.Wlththe poly-U mRNA, phenylalanine (Phe) was the only amino acid incorporated into the resulting polypeptide (Fig. S.5a). Because UUU is the only possible triplet in poly-U, UUU must be a codon for phenylalanine. In a similar fashion, Nirenberg and Matthaei showed that CCC encodes proline (Pro), AAA is a codon for lysine (Lys), and GGG encodes glycine (Gly) (Fig. 8.5b).
5',
*ffi. Analyze radioactive polypeptides synthesized Synthetic mRNA
a,
/n vfro translational system plus radioactive amino acids
(b) Analyzing the coding possibilities. Synthetic
mRNA
Polypeptides synthesized Polypeptides with one amino acid
poly-U UUUU poly-C CCCC
Phe-Phe-Phe Pro-Pro-Pro .
poly-A AAAA poly-G GGGG
Lys-Lys-Lys.. Gly-Gly-Gly..
Repeating
dinucleotides
poly-UC UCUCUC poly-AG AGAGAG
Polypeptides with alternating amino acids
poly-Uc UGUGUG
Ser-Leu-Ser-Leu Arg-Glu-Arg-Glu Cys-Val-Cys-Val
poly-AC ACACAC
Thr-His-Thr-His
Repeating
trinucleotides
poty-UUC UUCUUCUUC...
poty-AAG AAGMGAAG... poly-UUG UUGUUGUUG... poly-UAC UAGUACUAC ... Repeating
tetranucleotides
poly-UAUC poly-UUAC poly-GUAA poly-GAUA
UAUCUAUC... UUACUUAC..,
GUMGUAA... GAUAGAUA...
.
Three polypeptides each with one amino acid Phe-Phe,... and Ser-Ser.... and Leu-Leu.... Lys-Lys.... and Arg-Ar9.... and Glu-G|u.... Leu-Leu.... and Cys-Cys.... and ValVal.... TyrTyr.... and Thr-Thr.... and Leu-Leu.... Polypeptides with repeating units of four amino acids Tyr-Leu-Ser-lle-Tyr-Leu-Ser-lle... Leu-Leu-Th r-Tyr-Leu-Leu-Thr-Tyr... none none
The chemist Har Gobind Khorana later made mRNAs
with repeating dinucleotides, such as poly-UC (5'. .
.
UCUCUCUC . . . 3'), repeating trinucleotides, such as polyUUC, and repeating tetranucleotides, such as poly-UAUC, and used them to direct the synthesis of slightly more complex polypeptides. As Fig. 8.5b shows, his results limited the coding possibilities, but some ambiguities remained. For example, poly-UC encodes the polypeptide N . . . Ser-Leu-SerLeu-Ser-Leu . . . C in which serine and leucine alternate with each other. Although the mRNA contains only two different codons (5' UCU 3' and 5' CUC 3'), it is not obvious which corresponds to serine and which to leucine. Nirenberg and Philip Leder resolved these ambiguities
in
1965
with experiments in which they added short,
8.1 The Genetic
synthetic mRNAs only three nucleotides in length to an in yitro translational system containing tRNAs attached to amino acids, where only one of the 20 amino acids was radioactive. They then poured through a filter a mixture of a synthetic mRNA and the translational system containing a tRNA-attached, radioactively labeled amino acid (Fig. 8.6). tRNAs carrying an amino acid normally go right through a filter. Il however, a IRNA carrying an amino acid binds to a ribosome, it will stick in the filter, because this larger complex of ribosome, amino-acid-carrying tRNA, and small mRNA cannot pass through. Nirenberg and Leder used this approach to see which small mRNA caused the entrapment of which radioactively labeled amino acid. For example, they knew from Khorana's earlier work that CUC encoded either serine or leucine. When they added the synthetic triplet CUC to an in vitro system where the radioactive amino acid was serine, this tRNA-attached amino acid passed through the filter, and the filter thus emitted no radiation. But when they added the same triplet to a system where the radioactive amino acid was leucine, the filter lit up with radioactivity, indicating that the radioactively tagged leucine attached to a IRNA had bound to the ribosome-mRNA complex and gotten stuck in the filter. CUC thus encodes leucine, not serine. Nirenberg and Leder used this technique to determine
Figure 8.6 Cracking the genetic code with mini-mRNAs. Nirenberg and Leder added trinucleotides of known sequence, in combination with a mixture of amino acid-charged tRNAs where only one amino acid was radioactive, to an ln vitro extract containing ribosomes. lf the trinucleotide specified this amino acid, the radioactive charged tRNA formed a complex with the ribosomes that could be trapped on a filter. The experiments shown here indicate that the codon CUC specifies leucine, not serine.
u€iElE.,
Add ribosomes
Labeled Ser IRNA + synthetic trinucleotide
uEEi&., Labeled Leu IRNA + synthetic trinucleotide
Code
261
most of the codon-amino acid correspondences shown in the genetic code table (see Fig. 8.2 on p. 256).
Polarities: 5' to 3' in mRNA corresponds to N to C in the polypeptide
In
studies using synthetic mRNAs, when investigators
added the six-nucleotide-long 5' AAAUUU 3' to an in vitro translational system, the product N Lys-Phe C emerged, but no N Phe-Lys C appeared. Because AAA is the codon
for lysine and UUU is the codon for phenylalanine, this result means that the codon closest to the 5' end of the mRNA encoded the amino acid closest to the N terminus of the corresponding polypeptide. Similarly, the codon nearest the 3' end of the mRNA encoded the amino acid nearest the C terminus of the resulting polypeptide. To understand how the polarities of the macromolecules participating in gene expression relate to each other, remember that although the gene is a segment of a DNA double helix, only one of the two strands serves as a template for the mRNA. This strand is known as the template strand. The other strand is the RNA-like strand, because it has the same polarity and sequence (written in the DNA dialect) as the RNA. Note that some scientists use the terms sense strand or coding strand as synonyms for the RNAlike strand; in these alternative nomenclatures, the template strand would be the antisense strand or the noncoding strand. Figure 8.7 diagrams the respective polarities of a gene's DNA, the mRNA transcript of that DNA, and the resulting polypeptide.
Nonsense codons and polypeptide chain termination Although most of the simple, repetitive RNAs synthesized by Khorana were very long and thus generated very long polypeptides, a few did not. These RNAs had signals that stopped construction of a polypeptide chain. As it turned
out, three different triplets-UAA, UAG, and UGA-do not correspond to any of the amino acids. When these
Figure 8.7 Correlation of polarities in DNA, mRNA, and polypeptide. The template strand of DNA is complementary to both the RNA-like DNA strand and the mRNA. The 5'-to-3' direction in an mRNA corresponds to the N terminus-to-C terminus direction in the polypeptide.
Pour through
filter
RNA-like strand 5
'J
DNA No radioactivity
trapped in filter
3
Radioactivity trapped in filter
5'
Template strand J
mRNA
U'EIEEL'
Polypeptide
N
IM
c
262
Chapter
8
Gene Expression: The Flow of Information from
codons appear in frame, translation stops. As an example of how investigators established this fact, consider the case of poly-GUAA (review Fig. 8.5b). This mRNA will not generate a long polypeptide because in all possible reading frames, it contains the stop codon UAA. Sydney Brenner helped establish the identities of the stop codons in an alternative way, through ingenious experiments involving point mutations in a T4 phage gene (m) encoding a protein component of the phage head capsule. As shown in Fig. 8.8a, Brenner determined that certain mutant alleles (m1-m6) encoded truncated polypeptides that were shorter than the wild-type M protein. Brenner found that the last amino acid at the C terminus
Figure 8.8 Sydney Brenner's experiment showing that UAG is a stop signal. (a)TheT4 phage m+ gene encodes a polypeptide M whose amino acids are shown with b/ue circles. Mutant alleles m7-m6 direct synthesis of truncated M proteins (black crcles). ln the wild-type M protein, the amino acid that follows the last amino acid in each truncated protein is encoded by a triplet that differs from UAG by a single nucleotide. (b) The genetic map positions of the m1-m6 mutations are colinear with the sizes of the corresponding truncated M proteins.
DNA to RNA to Protein
in each of the truncated proteins would have been followed in the normal, full-length protein by an amino acid specified by a codon that differed from the triplet UAG by a single nucleotide. These data suggested that each m mutant had a point mutation that changed a codon for an amino acid into the stop codon UAG. Such a mutation is called a nonsense mutation, because it changes a codon that signifies an amino acid (a sense codon) into one that does not (a nonsense codon). Brenner later established that a fine structure map of mutations mt -m6 corresponds in a linear manner to the size of the truncated polypeptide chains (Fig. 8.8b). It makes sense that the M protein encoded by m6, for example, is shorter than that encoded by mt because the m6 nonsense mutation is closer to the beginning of the reading frame than ms.
Brenner also isolated analogous sets of nonsense mutations that defined UAA and UGA as stop codons. For historical reasons, researchers often refer to UAG as the amber codon, UAA as the ochre codon, and UGA as the opal codon. The historical basis of this nomenclature is the last name of one of the early investigatorsBernstein-which means "amber" in German; ochre and opal derive from their similarity with amber as semiprecious materials.
(a) Nonsense mutations
Phage
M polypeptide length
gene
CAG
Gln
m' m1 UAG
m2
The genetic code is a complete, unabridged dictionary equating the four-letter language of the nucleic acids with the 20-letter language of the proteins. The following list summarizes the code's main features:
AAG
m'
The Genetic Code: A Summary
L
m
UAG GAG
m*
2.
m
UAG UAU
m4
3.
aaaaaaaaoo UAG
UGG
4.
mt m5
oaaaaa
UAG
UCG Ser
5.
m' m6
oao UAG
(b)
Fine structure map
m6
ms
m4
m3
m2
m1
Triplet codons: As written in Fig. 8.2 on p. 256,the code shows the 5'-to-3' sequence of the three nucleotides in each mRNA codon; that is, the first nucleotide depicted is at the 5' end of the codon. The codons are nonoverlapping.In the mRNA sequence 5' GAAGUUGAA 3', for example, the first three nucleotides (GAA) form one codon; nucleotides 4 through 6 (GUU) form the second; and so on. Each nucleotide is part of only one codon. The code includes three stop, or nonsense, codons: UAG, UAA, and UGA. These codons do not usually encode an amino acid and thus terminate translation. The code is degenerate, meaning that more than one codon may specify the same amino acid. The code is nevertheless unambiguous, because each codon specifies only one amino acid. The cellular machinery scans mRNA from a fixed starting point that establishes a reading frame. As we see later, the nucleotide triplet AUG, which specifies the amino acid methionine wherever it appears in the reading frame, also serves as the initiation codon, marking where in an mRNA the code for a particular
polypeptide begins.
8.1 The Genetic
6. Corresponding polarities of codons and amino acids: Moving in the 5'-to-3' direction along an mRNA, each successive codon is sequentially interpreted into an amino acid, starting at the N terminus and
moving toward the C terminus of the resulting polypeptide. 7. Mutations may modify the message encoded in a sequence of nucleotides in three ways. Frameshift mutanucleotide insertions or deletions that alter the tions for polypeptide construction by instructions genetic frame. Missense mutations change the reading changing acid to a codon for a different for one amino a codon change a codon for an Nonsense mutations amino acid. stop codon. amino acid to a are
The Effects of Mutations on Polypeptides Helped Verify the Code The experiments that cracked the genetic code by assigning
codons to amino acids were all in vitro studies using cellfree extracts and synthetic mRNAs. A logical question thus arose: Do living cells construct polypeptides according to the same rules? Early evidence that they do came from studies analyzing how mutations actually affect the amino acid composition of the polypeptides encoded by a gene. )' Most mutagens change a single nucleotide in a codon. As a result, most missense mutations that change the identity of a single amino acid should be single-nucleotide substitutions, and analyses of these substitutions should conform to the code. Yanofsky, for example, found two trp- auxotrophic mutations in the E. coli tryptophan synthetase gene that produced two different amino acids (arginine, or Arg, and glutamic acid, or Glu) at the same position-amino acid 2lI-inthe polypeptide chain (Fig. S.9a). According to the code, both of these mutations could have resulted from single-base changes in the GGA codon that normally in-
(- +). Upon determining the amino acid sequences of the tryptophan synthetase enzymes made by the revertant strains, Yanofsky found that he could use the genetic code to predict the precise amino
base-pair insertion
acid alterations that had occurred by assuming the revertants had a specific single-base-pair insertion and a specific single-base-pair deletion (Fig. 8.9b). Yanofsky's results helped confirm not only amino acid codon assignments but other parameters of the code as well. His interpretations make sense only if codons do not overlap and are read from a fixed starting point, with no pauses or commas separating the adjacent triplets.
Figure 8.9 Experimental verification of the genetic code. (a) Single-base substitutions can explain the amino acid substitutions caused by trp mutations and trp+ reversions. (b) The genetic code predicts the amino acid alterations (yellow) thal would arise from single-base-pair deletions and suppressing insertions.
(a) Altered amino acids in frp- mutations and try' revertants Position in polypeptide
211
Amino acid in wildtype polypeplide/(codon)
Glv (GGA) Z\
,, Mutations..
Arg
Amino acid in mutant polypeptide/(codon)
/'\ /\,/\
Glu (GAA)
(Acnl
/l\
Reversions
/\
lle
Thr
Ser
Gly
(AUA) (AQA) /AGC\ (8GA)
I\neu/ or-)
Ala Gly
Val
(GqA) (cqA) (GUA)
(b) Amino acid alterations that accompany intragenic suppression Wild-type mRNA and polypeptide
*
U
UA8
cU
U
CGA GCC A
CU
G
Tyr
Leu Leu
Thr
Ser
-A
enzyme's function.
Yanofsky obtained better evidence yet that cells use the genetic code in vivo by analyzing proflavin-induced frameshift mutations of the tryptophan synthetase gene (Fig. S.9b). He first treated populations of E. coli with proflavin to produce trp- m:utants. Subsequent treatment of these muiants with more proflavin generated some trp+
263
revertants among the progeny. The most likely explanation for the revertants was that their tryptophan synthetase gene carried both a single-base-pair deletion and a single-
serts glycine (Gly) at position 211.
Even more infoimative were the trp+ revertants of these mutations subsequently isolated by Yanofsky. As Fig. 8.9a illustrates, single-base substitutions could also explain the amino acid changes in these revertants. Note that some of these substitutions restore Gly to position 2ll of the polypeptide, while others place amino acids such as Ile, Thr, Ser, Ala, or Val at this site in the tryptophan synthetase molecule. The substitution of these other amino acids for Gly at position 2Il in the polypeptide chain is compatible with (that is, largely conserves) the
Code
Double mutant mRNA and polypeptide
*
UA8 ACC
U
Arg
Ala
+G
UGU
cAc
U
G
GCC A G
Tyr Thr Phe Cys Cys His Gly
Ala
264
Chapter
8
Gene Expression: The Flow of Information from
The Genetic Code ls Almost,
but Not Quite, Universal We now know that virtually all cells alive today use the same basic genetic code. One early indication of this uniformity was that a translational system derived from one organism could use the nRNA from another organism to convert genetic information to the encoded protein. Rabbit hemoglobin mRNA, for example, when injected into frog eggs or added to cell-free extracts from wheat germ, directs the synthesis of rabbit hemoglobin proteins. More recently, comparisons of DNA and protein sequences have revealed a perfect correspondence according to the genetic code between codons and amino acids in almost all organisms examined.
Conservation of the genetic code The universality of the code is an indication that it evolved very early in the history of life. Once it emerged, it remained constant over billions of years, in part because evolving organisms would have little tolerance for change. A single change in the genetic code could disrupt the production of hundreds or thousands of proteins in a cell-from the DNA polymerase that is essential for replication to the RNA polymerase that is required for gene expression to the tubulin proteins that compose the mitotic spindle-and such a change would therefore be lethal.
Exceptional genetic codes Researchers were thus quite amazed to observe a few excep-
tions to the universality of the code. In some species of the single-celled eukaryotic protozoans known as ciliates, the codons UAA and UAG, which are nonsense codons in most organisms, specif' the amino acid glutamine; in other ciliates, UGA, the third stop codon in most organisms, specifies cysteine. Ciliates use the remaining nonsense codons as
DNA to RNA to Protein
the code made it possible to understand the broad outlines of information flow between gene and protein, it did not explain exactly how the cellular machinery accomplishes gene expression. This is our focus as we present in the next sections the details oftranscription and translation.
essential concepts
.
.
The nearly universal genetic code consists of64 codons, each one composed ofthree nucleotides. Sixty-one codons specify amino acids, while three-UAA, UAG, UGA-are stop codons.The code is degenerate in that more than one codon can specifli an amino acid. The codon AUG specifies methionine; it also serves as the intiation codon establishing the reading frame that groups nucleotides into nonoverlapping codon triplets.
.
Missense mutations change a codon so that it specifies a different amino acid; frameshift mutations alter the reading frame for all codons following the mutation; and nonsense mutations change a codon for an amino acid into a stop
codon.
ff,|
Transcription: From DNA to RNA learning objectives 1
.
Describe the three stages of transcription: initiation,
elongation, and termination.
2. 3.
Compare transcription initation in prokaryotes and eukaryotes. List three ways by which eukaryotes process mRNA after
transcription.
stop codons. Other systematic changes in the genetic code exist in
mitochondria, the semiautonomous, self-reproducing organelles within eukaryotic cells that are the sites of ATP formation. Each mitochondrion has its own chromosomes and its own apparatus for gene expression (which we describe in detail in Chapter 14). In the mitochondria of yeast, for example, CUA specifies threonine instead of leucine. Yet another exception to the code is seen in certain prokaryotes which sometimes use the triplet UAG to specify insertion of the rare amino acid pyrrolysine (see Fig. 7.28c on p.237 and Problem 20 at the end of this chapter.) The experimental evidence presented so far helped define a nearly universal genetic code. But although cracking
Transcription is the process by which the polymerization
of
ribonucleotides guided by complementary base pairing produces an RNA transcript of a gene. The template for the
RNA transcript is one strand of that portion of the DNA double helix that composes the gene.
RNA Polymerase Synthesizes a SingleStranded RNA Copyof a Gene Figure 8.10 depicts the basic components of transcription and illustrates key events in the process as it occurs in the
bacterium E. coli. This figure divides transcription into
8.2 Transcription: From DNA to
RNA
265
Transcription in Bacterial Cells (a) The initiation of transcription
1.
-enzyme \
RNA polymerase core
o tacior
tl\ltl\lrt\lt\ltt\llt\llN::
:l\rrt\.lr
Termination region
Promoter
2.
|\
J
Nascent mRNA
RNA-like strand
t\t\lt\l\lt\lt\tr\:
:E
5',
o factor released Templaie strand
Direction of transcription
(b) Elongation 1.
Transcription bubble
t\rt\rt\rt\rl\r DNA rewinds
J 5',
Promoter region
6
tl\rl\rl\::
r!
Termination region
mRNA RNA polymerase movement
2.
l-,
1_
Transcription
Promoter
(c) Termination
J
5'
Termination region
l\,lt\tr\tt\.lr\t\'l|\*t|\.llt\
I\:: Termination signal
A hairpin loop termination signal
3, mRNA RNA polymerase released at terminator
5'
3'
(Continued)
266
Chapter
8
Gene Expression: The Flow of Information from DNA
to RNA to Protein
FEATURE FIGURE 8.1O (a) The lnitiation of Transcription
1. RNApolymerasebindstodouble-strandedDNAatthebeginningofthegenetobecopied.RNApolymeraserecognizesandbindsto promoters, specialized DNA sequences near the beginning of
a gene where transcription will start. Although specific promoters vary substantially, all promoters in F. coli contain two characteristic short sequences of 6-'10 nucleotide pairs that help bind RNA polymerase (Fig. 8.1 la). ln bacteria, the complete RNA polymerase fthe holoenzyme) consists of a core enzymer plus a o (sigma) subunit involved only in initiation. The o subunit reduces RNA polymerase's general affinity for DNA but simultaneously increases RNA polymerase's affinity for the promoter. As a result, the RNA polymerase holoenzyme can hone in on a promoter and bind tightly to it, forming a so-called closed promoter complex.
2, After binding to the promoter, RNA polymerase unwinds part of the double helix, exposing unpaired bases on the template strand. The complex formed between the RNA polymerase holoenzyme and an unwound promoter is called an open promoter complex.fhe enzyme identifies the template strand and chooses the two nucleotides with which to initiate copying. Guided by base pairing with these two nucleotides, RNA polymerase aligns the first two ribonucleotides of the new RNA, which will be at the 5' end of the final RNA product. The DNA transcribed as the 5' end of the mRNA is often called rhe 5' end of the gene. RNA polymerase then catalyzes the formation of a phosphodiester bond between the first two ribonucleotides. Soon thereafter, the RNA polymerase releases the o subunit. This release marks the end of initiation. (b) Elongation: Constructing an RNA Copy of the Gene 1. When the o subunit separates from the RNA polymerase, the enzyme /oses lts enhanced affinity for regains
the promoter sequence and
strong generalized affinity for any DNA. These changes enable the core enzyme to leave the promoter yet remain bound to the gene. The core enzyme now moves along the chromosome, unwinding the double helix to expose the next single-stranded region of its
the template. The enzyme extends the RNA by linking a ribonucleotide, positioned by its complementarity with the template strand, to the 3' end of the growing chain. As the enzyme extends the mRNA in the 5'-to-3' direction, it moves in the antiparallel 3'-to-5' direction along the DNA template strand. RNA polymerase synthesizes RNA at an average speed of about 50 nucleotides per second. The region of DNA unwound by RNA polymerase is called the transcription bubble. Within the bubble, the nascent RNA chain remains base paired with the DNA template, forming a DNA-RNA hybrid. However, in those parts of the gene behind the bubble that have already been transcribed. the DNA double helix re-forms. displacing the RNA, which hangs out of the transcription complex as a single strand with a free 5' end. 2. Once an RNA polymerase has moved off the promoter, other RNA polymerase molecules can move in to initiate transcription.lf the promoter is very strong, that is, if it can rapidly attract RNA polymerase, the gene can undergo transcription by many RNA polymerases simultaneously. Here we show an electron micrograph and an artist's interpretation of simultaneous transcription by several RNA polymerases. As you can see, the promoter for this gene lies very close to where the shortest RNA is emerging from the DNA. Geneticists often use the direction traveled by RNA polymerase as a reference when discussing various features within a gene. ll for example, you started at the 5' end of a gene at point A and moved along the gene in the same direction as RNA polymerase to point B, you would be traveling in the downstream direction. lf. by contrast, you started at point B and moved in the opposite direction to point A, you would be traveling in the upstream direction. (c) Termination: The End of Transcription RNA sequences that signal the end of transcription are known as terminators. There are two types of termin ators: intrinsic terminators, which cause the RNA polymerase core enzyme to terminate transcription on its own. and extrinsic terminators, which require proteins other than RNA polymerase-particularly a polypeptide known as rho-to bring about termination. All terminators, whether intrinsic or extrinsic, are specific sequences in the mRNA that are transcribed from specific DNA regions. Terminators often form hairpin loops (also called stem loops) in which nucleotides within the mRNA pair with nearby complementary nucleotides in the same molecule. Upon termination, RNA polymerase and a completed RNA chain are both released from the DNA.
successive phases of initiation, elongation, and termination The following four points are of particular importance:
1. The enzyme RNA polymerase
2. DNA
catalyzes transcription.
sequences near the beginning of genes, called
promoters, signal RNA polymerase where to begin transcription. Most bacterial gene promoters have
almost identical nucleotide sequences in each of two short regions (Fig. S.f f a). These are the sites at which RNA polymerase makes particularly strong contact
with the promoters. 3. RNA polymerase adds nucleotides to the growing RNA polymer in the 5'-to-3' direction. The chemical mechanism of this nucleotide-adding reaction is
8.2 Transcription: From DNA to
.-
\
in bacteria (o) factors. In eusigma by alternative can be initiated than those in complicated promoters are more karyotes, polymerase kinds of RNA different bacteria, and three exist that can transcribe different classes of genes. One of these is eukaryotic RNA polymerase II (pol II), which transcribes genes that encode proteins. Figure 8.llb illustrates the general structure of the DNA regions of eukaryotic genes that allow pol II to initiate transcription. A key difference with prokaryotes is that sequences called enhancers that can be thousands ofbase pairs away from the promoter are often also required for the efficient transcription of eukaryotic genes. Chapters 15 and 16 will describe how prokaryotic and
only the sequence of the RNA-like strand is shown; numbering
starts at the first transcribed nucleotide (+1). (a) All promoters in E. coli share two different short stretches of nucleotides (yellow\ rhat are essential for recognition of the promoter by RNA polymerase. The most common
nucleotides at each position in each stretch constitute the consensus (b) Eukaryotic genes transcribed by RNA pol ll have a promoter, and also one or more distant DNA elements called enhancers (orange) that bind to protein factors aiding transcription.
sequences shown.
(a) Transcription initiation region in bacterial genes Upstream Downstream Transcription
Promoter
/*'\ /"\ ''
Primary lranscript (mRNA)
TTGACA TATAAT
3'
(b) Transcription initiation region in eukaryotic genes transcribed by pol ll Upstream Downstream Transcription
Promoler
Enhancer
|-------r I-25\
100s-1000s of bp
/\
+1
Primary transcript
s''--\-./-\--l--\-,
s'
similar to the formation of phosphodiester bonds between nucleotides during DNA replication (review ) Fig.6.21on p. 190), with one exception: Transcription uses ribonucleotide triphosphates (ATP, CTR GTR and UTP) instead of deoxyribonucleotide triphosphates. Hydrolysis of the high-energy bonds in each ribonucleotide triphosphate provides the energy needed for elongation. 4. Sequences in the RNA products, known as terminators, tell RNA polymerase where to stop
transcription.
eukaryotic cells can exploit these and other variations to control when, where, and at what level a given gene is expressed. Finally, the Genetics and Society Box "HIV and Reverse Transcription' on p. 268 describes how the AIDS virus uses an exceptional form of transcription, known as reyerse transcription, to construct a double strand ofDNA from an RNA template. The result of transcription is a single strand of RNA known as a primary transcript (see Figs. 8.10 and 8.11). In prokaryotic organisms, the RNA produced by transcription is the actual messenger RNA that guides protein synthesis. In eukaryotic organisms, by contrast, most primary transcripts undergo RNA processing in the nucleus before they migrate to the cytoplasm to direct protein synthesis. This processing has played a fundamental role in the evolution of complex organisms.
ln Eukaryotes, RNA Processing After Transcription Produces a Mature mRNA Some RNA processing in eukaryotes modifies only the 5' or
As you examine Fig. 8.10, bear in mind that a gene consists of two antiparallel strands of DNA, as mentioned earlier. One-the RNA-like strand-has the same polarity and sequence (except for T instead of U) as the emerging RNA transcript. The second-the template strand-has the opposite polarity and a complementary sequence that enables it to serve as the template for making the RNA
transcript. When geneticists refer to fhe sequence of a gene, they usually mean the sequence of the RNA-like strand.
Transcri ption
267
example, the transcription of different genes
Figure 8.11 Control regions of bacterial and eukaryotic
)g"n"t.
RNA
lnitiation Varies Between
Eukaryotes and Prokaryotes , Although the transcription of all genes in all organisms / roughly follows the general scheme shown in Fig. 8.10, important variations can be found in the details. For
3' ends of the primary transcript, leaving the information content of the rest of the nRNA untouched. Other processing deletes blocks of information from the middle of the primary transcript, so the content of the mature mRNA is related, but not identical, to the complete set of DNA nucleotide pairs in the original gene.
Adding a 5' methylated cap and a 3' poly-A tail The nucleotide at the 5' end of a eukaryotic mRNA is a G in reverse orientation from the rest of the molecule; it is connected through a triphosphate linkage to the first nucleotide in the primary transcript. This "backward G" is not transcribed from the DNA. Instead, a special capping enzyme adds it to the primary transcript after polymerization of the transcript's first few nucleotides. Enzymes known as methyl transferases then add methyl (-CH:) groups to the backward G and to one or more of the
268
Chapter
8
Gene Expression: The Flow of Information from
DNA to RNA to Protein
t
GENETICS AND SOCIETY HIV and Reverse Transcription The AlDS-causing human immunodeficiency virus (HlV) is the most
lie latent inside the host chromosome, which then copies and trans-
intensively analyzed virus in history. From laboratory and clinical
mits the viral genome to two new cells with each cell division. The events of this life cycle make HIV a retrovirus: an RNA virus that after infecting a host cell copies lts own single strands of RNA into double helixes of DNA, which a viral enzyme (integrase) then integrates into a host chromosome.
studies spanning more than three decades, researchers have learned
that each viral particle is a rough-edged sphere consisting of an outer envelope enclosing a protein matrix, which, in turn, surrounds a cut-off cone-shaped core (Fig. A). Within the core lies an enzyme-studded genome: two identical single strands of RNA associated with many molecules of an unusual DNA polymerase known as reverse transctiptase. During infection, the AIDS virus binds to and injects its coneshaped core into cells of the human immune system (Fig. B). The virus next uses reverse transcriptase to copy its RNA genome into double-stranded DNA molecules in the cytoplasm of the host cell. The double helixes then travel to the nucleus where another enzyme, called integrase, inserts them into a host chromosome. Once integrated into a host-cell chromosome, the viral genome can do one of two things. lt can commandeer the host cell's protein synthesis machinery to make hundreds of new viral particles that bud off from the parent cell, taking with them part of the cell membrane and sometimes resulting in the host cellt death. Alternatively, it can
Figure
A
Reverse transcription, the foundation of the retroviral life cycle, is inconsistent with the one-way, DNA-to-RNA-to-protein
flow of genetic information. Because it was so unexpected, the phenomenon of reverse transcription encountered great resistance in the scientific community when first reported by Howard Temin of the University of Wisconsin and David Baltimore, then of MlT. Now,
Figure
B
Life cycle of the AIDS virus
2. Core Reverse produces viral RNA
Structure of the AIDS virus HIV viral particle
VITUS
viral R Core 1.
Protein matrix RNA
Virus particles attach to host cell membrane. 6. Core forms; new virus parlicles bud from host cetl.
Reverse Bilipid outer layer
4F
succeeding nucleotides in the RNA, forming a so-called methylated cap (Fig. 8.f 2). Like the 5' methylated cap, the 3' end of most eukaryotic mRNAs is not encoded directlybythe gene. In a large majority of eukaryotic mRNAs, the 3' end consists of 100-200 As, referred to as a poly-A tail (Fig. 8.13). Addition of the tail is a
two-step process. First,
a
ribonuclease cleaves the primary
transcript to form a new 3/ end; cleavage depends on the sequence AAUAAA, which is found in poly-A-containing mRNAs 11-30 nucleotides upstream of the position where the tail is added. Next, the enzyme poly-A polymerase adds As onto the 3' end exposed by cleavage. Unexpectedly, both the methylated cap and the poly-A tail are critical for the eficient translation of the mRNA into protein, even though neither helps specify an amino
acid. Recent data indicate that particul ar eukaryotic translation initiation factors bind to the 5' cap, whlle poly-A binding protein associates with the tail at the 3' end of the mRNA. The interaction of these proteins in many cases shapes
the mRNA molecule into a circle. This circularization both enhances the initial steps of translation and stabilizes the mRNA in the cytoplasm by increasing the length of time it can serve as a messenger.
Removing introns from the primary transcript by RNA splicing Another kind of RNA processing became apparent in the late 1970s, after researchers had developed techniques that enabled them to analyze nucleotide sequences in both DNA
8.2 Transcription: From DNA to
however, it is an established fact. Reverse transcriptase is a remarkable DNA polymerase that can construct a DNA polymer from either an RNA or a DNA template. ln addition to its comprehensive copying abilities, reverse transcriptase has another feature not seen in most DNA polymerases: inaccuracy. As we saw in Chapter 7, normal DNA polymerases replicate DNA with an error rate of one mistake in every million nucleotides copied. Reverse transcriptase, however, introduces one mutation in every 5000 incorporated nucleotides. HIV uses this capacity for mutation to gain a tactical advantage over the immune response of its host organism. Cells of the immune system seek to overcome an HIV invasion by multiplying in response to the proliferatlng viral particles. The numbers are staggering. Each day of infection in every patient, from 100 million to a billion HIV particles are released from infected immune-system cells. As long as the immune system is strong enough to withstand the assault, it responds by producing as many as 2 billion new cells daily. Many of these new immune system cells produce antibodies targeted against proteins on the surface of the virus. But just when an immune response wipes out those viral particles carrying the targeted protein, virions incorporating new forms of the protein resistant to the current immune response make their appearance. After many years of this complex chase, capture, and destruction by the immune system, the changeable virus outruns the host's immune response and gains the upper hand. Thus, the intrinsic infidelity of HIV's reverse transcriptase, by enhancing the virus's ability to compete in the evolutionary marketplace, increases its threat to human life and health. This inherent mutability has undermined two potential therapeutic approaches toward the control of AlDS: drugs and vaccines' Some of the antiviral drugs approved in the United States for treatment of HIV infection-AZT (zidovudine), ddC (dideoxycytidine), and ddl (dideoxyinosine)-block viral replication by interfering with the action of reverse transcriptase. Each drug is similar to one of the four nucleotides. and when reverse transcriptase incorporates
and RNA. Using these techniques, which we describe in Chapter 9, they began to compare eukaryotic genes with the mRNAs derived from them. Their expectation was that just as in prokaryotes, the DNA nucleotide sequence of a gene's RNA-like strand would be identical to the RNA nu-
cleotide sequence of the messenger RNA (with the exception of U replacing T in the RNA). Surprisingly, they found that the DNA nucleotide sequences of many eukaryotic genes are much longer than their corresponding mRNAs, suggesting that RNA transcripts, in addition to receiving a methylated cap and a poly-A tail, undergo extensive internal processing. An extreme example of the length difference between primarytranscript and mRNA is seen in the human gene for dystrophin. Abnormalities in the dystrophin gene underlie
RNA
269
one of the drug molecules rather than a genuine nucleotide into a growing DNA polymer, the enzyme cannot extend the chain any further. However, the drugs are toxic at high doses and thus can be administered only at low doses that do not destroy all viral particles. Because of this limitation and the virus's high rate of mutation, mutant reverse transcriptases soon appear that work even in the presence ofthe drugs. Similarly, researchers are having trouble developing, effective vaccines. Even if a vaccine could generate a massive immune response against one, two, or even several HIV proteins, such a vaccine might be effective for only a short while-until enough mutations build up to make the virus resistant. For these reasons, the AIDS virus will most likely not succumb entirely to drugs or vaccines that target proteins active at various stages of its life cycle. However, combinations of these therapeutic tools have nonetheless proven remarkably effective at prolonging an AIDS patient's life. ln 2013, AIDS patients who receive combination therapy have on average two-thirds of a normal life span. Newer drugs added to the cocktail include protease inhibitors that prevent the activity of enzymes needed to produce viral coat proteins, drugs that prevent viral entry into human cells, and inhibitors of the viral protein which integrates viral DNA into human
chromosomes.
A self-preserving capacity for mutation, perpetuated by reHlVt success. lronically, it may also provide a basis for its subjugation. verse transcriptase, is surely one of the main reasons for
Researchers are studying what happens when the virus increases its mutational load. lf reverse transcriptase's error rate determines the size and integrity of the viral population in a host organism, greatly accelerated mutagenesis might push the virus beyond the error threshold that allows it to function. In other words, too much mutation might destroy the virus's infectivity, virulence, or capacity to reproduce. lf geneticists could figure out how to make this hap-
pen, they might be able to give the human immune system the advantage it needs to overcome the virus.
the genetic disorder of Duchenne muscular dystrophy (DMD). The dystrophin gene is 2.5 million nucleotides-or 2500 kilobases (kb)-long, whereas the corresponding mRNA is roughly 14,000 nucleotides, or 14 kb, in length. Obviously the gene contains DNA sequences that are not present in the mature mRNA. Those regions of the gene that
do end up in the mature mRNA are scattered throughout the 2500 kb of DNA. Exons and lntrons Sequences found in both a gene's DNA and the mature messenger RNA are called exons (for "expressed regions"). The sequences found in the DNA of the gene but not in the mature mRNA are known as introns (for "intervening regions"). Introns interrupt, or separate,
the exon sequences that actually end up in the mature
27O
Chapter
8
Gene Expression: The Flow of Information from
DNA to RNA to Protein
Figure 8.12 Structure of the methylated cap at the 5' end of eukaryotic mRNAs. Capping enzyme connects a backward G to the
Figure 8.13 How RNA processing adds a tail to the 3'end of eukaryotic mRNAs. A ribonuclease recognizes AAUAAA in a
first nucleotide of the primary transcript through a triphosphate linkage.
particular context of the primary transcript and cleaves the transcript 1 1-30 nucleotides downstream to create a new 3' end. The enzyme poly-A polymerase then adds 100-200 As onto this new 3' end.
Methyl transferase enzymes then add methyl groups to this G and to one or two of the nucleotides first transcribed from the DNA template. Melhyl group
Guanine
RNA polymerase
5'
ooo ttt o-
Nt|\lt|\Jl
llltil o-P-o-o-P-o-o-P-o o-
o-
:
Methylated cap - not
tanscribed
Triphosphate bridgo
Cleavage by ribonuclease
o lr
o* P-O
6
I
o-
AAUAAA II
o- P-O-
-
cj
I
o-
rj
ofintrons. Mature mRNAs must contain all of the codons that are translated into amino acids, including the initiation and termination codons. In addition, mature mRNAs have sequences at their 5' and 3' ends that are not translated, but that nevertheless play important roles in regusizes and base sequences
lating the efficiency of translation. These sequences, called the 5' and 3' untranslated regions (5' and 3' UTRs), are located just after the methylated cap and just before the poly-A tail, respectively (Fig. 8.14a). Excepting the cap and tail themselves, all of the sequences in a mature mRNA, including all of the codons and both UTRs, must be transcribed from the geneb exons. Introns can interrupt a gene at any location, even between the nucleotides making up a single codon. In such a case, the three nucle-
otides of the codon are present in two different (but successive) exons.
How do cells make a mature mRNA from a gene whose coding sequences are interrupted by introns? The answer is that cells first make a primary transcript containing all of a gene's introns and exons, and then they remove the introns from the primary transcript by RNA
splicing, the process that deletes introns and joins
Poly-A polymerase aOOs n's onto 3' end
J AAUAAA
mRNA. The gene for collagen (an abundant protein in connective tissue) shown in Fig. 8.14 has two introns. By contrast, the dyrtrophin gene has more than 80 introns; the mean intron length is 35 kb, but one intron is an amazing 400 kb long. Other genes in humans generally have many fewer introns, while a few have none-and the introns range from 50 bp to over 100 kb. Exons, in contrast, vary in size from about 50 bp to a few kilobases; in the DMD gene, the mean exon length is 200 bp. The greater size variation seen in introns compared to exons reflects the fact that introns do not encode polypeptides and do not appear in mature mRNAs. As a result, fewer restrictions exist on the
3',
,
o
AAAAAAA...A 3' Poly-A tail
together successive exons to form a mature mRNA consisting only of exons (Fig. 8.14a). Because the first and last exons of the primary transcript become the 5' and 3'
ends of the mRNA, while all intervening introns are spliced out, a gene must have one more exon than it does introns. To construct the mature mRNA, splicing must be remarkably precise. For example, if an intron lies within a codon, splicing must remove the intron and reconstitute the codon without disrupting the reading frame of the mRNA. The Mechanism of RNA Splicing Figure 8.15 illustrates how RNA splicing works. Three types of short sequences within
the primary transcript-splice donors, splice acceptors, and branch sites-help ensure the specificity of splicing. These sites make it possible to sever the connections between an intron and the exons that precede and follow it, and then to join the formerly distant exons. The mechanism of splicing involves two sequential cuts
in the primary transcript. The first cut is at the splice-donor site, at the 5' end of the intron. After this first cut, the new 5' end ofthe intron attaches, via a novel 2'-5'phosphodiester bond, to an A at the branch site located within the intron, forming a so-called lariat structure. The second cut is at the splice-acceptor site, at the 3' end of the intron; this cut removes the intron. The discarded intron is degraded, and the precise splicing of adjacent exons completes the process of intron removal. SnRNPs and the Spliceosome Splicing normally requires a complicated intranuclear machine called the spliceosome, which ensures that all of the splicing reactions take
8.2 Transcription: From DNA to
ATG
Stop
Stop
AAUAAA
gene and its products. Exons are shown in red, introns in green' and nontranscribed parts of the gene in b/ue. Mature mRNAs are processed from the primary transcript; introns are spliced out, a 5' methyl-G cap is added, and poly-A tail is added to the 3' end. The 5' untranslated region (5'UTR) lies between the 5'end and the start codon (AUG), and the 3' untranslated region (3' UTR) lies between the stop codon and the poly-A tail at the 3' end of the mRNA (orange bars) Note that the start codon is not always in the first exon, and neither is the stop codon always in the final exon. (b) The same gene at the nucleotide level. Colors are the same as in part (a), except that the mature mRNA is shown in purple for emphasis. The AAUAAA poly-A addition signal in the
DNA Transcription
Promoler
AUG
_1"
Primary
transcript
3',
5',
AUG lvlature
mRNA
5' mec Poly-A tail
cap
27'l
Figure 8.14 Structure and expression of a typical eukaryotic Aene. (a) Schematic view of landmarks in a collagen
(a) A collagen gene: structure and expression Poly-A signal
RNA
mRNA is underlined. lntrons can occur anywhere in the transcribed part of a gene, including within a codon or either of the UTRs.
UTR
N,ec
Protein
polypeptide {b) Sequence of a collagen gene, mRNA, and
5'caa: - Cir:1.JUC*{U U{:-l;it!AC*Cf AilJ'.ACF'n
ttrEil,1
5'UTB
:TA'-'AAEATI ]''iI:-]-A'] I IIAIi' ] A4.I A;1:'4}
'. FolypeptiOe
] 'iACAAAAAT6ACCCAAGATCCAAACCAGATTCCCCAGCACACTGAGCTTCAATTCTCCCAACACACATCAAATCCACTTTCCCATCAC fi.TTTTTACT6GCTTCTACGTTTCCTCTAACGCGTCCTCTCACTCCAACTTAACAC6GTTGTG'TCTACTTTACCTGAAACCCIACTC
'E,1AA
AAAAAUCACa{AAGqUfC.li.46C4iAU,iaaCalltCAaAaUCAitUiiaAAUti:t-lCCaEAC,lanCArJaAAAUI6AarJUUiiiAUal1C iletThrcl uAspproLyscl nI l eA'l acl ncl uThrcl uval C l uPhecysc l nHi sArgserAsn6l vLeuTrpAspcl u
TATAACAGA*tATCTTaTiT-ilClTCAilT ATATTCTCTa^T4arr.4irr^nAA{:43.
iJntAACA,.rr TyrLysArg
il A l
ATfT?tEfl-iTA[-r-l';1A:1].ii'fldAli-i14GTTCCAAGGAGTfiCTGGA(TTGAAGGACCTATCAACAGAGACGCATAT l;i4A.4TtAAtA-aa-A4;l aAiIiA4aTiTi,IT,.I aAACCTTCCTCAAAGACCTCAACTTCCTCCATACTTCTCTCIGCCTATA tJUt{AAC*ACUllUaLltaACUUaAA{iCACaUAUai.ltACr'idAfa:lAUALi phecl ncl yval Sercl yval C'l uc'l yArgll elysArgAspAl aTyr
CACCGTAGCCTC66ACTTTCT6CIGCTTCCC6CMG6CTCCICGTCAATCTTAT6GAAATGACGCTcCTCTCGCACCAfiCGCTG6ATCATCTC6ACCATCAl'GCTCC CI'C6CATCCGAGCCTCAAACACCACCAA6CCCCTTCCCAGCACCACTTACAATACCTT'TACTGCCACCACAGCC'TCCTAACCCACC1ACTA6ACCTCCTAGTACCACC
d{'
Cr.raaeTAaaaUa{;CAatUila..:{icUCCUUaaaCa:AfiCaUCCJ:-atiaE}.UCuUAUdnAAAEil.4.CiltCatJ{U{fia;l6CiriJlJfCi'JCCA!CAUILJCCAaCAUaAUGaI ycl yPhecl ycl yserSerclycl ysercyscys Hi sArgserLeuc'l yva l Serc'l yAl aserArgLysAl aArgArgcl nserTyrGl yAsnAspA'l aAl aval Gl
TCATGCCGATCTCCAGGACAA6CT6CACCACCACGACAACATGCACACAGTCCATCCCACGCAGCTTCCGATCACTGCCCACCACCACCTACCCCTCCA6GACCTATT CCATGCCCA6CTCCTCCATAA ACTACCCCTACACCTCCTCTTCGACCTCGICCTCCTCTTCTACCICICTCACCIAGCCTCCCTCCAACGCTACTCACCCCTCGTCG'I
taAUaat6AtaUat46CA{An{jatCC.4€CAi{AGii{AA{,1U6fritACAaUCC3!lCaCAfiGnAdCUriiC(ALIaAtN-{Ca'laaltaaAaCtACCaaUCCtiiSAa{UAUii ySerprocl ycl nAl acl yAl aProcl ycl nAspc l yc l uSerCl ysercl uG l yAl acysAspHi scysProProProArgThrAl aP106l yAl aI l e
ierCysGl
CCAGCGCCGIATTAACCCCTTCAATCACATCTCATTTGAT'TCTTTATCTCATTTI6TCTATCAAAAACCAACACACTTACAATTTAATACCTAAAACCATATTCTCAA CCTCCCC6CATAATTC6CGAACTTACT6TA6ACTAAACTAAGAAATAGACTAAAACACATACTTTTTGCTTCTGTCAATCTTAAATTATG6ATlTT6C'ATAAGAGTT
g UfR
ProclyAlaTyrstop.
A6TCCAATAAATCAI I I'CATTACAAATTT6AAATTG ::: : -r: :: : CTTATCACCTTATTTACTMACTAATCTTTAAACTTTAAC:, GALUAa$CCAnuA,4StCACIItJCAUUACEAAU!iUCAAtllUfiEAAtrAi4AAAA4AA*A/tAAAAAAil CAA'{
:
g UTF
:
FslY=A
place in concert (Fig. S.l6). The spliceosome consists of four subunits known as small nuclear ribonucleoproteins, or srRNPs (pronounced "snurps"). Each snRNP contains one or two smqll nuclear RNAs (szRNAs) 100-300 nucleotides long, associated with proteins in a discrete particle' Certain snRNAs can base pair with the splice donor and splice acceptor sequences in the primary transcript, so these snRNAs are particularly important in bringing together the two exons that flank an intron. Given the comPlexities of spliceosome structure, it is remarkable that a few primary transcripts can sPlice themselves without the aid of a spliceosome or any additional factor. These rare prilnarytranscripts function as ribozymes: RNA molecules that can act as enzymes and calalyze a specific biochemical reaction.
tail
" :'
It might seem strange that eukaryotic genes incorporate DNA sequences that are spliced out of the mRNA before translation and thus do not encode amino acids' No one knows exactly why introns exist. One hypothesis proposes that they make it possible to assemble genes from various exon building blocks, which encode modules of protein function. This type of assembly would allow the shul11ing of exons to make new genes, a process that appears to have played a key role in the evolution of complex organisms. The exon-as-module proposal is attractive because it is easy to understand the selective advantage ofthe potential for exon shuftling. Nevertheless, it remains a hypothesis without proof. There is no hard evidence for or against the hlpothesis, and introns may have become established through means that scientists have yet to imagine.
272
Chapter
8
Gene Expression: The Flow of Information from DNA
to RNA to Protein
Figure8.l5 HowRNAprocessingsplicesoutintronsandjoinsadjacentexons.Exonsareshowninredandintronsin
green.(a)rhree
short sequences within the primary transcript determine the specificity of splicing. (1) The splice-donor site occurs where the 3, end of an exon abuts the 5' end of an intron. In most splice-donor sites, a GU dinucleotide (anows) that begins the intron is flanked on either side by a few purines (pu; that is, A or G). (2) The splice-acceptor site is at the 3' end of the intron where it joins with the next exon. The final nucleotides of the intron are always AG (arrows) preceded by 12-14 pyrimidines (Py; that is, C or U). (3) The branch site, which is located within the intron about 30 nucleotides upstream of the splice acceptor, must include an A (anow) and is usually rich in pyrimidines. (b) Two sequential cuts, the first at the splice-donor site and the second at the splice-acceptor site, remove the intron, allowing precise splicing of adjacent exons.
(a) Short sequences dictate where splicing occurs. Exon
-30 nucleotides lntron
1
Exon 2
5'
J
Primary transcript
Splice donor
Branch
site
Splice acceptor
(b) Two sequential cuts remove the intron. 5'site
3'site
5',
GU CACUGAC
AG
"Lariat" 5
J
J
CACUGAC AG
e,
5'
J3
J
+
5'
Mature mRNA
Figure 8.15 Splicing is catalyzed by the spliceosome. (rop) The spliceosome is assembled from four snRNP subunits, each of which contains one or two snRNAs and several proteins. (Bottom) Views of three spliceosomes in the electron microscope. Spliceosome components Five snRNAs (small nuclear
RNAs)
+
-50 proteins
I Four snRNPs (small nuclear ribonucleic particles), which assemble into a spliceosome
g, *G
Proteins SnRNA
J
\--Intron is degraded
Alternative splicing: Different mRNAs from the same primary transcript Sometimes RNA splicing joins together the splice donor and splice acceptor at the opposite ends ofan intron, resulting in removal of the intron and fusion of two successiveand now adjacent-exons. Often, however, RNA splicing during development is regulated so that at certain times or
in certain tissues, some splicing signals may be ignored. As an example, splicing may occur between the splice donor site ofone intron and the splice acceptor site ofa different intron downstream. Such alternative splicing produces different mRNA molecules that may encode related proteins with different-though partially overlapping-amino acid sequences and functions. In effect then, alternative splicing can tailor the nucleotide sequence of a primary transcript to produce more than one kind of polypeptide. Alternative splicing largely explains how the 25,000 genes in the human genome can encode the hundreds of thousands of different proteins estimated to exist in human cells. In mammals, alternative splicing of the gene encoding the antibody healy chain determines whether the antibody proteins become embedded in the membrane of the B lymphocyte that makes them or are instead secreted into the blood. The gene for antibody heavy-chains has eight exons and seven introns; exon number 6 has a splice-donor site within it. To make the membrane-bound antibody, all exons except for the right-hand part of number 6 are joined to create
8.3 Translation: From mRNA to
.. Figure
8.17 Different mRNAs can be produced from the
)same primary transcript.
Alternative splicing of the primary transcript for the antibody heavy chain produces mRNAs that encode different kinds of antibody proteins.
r-
outside of gene
v
iniron intron in membrane-bound/ exon in secreted poly-A addition sites splice specific for membrane-bound
Antibody heavy-chain gene
3
12
l
I I
t
Splicing for secreted antibody
/\
I
5';Erynnnnan3' mRNA
lt--Ll- Exons that encode
1 2 3 4 5 6a6b s'EEEEEET
1. 2.
Relate tRNA's structure to its function.
3.
List three categories of posttranslational processing and provide examples of each.
Describe the key steps of translation. indicating how each depends on the ribosome.
cific amino acids with the genetic instructions of
AAAAAA3'
mRNA
membrane attachment domain
Membrane-bound antibody
mRNA to Protein
Trqnslation is the process by which the sequence of nucleotides in a messenger RNA directs the assembly of the correct sequence of amino acids in the corresponding polypeptide. Translation takes place on ribosomes that coordinate the movements of transfer RNAs carrying spe-
Primary transcript
'l 23456a7
f,f,| Transtation: From
56a
4
t_lranscnptron
Splicing for membrane-bound antibody
273
Iearning objectives
E exon
{
Protein
an
mRNA. As we examine the cell's translation machinery, we first describe the structure and function of tRNAs and ribosomes; and we then explain how these components interact during translation.
Secreted antibody
mRNA encoding a hydrophobic (water-hating, lipidJoving) ' C terminus (Fig. S.l7). For the secreted antibody, onlythe first six exons (including the right part of 6) are spliced together to make an mRNA encoding a heavy chain with a hydrophilic (water-loving) C terminus. These two kinds of mRNAs formed by alternative splicing thus encode slightly different proteins that are directed to different parts ofthe body. an
Transfer RNAs Mediate the Translation of mRNA Codons to Amino Acids No obvious chemical similarity or affinity exists between the nucleotide triplets of mRNA codons and the amino acids they specify. Ratheq transfer RNAs (tRNAs) serve as adapter molecules that mediate the transfer of information from nucleic acid to protein.
essential concepts
The structure of IRNA
.
Transfer RNAs are short, single-stranded RNA molecules 74-95 nucleotides in length. Several of the nucleotides in tRNAs contain modified bases produced by chemical alterations of the principal A, G, C, and U nucleotides. Each IRNA carries one particular amino acid, and all cells must have at least one tRNA for each of the common 20 amino acids specified by the genetic code. The name of a IRNA reflects the amino acid it carries. For example, tRNAGlv carries the amino acid glycine. As Fig. 8.18 shows, it is possible to consider the structure of a IRNA molecule on three levels.
.
.
Transcription is the process by which RNA polymerase synthesizes a single-stranded primary transcript from a DNA template. During transcription initiation, promoters in DNA signal RNA polymerase where to begin copying. ln the elongation phase, RNA polymerase adds nucleotides to the growing RNA strand in the 5'-to-3' direction. At termination, a terminator sequence in the RNA transcript tells RNA polymerase to cease transcription. Transcription initiation requires a DNA sequence called the promoter; in eukaryotes, initiation requires an additional DNA sequence called an enhancer.
ln prokaryotes, the primary transcript
is
the messenger RNA
(mRNA). ln eukaryotes, RNA processing after transcription produces a mature mRNA; the RNA transcript is modified by the addition of a 5' cap and a poly-A tail, along with the excision of introns when exons are joined by splicing.
.
Exons can be spliced together in alternative ways to produce different mRNA sequences and therefore different
polypeptides from the same primary transcript.
1. The nucleotide sequence of a IRNA constitutes the primary structure. 2. Short complementary regions within a tRNAs single strand can form base pairs with each other to create a characteristic cloverleaf shape; this is the tRNAs secondary structure.
3. Folding in three-dimensional
space creates a
structure that looks like a compact letter L.
tertiary
274
8
Chapter
Gene Expression: The Flow of Information from DNA to RNA to Protein
Figure 8.18 tRNAs mediate the transfer of information from nucleic acid to protein. The primary srructures of IRNA molecules fold to form characteristic secondary and tertiary structures. Note that the anticodon and the amino acid attachment site are at opposite ends of the L-shaped structure. Several bases of the IRNA are variants of A, G, C, and U that have been chemically modified by enzymes; these unusual bases are indicated as l, $, UH2, ml, mzG, and mG.
Primary structure
Figure 8.19 Aminoacyl-tRNA synthetases catalyze the attachment of tRNAs to their corresponding amino acids. The aminoacyl-tRNA synthetase has recognition sites for an amino acid, the corresponding tRNAs, and ATP. The synthetase first activates the amino acid, forming an AMP-amino acid. The enzyme then transfers the amino acidt carboxyl group from AMP to the hydroxyl (-OH) group of the ribose at the 3' end of the IRNA, producing a charged IRNA.
3'
Secondary structure
Tertiary
\Amino
acid attachment site
Yeast tRNAAta
structure a Amino acid (Gly)
5',
Amino acid attachment site
O',
+
TRNAGIY
o
t
Charged tRNAGtv
t
5'
mRNA 5',
Codon
3',
Anticodon
for Ala
At one end of the L, the IRNA carries an anticodon: three nucleotides complementary to an mRNA codon specifying the amino acid carried by the tRNA (Fig. 8.18). The anticodon never forms base pairs with other regions of the tRNA; it is always available for base pairing with its complementary mRNA codon. As with other complementary base sequences, during pairing at the ribosome, the strands of anticodon and codon run antiparallel to each other. If, for example, the anticodon is 3' CCU 5', the complementary mRNA codon is 5' GGA 3', specifying the amino acid glycine. At the other end of the L, where the 5' and 3' ends of the tRNA strand are found (Fig. 8.18), enzyrnes known as aminoacyl-tRNA synthetases connect the IRNA to the amino acid that corresponds to the anticodon. These enzymes are extraordinarily specific, recognizing unique features of a particular IRNA including the anticodon, while also recognizing the corresponding amino acid (see the opening figure of this chapter onp.254). Aminoacyl-tRNA synthetases are, in fact, the only molecules that read the languages of both nucleic acid and protein. They are thus the actual molecular translators. Normally, one aminoacyl-tRNA synthetase exists for each of the 20 common amino acids. Each synthetase functions with onlyone amino acid, butthe enzqemayrecognize sev-
bond between an amino acid and the 3' end of its corresponding IRNA. A IRNA covalently coupled to its amino acid is called a charged IRNA. The bond between the amino acid and tRNA contains substantial energy that is later used to drive peptide bond formation. The critical role of base pairing
between codon and anticodon While attachment of the appropriate amino acid charges a IRNA, the amino acid itself does not play a significant role in determining where it becomes incorporated in a growing polypeptide chain. Instead, the specific interaction between a tRNAs anticodon and an mRNAs codon makes that decision. A simple experiment illustrates this point (Fig. S.20). Researchers can subject a charged IRNA to chemical treatments that, without altering the structure of the tRNA, change the amino acid it carries. One treatment replaces the cysteine carried by tRNAcv'with alanine. Whenlnvestigators then add the tRNAcv'charged with alanine to a cellfree translational system, the system incorporates alanineinto the growing polypeptide wherever the nRNA contains a cysteine codon complementary to the anticodon of the tRNAcv'.
eral different tRNAs specific for that amino acid. Figure 8.19
Wobble: One IRNA, more than one codon Although at least one kind of IRNA exists for each of the 20
shows the two-step process that establishes the covalent
common amino acids, cells do not necessarily carry tRNAs
Protein
8.3 Translation: From mRNA to
Figure 8.20 Base pairing between an mRNA codon and a IRNA anticodon determines which amino acid is added to a growing polypeptide. A tRNA with an anticodon for cysteine, but carrying the amino acid alanine, adds alanine whenever the mRNA codon for cysteine aPpears. SH
cHz
?*,
I
H-C-NH2
Cysteine
tRNA cysteine anticodon mRNA codon for cysteine
3',
5'
H-C-NH2
Treatment with nickel hydride changes amino acid
Alanine
Treatment with nickel hydride leaves anticodon unchanged
275
the bases in the anticodon run antiparallel to the bases in the codon.) A single IRNA charged with a particular amino acid can thus recognize several or even all ofthe codons for that amino acid. This flexibility in base pairing between the 3' nucleotide in the codon and the 5' nucleotide in the anticodon is known as wobble (Fig. S.21a). The combination of normal base pairing at the first two positions of a codon with wobble at the third position clarifies why multiple codons for a single amino acid usually start with the same two letters. An important aspect of wobble is the chemical modification of certain bases at the 5' end of the anticodon (the wobble position) (Fig. S.2fb and c). An A in the wobble
5'
5l
1
s,
Codon for cysteine
with anticodons complementary to all of the 61 possible codon triplets in the genetic code' E. coli, fot example, makes 79 differenl tRNAs containing 42 diffetent anticodons. Although several of the 79 tRNAs in this collection
obviously have the same anticodon, 61 - 42 : 19 of 6 1 potential anticodons are not represented. Thus 19 mRNA codons will not find a complementary anticodon in the E' coll collection of tRNAs. How can an organism construct proper polypeptides if some of the codons in its mRNAs cannot locate tRNAs with complementary anticodons? The answer is that some tRNAs can recognize more
than one codon for the amino acid with which they
Figure 8.21 Wobble: Some tRNAs recognize more than one codon for the amino acid they carry. (a) The G at the 5' end of the anticodon shown here can pair with either U or C at the 3' end of the codon. (b) The table shows the pairing possibilities for nucleotides at the 5' end of an anticodon (the wobble position). xosU only rarely pairs with C. (c) Chemical structures of the modified bases in anticodons. The modification of C to k2C occurs only in certain bacteria. (a)
Phe tRNA
mRNA codon
5'
3'
tRNA anticodo
position
s'
are
charged. That is, the anticodons of these tRNAs can interact
with more than one codon for the same amino acid, in keeping with the degenerate nature of the genetic code. Francis Crick spelled out a few of the rules that govern the promiscuous base pairing between codons and anticodons. Crick reasoned first that the 3' nucleotide in many codons adds nothing to the specificity of the codon. For example,5' GGU 3',5' GGC 3',5' GGA 3', and 5' GGG 3' all encode glycine (review Fig. 8.2 on p. 256).It does not matter whether the 3' nucleotide in the codon is U, C, A, or G as long as the first two letters are GG. The same is true for other amino acids encoded by four different codons, such as valine, where the first two bases must be GU, but the third base can be U, C, A, or G. For amino acids specified by two different codons, the first two bases of the codon are, once again, always the same, while the third base must be either one of the two purines (A or G) or one of the two pyrimidines (U or C). Thus, 5' CAA 3' and 5' CAG 3' are both codons for glutamine; 5' CAU 3' and 5' CAC 3' are both codons for histidine. If Pu stands for either purine and Py stands for either pyrimidine, then CAPu represents the codons for glutamine, while CAPy represents the codons for histidine. In fact, the 5' nucleotide of a tRNAs anticodon can often pair with more than one kind of nucleotide in the 3' position of an mRNAs codon. (Recall that after base pairing,
(b)
Wobble Rules
3'end ol
can oatr with
5'end of anticodon
codon
G
UorC
c
G U, C, orA
I
G
xmsU
AorG
xm5s2U
A, G, U, or (C)
xosU
A
k2c
(c) Modified bases in anticodon wobble position Unmodified U il
X
X
S
2{hio-uridine
5'methylene-Uridine
Uridine
derivatives {xmsU)
oo ltl
o fl
I
s-oxy-Uridine
clerivatives(xmss'zU) derivatives(xosu)
-cH2cocH3
-cH2cNH2
-ocH2cooH ocH2cocH3
(eukaryotes)
(prokaryotes)
-ocH3
unmodified C
tJnmodified A
cooH I
cH(cH,)4HN
t-
Adenosine
lnosine
(l)
Cytidine
NHz
Lysidine (kzo) (bacteria)
276
Chapter
8
Gene Expression: The FIow of Information from
position of a IRNA is always modified to inosine (I), and a U in the wobble position is always modified in one of three possible ways. By contrast, G in the anticodon wobble position is always unmodified, while modification of C occurs only in the tRNAs of some bacterial species. Wobble bases are modified by specific enzymes that act on the tRNA after it has been synthesized by transcription. The wobble rules in Fig. 8.21c delimit the anticodon sequences and the wobble base modifications that are consistent with the genetic code. For example, methionine (Met) is specified by a single codon (5' AUG 3'). As a result, Met-specific tRNAs must either have a C at the 5' end of their anticodons (5' CAU 3'), or a U that is modified to xm5u, because these are the only nucleotides at that position that can base pair only with the G at the 3' end of the Met codon. By contrast, a single isoleucine-specific IRNA with the modified nucleotide inosine (I) at the 5' position of the anticodon can recognize all three codons (5' AUU 3', 5' AUC 3',and 5' AUA 3') for isoleucine.
DNA to RNA to Protein
Figure 8.22 Selenocysteine is encoded by a UGA triplet. (a) The serine on tRNAs"'with the anticodon 5, xmsUCA 3, is modified to selenocysteine (Sec). The Sec-charged IRNA recognizes the triplet UGA only in rare mRNAs with a downstream SECIS element. Due to the unusual features ofthis system, xmtu at the 5, end ofthis anticodon (indicated as U^) is an exceptional case and pairs with A instead of G as predicted by the wobble rules in Fig. 8.21b.
Serlne (Ser)
* Modification
#
""ar^ur/
A special IRNA for selenocysteine
Most mRNAs direct the synthesis of proteins containing only the 20 common amino acids. Exceptional mRNAs in bacteria and eukaryotes direct the synthesis of selenoproteins, which contain the amino acid selenocysteine (Sec), sometimes referred to as amino acid2I. Selenoproteins are rare; in humans, only 25 are known to exist. As^shown in Fig. 8.22,adedicated selenocysteine tRNA
Sec-charged tRNAsec
SECIS 5',
mBNA
3'
(tRNAs") with the anticodon sequence 5' xmsUCA 3, is recognized by serine tRNA synthetase and charged with serine. Modification enzymes convert the Ser to ^subsequently Sec. The Sec-charged IRNA"" interacts with 5' UGA 3' triplets found only in mRNAs that contain a special structure called the Sec insertion sequence (SECfS) element. The SECIS element is a region of the mRNA that forms a par-
ticular stem-loop (hairpin) structure through intramolecular complementary base pairing (Fig.8.22). This stem loop prevents termination of polypeptide synthesis at the UGA triplet, which would otherwise act as a stop codon. The anticodon of the Sec-charged tRNAs" binds to the UGA triplet in the mRNA, allowing the incorporation of Sec into the polypeptide product.
Ribosomes Are the Sites of Polypeptide Synthesis Ribosomes facilitate polypeptide slmthesis in various ways. First, they recognize mRNA features that signal the start of translation. Second, they help ensure accurate interpretation of the genetic code by stabilizing the interactions between tRNAs and mRNAs; without a ribosome, codon-anticodon
Fourth, by moving 5' to 3' along an mRNA molecule, they expose the mRNA codons in sequence, ensuring the linear addition of amino acids. Finally, ribosomes help end polypeptide synthesis by dissociating both from the mRNA directing polypeptide construction and from the polypeptide product itself. The structure of ribosomes InE. coli, ribosomes consist of three different ribosomal RNAs (rRNAs) and 52 different ribosomal proteins (Fig. 8.23a). These components associate to form two different ribosomal subunits called the 30S subunit and the 50S subunit (with S designating a coefficient of sedimentation related to the size and shape of the subunit; the 30S subunit is smaller than the 50S subunit). Before translation begins, the two subunits exist as separate entities in the cytoplasm. Soon after the start of translation, they come together to reconstitute a complete ribosome. Eukaryotic ribosomes have more components than their prokaryotic counterparts, but they still consist of two dissociable subunits.
recognition, mediated by only three base pairs, would be
Functional domains of ribosomes
extremely weak. Third, ribosomes supply the enzyrnatic activitythatlinks the amino acids in a growingpolypeptide chain.
The small30S subunit is the part of the ribosome that initially binds to mRNA. The larger 50S subunit contributes an enzl'rne
8.3 Translation: From mRNA to
Figure 8.23 The ribosome: Site of polypeptide synthesis' (a) A ribosome has two subunits, each composed of rRNA and various proteins. (b) The small subunit initially binds to mRNA' The large subunit contributes the enzyme peptidyl transferase, which catalyzes the formation of peptide bonds. The two subunits together form the A, P,
and
E
Protein
277
Figure 8.24 The large subunit of a bacterial ribosome. Various ribosomal proteins are lavender,23S rRNA is in gold and white, and 55 rRNA is magenta and white.fhe IRNA in the A site is green'the tRNA in the P site is orange;no tRNA is shown in the E site. The superimposed box shows the location where new peptide bonds are formed.
IRNA binding sites.
(a) A ribosome has two subunits composed of RNA and protein. Complete Ribosomes
Nucleotides
1.. rr?/v/ Subunits
Proteins
23S rRNA 3000 nucleotides 5S rRNA 120 nucleotides
U
I
Eukaryotic
80s
.^.--1-.-'^
21
'16S rRNA 1700 nucleotides
30s
333d?)j"""* (/?/?flAn rRNA\)
rRNA-
SS S.gS 160 nucleotides 120 nucleotides
60s
-45
6
Receptor: Withoui steroid hormone (SH), the receptor can't bind to enhancer.
rsHr
T,
fr -+
of
nding SH induces allosteric
Receptor can now bind to
change in
enhancer
receptor.
-r-.----
16.2 Control ofTranscription Initiation Through
\target cells like those in facial hair follicles. Modulation of )androgen receptor transcription factors by steroids controls gene activity in the target cells, leading to the development of male secondary sexual characteristics like facial hair in pubescent boys. This example clearly shows that each cell of a multicellular eukaryote must constantly modify its program of gene activity in response to ever-changing signals from elsewhere in the body.
Transcription factor proteins can be modified after they are synthesized by the covalent addition of any of several different chemical groups, as was previously described in Fig. 8.26b (p. 280). One of the most important of these modifications is phosphorylation, the addition of a phosphate group to a protein by action of an enzyme called a kinase. Phosphorylations can either activate or deactivate a transcription factor in any of a number of ways: by influencing movement of the factor into the nucleus, the factor's DNA-binding properties, its ability to multimerize, or its ability to interact with other proteins, including coactivators or corepressors. Cells often rely on phosphorylations to control events that must occur rapidly, such as responses to changes in the environment or transitions between states in the cell cycle.
In Chapter 19, you will see how the phosphorylation of a particular transcription factor called p53 plays an important role in protecting organisms from cancer.
are
made in all cells at all times. Clearly, if a factor is not present in a given cell, it will be unable to influence the initiation of the transcription of any target genes. In other words, the
availability of various transcription factors is critical to a cell's determination of which genes will be transcribed, and
if
in either orientation with
555
respect to the promoter.
These facts pose a conceptual problem: How does an enhancer "know" which of the two genes it inevitably sits between is the right one? And as enhancers may work at great distances from the promoters that they regulate, what prevents any enhancer on a chromosome from influencing any promoter for any gene anywhere on that chromosome? The answer is that DNA elements called insulators organize chromatin so that enhancers have access only to particular
promoters.
Transcription factor modifi cations
Transcription factor cascades As Fig. i6.13 illustrated, not all transcription factors
and
Enhancers
so, at what levels.
A transcription factor is, like anyprotein, a gene product. The expression ofa gene encoding a transcription factor is thus subject to control by other transcription factors,
implying that cascades of transcription factor expression must occur. One set of factors turns on or represses another set of factors, which in turn controls the expression of yet other transcription factors. You will see in Chapter 18 that such transcription factor cascades are critical to the bio-
How insulators are identified Insulators are characterized as DNA elements located between
a
promoter and an enhancer that block the enhancer
from activating transcription from that promoter. Suspect insulator DNA sequences are inserted between an enhancer and a promoter of a reporter gene; if reporter gene expression is blocked, then the DNA sequence is deemed an insulator (Fig.
16. 15).
How insulators work Insulators bind a protein called CTCF (CCCTC-binding factor) that facilitates the formation of DNA loops. A promoter and an enhancer will be in separate loops and cannot interact with each other if an insulator lies between them (Fig. f6.f 6). How CTCF forms loops in DNA is not well understood. Recent research has revealed that the functions of some insulators are more complex than simply blocking enhancers. Some developmental regulator genes have several enhancers separated by insulators. DNA loop formation at these genes is dynamic, and the insulators may deliver specific enhancers to particular promoters in response to signals that change as the organism develops.
Figure 1 6.15 ldentifying insulators.
An enhancer placed be-
tween the promoters of two reporter genes will activate transcription of both unless an insulator sequence is located between the enhancer and one promoter to block transcription from that promoter. RFP is red fluorescent protein;the RFP gene was cloned from a red fluorescent coral. With the construct at the top, cells in a transgenic organism would fluoresce both red and green (that is, in yellow); with the construct at the bottom, the same cells would fluoresce only in green. RFP and GFP transcribed
chemical mechanisms that control the development of multicellular eukaryotes. Promoter
lnsulators Organize DNA to Control Enhancer/Promoter lnteractions As mentioned above, an enhancer may be located upstream or downstream of the promoter that it regulates
Only GFP transcribed
4
I
lnsulator
Promoter
556
Chapter
16
Gene Regulation
in Eukaryotes
Figure 15.16 How insulators work.
lnsulators organize genomic DNA into loops, while enhancers activate transcription (A=activator) only from promoters within the same loop. The insulator-binding protein CTCF (yel/ow) facilitates loop formation by interacting with other proteins (gray).
Promoter 2 ON
Promoter
Figure 16.17 ChlP-Seq.
An antibody (Y-shaped molecules) against transcription factor (b lue ovalt is used to purify the protein bound to its target gene DNA sites (by attaching the antibody to microscopic beads, notshown). Sequencing ofthe DNA fragments within the purified protein-DNA complexes identifies the genes the transcription factor regulates. a specific
lsolate chromatin from cell nuclei Crosslink DNA and proteins with formaldehyde. Fragment DNA by sonication.
OFF
1
I
Enhancer
rffioooooor aauooooffic /q@trxuoooc ^
lnsulator
^^^@---" /\ New Methods Provide Global Views ol ci s- and trans-Acti n g Tra nscri ptiona Regulators The recent emergence of
bioinformatics-a field of science
as
helix-loop-helix or zinc finger motifs, that would indicate the proteins are transcription factors. In this way, researchers can find trans-acting transcription factors encoded by an organisnis genome. The conceptually translated amino acid sequences might further suggest the presence of target amino acid sequence motifs for specific posttranslational modifications, such as phosphorylation, that could be important for the regulation ofthese transcription factors. To search for possible cis-acting transcriptional regulatory sites such as enhancers, computers compare genomic sequences of closely related species. As previously shown in Fig. 9.I2 on p. 318, nucleotide sequences tend to be poorly conserved outside ofcoding regions, so any such conserved sequences that are found are strong candidates for roles in as gene
//\
#*
in which biology, computer science, and information technology merge to form a single discipline-promises to facilitate the understanding of complex transcriptional programs, As an example, computer programs virtually translate coding sequences in cDNA clones and putative open reading frames in genomes into the amino acid sequences ofproteins. The computers then search for signa-
important processes such
Add antibody to specific transcription factor.
aar*oooo&
I
tures within these amino acid sequences, such
/\
essential concepts
.
Chromatin immunoprecipitation-sequencing (ChIP-
chemically crosslink the DNA and protein components of the chromatin, and then fragment the DNA in the chromatin. Researchers next add microscopic beads coated with an
Purily DNA and sequence.
antibody that binds specifically to the transcription factor of interest. The only protein-DNA complexes that will stick to the beads are those containing the transcription factor crosslinked to the enhancers with which it interacts. These complexes can be washed free of other, non-specific chromatin pieces. Purification, through antibody binding, of specific proteins bound to other proteins or nucleic acids is called coimmunoprecipitation. The scientists sequence the DNA fragments in these purified complexes so as to identifi' the genes targeted by the particular transcription factor in the type ofcell being analyzed.
regulation.
Seq) is a powerful new technology for finding all the target genes of a particular transcription factor within the entire genome of a particular type of cell (Fig. 16.17). Scientists first isolate chromatin from nuclei of the cells being studied,
Coimmunoprecipitate transcription iactor and DNA.
.
.
Enhancers are DNA sequences which may be distant from a gene's promoter and act in particular cell types to increase or decrease the amount of transcription relative to a basal level.
Transcription factors are trans-acling proteins that include basal factors that bind the promoter, and activators and repressors that bind enhancers. Once bound to DNA, transcription factors can recruit other proteins to the gene. Enhancers can have binding sites for many different activators and repressors; this property of enhancers enables them to impart temporal and cell type specificity to gene transcription.
16.3 Epigenetics: Control of Transcription Initiation Through DNA
. .
lnsulators are DNA elements that organize chromatin into loops; an enhancer and a promoter can interact only if they are in the same loop. Bioinformatics enables genome-wide searches for new transcription factors and their binding sites. ChlP-Seq uses specific antibodies to identify genes regulated by transcription factors of interest.
Figure 16.18 DNA methylation and CpG islands.
CpG residues), which contain binding sites for activators. (b) Bound activators prevent methylation ofthe C residues. (c) lfactivators are no longer present,
the CpG island becomes methylated; repressors called methyl-CpG binding proteins (MeCPs) bind methylated sites and close the chromatin. NH 2
Hsc
c
o
1
o-P-o I
II
o N
DNA Methylation o
learning objectives
1. 2.
o
Describe gene regulation by CpG islands.
o I
Discuss how genomic imprinting can be inferred from
Definean epigeneticphenomenon. Explain the relationship between DNA methylation and
H
N
o
inheritance patterns in human pedigrees.
3. 4.
(a) Chemical
a CpG
I
Epigenetics: Control of Transcription lnitiation Through
557
dinucleotide where the C is methylated (red). (b and c) Transcription of some genes is controlled by CpG islands (sequences rich in structure of
(a)
ftfl
Methylation
J
(b) C:
f :
genomic imprinting.
unmethylated methYlated
Open chromatin Transcription pol ll
In the preceding section, we discussed how the binding of transcription factors to enhancers modulates the spatial and temporal expression of many genes that are expressed only in particular tissues at specific times during development. A second method by which cells can regulate transcription initiation is through the control of DNA methylation: a biochemical modification of DNA itself in which a methyl (-CHr) group is added to the fifth carbon of the cltosine base in a 5'CpG 3'dinucleotide pair on one strand of the double helix (Fig. l6.l8a). (The "p" in CpG stands for phosphate.) Enzymes called DNA methyl transferases (DNMTS) catalyze the methylation of cltosines in CpG dinucleotides. DNA methylation is particularly important to the control of expression ofhousekeeping genes in vertebrates, though it also plays a role in regulating some cell-type-specific genes. In the human genome, about7}o/o of the C residues in CpG dinucleotides are methylated. You will see that because DNA methylation affects gene transcription, and methylation patterns are copied during DNA replication, DNA methylation can alter gene expression heritably without changing the base sequence of DNA-and thus constitutes a so-called epigenetic phenomenon. Methylation is key to an epigenetic phenomenon seen in mammals (including humans) that is called genomic imprinting. Invertebrate animals and unicellular eukaryotes have
little or no DNA methylation, while the worm C. elegans and yeasts have none. The information in this section may thus not be relevant to all eukaryotic organisms, but it is very important to human genetics.
CpG island (c)
Closed chromatin No transcription
CpG island
DNA Methylation at CpG lslands Silences Gene Expression CpG islands are DNA sequences that may be a few hundred or a few thousand bp long, and within which the frequency of CpG dinucleotides is much higher than that of the rest of the genome. However, unlike the CpG dinucleotides in the rest of the mammalian genome, the C residues in CpG islands are usually unmethylated. When the CpG islands in the vicinity of a genet promoter are unmethylated, the chromatin is 'bpen' and the gene is transcriptionally active. Methylation of the CpG islands 'tloses" the chromatin and represses transcription (Fig. 16.f 8b and c). The reason that CpG islands are usually unmethylated is that the proteins that activate transcription by binding to CpG islands prevent DNMTs from methylating these islands (Fig. 16.18b). These transcriptional activators will be
558
Chapter
16
Gene Regulation
in Eukaryotes
found in many cell tlpes if the target gene is a housekeeping gene expressed in most cells.
If the activators
are not present, the CpG island be-
comes methylated. The gene cannot be transcribed because repressors called Methyl-CpG-binding proteins (MeCPs)
bind to methylated CpG islands and close the chromatin structure (Fig. 16.18c). Repression of genes by DNA methylation is often long-term because the methylation pattern is maintained through numerous cell divisions; long-term repression through DNA methylation is called silencing. DNA methylation patterns are copied during DNA replication by a special DNMT present at the replication fork that recognizes hemi-methylated DNA (DNA methylated on only one strand, in this case the parental strand) and methylates the newly synthesized DNA strand (Fig. 16.19). The potential importance of gene regulation through DNA methylation in humans was revealed by the discovery of the mutant gene responsible for Rett syndrome, a developmental disorder of the brain that results in seizures and mental and physical impairment. The disease shows X-linked dominant inheritance, and is caused by loss-of-function mutations in an X-linked gene calledMeCP2 that encodes a methyl-CpG-binding protein.
Sex-Specific DNA Methylation ls Responsible for Genomic lmprinting A major tenet of Mendelian
genetics is that the parental origin of an allele-whether it comes from the mother or the father-does not affect its function in the F1 generation. For the vast majority of genes in plants and animals, this principle still holds true. Surprisingly, however, geneticists have found that some genes in mammals are exceptional and do not obey this general rule. The unusual phenomenon in which the expression of an allele depends on the parent that transmits it is known as genomic imprinting. In genomic imprinting, the copy of a gene an individual inherits from one parent is transcrip-
tionally inactive, while the copy inherited from the other parent is active. The term imprinting signifies that whatever Figure 16.19 Cytosine methylation is perpetuated during DNA replication. A dedicated DNA methyltransferase (DNMT) functions at the DNA replication fork; the pattern of cytosine methylation (blue circle) on the template strand is replicated on the newly synthesized strand of DNA(red).
?
5'-CG
??? *3*
??
5'-CG-CG-CG3'-GC-GC-GC-
-cG-cG-
3'-GC-GC-GC-
? ? v,z
5'-CG-CG-CG3'-Gc-Gc-G6-
3 5 *;
Orun PolYmerase
5',-CG-CG-CG3'-GC-GC-GC-
a*$
silences the maternal or paternal copy of an imprinted gene is not a change in the nucleotide sequence of DNA. Instead, as you will see later in this section, the "whatever" is sex-specific methylation of certain DNA sequences called
imprinting control regions (ICRs). Only about 100 of the approximately 25,000 genes in the human genome exhibit imprinting. This number of imprinted genes was estimated by RNA-Seq experiments (review pp. 534-535 in Chapter 15) that could distinguish transcripts of a gene from the two homologs in a heterozygous individual. About half of these 100 are paternally imprinted genes, meaning that the allele inherited from the father is not expressed, while the allele from the mother is transcribed. For maternally imprinted genes, lhe allele inherited from the mother is not transcribed, and all of the mRNA for this gene is made from the paternal allele. It may be easier for you to track this nomenclature by equating the term imprinted wlth silenced.
lmprinting and human disease The existence of genomic imprinting was first inferred well before the RNA-Seq technique was developed, when clinical geneticists in the 1980s observed pedigrees in which the sex of the parent carrying the mutant allele determined whether the child would manifest the disease. These kinds of pedi-
gree patterns were particularly clear in certain rare cases where the condition was caused by a deletion that removed an imprinted gene, because the inheritance of the deletion as well as the disease could be followed in karyotypes. As seen in Fig. l6.20a, a deletion of a paternally imprinted gene could pass without effect from a father to any child, because the childt wild-type maternal allele would be expressed. However, if a woman was heterozygous for the same deletion, 50o/o of her children would receive the deletion from her. All of these heterozygous children would have the mutant phenotlrye, because the one intact copy of the gene they inherit from their father is inactive; no gene product can be made, causing the aberrant phenotlpe. Conversely, deletion of a maternally imprinted gene could pass
unnoticed from mother to daughter for many generations because the paternally derived gene copy is always active.
Il
however, the deletion passed from a man to his children, both the sons and daughters would each have a 50% chance
of receiving a deleted paternal allele, and those children would express the mutant phenotype because the intact copy inherited from their mothers is inactive (Fig. 16.20b), Evidence for imprinting as a contributing factor now exists for a variety of human developmental disorders, including the related pair of conditions known as Prader-
Willi syndrome and Angelman syndrome. Children with
DNMT
??? 5'-CG-CG-CG3'-GC-GC-GC-
*J*
Prader-Willi syndrome have small hands and feet, underdeveloped gonads and genitalia, a short stature, and mental retardation; they are also compulsive overeaters and obese. Children affected by Angelman syndrome have red cheeks,
16.3 Epigenetics: Control of Transcription Initiation Through DNA
Figure 16.2O Genomic imprinting and human disease. (a) A typical pedigree for a disease associated with deletion of a paternally imprinted autosomal gene. Fathers can pass the deletion to their sons or daughters who are unaffected (dots in pedigree symbols indicate unaf-
fected carriers ofthe deletion); mothers can pass the deletion and the disease (yel/ow shading) to their children. (b) A typical pedigree for a disease associated with deletion of a maternally imprinted gene. Here, it is the mothers who can pass the deletion to their sons and daughters without effect (dots); fathers can pass the deletion and the disease to their sons and daughters who will be affected (purple). Both pedigrees also apply for inheritance of a recessive loss-of-function mutation of the imprinted gene instead of a deletion.
(a) Paternal imprinting
(b) Maternal imprinting
Methylation
559
imprinting is sex-specific DNA methylation of CpG dinucleotides found in specific ICRs (imprinting control regions) that are located near the 100-odd imprinted genes. Imprints are maintained when somatic cells divide by mitosis because the pattern of methylation can be transmit-
ted during DNA replication. The presence of a methyl group on one strand of a newly synthesized double helix signals DNMT methylase enzymes to add a methyl group to the other strand (review Fig. 16.19). Sex-specific methylation of imprinted loci thus generally remains in the somatic cells throughout the life of the individual. Note, however, that the pedigrees shown in Fig. 16.20 require that the patterns of DNA methylation must be reset during meiosis before being passed on to the next generation. If this were not true, the imprinting would not be sex-
specific. Figure 16.21 shows that the methylations are erased (removed) in the germ-line cells, and sex-specific methylation marks are then generated during each passage of the gene through the germ line into the next generation.
Figure 16.21 Genomic imprints are reset during meiosis. Maternally imprinted genes are shown in red and paternally imprinted genes in black.ln germ-line cells, somatic cell imprints are erased, and new sex-specific imprints are established.
Sperm
Egg
a large jaw, and a large mouth with a prominent tongue; they also show severe mental and motor retardation. Both syndromes are often associated with small deletions in the ql1-13 region of chromosome 15. When the deletion is inherited from the father, the child develops Prader-Willi syndrome; when the same deletion comes from the mother, the child has Angelman syndrome. The explanation for this phenomenon is that at least two genes in the region of these deletions are differently imprinted. One gene is maternally imprinted; children receiving a deleted chromosome from their father and a wildtype (nondeleted) chromosome with an imprinted copy of this gene from their mother exhibit Prader-Willi syndrome because the imprinted, wild-type gene is inactivated. In the case of Angelman syndrome, a different gene in the same region is paternally imprinted; children receiving a deleted chromosome from their mother and a normal, imprinted gene from their father develop this syndrome.
Homolog
1
J
Homolog 2
It Zygole Male
Female
ll
Somatic cell
+
+
It
Germ cell
+
OId imprints erased
I
the base pair sequence of the DNA, but nevertheless affects gene transcription in a heritable manner. As mentioned earlier, modifications to genes that alter gene expression without changing the base pair sequence and that are inherited directly through cell divisions are called epigenetic changes.
The type of epigenetic change responsible for genomic
Eggs
ffi +
il I
II +
m
Imprinting as an epigenetic phenomenon Genes may be modified in a manner that does not change
il
New imprints made
I
m
560
Chapter
16
Gene Regulation
in
Eukaryotes
Some genes are methylated in the maternal germ line; others receive methylation marks in the paternal germ line. For each gene subject to this effect, imprinting occurs in either the maternal or paternal line, never in both. The molecular
binds CTCR which as we saw earlier, is a protein whose association with insulators forms loops in chromatin. As a result, the enhancer on the maternal chromosome cannot activate transcription of the lgf2 gene because it is not in
differences in the male and female germ line that result in different patterns of methylation are unknown.
the same loop as the gene's promoter. On the paternal chromosome, by contrast, the insula-
How imprinting works DNA methylation at ICRs controls the transcription of nearby genes. In contrast with methylation at CpG islands, which always represses transcription, methylation at ICRs can turn imprinted genes either on or off. Biochemical studies have uncovered two ways in which ICR methylation can influence gene expression.
lnsulator Mechanism Here, the ICR contains an insulator whose function is controlled by DNA methylation. An example of this mechanism for ICR function is seen in the maternally inherited mouse gene lgf2 (for insulin-like growthfactor 2).Imprinting at the lgf2locus works through methylation of an insulator that lies between the lgf2 promoter and its enhancer (Fig.l6.22a). The nonmethylated
insulator on the maternal chromosome is functional-it Figure 16.22 Genomic imprinting mechanisms.
(a) Maternal
imprinting of lgf2iscontrolled by methylation of an insulator located between the /gf2 enhancer and promoter. On the maternal homolog, the insulator is unmethylated and therefore functional (binds CTCF); on the paternal homolog, the insulator is methylated and does not function. (b) Paternal imprinting of /9fr2 depends on methylation of a CpG island that controls transcription of the 4rr ncRNA; when All" is transcribed,lgfr2 is not expressed. The CpG island on the maternal homolog is methylated, silencing Alr transcription and allowing /gfr2 expression; the paternal lgfr2 allele is silenced because the CpG island is unmethylated and Air is transcribed.
CTCF
? ?
(b) ncRNA mechanism of imprinting
later).
Why imprinting?
Unmethylated C
babies to be large and therefore more robust. According to
the parental conflict theory, imprinting may be nature's way of playing out this struggle in the womb. For example,
rJ Air CpG island
Palemal lgfr2 allele not transcribed
scription is methylated, silencing transcription of Air and thus permitting expression of lgfr2.It is not clear how the antisense Alr suppress es lgfr2; perhaps the act of Air transcription itself somehow interferes with transcription of Igfr2, or interaction of Air and lgfr2 transcripts could lead to the latter's destruction by "RNA interference" (described
Methylated C
Patemal lgf2 allele transcribed
r
Noncoding RNA (ncRNA) Mechanism In the vicinity of some imprinted genes, the ICR encodes an ncRNA whose transcription is controlled by a CpG island. The paternally imprinted insulin growth factor receptor 2 gene (Igfr2), which encodes the receptor for Igf2, provides an example of this imprinting mechanism (Fig. 16.22b). On the paternal chromosome, an ncRNA called Air is transcribed from a promoter within an intron of Igfr2 but in the opposite direction from Igfr2. The Air ncRNA is thus an antisense transcript that suppresses the expression of lgfr2. On the maternal chromosome, a CpG island that controls Air tran-
only in placental mammals and that most imprinted genes, like lgf2 and lgfr2, control prenatal growth. This so-called parental conJlict hypothesis imagines that because a fetus growing in the womb uses tremendous maternal resources, it is in the motherb interest for her baby to be small so as to balance her own needs with those of the child; conversely, the father's only interest is for his
lnsulator
Malernal lgtr2 allele transcribed
ments are now in the same loop. Note in this case that even though it is the paternal chromosome that is methylated, it is the maternal allele that is not transcribed (that is,Igf2 is maternally imprinted).
The answer to this question is not known, but several hypotheses have been proposed. One of the most interesting of these ideas is based on the facts that imprinting occurs
(a) lnsulator mechanism of imprinting
Maletnal lgf2 allele not transcribed
tor is methylated, which prevents it from binding CTCF. Without a functional insulator, the enhancer activates transcription from the lgf2 promoter because these two ele-
Air
Igf2 encodes a ligand that promotes growth, and it is maternally imprinted, while lgfr2 encodes a receptor for the ligand that represses growth and is paternally imprinted. Although the parental conflict hypothesis is compelling on its surface, many biologists think that it is overly simplistic, and they have very different ideas about the origins of genomic imprinting.
16.4 Regulation After
essential concepts
.
.
.
.
Certain repressors bind methylated CpG islands, blocking transcription activators; this repression is maintained through generations of cells because CpG methylation patterns are copied during DNA replication. The expression patterns ofabout 100 human genes depend on whether they were inherited from the male or female parent. Paternally imprinted genes are silenced when inherited from the father, while maternally imprinted genes are silenced when inherited from the mother.
561
type of protein from a single gene is through alternative splicing: that is, the splicing of primary transcripts into distinct mRNAs that produce different proteins (review Fig.8.17 onp.273). The spliceosomes that assemble at the splice junction sites of primary transcripts to enable their splicing can contain more than 100 proteins. The spliceosome proteins carry out different functions, including RNA cleavage and
ligation to join exons together. Some of the spliceosome components are crucial for determining which exons become spliced together; these spliceosome proteins recog-
Epigenetic phenomena, such as imprinting, are caused by changes in DNA that alter gene expression without changing base-pair sequence and that are heritable during cell division.
nize specific RNA sequences in the primary transcript to either facilitate or prevent the use of particular splice junction sequences. We mentioned at the beginning of this chapter that
Genomic imprinting results from sex-specific DNA methylation of crs-acting elements (lCRs) that control the expression of particular genes. During meiosis, the old imprints are erased and new sex-specific methylation patterns are established.
sex-specific courting behaviors of male Drosophila are under the control of thefruitless (fru) gene. In their brains and elsewhere in the nervous system, male flies produce a malespecific form of the fru gene product, Fru-M, a zinc-finger transcription factor. The synthesis of the Fru-M protein only in males requires a decision at the level of splicing that depends on the absence of a female-specific RNA-binding protein called Transformer (Tra). (We will discuss in the comprehensive example later why Tra is generated only in
]l!l
Regulation After Transcription
Iearning objectives 1
Transcription
.
2.
Explain how the primary transcript of a single eukaryotic gene can produce different proteins. Contrast the functions of the three main categories of small regulatory RNAs (miRNAs, siRNAs, and piRNAs).
Gene regulation can take place at any point in the process of gene expression. Thus far we have discussed mechanisms that influence rates of transcription initiation, but many other systems exist that regulate posttranscriptional events, including the splicing, stability, and localization of mRNAs; the translation of these mRNAs into proteins; and the sta-
bility, localization, and modifications of the protein products of these mRNAs. it is impossible to introduce all of these mechanisms in a single chapter, so we focus here on two of the most important: first, how RNA-binding proteins can affect alternative splicing of a primary transcript into different mRNAs, and second, how small RNAs can influence the stability and translation of specific mRNAs.
females.) The fru primary transcript is made in both sexes. In females, Tra (together with a protein present in both sexes called Tra2) binds specific sequences in the fru primary
RNA; Tra andTra2 interact with other spliceosome components and block the use ofa particular splice acceptor site, resulting in a fru mRNA that produces a female specific Fru-F protein (Fig. 16.23).In males, whose cells carry no Tra protein, alternative splicing of the fru transcript generates a related Fru-M protein with 101 additional amino acids at its N terminus (Fig. 16.23).
Figure 16.23 Sex-specific splicing of the primary fru transcript.
Splicing of fru RNA in the absence of Tra protein (in males) produces an mRNA that is translated into Fru-M protein. ln females,Tra protein (with Tra2) blocks the use of one exon, causing the fru transcript
to be spliced so
as
to encode
Fru-F.
d
o+
1 X chromosome
2 X chromosomes +
+
Sxl product
No Sxl product
+
+
No Tra product
Tra product
I
Sequence-Specific RNA Binding Proteins Can Regulate RNA Splicing )
fru primary Q-specific splicing +
Th. g"no-es of eukaryotic cells have many fewer genes than the number of different proteins expressed in those
Fru-F protein
cells. One of the ways cells can generate more than one
Qsexual behavior
+
transcripi produced in both sexes
t
d-specific splicing +
Fru-M protein +
dsexual behavior
562
Chapter
16
IE@
Gene Regulation
in Eukaryotes
SmallRNAs in
miRNAs
Targets
Effects
.
. . . .
mRNAs
(micro-RNAs) siRNAs Gmall interferinq RNAs)
. mRNAs . Nascent transcripts of chromosomal
regions
destined to become heterochromatin piRNAs
€iwi-jnteracting
RNAs)
. .
the male mating dance. Male flies with fru mutations that block production of Fru-M court males and females indiscriminately. Conversely, female flies with fru mutations that cause them to express Fru-M acquire male sexual behaviors; they display the male dating dance and specifically
court females. Researchers are now trying to identify the transcriptional targets of Fru-M that ultimately dictate
Destabilize mRNAs Block translation/Destabilize mRNAs Recruit histone-modifying enzymes to DNA,
resulting in heterochromatin formation
. .
Transposable element transcripts Transposable element promoters
Although Fru-F does not appear to have a function, Fru-M elicits a program of gene expression that controls
Block mRNA translation
Degradation of transposable element mRNA Facilitate histone modifications that inhibit transposable element transcription
mRNAs, resulting in the destruction of these mRNAs or preventing them from being translated. The human genome has close to 1000 genes encoding miRNAs. These genes are transcribed by RNA polymerase II into long primary transcripts called pri-miRNAs that contain one or more miRNA sequences in the form of mostly double-stranded stem loops (Fig. 16.24). The pri-miRNAs need to be processed to form the active miRNAs, which are
these behaviors.
short and single-stranded. Figure 16.25 diagrams this multistep process, which is aided by two ribonuclease enzymes called Drosha and Dicer. During the process, the
Small RNAs Regulate mRNA Stability and Translation
miRNA sequences are transported out of the nucleus (where they were transcribed) into the cytoplasm, where they will act. Furthermore, the miRNAs become incorporated into ribonucleoprotein complexes called miRNA-
In the first five years ofthe twenty-first century, new types of gene regulators were discovered in the form of small, specialized RNAs that prevent the expression of specific genes through complementary base pairing (Table f6.f ). Three classes of small regulatory RNAs have now been
identified: micro-RNAs (miRNAs), small interfering RNAs (siRNAs), and Piwi-interacting RNAs (piRNAs). Each small RNA class is generated through a distinct pathway,leading to the production of single-stranded RNAs of slightly different lengths but always within the range of 21
-30 nucleotides. To exert their functions, each small RNA class requires
distinct members of the Argonaute/Piwi protein family with which they form ribonucleoprotein complexes. The small RNA in each complex serves to guide the complex to particular nucleic acid targets that have perfect or partial complementarity with the small RNA. All three classes of
induced silencing complexes (miRISCs); each miRISC contains a particular member of the Argonaute protein family. The ribonucleoprotein complexes (miRISCs) containing miRNAs mediate diverse functions depending on the particular Argonaute protein they possess, and on the extent of sequence complementaritybetween the miRNA in the complex (called the guide) and the target sequences in mRNA 3' UTRs. A miRISC whose guide miRNA has perfect complementarity with the target RNA causes mRNA cleavage
(Fig. f 6.26a). With
less complementarity, the mechanism is
Figure 16.24 micro-RNA-containing genes.
Primary miRNA
transcripts (pri-miRNAs) can contain one or several miRNAs. Some of these primary transcripts do not encode proteins (a), but in other cases miRNAs can be processed from the introns of protein-coding transcripts (b).
small RNAs regulate gene activity at the posttranscriptional
level through the modulation of RNA stability and/or translation; siRNAs and piRNAs also act at the transcriptional level by affecting chromatin structure. miRNAs In animals, one of the most abundant classes of small RNAs is composed of the micro-RNAs (miRNAs). As will be seen shortly, miRNAs are usually negative regulators of target
(a) 5'cap
(b) ,, ."0
## g{
a n"rru.unooo A{V.AJ'V- - A.A- --A-/* ----'t\ protein-coing region
16.4 Regulation After
Transcription
563
Figure 16.25 miRNA processing.
lmmediately after transcription, pri-miRNAs are recognized by the nuclear enzyme Drosha, which crops out pre-miRNA stem-loop structures from the larger RNA. The pre-miRNAs undergo active transport from the nucleus into the cytoplasm, where they are recognized bythe enzyme Dicer. Dicer reduces the pre-miRNA into a short-lived miRNA":miRNA duplex, which is released and picked up by a RISC. The RISC becomes a functional and highly specific miRlSC by eliminating the miRNAx strand that is partially complementary to the miRNA that will serve as the guide in the miRlSC.
Nucleus ,z
#
miRNA gene
Cytoplasm Dicer
Washa
RISC
tg
.=l
H "5 H
Transcription
l-
€+ .cl H -::>
Dicing
Cropping
FA
pri-miRNA
-t
Functional miRlSC
miRNAdegradation
miRNA.: miRNA
pre-miRNA
miRNA (guide)
duplex
usually inhibition of translation (Fig. 16.26b), although exactly how miRISCs regulate translational activity is not yet understood. Because exact complementaritybetween guide and target RNAs is not required, each type of miRNA ultimately can control several different mRNAs (about 10 on average). As a result, scientists estimate that about half of all human
genes are controlled by miRNAs. Moreover, each miRNA gene is transcribed according to its own temporal and spatial pattern during the development of a multicellular organism, so any single miRNA can influence gene expression in
siRNAs
The small-interfering RNA (siRNA) pathway has many similarities with that just described for miRNAs. A key difference is the source of the small RNA. Instead of resulting
from the processing ofa long, single-stranded transcript,
as
was the case with miRNAs, siRNAs result from the process-
ing (also by Dicer) of double-stranded RNAs (dsRNAs). These dsRNAs are produced originally by transcription of both strands either of an endogenous DNA sequence in the genome or an exogenous source such as a virus. Processing of the dsRNAs produces single-stranded RNAs that form
ribonucleoprotein complexes with Argonaute proteins.
different ways in different tissues.
Figure 16.25 How miRlSCs interfere with gene expression.
The miRlSC can down-regulate target genes in two different ways. (a) lf the miRNA and its target mRNA contain perfectly complementary sequences, miRlSC cleaves the mRNA. The two cleavage products are no longer protected from RNase and are rapidly degraded. (h) lf the miRNA and its target mRNA have only partial complementarity, translation of the mRNA is inhibited by an unknown mechanism.
(b) Tran slational repression
(a) mRNA cleavage
miRlSC
miRlSC 5'ggp ,,,,,,r,,,r,,,,,,,,ru,,,,,,,,,rn{{W$$!!1tr[r,t,,,,,,,,r,ur,,,,trrrn,tr,,4444 Perfect complementarity
b'caP
$}
$! tttt llll mRNA degraded
miRlSC
5'96p ,rrur,,rr,,r,,urr,,,rrur,,,,rr,,,,,,tr,,rru,,,,,,,rr,,,,,ru,,,tl{fu!!Tfr[Ir,,,,r,!l-!!ful$[-l#!Tr,,urr,,4444 I
ncomplete complementarity mRNA not translated
564
Chapter
16
Gene Regulation
in Eukaryotes
Using the single-stranded RNA as a guide, these complexes can interfere with the expression of a gene containing the complementary sequence by mechanisms previously shown for miRNA-containing complexes in Fig. 16.26.The siRNA pathway may also protect the cell from invading viruses by destroying viral mRNAs. Researchers have exploited the siRNA pathway to selectively shut offthe expression ofspecific genes in order
piRNAs are generated by cleavage of long RNAs transcribed fromplRNA gene clusters located throughout the genome, each of which encodes between 10 and 1000 piRNAs. After processing, piRNAs are loaded onto complexes con-
to
while Piwi complexes bound to TE transcripts degrade the TE RNAs. Many of the details of the piRNA pathway remain to be worked out.
evaluate their function. The idea is to introduce dsRNA corresponding to a particular gene into the cell or organism to shut offor knock down the expression of the endogenous gene in the genome; this technique is called RNA interference. The processing pathway for siRNA will convert the double-stranded RNA into a singlestranded siRNA; within the context of an Argonautecontaining complex, the siRNA will hybridize with the complementary mRNA transcript for the gene and mediate destruction of that mRNA. In this way, scientists can turn down the expression of any gene of interest and investigate any possible phenotypic consequences of this loss of function. In Chapter 18, we will explore the use of RNA interference to "genetically dissect" many biological processes.
Another important role of siRNAs is in the formation of heterochromatin. Recall from Chapter 11 that heterochromatin formation involves modifications of histone tails that facilitate binding of a protein called HP1 (review Fig. 11.14, p.392). A chromosomal region destined to become heterochromatin is first transcribed bidirectionally, and the resulting long double-stranded RNA is processed by
Dicer into small double-stranded RNAs. A complex similar
to RISC called the RNA-induced transcriptional silencing (RITS) complex incorporates one strand of these duplexes, and uses this siRNA as a guide to bind its complement in a nascent transcript being transcribed from the DNA destined to become heterochromatin. The RITS complex brings histone-modifying enyzmes to the DNA; the result is HP1 binding, closed chromatin, and the inactivation of transcription.
taining Piwi proteins (one subfamily of Argonaute proteins), and the piRNA guides the Piwi complexes to TE DNA or TE transcripts. Piwi complexes at TE DNA facilitate histone modifications that interfere with TE transcription,
essential concepts
.
.
ln eukaryotes, alternative splicing can produce different proteins from a single transcript. Sequence-specific RNAbinding proteins can inhibit or promote the use of particular splice-junction sequences. Three classes of small RNAs regulate mRNA stability, translation, or transcription through complementary base pairing: miRNA, siRNAs, and piRNAs. These small RNAs act as guides to bring protein complexes to particular target mRNAs (leading to mRNA destruction or preventing translation) or to DNA sequences near
promoters (blocking transcription or promoting heteroch romatization).
:lf|
Sex Determination in Drosophila learning objectives
1. 2.
will recall from previous chapters that the genomes of eukaryotic organisms contain many transposable elements (TEs) that propagate themselves by mobilization and transposition. The organisms harboring these TEs must limit TE movement to prevent their genomes from being destroyed by rapid mutation and rearrangement. One important mechanism by which organisms can minimize TE mobilization is through the action of small Piwi-interacting RNAs (piRNAs). These piRNAs block both the transcription of TEs in the genome and the translation of the TE mRNAs that still get transcribed. Without the synthesis of enzymes like transposase (for DNA transposons) or reverse tran-
Explain how the Sxl promoter "counts"the number of X chromosomes in Drosophi Ia. Describe the cascade of events initiated by the Sxl protein
that results in female morphology and behavior.
3. piRNAs
A Comprehensive Example:
Discuss the role of transcriptional regulation in Drosophila sex determination.
You
scriptase (for retrotransposons), the TEs cannot move.
Male and female Drosophila exhibit many sex-specific differences in morphology, biochemistry, behavior, and function of the germ line (Fig. 16.27). Through decades of work, researchers concluded that in Drosophila, it is the number of X chromosomes, not the presence of the Y, that determines sex, and that sex determination first occurs through transcriptional regulation ofthe Sxl gene. Transcription ofSx/ in XX (and not XY) animals initiates a cascade of events that influences sex through three independent pathways: One determines whether the flies look and act like males or females; another determines whether germ cells develop as eggs or
16.5 A Comprehensive Example: Sex Determination in
Figure 16.27 Sex-specific traits in Drosophila. objects ortraits shown in b/ue are specific to males. Objects or traits shown in red are specific to females. Objects or traits shown in green arefound in different forms in the two sexes. Brain Regions determining courtship behaviors More Kenyon fibers in female mushroom body
Antenna Sensillae Foreleg Chemosensory axons Sex comb in male Thoracic ganglion Courtship behaviors
d ln male:
. i
presence of the SRYgene on the Y chromosome; femaleness is the default state in the absence of SRY In flies, maleness is the default state brought about by the presence of only one X chromosome instead of two as in females; the reason
is that two X chromosomes are required to activate transcription of the Sxl gene in early embryogenesis.
Counting of X chromosomes by the Sxl promoter In early embryogenesis (before sex determination and dos-
o+
Fat body Yolk proteins in female
from the "establishment promoter" (P"). Transcription from P" depends on four transcriptional activator proteins: Scute, Runt, SisA, and Upd (Fig. f6.28a). Because the genes for
Gonads and reproductive tract ln female: Ovaries/oogenesis Yolk, chorion, and vitelline membrane proteins
Figure 16.28 5x/ expression only in XX Drosophila. (a) ln the early female-but not the male-Drosophild embryo, transcriptional activators encoded by X-linked genes are present in concentrations suf-
sperm; and a third produces dosage compensation by doubling the rate of transcription of X-linked genes in males.
To simplify this discussion of sex determination in Drosophila, we focus on the first-mentioned pathway: the determination of somatic sexual characteristics. An understanding of this pathway emerged from analyses of mutations that affect particular sexual characteristics in one sex or the other. For example, as we saw at the beginning of the chapter, XY flies carrying mutations in the fruitless gene (fru) exhibitaberrant male courtship behavior, whereas XX flies with the same fru mutations appear to behave as normal females. Table 16.2 on the next page shows that mutations in other genes also affect the two sexes differently. Clarification of how these mutations influence somatic sex determination came from a combination of genetic experiments (studying, for example, whether one mutation is epi-
static to another) and molecular biology experiments (in which investigators cloned and analyzed mutant and normal genes). Through such experiments, Drosophila genelicists dissected various stages of sex determination to delineate the following complex regulatory network.
The Number of X Chromosomes Determines Sex ln Drosophila saw that in both humans and flies, XY animals are male and XX are female. However, the underlying molecular mechanism of sex determination is different in humans and flies. In humans the key to maleness is the
In Chapter 4, you
565
age compensation have taken place), XX cells transcribe Sxl
Abdomen Pigmentation Male-specific muscle
Testes/spermatogenesis Accessory gland peptides Ejaculatory duct proteins
Drosophila
ficient to initiate transcription from the Pe promoter of 5x/ to produce an mRNA that encodes the Sxl protein. E7 and 4 denote the first two exons. (b) Later in development, transcriptional activators produced equally in XX and XY animals activate Sx/ transcription from the P. promoter. L t, 2, 3, and 4 denote the first few exons. When the Sxl protein is present (in females), it binds the 5x/ primary transcript to make a spliced mRNA that can be translated into more Sxl protein. The result is a feedback loop that maintains Sxl protein in females but not in males. (The early and late Sxl proteins differ slightly in their N-terminal amino acids.)
(a) Early embryo: Enhancer of Sxl P" promoter "counts" X chromosomes Transcription factors encoded by X-linked genes (Scute, Runt, SisA, Upd) activate expression of Sx/ +
o+
P
Sx/ _ P, Pu o' f,l--:l-l-r:n-Eul-[J:r:
Sx/
P
lranscnptron
No transcription
I
lntronI EIA4I_1_-,\_A
No Sxl protein Pe
=II-I-5n
* +
RNA
I
Splicing {
*;-.
MRNA
I Sxl protein S
(b) Sxl protein regulates the splicing of its mRNA Male
Female
Pr
P
DNAW 111 213 RNA-.E-t\
DNA L1A 2 RNA
*
4
I
Sxl protein from early development binds to later RNA at exon 3.
::.,
Pu
Truncated
Full-length
A4A A A
-..*
Jpen\
StoP codon
reading reading frame, , by I I purple boxes) open frame (indicated
l+
A
-
I
|
mRNA
Productive splice
Unproductive splice
More Sxl protein
No Sxl protein
566
Chapter
16
Gene Regulation
these activators are on the
in Eukaryotes
X chromosome, XX embryos
have twice as much of these four activators as XY embryos. Only in cells with two X chromosomes is the concentration
of activators sufficient for Sx/ transcription to occur.
The action of the Sxl protein in females
on the X chromosome. In females with mutations in Sx/, each X-linked gene is transcribed at twice the rate it is transcribed in normal females. Because females have two X chromosomes rather than the one X in males, hypertranscription of the genes from the two X chromosomes occurs and it proves lethal.
Sxl is an RNA-binding protein that controls the alternative
splicing of specific RNA targets, including its own RNA (Fig. 16.28b). As embryogenesis progresses, the transcription factors that activate Sxl transcription from P" disappear, and Sx/ is transcribed instead from the "maintenance promoter" (P,,,). In males, splicing of the primary Sx/ transcript produced from the maintenance promoter generates an RNA that includes an exon (exon 3) containing a stop codon in its reading frame. As a result, this RNA in males is not productive-it does not generate any Sxl protein. In females, however, the Sxl protein previously produced by transcription from the establishment promoter P. influences the splic, ing of the primary transcript initiated at the maintenance promoter P-. When the earlier-made Sxl protein binds to the later-transcribed RNA, this binding alters splicing so that exon 3 is no longer part of the final nRNA. Without exon 3, the mRNA can be translated to make more Sxl protein. Thus, a small amount of Sxl protein synthesized very early in development establishes a positive feedback loop that ensures more synthesis of Sxl protein later in development.
Sxl Protein Triggers a Cascade of Splicing The Sxl protein influences the splicing of RNAs transcribed
not only from its own gene, but also from other genes. Among these is the transformer (tra) gene. In the presence of the Sxl protein (as in normal females), the tra primary transcript undergoes productive splicing that produces an mRNA translatable to a functional protein. In the absence of Sxl protein (as in normal males), the splicing of the tra transcript results in a nonfunctional protein (Fig. 16.29a). The cascade continues. You saw previously in Fig. 16.23
(p. 561) that Tra protein (and the non-sex-specific Tra2) control the splicing of fru primary RNA so that the transcription factor Fru-M is produced only in males; Fru-M controls male sexual behaviors. The Tra andTra2 proteins also influence the splicing of the doublesex (dsx) gene's primary transcript. This splicing pathway results in the production of a female-specific Dsx protein called Dsx-F. In males, where there is no Tra protein, the splicing of the dsx primary transcript produces the related, but different,
The effects of 5x/ mutations Recessive Srl mutations that produce nonfunctional gene products have no effect in XY males, but they are lethal in XX females (see Table 16.2). The reason is that males, which do not normally express the Sr/ gene, do not miss its functional product, but females, which depend on the Sxl protein for sex determination, do. The absence of the Sxl protein in females allows the aberrant expression of certain dosage-compensation genes that normally increase (specifically in males) transcription of the genes
Figure 16.29 A cascade of alternate splicing.
(a) Sxt protein
alters the splicing of tro RNA; female transcripts produce functionalTra protein, while male transcripts cannot. {b) Tra protein, in turn, alters the splicing pattern of dsx RNA; a different Dsx product results in males (Dsx-M) and in females (Dsx-F).
(a) tra splicing Results of tra splicing when Sxl protein is present (Q
)
Sxl protein blocks splice site Functional
Tra protein Full length open reading frame
TABLE 16.2
D rosap
hlla Mutation s that
Affaet the Two Sexes
Results of fra splicing in absence of Sxl protein
(d) No funciional
Tra protein
Mutation (Sxl) transformer (tra) doublesex (dsx) intersex (ixl fruitless (fru) Sex
lethal
Phenotype
Phenotype
of XY
of XX
Male
Dead
Male
Male (sterile)
lntersex
lntersex
Male
I
Male with aberrant courtship behavior
All mutant alleles are loss-of-function and recessive to wild type.
Stop codon Truncated open reading {rame
(b) dsx splicing Results of dsx splicing when tra is present ( Q
lra
)
,enca.traget soliae 3ite
Dsx-F
ntersex
Female
Results of dsx splicing when tra is absent
(d)
-*
Dsx-M
uis
What's
Dsx-M protein (Fig. f 6.29b). The N-terminal parts of the Dsx-F and Dsx-M proteins are the same, but the C-terminal parts ofthe proteins are different.
Development of Somatic Sexual Characteristics Although both Dsx-F and Dsx-M function as transcription factors, they have opposite effects' In conjunction with the protein encoded by the intersex (ix) gene, Dsx-F primarily repr"ses the transcription of genes whose expression would generate the somatic sexual characteristics of males. However, it also activates the transcription of genes that promote somatic femaleness. Dsx-M, which works independently of the Intersex protein, does the opposite; it is primarily a transcriptional activator of maleness genes, and it also re-
567
Figure 16.30 Male- and female-specific forms of Dsx protein. At the yp t gene enhancer, Dsx-F acts as a transcriptional activator, whereas Dsx-M acts as a transcriptional repressor.
I
d 1 X chromosome
2 X chromosomes
Dsx-F and Dsx-M Proteins Control
Next
I
+
No Sxl product
Sxl product
+
+
Tra
ct
No Tra product
Tra2
+
Dsx-M protein
Dsx-F protein +
Transcription activated
Enhancer
*'".
,
yp1
YP1 product (yolk protein)
Enhancer No
yp1
X
YPI product
presses femaleness genes.
)
'
Interestingly, the two Dsx proteins can bind to the same enhancer elements, but their binding produces opposite outcomes (Fig. f 6.30). For example, both bind to an enhancer upstream of the promoter for the ypl gene, which encodes a yolk protein; females make this protein in their fat body organs and then transfer it to developing eggs. The binding of Dsx-F stimulates transcription of the ypl gene in females; the binding of Dsx-M to the same enhancer region h"lpr inactivate transcription of yp1 in males. Mutations in dsx affectboth sexes because in both males
and females, the production of Dsx proteins represses certain genes specific to development of the opposite sex. Null mutations in dsx that make it impossible to produce either functional Dsx-F or Dsx-M result in intersexual individuals that cannot repress either certain male-specific or certain female-specific genes (Table 16.2).
essential conceqts
.
The 5x/ gene is the master regulator of sex determination in Drosophila.franscription of 5x/ depends on activator
proteins encoded by genes on the X chromosome; only in cells with two X chromosomes is the concentration of activators high enough for 5x/ transcription.
.
.
ln females, the Sxl protein initiates a splicing factor cascade that culminates in the synthesis of a dsx mRNA that encodes Dsx-F protein. ln males without Sxl protein. alternative splicing results in a dsx mRNA that makes Dsx-M.
The Dsx-F and Dsx-M proteins are transcription factors that have opposite effects on the expression of genes whose products influence female- and male-specific morphologies and behaviors.
In the next section of this book-"Using Genetics"will explore how scientists exploit this knowledge to further our understanding of the workings of cells and
be inserted into or removed from the genomes of model organisms at will-any gene, indeed any base pair, in the genome can be changed at the whim of a molecular geneticist. Genome manipulation is the basis for gene therapies that hold promise for curing some human diseases. In Chapter 18, we will explore how scientists use the genetics of model organisms to dissect biological pathways. In particular, the analysis of mutants that develop aberrantly from a fertilized egg to a multicellular organism has helped uncover many details of this remarkable process. Finally, in Chapter 19, you will see how new technologies for studying genomes are revolutionizing our understanding and treatment of cancer, the most important of all genetic
whole organisms. In Chapter 17,you will see that genes can
diseases.
At this point in your journey through this book, you have seen how genes control phenotype, how they are transmitted from one generation to the next, how they can mutate, and how the structure of a gene relates to its function. You have also learned about technologies that allow scientists to analyze individual genes and whole genomes at the molecular level. In the previous two chapters, we discussed how genes and their products are regulated, allowing a single-celled organism to respond to its environment and
a multicellular organism to form different organs.
we
568
l.
Chapter
16
Gene Regulation
in Eukaryotes
The retinoic acid receptor (RAR) is a transcription factor that is similar to steroid hormone receptors. The substance (ligand) that binds to this receptor is retinoic acid. One of the genes whose transcription is activated by retinoic acid binding to the receptor is myoD. The diagram at the end of this problem shows a schematic of the RAR protein produced by a gene into which one of two different l2-base double-stranded oligonucleotides had been inserted in the ORF. The insertion site (a-m) associated with each mutant protein is indicated with the appropriate letter on the polypeptide map below. For constructs encoding a-e, oligonucleotide I (5' TTAATTAATTAA 3' read off either strand) was inserted into the RAR gene. For constructs encoding f-m, oligonucleotide 2 (5' CCGGCCGGCCGG 3') was inserted into the gene. Each mutant protein was tested for its ability to bind retinoic acid, bind to DNA, and activate transcription of the myoD gene. Results are tabulated as follows: NH,
fehi a
Mutant
de
c
Retinoic acid
DNA
binding
binding
Transcriptional activation
a
b c
d e
f b
h
i
j
+ + + + + +
+ + +
m
+
+ +
a. What
is the effect of inserting oligonucleotide where in the protein?
and so it will just add amino qcids to the protein. Because
there are 12 bases in the oligonucleotide, it will not change the readingframe of the protein.Insertion of the oligonucleotide can disrupt the function of a site in which it inserts, although this will not necessarily be the case.
c. Looking at the data overall, notice that all mutants that are defective in DNA binding are also defective in transcriptional activation, as would be expected for a transcription factor that binds to DNA. The mutants that will be informative about the transcriptional activation domain are those that do not have a DNA-binding defect. Inserts a, b, and c using oligonucleotide 1, which truncates the protein at the site ofinsertion, are defective in all three activities. DNA binding and transcription actiyation are both seen in mutant d, so these two ac-
retinoic acid-binding activity must lie before e. Using the oligonucleotide 2 set of insertions, transcriptional activation was disrupted by insertions at sites g and h, indicating that this region is part of the transcriptional domain; i and j insertions disrupted the DNA binding,
m
NHe a
+
+
acid
b. What
are the possible effects of inserting oligonucleotide 2 anywhere in the protein? a
copy of the preceding drawing.
Answer This question involves the concept of domains within pro-
teins and the use of the genetic code to understand the effects of oligonucleotide insertions. a. Oligonucleotide 1 contains a stop codon in any of its three reading frames. This means it will cause termination of
ll.
cooH
e
Transcriptional activation
I any-
c. Indicate the three protein domains of RAR on
k and I insertions disrupted the retinoic
binding. The minimal endpoints of the domains of RAR as determined from these data are summarized in the following schematic.
+
1
b. Oligonucleotide 2 does not contain any stop codons,
and
+ + +
+ +
k
it
tivities must lie closer to the N terminus than d. Truncation at d is negative for retinoic acid binding, but the truncation at e does bind to retinoic acid. The
iklmCOOH b
translation of the protein in either orientation wherever is inserted.
DNA binding
Retinoic acid binding
Assume that the disease illustrated with the pedigree below is due to the phenotypic expression of a rare recessive allele of an autosomal gene that is paternally
imprinted. What would you predict is the genotype of individuals (a) I-1, (b) Ii-l, and (c) III-2?
Problems
Answer
) This question requires you to understand how imprinting
influences phenotype, and how imprints are reset in the germ line. Alleles of paternally imprinted genes are expressed only if they are inherited from the mother. Neither parent in generation I expresses the disease, but two out of iheir three children do. At first glance, either parent could have the mutant allele of the gene. Because neither expresses the disease, each parent in generation I could have ieceived an inactivated mutant allele from their father and a normal, transcriptionally active allele from their mother. However, further consideration shows that it cannot be the male parent (I-2) who is the source of the mutant allele. He can provide only transcriptionally inactive alleles to generation II. The two generation II children showing the disease phenotlpe must have received the mutant allele from their
569
mother, because hers is the only allele that they express. In the answers below A is the normal dominant allele, and a is the recessive disease allele. a. The genotype of I-1 ls Aa. The a allele is inactive and was inherited from her father. b. The genotype of II-1 ls AA. He must have inherited the normal allele from his mother, as this is the only allele he expresses. We assume that his father is AA because disease alleles are rare, and no data in the pedigree forces us to conclude that II-1 or his father is other than AA. c. The genotype of III-2ls AA. She must have inherited A from her mother (who is Aa) because this is the only allele III-2 expresses and she is unaffected. Because disease alleles are rare, the most likely assumption is that III-2's father is AA.
PROBLEMS Vocabulary 1. For each of the terms in the left column, choose the best matching phrase in the right column. a. basal factors
1. organizes enhancer/
b.
2. pattern of expression
promoter interactions repressors
temporal- and tissue-specifi
4. site of DNA methYlation
e. miRNA
5. identifies DNA binding
ca11y
sites
of transcription factors
f.
coactivators
6. bind to enhancers
g. h.
epigenetic effect
7. bind to promoters
insulator
8. bind to activators
i.
enhancer
9. prevents or reduces gene expression posttranscriptionally
j.
ChIP-Seq
10. change in gene exPression caused by DNA methylation
Section'16.'l
2.
duced in a cell.
4. Which eukaryotic RNA polymerase (RNA pol I, pol Ii, or pol III) transcribes which genes?
3. activates gene transcription
d. imprinting
can affect the type or amount of active protein pro-
Section 16.2
depends on which parent transmitted the allele
c. CpG
3. List five events other than transcription initiation that
a. b. c. d.
tRNAs mRNAs rRNAs miRNAs
5. You have synthesized an "enhancerless" GFP reporter gene in which the jeltyfish GFP gene is placed downstream of a basal promoter that functions in mice. You enhancerless reporter to the three types of sequences listed below (x-z).
will now fuse this
a. lVhich of the three types of sequences would you use for which of the three listed purposes
(i-iii)? In
each case, explain how the particular fusion would address the particular use.
Does each of the following types of gene regulation occur in eukaryotes only? in prokaryotes only? in both prokaryotes and eukarYotes?
Types ofsequences fused to the reporter: x. random mouse genome sequences y. known mouse kidney-specific enhancer
a. differential splicing b. positive regulation c. chromatin comPaction d. attenuation of transcription through translation of
z. fragmenls of genomic DNA surrounding
the RNA leader e. negative regulation
f.
translational regulation by small RNAs
transcribed part of
a
mouse gene
Uses:
i.to identify a genes'enhancer(s) ii. to express GFP tissue-specifically iii. to identify genes expressed in neurons
the
57O
16
Chapter
Gene Regulation
in
Eukaryotes
b. Which of the sequences (x-z) would you fuse to a particular mouse gene of interest in order to express
c. GALL and GALL0 are not only adjacent to each other, but also are transcribed divergently with a
the protein product of the gene "ectopically", that is, in a tissue in which the gene is not usually expressed? Why might you want to do this experiment in the first place?
single UAS6 between them. Describe experiments using GFP and RFP transgenes that would allow you to determine if this particular UAS6 is required for the transcription of either or both genes, and if so, which of the four GAL4 binding sites in this UAS6 element is (are) important.
6. You isolated
a gene expressed in differentiated neurons in mice. You then fused various fragments of the gene (shown as dark lines in the following figure) to a GFP reporter lacking either enhancers or a pro-
9. MyoD is a transcriptional activator that turns on the expression of several muscle-specific genes in human cells. The Id gene product inhibits MyoD action. a. One possibility is that the Id protein is a repressor. Explain how Id would function if it were a repressor.
moter. The resulting clones were introduced into neurons in tissue culture, and the level of GFP expression was monitored by looking for green fluorescence. From the results that follow, which region contains the promoter and which contains a neuro-
b. Another possibility is that Id inhibits muscle, specific gene transcription indirectly, by preventing MyoD function. Explain how Id could function as
nal enhancer?
c.
Regulatory region of nerve gene
Id protein. How might this information support hypothesis (a) above? How might the amino acid sequence support hypothesis (b)?
E GFP
Fragments fused to GFP
5
1O.
5 5
Yeast genes have cfs-acting elements upstream of their
b. Now assume that either transcription factor is suffi-
promoters, similar to enhancers, called upstream activating sequences or UASs. Several target genes involved in galactose utilization are regulated by one type of
8. A single UAS6 regulates the expression of three adjacent genes: GALT and GALL} as described in Problem 7, and also GALL. a. Would you expect these genes to be transcribed into individual transcripts, or to be cotranscribed as one
b.
mRNA? Explain. How could you determine experimentally whether each gene is transcribed separately or instead that the three are cotranscribed into a single mRNA?
Assume that two transcription factors are required for expression of the blue pigmentation genes in
white.) What phenotypic ratios would you expect from crossing strains heterozygous for wild-type and recessive amorphic alleles for each of the genes encoding these transcription factors?
80
UAS called UAS6, which has four binding sites for an activator called GAL4. Two target genes regulated by UAS6 are GALT and GAL10. The GAL80 protein is an indirect repressor of GALT and GALI} transcription; at UAS6, GAL80 binds to GAL4 protein and blocks GAL4's activation domain. In the presence of galactose, GAL80 no longer binds GAL4. In which gene(s) (GAL4 and/or GAL80) should you be able to isolate mutations that allow the constitutive expression of the target genes GALT and GALI7 even in the absence ofgalactose. In each case, what characteristics of the protein would the mutation disrupt?
a.
pansies. (Without the pigment, the flowers are
BO
7.
an indirect repressor. Suppose you know the amino acid sequence of the
cient to get blue color. What phenotypic ratios would you expect from crossing the same two heterozygous strains?
1
1. a. You want to create a genetic construct that will express GFP in Drosophila.In addition to the GFp coding sequence, what DNA element(s) must you include in order to express this protein in flies if the construct were integrated into the Drosophila genome? Where should such DNA element(s) be located? How would you ensure that GFp is expressed only in certain tissues of the fly, such as the wing?
b. In makingyour construct, you insert the GFp coding region plus all of the DNA elements required by the answer to part (a) excepting the enhancer in between
inverted repeats found at the end of a particular transposable element. Because all the DNA se, quences located between these inverted repeats can move from place to place in the Drosophila genome, you can generate many different fly strains, each with the construct integrated at a different genomic location. You now examine animals from each strain for GFP fluorescence. Animals from different strains show different patterns: some glow green in the eyes,
Problems
others in legs, some show no green fluorescence, etc. Explain these results and describe a potential use of your construct.
12. In Problem 11, you identified a genomic region that is likely to behave as an enhancer. What experiments could you perform to verify that these DNA sequences indeed share all the characteristics ofan enhancer and to determine the precise boundaries of the enhancer in the genomic DNA?
571
a concise summary of how these three proteins can regulate cell proliferation. d. Would cancer-causing mutations in myc be loss-offunction or gain-of-function mutations?
c. Provide
15. Genes in both prokaryotes and eukaryotes are regulated by activators and repressors.
a. Compare and contrast the mechanism of function ofa prokaryotic repressor (for example, Lac repressor) a typical eukaryotic repressor protein (a direct repressor).
with
13. A graduate student came up with the following idea for identifying insulators in the Drosophila genome: Perform an experiment like the one described in Problem 11, but instead of using an "enhancerless" construct, use one that contains an enhancer, and screen for lines that do not express GFP. a. What is wrong with this experimental design? b. Can you think of a different experiment that you could perform to identify insulators?
16. The modular nature of eukaryotic activator proteins
transcription factor that regulates cell proliferation; mutations inthe myc gene contribute to many cases of the cancer Burkitt lymphoma. Initial experiments on Myc were puzzling. The Myc protein contains both a leucine zipper dimerization domain and a specialized type of helix-loop-helix DNA-binding domain called a bHLH motif, but purified Myc can neither homodimerize nor bind to DNA efficiently.
construct in which UAS6 (an enhancer-like sequence that binds the activator Gal4 as described in Problem 7)
14. Myc is
a
Discovery of the Max and Mad proteins helped resolve this dilemma. Like Myc, Max and Mad each contain a bHLH motif and a leucine zipper, but neither Max
nor Mad homodimerize readily nor bind DNA with high affiniry However, Myc-Max and Mad-Max heterodimers do readily form and bind DNA; in fact, they bind the same sites on the enhancers of the same target genes. Myc contains an activation domain, while Mad contains a repression domain, and Max contains neither. The max gene is expressed in all cells at all times. In contrast, mad is expressed in resting cells (in the Gs phase of the cell cycle), while myc is not transcribed in resting cells but starts to be expressed when cells are about to divide (at the transition from G6 to S phase).
Mad and Myc proteins are unstable relative to Max protein; when expres sion of mad or myc ceases' Mad or Myc proteins soon disappear. a. Do you think target genes with enhancers containing binding sites for Myc-Max encode proteins that would arrest the cell cycle or that would drive the cell cycle forward? What about genes with enhancers containingbinding sites for Mad-Max? Explain your answers. b. Draw diagrams similar to those seen in Fig. 16.13 on p. 554 to show the control region for a target gene, the proteins binding to the enhancer, and whether or not transcription is taking place in (i) resting cells, and (ii) cells that are about to divide.
b. Compare and contrast the mechanism of function of
a
prokaryotic activator (for example, CAP) with
a
tlpical eukaryotic activator protein. gave scientists an idea for a way to find proteins that interact with any particular protein ofinterest. The idea
is to use the protein-protein interaction to bring together a DNA-binding region with an activation region, creating an artificial activator that consists of two polypeptides held together noncovalently by the interaction. The method is called the yeast two-hybrid system, and it has three components. First, the yeast contains a
drives the expression ofanE. colilacZreporter (encoding the enzyme l3-galactosidase) from a yeast promoter. Second, the yeast also expresses a fusion protein in which the DNA-binding domain of Gal4 is fused to the protein of interest; this fusion protein is called the "bait:'The third component is a cDNA library made in plasmids, where each cDNA is fused in frame to the activation domain of Gal4, and these can be expressed in yeast cells as "pt.y" fusion proteins. How could you use a yeast strain containing the
first two components, along with the plasmid cDNA expression library described, to identify prey proteins that bind to the bait protein? How is this procedure relevant to the goal of finding proteins that might interact with each other in the cell?
Section 16.3 17. Prader-Willi syndrome is caused by a mutation in an autosomal maternally imprinted gene. Label the following statements as true or false, assuming that the trait is 1007o penetrant.
will show the syndrome. b. Half of the daughters of affected males will show the syndrome. c. Half of the sons of affected females will show the syndrome. d. Half of the daughters of affected females will show the syndrome. a. Half of the sons of affected males
572
16
Chapter
Gene Regulation
in Eukaryotes
18. The human IGF-2R gene is autosomal and mater-
nally imprinted. Copies of the gene receiyed from the mother are not expressed, but copies received from the father are expressed. You have found two alleles of this gene that encode two different forms of the IGF-2R protein distinguishable by gel electrophoresis. One allele encodes a 60K (Kilodalton) blood protein; the other allele encodes a 50K blood protein. In an analysis ofblood proteins from a couple named Bill and Joan, you find only the 60K protein in foan's blood and only the 50K protein in Bill's blood. You then look at their children: fill is producing only the 50K protein, while Bill |r. is producing only the 60K protein. a. With these data alone, what can you say about the IcF-2Rgenotype of Bill Sr, and foan? b. Bill /r. and a woman named Sara have two children, Pat and Tim. Pat produces only the 60K protein and
shows the percentage of reads for the given mRNA that correspond to the AKR allele of that gene.)
AKR
9
X PWD
PWDg X AKRd Gene A Gene B
Gene C
10O% 59o/o Oo/o 50%
1o0o/o
Percentage of AKR allele
a. Which of the genes (A, B, or C) is maternally imprinted? Which is paternally imprinted? Which is not imprinted?
b. Why was
it
important to perform reciprocal of the genes
crosses to determine whether any were imprinted?
Tim produces only the 50K protein. With the accu-
c. Using the same type of diagram that indicates the
mulated data, what can you now say about the genotypes of Joan and Bill Sr.?
percentage of AKR alleles, diagram the expected results for these same three genes if a female F1 mouse from the cross on the left (that is, a daughter of a cross between an AKR female and a PWD male) was then crossed to a PWD male. Describe the two possible outcomes for each gene.
19. Follow the expression of a paternally imprinted gene through three generations. Indicate whether the copy of the gene from the male in generation I is expressed in the germ cells and somatic cells of the individuals listed.
23 lil 2
a. generation b. generation c. generation d. generation e. generation
I male (I-2) germ cells
II daughter (II-2): somatic cells II daughter (II-2): germ cells II son (II-3): somatic cells II son (II-3): germ cells f. generation III grandson (III-1): somatic cells g. generation IiI grandson (III-1): germ cells 20. Reciprocal crosses were performed using two inbred strains of mice, AKR and PWD, that have different alleles of many polymorphic loci. In each of the two crosses, placental tissue was isolated whose origin was strictly from the fetus (this can be separated by dissection from placental tissue originating from the mother). RNA was prepared from the fetal placental tissue and then subjected to "deep sequencing" (that is, RNA-Seq) Because of the polymorphisms, investigators could compare the number of "reads" of mRNAs for specific genes that were transcribed from maternal or paternal alleles, as shown in the following figure. (The x-axis
21. Interestingly, imprinting can be tissue-specific. For example, a gene that is maternally imprinted in fetal placental tissue is not imprinted at all in the fetal heart. Guided by the diagram inFig.l6.22aon p. 560, suggest a mechanism that could explain the tissue specificity of imprinting. (Hlnfr Remember that a gene may have multiple enhancers that allow expression in different tissues.)
22. Antibodies are currently available that will bind specifically to DNA fragments containing 5-methylcltosine but not to DNA lacking this modified nucleotide. How could you use these antibodies in conjunction with the ChIP-Seq technique outlined in Fig. I6.t7 onp. 556 to look for imprinted genes in the human genome?
Section 16.4 23. a. How can a single eukaryotic gene give rise to several different types of mRNA molecules?
b. Excluding the possible rare polycistronic message, how can a single mRNA molecule in a eukaryotic cell produce proteins with different activities? 24. The hunchback gene, a gene necessary for proper pat-
terning of the Drosophila embryo, is translationally regulated. The position of the coding region within the transcript is known. How could you determine if the sequences within the 5' UTR or 3' UTR, or both, are necessary for proper regulation of translation?
Problems
25. You know that the mRNA and protein produced by a particular gene are present in brain, liver, and fat cells, but you detect an enzymatic activity associated with this protein only in fat cells. Provide an explanation for this phenomenon.
26. You are studying a transgenic mouse strain that expresses a GFP reporter gene under the control of clsacting regulatory elements that normally control a gene needed for the early development of mice. Previous evidence from transcriptome sequencing (RNA-seq) indicates that mRNA for the gene of interest can be identified between days 8,5 and 10.5 of gestation. In your strain, GFP fluorescence can be seen from
about day 8.75 until at least day 12. a. Explain the discrepancy between mRNA and pro-
tein expression. b. Would you expect GFP protein expression to indicate more accurately the normal onset of activity for this gene or the normal cessation of this gene's activity? Explain.
)
27. By searching a human genomic database, you have found a gene that encodes a protein with weak homology to Argonaute, a factor present in complexes that bind to certain miRNAs and mediate the ability of these miRNAs to regulate the stability or translation of target mRNAs (review Figs. 16.25 and 16.26 on p. 563). a. How would you determine which specific miRNAs might be associated with the new protein you discovered? (Think about how you might use a variation of the ChIP-Seq technique described in Fig. 16.17 on p. 556 to explore this question.) b. If a mouse could be obtained that was homozygous for a null mutation of the mouse gene almost identical to the human gene you found, how could you use
this mutant mouse to ask what mRNAs might be targeted by the miRNA-RISC complexes containing your Argonaute-like protein?
28. Scientists have exploited the siRNA pathway to perform a technique called RNA interference-a means to knock out the expression of a specific gene without having to make mutations in it. The idea is to introduce dsRNA corresponding to the target gene into an organism; the dsRNA is then processed into an siRNA that leads to the degradation of the target gene's nRNA. One clever method for delivery of the dsRNA to some organisms (the nematode C. elegans, for example) is to feed them bacteria transformed with a re-
)
combinant plasmid that expresses dsRNA. a. Draw a gene construct that, when expressed from a plasmid in bacteria, could be used to knock out the expression of gene X of C. elegans.
573
b. How can you test if gene X expression is obliterated in worms that have eaten the bacteria transformed with a plasmid containing your construct? c. Do you think that only gene X expression will be affected in these worms? Explain.
Section
I5.5
29. Researchers know that Fru-M controls male sexual behavior in Drosophila because inappropriate Fru-M expression in females causes them to behave like males: Such females display male behaviors that are oriented toward other females. a. Describe afru mutalion that could cause expression of Fru-M in females. b. Describe a gene construct that scientists could generate and insertinto Drosophila females that would have the same effect as the mutant you described in (a).
3O. The Drosophila gene Sex lethal (Sxl) is deserving of its name. Certain alleles have no effect on XY animals, but cause XX animals to die early in development. Other alleles have no effect on XX animals, but cause XY animals to die early in development. Thus, some Sxl alleles are lethal to females, while others are lethal to males. a. Would you expect a null mutation in Sxl to cause lethality in males or in females? What about a constitutively active Sxl mutation? b. Why do Sx/ alleles of either type cause lethality in a specific sex? The gene transformer (tra) gets its name from "sexual transformationj' as some tra alleles can change XX animals into morphological males, while other fra alleles can change XY animals into morphological females.
c. Which of these sex transformations would be caused by null alleles of tra and which would be caused by constitutively active alleles of tra?
d. In contrast with Sxl null tra muLations do not cause lethality either in XX or in XY animals. However, the Sxl protein regulates the production of the Tra protein. Why then do all tra mutant animals survive? e. Predict the consequences of null mutations in tra-2 on XX and XY animals. (Recall that tra-2 encodes a protein, expressed in both sexes, that is required for Tra function.)
f. XY males carrying loss-of-function the fruitless
(fru)
mutations in
gene display aberrant courtship
behavior. Would you predict that either XX or XY
animals with wild-type alleles of fru but lossof-function mutations of tra would also court abnormally?
574
Chapter
16
Gene Regulation
in Eukaryotes
31. The Sxl protein of Drosophila is an RNA-binding factor that affects splicing of primary transcripts for several genes, including its own gene and the tra gene, as shown on Figs. 16.28 (p.565) and 16.29 (p.566). Sxl can also bind to specific mRNAs and influence their translation through a different mechanism. The most important of these target mRNAs is the product of the gene male-specific lethal2. The MSL-2 protein is a transcription factor that binds to the X chromosome in XY males to double the level of X-linked gene transcription, thus equalizing X-linked gene expression in XY
males and XX females. a. In contrast to most mRNAs, the open reading frame of msl-2 begins not with the first, but instead with
the fourth AUG reading from the cap at the 5' end. (The first three AUGs are not in frame with the fourth one.) Propose a model that explains how the binding of Sxl protein to the zsl-2 nRNA would prevent resl-2 expression. b. In which sex, XY males or XX females, would the Sxl protein bind to the msl-2 mRNA? c. Do your answers to parts (a) and (b) of this question explain why some Sx/ alleles are lethal to females and other Sx/ alleles are lethal to males, as discussed
in Problem 30? Explain. d. Predict the effect of loss-of-function mutations in msl-2 on male and female fertility and viability.
',chapter
17 Manipulating the Genomes of Eu ka
'=
ryotes Glofsh,'transgenic zebrafsh (Danio rerio) that express dffirentcolored variants of GFP and RFE were the first genetically modifed pets.
chapter outline
t
ii1
7.1 Creating Transgenic Organisms
::
17.2 Uses of Transgenic Organisms '.1
:r
!l
ii
u
NTtL REcENTLY, cHILDREN born with poor vision due
i:
1
7.3
Targeted Mutagenesis
17.4 Human Gene Therapy
t;
to a genetic disease called Leber congenital amaurosis (LCA) ; were destined to become completely blind by early adulthood. Now, for many of these children, the success of gene therapy trials provides hope not only for a halt to the retinal degeneration characteristic of the disease, but even for restoration of normal sight. One form of LCA is caused by loss of function of a gene called RPE65. This gene encodes a protein found in the retinal pigment epithelium (a cell layer just beneath the retina) that is crucial for the function of photoreceptors. The RPE65 enzyme functions in the visual cycle-the process by which the retina detects light. LCA patients lose sensitivity to light, which eventually results in a reduction in the amount of brain cortex devoted to visual processing (Fig. l7.l).
i, , :;
Gene therapy is the manipulation of genes-adding DNA to the genome or
,
altering the DNA of a gene-in order to cure a disease. The experimental gene therapy strategy for this form of LCA was simple: Scientists delivered normal copies of the RPE65 gene to the retinal pigment epithelium cells of patients simply by injecting DNA packaged in viral particles through the eye into these cells. Since the first results of RPE65 gene therapy clinical trials were reported in 2008, more than 30 patients have undergone the procedure, and almost all of them have had their vision restored at least in part; several are no longer considered legally blind. In this chapter, you will learn about two general strategies for altering genomes: creation of transgenic organisms and targeted mutagenesis. Development of these exciting technologies has relied on knowledge of the natural processes by which DNA can move within a genome, can be transferred between individuals and between species, and can be protected from alteration or degradation. The overarching theme of this chapter is that by using recombinant DNA technology, scientists can harness these natural processes to develop creative and powerful methods to alter genomes-not only in order to treat disease, but also to improve the production of medicines and food crops and to enhance modern biological research. 575
576
Chapter
17
Manipulating the Genomes of Eukaryotes
,
Normal
Signal Change
Figure 17.1 Activation of the brain cortex in response to
r.=I
\
light, The cortex of the brain of a normal dog (fop) and a dog with Leber congenital amaurosis (LCA) due to mutations in the RPE65 gen e (bottom) are mapped in shades ofgroy. The rghr and /eft images for each dog are views from different angles. Yellow and orange signals indicate the amplitude of cortical activation in response to a controlled amount of light shined on the eyes of these animals. Much less of the cortical region of the LCA dog participates in visual processing.
RPE65
F
Medial
sf,|
Creating Transgenic Organisms
solution directly into a cell or embryo (Fig. f7.2). In yet other cases, the transgenic DNA can be incorporated into a virus particle or even
a
bacterium that can then infect
a
1.
Summarize how scientists create transgenic mice with pronuclear injection.
2.
Describe how P elements are used to produce transgenic Drosophila.
3.
Explain how researchers use the Ti plasmid from Agrobacterium to insert genes into plants.
Once introduced into a cell, the transgene has to be replicated and maintained as the cell divides. In most cases, these goals are accomplished by integrating the transgene into a random location in the genome of the host cell; but in some species, the transgenes can be maintained outside of
Figure 17.2 lnjecting transgenes into cells.
(a) An investigator injecting DNA into one of the two pronuclei present in a mouse embryo soon after fertilization. (b) DNA is being injected into the posterior end of an early Drosophila embryo that is a single syncytial cellwith is
A transgenic organism is one
cell
(described later).
Iearning objectives
whose genome contains
a
gene from another individual of the same species or a different one; such a gene is called a transgene. In this section we
will discuss some of the ways researchers can make trans-
many nuclei.
(a) lnjecting a transgene into a recently fertilized mouse egg
genic organisms, and then in the next section we will examine just a few ofthe possible uses oftransgenic technology, which are limited only by the scientist's imagination.
Scientists Exploit Natural Gene Transfer Mechanisms to Create Transgenic Organisms Transgenes can be made in vitro using the types of recombinant DNA techniques described in Chapter 9. But to
make transgenic eukaryotes, researchers need to introduce the transgene DNA into one or more cells. This goal can be accomplished in various ways depending on the organism. Some unicellular eukaryotes like the yeast S. cerevisiae can be subjected to treatments that disrupt the cell wall, and DNA can then enter the cells in a fashion very similar to the artificial transformation of E. coli (see Fig. 9.5b on p. 307).
For many other organisms, the most eficient means of transferring DNA into cells involves injection of a DNA
(b) lnjecting a transgene into a Drosophila embryo
17.1 Creating Transgenic
the host chromosomes, either as an extrachromosomal array (in C. elegans) or (in yeast) as a plasmid. Finally, in order for the transgene to be propagated between generations of a multicellular organism, it is critical that cells containing the transgene have the ability to develop eventually into gametes. In animals, this requirement means that the transgene must be incorporated into a germ-line cell; while in plants, almost any cell can cafty the transgene because entire plants can be regenerated from isolated cells. We describe here methods to create transgenic mice, flies, and plants that illustrate many of these points. These techniques are in large part based on our knowledge of natural gene transfer mechanisms.
DNA lnjection into Pronuclei Generates Transgenic Mice A fertilized mouse egg (zygote) contains two haploid pronuclei-one maternal and one paternal. The pronuclei come close together, their nuclear membranes break down,
and the maternal- and paternal-derived chromosomes intermingle so that their sister chromatids separate on the same spindle for the first mitotic division. At the conclusion of this mitosis, each cell of the two-cell embryo has a single diploid nucleus. To create a transgenic mouse carrying a foreign DNA )
)
into one of its chromosomes by pronuclear injection, as shown in Fig. 17.3, a researcher mates a male and female mouse and harvests the justfertilized eggs from the female's reproductive tract. The investigator then injects linear copies of the foreign DNA into either one of the pronuclei of the fertilized egg (see also Fig. 17.2a). The injected, fertilized egg is then implanted into the oviduct of a pseudo-pregnant female, where it can continue its development as an sequence integrated
embryo. Roughly 25o/o fo 50o/o of the time, the injected DNA will integrate into a random chromosomal location. Integration can occur prior to the first mitosis, in which case the transgene will appear in every cell of the adult body. Alternatively, integration may occur somewhat later, after the embryo has completed one or two cell divisions; in such cases, the mouse will be a mosaic of cells, some with the transgene and some without it. As long as the transgene is present in germ-line cells, the transgene will be transmitted to the next generation. A mouse formed from a gamete containing a transgene can then be mated with other mice to establish stable lines of transgenic animals. The exact mechanism for random transgene integration is unknown, but clearly depends on DNA repair enzymes that seek out and repair broken ends of DNA, probably similar to those involved in nonhomologous end-joining (NHEJ; see Fig. 7 .17 on p. 222). Usually, many
Organisms
577
Figure 17.3 Making transgenic mice by pronuclear injection Several zygotes recovered from sacrificed female
Zygotes transferred to a depression slide containing culiure medium
I
Culture medium
oil
As zygote is held in place, DNA isi njected into a pronucleus.
Holding pipetie Pronuclei
DNA to be injected
Concatamer of injected DNA copies
'+
.'
lnjection pipette
/,.' ,u'
/
Several tandem copies of injected DNA integrate into a random location
onachromosome.
Several injected embryos are placed into oviduct of receptive female
I Mice that were injected as embryos are born.
Their tail cells are iested for the presence of injected DNA.
tandem copies ofthe transgene become integrated together into the same random site in the genome (Fig. 17.3).
Recombinant P elements Can Transform Drosophila P elements are a class of DNA transposon in Drosophila (recall Fig.12.23b, p. 430). Autonomous P elements contain a gene for transposase protein, and the transposon
578
Chapter
17
Manipulating the Genomes of Eukaryotes
Figure 1 7.4 Constructing transge nic Drosophilo by P
element transformation,
A transgene (Gene) is first ligated into
vector containing the white+ gene (w+) located between P element inverted repeats. Researchers inject this plasmid, along with a helper plasmid containing the P element transposase gene, into rar host embryos where transposition occurs in some germ-line cells. When adults with these germ cells are mated with w- flies, some progeny will have red eyes and an integrated transgene. a
P end
Transposase gene
w+
P element end
Helper plasmid
Posterior end
Grows into
Gametes with transformed DNA in genome
When these two plasmids are injected into embryos, transposase protein produced by the helper plasmid can mobilize the recombinant P element, cutting it out of the other plasmid and pasting the element into a random site in a Drosophila host chromosome (Fig. 17.4). After the injected embryos mature into adults, researchers cross each adult to w- flies. If a recombinant
P element. Investigators can recognize transgenic progeny (flies containing the recombinant P element) because they will have red (w+) eyes. These red-eyed flies can be used in cross schemes to establish stable lines of transgenic flies.
A recombinant P element containing a transgene will not subsequently mobilize and move around the genome in flies of this stable line because laboratory strains of Drosophila do not contain P elements, so no transposase will be present.
X
I
}ffi:,.,, L
the "helper" plasmid, contains the transposase gene but no P element inverted repeats (Fig. 17.Q,
P element integrates into a chromosome of a germ-line precursor cell, some gametes produced by the injected animal will carry a chromosome with the recombinant
+
Transformation plasmid
P element inverted repeats. The other plasmid, called
l*f4.
Drosophria genomic DNA
J
ends are inverted repeats. Transposase binds the inverted repeats, cuts the transposon out of the genome, and "pastes" it into a new location. Drosophila geneticists use P elements as vectors (vehicles) for transfer of genes into germ-line cells-a process called P element transformation. P element vectors are plasmids that contain the P element ends but not the transposase gene (Fig. 17.4). Using recombinant DNA techniques, scientists replace the transposase gene with the transgene and a marker gene; the marker gene enables detection of transgenic flies. A widely used marker gene is the white gene (w+), which confers normal red eye color to flies with mutations in their endogenous
white genes (w ). Figure 17.4 shows
a common procedure for generating transgenic flies with P element vectors. Investigators inject two plasmids into ry- embryos at an early stage of development when at most several hundred nuclei are present (see also Fig. 17.2b). One plasmid was made by cloning the trans-
gene into the vector; this plasmid now contains the transgene and the w+ marker gene, both located within the
Agrobacterium Ti Plasm id Vectors Accomplish Plant Transgenesis A vector derived from the tumor-inducing (Ti) plasmid of the bacterium Agrobacterium tumefaciens is the basis for an
efficient method for introducing transgenes into plantsAgrobacterium-mediated T-DNA transfer. (Fig. 17.5). A. tumefociens bacteria infect plant cells; during infection the bacteria can transfer DNA into the host cell in a process reminiscent of conjugation in bacteria because it involves formation of a pilus connecting the A. tumefaciens donor and the plant cell recipient. The transferred DNA is called the T-DNA, which is a portion of the Ti plasmid DNA present in A. tumefaciens. The T-DNA integrates into the plant cell genome. Because the T-DNA contains a gene that causes cell overgrowth, the descendants of the T-DNA-containing cell form a tumor called a crown gall. T-DNA transfer depends on 25 base pair sequences at each end of the T-DNA called the Ieft and right borders (LB and RB), and on several proteins encoded by the vir genes normally present on the Ti plasmid. You can see that the integration of T-DNA into the host chromosome is in many ways analogous to the mobilization of a DNA transposon, with the vir gene proteins being trans-acting enzymes that work on the LB and RB sequences at the border. The method for using T-DNA to transfer genes into plant genomes thus has some underlying similarities with the P element procedure just outlined for Drosophila.
I
Figure 17.5 Transgenic plants produced using a T-DNA plasmid vector. Researchers infect plants with Agrobacterium tumefaciens bacleria containing two plasmid constructs. A T-DNA plasmid contains a transgene (Gene) and marker gene that confers resistance
to a herbicide, both within the T-DNA ends LB and RB. A helper plasmid contains the vrr genes, required forT-DNA transfer to a plant cell. Upon infection, the recombinant T-DNA integrates into the host plant genome. Investigators select for single cells with a transgene insertion by growing cells in the presence of herbicide. They then grow the selected cell into a whole transgenic plant.
,,1ii6"ftF"
r---.
LB
Uses of Trarrsgerr ic
Organisms
579
makes crown galls to introduce foreign DNA into plants. Naturally occurring enzymatic processes, whether those
used
for DNA repair or for mobilizing transposons or
are thus the basis for integrating foreign DNA into host chromosomes.
T-DNA,
essential concepts
. .
, o",nioid"'%,-t
7.2
.
Transgenic mice are produced by injecting foreign DNA into a pronucleus of a fertilized egg. Transformation of Drosophilo relies on the construction of transgenes inserted into P element transposon vectors. Researchers make transgenic plants by infecting plant cells with Agrobacterium containing a Ti (tumor inducing) plasmid
engineered to contain the transgene.
. T-DNA plasmid
Helper plasmid
Transform Ag
ro b acte r i u
These methods of creating transgenic organisms result in the integration of transgenes at random locations in the
host genome.
m
with plasmids and spray transformed bacieria on plants Agrobacteriumtumefaciens Plantcell -
---'
Recombinant
}DNA transferred to
' plant cell
l$f|
LFses
of Transgeffiic Organisms
learning objectives
:
1.
T
Vir proteins Genomic DNA
Describe how transgenes can clarify which gene causes mutant phenotype.
a
herbicider Grow embryos from single cells; add herbicide io select for transformants
Gene
2.
Summarize the use of transgene reporter constructs in gene expression studies.
3.
Discuss examples of how transgenic organisms serve to
Recombinant T-DNA
integrates into plant genome
produce proteins needed for human health.
4.
List examples of GM organisms and discuss the pros and cons of their production.
5.
Explain the use of transgenic animals to model gain-offunction genetic diseases in humans.
Transformed plant
Researchers transform A. tumefociens with two different plasmids (Fig. 17.5). One is a helper plasmid that contains the vir genes but no border sequences. The other plasmid is the T-DNA vector engineered to contain the gene to be transferred and a marker gene (often a gene that confers resistance to an herbicide), both located between LB and RB sequences. To start the infection, investigators spray the transformed A. tumefaciens onto whole plants or plant cells. They next grow individual infected plant cells in culture to generate embryonic plants, and select embryos transformed with the recombinant T-DNA by adding herbicide to the growth medium (Fig. 17.5). These examples of methods used for constructing transgenic organisms show how scientists can take advantage of natural processes to alter genomes. Researchers in essence have "hijacked" the process by which A . tumefaciens
Our ability to generate transgenic organisms has had
a
major impact on biological research and is also increasingly important for several aspects of daily life. Studies with transgenic model organisms enable researchers to better
understand the functions of particular genes and their regulation, and to model certain human diseases in animals. In addition, scientists have engineered transgenic plants and anirnals to produce drugs and (more controversially) better agricultural products.
Transgenes Assign Genes to Phenotypes In many genetic investigations, the available information may not allow scientists to pinpoint the gene responsible
580
Chapter
17
Manipulating the Genomes of Eukaryotes
for a particular phenotype. The construction of transgenic organisms often allows investigators to resolve
Figure 17.6 Using Drosophilatransgenes to link a mutant phenotype to a gene. (a) Scanning electron micrographs show that
ambiguities. As an example, suppose a geneticist interested in how the Drosophila eye develops isolates a mutant fly strain homozygous for a recessive mutation (m-) that results in malformed eyes (Fig. 17.6a). Molecular analysis reveals that the mutation is a small deletion that removes the 5' portions of two different genes (Fig. 17.6b).Is it the loss of gene A or the loss of gene B that accounts for the eye defects?
loss of m gene activity results in malformed fly eyes. (b) The m- mutation is a deletion of two adjacent genes: gene A and gene B. (c) Flies conta in ing the geneA transgene (/eff) that are also m still have malformed eyes,
/m
while m- /m- flies containing the gene B trans gene (right) have wild-type eyes. Therefore, the malformed eyes are due to the loss of gene fl not to the loss ofgeneA.
(a) Homozygotes for recessive mutation have defective eyes.
You could answer this question by creating recombinant constructs containing either gene A or gene Bwi.d-
P element
type genomic DNA (Fig. f 7.6c). You would then test the ability of each transgene to restore the normal eye phenotype. For example, if homozygous m /m- flies carrying a wildtype gene A transgene have malformed eyes, buI"m- /m- flies carrying a wild-typegene Btransgene have normal eyes, you would conclude that the loss of gene B is the cause of the mutant phenotype; in other words, m : gene B. You saw previously in Chapter 4 an important historical example of this same logic. You will recall that an XX mouse containing an SRY transgene developed as a male (Fast Forward Box on p. 93).This experiment exemplifies how transgenic technology can be used to understand the function of a particular gene: Here, expression of the SRY gene in an unusual context (in an organism with two X chromosomes and no Y) showed that SRYcontrols maleness in mammals like mice and humans.
Transgenes Are KeyTools for Analyzing Gene Expression In Chapter 16, we described how scientists use reporter constructs containing foreign genes whose protein products are easy to detect (such as the jellyfish gene for GFPGreen Fluorescent Protein-or E. coli's IacZ gene for the enzyme p-galactosidase) to study many aspects of the regulation of gene expression in eukaryotic species. Such
reporter constructs help researchers identify enhancers that dictate the transcription of a gene in specific tissues at particular times in development (review Fig. 16.3 on p. 550). Reporter constructs are also valuable in finding genes that encode transcription factors that interact with the enhancers (review Fig. 16.12 on p. 554). Here, we remind you that the function of these reporter constructs can be monitored only when they are introduced into eukaryotic organisms
as transgenes.
Transgenic Cells and Organisms Serve as Protein Factories In Chapter
15, you saw that some human proteins used as drugs can be produced in bacteria transformed with fusion
Wild type
m-/m-
(b) Deletion in genomic DNA removes parts of two genes,
Wildtype
f.r-Gene Deletion mutates gene A and gene B
Det
(m
)
(c) Transgenes tested
----.-'
f{:"s
for rescue
'-
V
Transgene does not rescue m- / m- mutant phenotype.
m+GeneA
Transgene rescues
m-lm
mulanl phenotype.
m:GeneB
gene constructs in which the coding sequences for the human protein were placed under the control of bacterial promoters and Shine-Dalgarno sequences that ensure high levels of gene expression (review Fig. 15.27, p. 534). Pharmaceutical companies produce human growth hormone and insulin in this way. However, not all human proteins can be produced in a functional form by expressing them in bacteria. Bacteria are unable to perform many important posttranslational operations that can be critical
r
17.2 Uses of Transgenic
for protein function, including proper folding or cleavage
..
)of polypeptides, or modifications such as glycosylation and
phosphorylation.
To circumvent such problems, drug companies can sometimes use transgenic mammalian or plant cells that grow suspended in liquid culture. Several pharmaceutical proteins are produced this way, such as Factor VIII protein, the blood-clotting factor that is deficient in some people with hemophilia, and erythropoietin (EPO), a hormone that stimulates red blood cell production that has been misused as a performance-enhancing drug by some infamous athletes. However, cell cultures produce only low yields of recombinant proteins, and growing the cells
Organisms
581
Figure 17.7 Agene construct that produces a human anticoagulant protein in the milk of transgenic Aoats. Goat
mammary
Goat promoier
gland enhancer Human antithrombin lll oDNA
Protein Purified
lrom goat
I
Y
Aniithrombinlll
milk
is expensive.
surprisingly, the same pharmaceutical companies that are developing the technology to produce drugs in trans-
Pharming Transgenic farm animals and plants can provide a costeffective and high-yield alternative to cell culture for producing human proteins. The use of transgenic animals and plants to produce protein drugs is sometimes called pharming, a combination of the words "farming" and "pharmaceutical." Pharming technology is still in its infancy; so far (in 2013), only one "pharmed" drug is available to patients, but many more are in development. , ft. method used most commonly for the production of human protein drugs in transgenic animals is protein expression in the mammary glands, because proteins secreted into the milk can be purified in high yield. By pronuclear injection (as in Fig. 17.3), transgenes encoding human proteins have been transferred to goats, pigs, sheep, and rabbits. In 2009, the United States Food and Drug Administration (FDA) approved the first human protein drug produced in the milk of a transgenic animal: the blood factor antithrombin III. The goats that produce this drug were transformed with a fusion gene in which the regulatory sequences of a goat gene normally expressed in the mammary gland were fused with the coding region of the human gene encoding antithrombin III, a blood plasma factor that inhibits coagulation (Fig. 17 .7). People with only one functional copy of the gene for antithrombin III tend to develop blood clots, particularly after surgery or childbirth; the drug is currently approved for patients with this genetic condition.
Individual transgenic animals produced by pronuclear injection will have variable numbers of transgene copies, and the transgene array will be present at different random genomic locations. These variations result in large differences in the human protein yield among individual injected animals. One way to enhance the value of a rare, high-producing animal is by reproductive clon7
ing: using somatic cell nuclei of transgenic adults to generate other animals with the identical genomes. Not
genic animals are funding the development of animal cloning technology. The Tools of Genetics Box on p. 582 describes the process of cloning animals by somatic cell nuclear transfer.
Vaccine production in transgenic plants Like transgenic animals, plants carrying transgenes can be used for the production of human protein drugs. Transgenic plants seem to have particular advantages for making vaccines, antigens of a disease-causing agent that stimulate an immune response to that particular foreign substance. Vaccine proteins produced by transgenic crop plants such as potatoes, rice, soybeans, corn, or tomatoes could be stored in the leaves or seeds of the plant and the plants could simply be eaten to protect individuals from the pathogen. Edible vaccines could be especially advantageous for less-developed countries: No refrigeration is required for seed transport, plants could be grown on site, and no needles, syringes, or medical professionals would be necessary.
Despite the theoretical promise of producing vaccines in transgenic plants, trials to date have had only partial success, and many problems need to be overcome before any of these vaccines can be marketed. One major diffi-
culty is controlling the dose of the antigen: Individual plants can vary in the amount of antigen they produce, and too little antigen will result in an ineffective vaccine. In addition, vaccines that are eaten require higher antigen doses than those that are injected. Even if the scientific problems
can be overcome, drug companies
will encounter many
regulatory hurdles before making these plant-produced vaccines available to humans. Because the regulations are less strict, considerable recent attention has been placed instead on feeding transgenic vaccine-making plants to domestic animals, so as to protect them from various diseases caused by pathogenic organisms.
582
Chapter
17
Manipulating the Genomes of Eukaryotes
Cloning by Somatic Cell Nuclear Transfer ln Chapter 1 1 (p. 381), you were introduced to CC, the world's first cloned cat. "Cloning" in this sense refers to reproductive cloning, in which the genome of a single somatic cellfrom one individual now becomes the genome of every somatic cell in a different individual. Researchers create reproductive clones through a protocol known as somatic cell nuclear transfer. Scientists take the diploid nucleus of a somatic cell from one individual and insert it into an egg cell whose own nucleus has been removed (Fig. A). After several days of growth, the researchers implant the manipulated embryo into the uterus of a surrogate mother. After the embryo develops to term, a cloned animal is born. The cloned cat at the bottom of Fig. A could be thought of as having three different "mothers": the somatic nuclear donor, the oocyte donor, and the surrogate who provided the womb. lt is also possible to clone male animals if the somatic cell nucleus comes from a male. Even though all of the nuclear chromosomes in all of the cells of the clone are derived only from the somatic nuclear donor, the cloned animal and this donor are not perfectly identical in all respects. (1)The mitochondrial genomes of the clone come from the oocyte donor, not the nuclear donor. (2) For female clones, the pattern of X chromosome inactivation in the clone and in the mother will not be the same because the decision of which X to inactivate is made randomly in individual cells early in the animal's development. (3) The cloning procedure alters the process of gene imprinting, so the expression of imprinted genes is not identical in the somatic nuclear donor and the clone. (4)The uterine environment in the surrogate is not exactly the same as in the womb of the donor's mother. The work leading to the cloning of CC was funded by a biotechnology company called Genetic Savings & Clone, whose mission was to provide commercial cloning services for pet owners who might want to replicate their animals after their deaths. The company was not ultimately successful. Few people could afford the high costs of the cloning procedure, and furthermore, some ill-informed clients were disappointed to find that the clone they received was not in fact exactly the pet they knew. Good reasons nonetheless exist for the cloning of certain animals. Research on cloned animals enables scientists to better understand basic processes such as gene imprinting. Drug companies are investing in reproductive cloning technology with an eye toward being able to generate large numbers of high-producing transgenic anima[s. In fact, well before the cloning of CC. the first animal ever to be cloned from an adult cell was a sheep called Dolly in 1996. Dolly was cloned by scientists in Scotland, in part with funding from a pharmaceutical company. Before Dolly died in
2003, she gave birth to five progeny who live on. Finally, several endangered species have been cloned for the purpose of their preservation.
Figure
A
Cloning a cat by somatic cell nuclear transfer
Oocyte donor supplies unfertilized eggs. -
'
.':r
Somatic nuclear donor
l.t r'
-.4IqIFtt
.:.1:_::.:
.ts-t,
(\
.:.
Cells from animal to be cloned are maintained in culture.
Egg cell
1
'1
,l'
Enucleated egg
Somatic cell
*,.,",, is removed
lvl
I E=
:
a
/
-7
F
Nucleus fuses with egg after electric current is applied.
The hybrid embryo grows for seven days.
* '
;,,_ri., t
.- f'.-
)J
l
d-s
Embryo is implanted into surrogate mother
Cloned animal
17.2 Uses of Transgenic
GM Organisms Are Used Widely
Organisms
583
problems caused by large-scale agriculture, and to meet the
in Modern Agriculture
food requirements of an increasing world population. However, other countries restrict or even completely bar
As of 2013, about 100 different transgenic plant species with improved traits have been created; these are often referred to as GM (genetically modified) crops. The improvements conferred by the transgenes include enhanced
the importation of GM organisms. Few scientists now believe that the ingestion of food made from GM organisms poses direct dangers to humans. However, some objections to GM organisms must be considered. The widespread use
nutritional value; increased shelf life; increased yield or plant size; and resistance to stress, herbicides, or infesta-
of GM crops may disrupt the lives of farmers and farm communities, and it places considerable power in a small
tions by plant viruses or insects. We discuss here two of the most commercially important transgenic crops that are currently in wide use. More than 90o/o of the soybeans grown in the United States are transgenic plants resistant to glyphosate, the
number of transnational agribusinesses. Some potential environmental consequences may also exist, such as the unwanted transmission of traits from GM organisms to other species in the wild. These issues are likely to remain contentious in the coming years.
in the herbicide
called Roundup'. Glyphosate interferes with an enzyme called EPSPS that plants need for the biosynthesis of several amino acids. So-called Roundup Ready' soybeans carry a transgene encoding a bacterial version ofEPSPS that is resistant to glyphosate. Farmers spray fields of herbicide-resistant soybeans with Roundup to kill weeds with no harm to the soybeans, thus saving much labor and time. The glyphosate itself is then rapidly degraded by natural processes in active ingredient
the environment.
Another highly successful GM plant is corn that produces a natural, organic insecticide called Bt protein. i Bt protein protects the plant from being eaten by corn-borer ' moth caterpillars. This protein is made naturally by the bacterium Bacillus thuringiensis to protect itself from being eaten by the caterpillars; Bt protein is lethal to insect larvae that ingest it, but not to other animals, including humans. Because the engineered corn manufactures its own natural insecticide, farmers can avoid using costly chemical pesticides that damage farmworkers
and the environment. GM plants expressing Bt protein were grown commercially for the first time in 1996; at least one-third of all corn currently grown in the United States contains Bf transgenes. More than 10 billion acres of land around the world is used to grow Bt-expressing crops, not only corn but also canola, cotton, corn, papaya, potato, rice, soybean squash, sugar beet, tomato, wheat, and eggplant.
Although no GM animals have yet been approved for human consumption as of this writing in2013, transgenic
Atlantic salmon may soon be available at supermarkets. Atlantic salmon normally take three years to grow to their full size of about 9 pounds; their growth hormone gene is shut off during the coldest months when food is scarce, and so they grow only about eight months of the year. The GM salmon, which contain a growth hormone transgene that is expressed year-round, achieve their full weight in half that time. GM crops are grown in the United States and in more than 25 other countries, where these organisms are regarded as important tools to help limit environmental
Transgenic Animals Model Human Gain-of-Function Genetic Diseases Animal models of human genetic diseases have for decades been an important tool for scientists trying to understand disease biochemistry so as to design and test new drugs and other treatments. The idea of an animal model for a monogenic human disease is simple-to generate an animal with a corresponding mutation and a similar disease phenotype.
You should note that because transgenes are added to otherwise wild-type genomes, transgenic animals made by the techniques just described can serve as models only for dominant, gain-of-function mutations. (We discuss animal
models for diseases caused by loss-of-function mutations
in a subsequent section ofthis chapter.) For many reasons, mice are the animals used most often to model human genetic diseases. Mice are mammals, and similar versions of most human genes are present in their genome. In addition, mice are small and relatively economical laboratory animals. But for the study of human neurological disorders, unfortunately, mice cannot replicate the complex effects of some gene mutations on brain functions and behavior. Instead, scientists have recently begun to model human diseases in transgenic laboratory
monkeys-rhesus macaques. The first transgenic primate model for a human neurological disorder was for Huntington disease. You will recall that Huntington disease is caused by a dominant allele of the HD gene with an expanded number of CAG
triplet repeats within the coding region. (Review the Fast Forward Box in Chapter 7, p. 216.) The mutant allele encodes a form of the protein product (huntingtin) that has more than the normal number of glutamine residues in its so-called polyQ region. Rhesus monkeys that model Huntington disease carry a transgene containing a mutant copy of the HD gene with an expanded CAG repeat region. These monkeys show disease symptoms similar to those of people with Huntington disease, helping scientists to
584
Chapter
17
Manipulating the Genomes of Eukaryotes
understand the disorder and to develop more effective therapies.
Experiments with primates raise substantial ethical concerns for many people, so the future of primate models for human genetic diseases is unclear. As of this writing in 2013, the United States National Institutes of Health is in the process of phasing out most, though not all, invasive research on primate species.
A researcher needs only to know the DNA sequence of a gene in order to alter it; now that the genome sequences of all model organisms normally used in the laboratory have been determined, any gene in these species can be mutated
atwill. We focus here mostly on methods to alter specific genes in mice, which are the animal of choice for many studies relevant to human biology. However, at the end of this section we describe an exciting new technique just coming into widespread use that is applicable to many different species.
essential concepts
.
. .
.
wild-type transgene can be inserted into an embryo homozygous for a recessive mutant allele. lf the normal phenotype is restored, then the transgene identifies the gene that was mutated. A
The creation of reporter constructs allows easy detection of when and in which tissues a gene is turned on or turned off in eukaryotes.
Transgenic organisms produce medically important human proteins including insulin, blood clotting factors. and erythropoietin; transgenic crop plants can potentially make ingestible vaccines. GM soybeans are resistant to the weed killer glyphosate.
Many crops, such as corn, soybean, canola, and cotton have been genetically modified to express the Bt protein that discourages insect predation.
.
Adding a transgene that carries a disease-causing, gain-of-function allele to a nonhuman animal model allows researchers to observe disease progression and to test possible therapeutic interventions.
fif,!
Targeted Mutagenesis
leorning objectives
1. 2. 3. 4.
Describe how
ES
cells are used to generate knockout mice.
Explain why an investigator might want to create conditional knockout mouse.
a
Discuss how scientists employ a bacteriophage sitespecific recombination system to generate knockin mice. Describe TALENs and how they are used genomeS.
to modify
In the previous section, you saw that genes can be transferred easily into random locations in the genomes of many animals and plants. Here we will explore more advanced technology that enables scientists to change specific genes in virtually any way desired-that is, targeted mutagenesis.
Knockout Mice Have Loss-of-Function Mutations in Specific Genes Homologous recombination provides a way for DNA sequences to zero in on specific regions of a genome. In fact, in Chapter 13 you have seen already that gene transfer by means of homologous recombination can make mutations in specific bacterial genes-a process called gene targeting (recall Fig. 13.31, p. 481). In gene targeting, scientists mutagenize a specific gene in vitro, andthen introduce the mutant DNA into bacterial cells. Homologous recombination then replaces the normal copy of the gene in the bacterial genome with the mutant copy. Although homologous recombination events are rare, investigators can grow large numbers of bacteria easily and then identify rare cells containing targeted mutations by selecting for a drug resistance marker present within the transferred DNA. Gene targeting in single-celled eukaryotes such as the yeast S. cerevisiae by the same method is also quite routine. Mouse geneticists use mouse embryonic stem cells (ES cells) to surmount two main obstacles for gene targeting in multicellular organisms. First, for a chromosome containing a targeted gene to be transmitted to progeny, gene targeting has to occur in germ-line cells. Second, given the low efficiency of homologous recombination, investigators need to screen through a large number of germ-line cells to obtain one with the desired mutation. Mouse ES cells grow in a culture dish, so as is done with bacteria or yeast, investigators can select rare cells containing a targeted mutation. A crucial aspect of this procedure is that the ES cells with targeted chromosomes can be moved from a cell culture dish to a developing embryo, where they can contribute to all different cell types, including germ-line cells.
Gene targeting in ES cells to generate knockout mice Mouse ES cells are undifferentiated cells derived from the inner cell mass of early-stage embryos calTed blastocysts
(Fig. 17.8). These ES cells are not yet committed to
Figure 17.8 Constructing knockout mice. (b) Purebred agouti mice are mated to produce an early embryo (a blastocyst). Embryonic stem (ES) cells from the inner cell mass of the blastocyst are cultured to increase their
(a) Using recombinant DNA techniques, a gene specifying resistance to the drug neomycin (neol) is inserted into an exon
number.
of the gene of interest.
-F"
neo'
X
-*a::- exon
I
I
fr
i" .,id*s*.
;t
fi
iz:.*ii:*la. 13
,
.$i:::i*L'&-
_...1
ra, . ullilr::iii-'t't..
;5 ' ..-.,it 1iillt:L
7a
.ilii:,'i;;a
nz
.rir;::til;i;
AO AO
-fii
-\S
,,++"i#;
p;;*i AA
#,
u'-...1-.,.'
4 normal
.]i ( riippel
(examine Fig. 18.18 on p. 613). These embryos were so aberrant that they were unable to grow into adults; thus, the mutations causing these defects would be classified as recessive lethals.
After screening several thousand such stocks for each
of the Drosophila chromosomes, Niisslein-Volhard
and Wieschaus identified three classes of zygotic segmentation genes: gap genes (nine different genes); pair-rule genes (eight genes); and segment polarity genes (about 17 genes). These three classes of zygotic genes fit into a hierarchy of gene expression.
Gap genes The gap genes are the first zygotic segmentation genes to be
transcribed. Embryos homozygous for mutations in the gap genes show a gap in the segmentation pattern caused by an absence of particular segments that correspond to the posi, tion at which each gene is expressed (Fig.lS.22a and b). How do the maternal transcription factor gradients ensure that the various gap genes are expressed in their broad zones at the proper position in the embryo? Part of the answer is that the binding sites in the enhancers of the gap genes have different affinities for the maternal transcription factors. For example, some gap genes are activated by Bcd protein (the anterior morphogen). Gap genes such as hb with low-affinity Bcd protein-binding sites are activated only in the most anterior regions, where the concentration of Bcd is at its highest; by contrast, genes with high-affinity sites have an activation range extending farther toward the posterior pole. Another part of the answer is that the gap genes themselves encode transcription factors that can influence the expression ofother gap genes. the Krilppel (Kr) gap gene, for example, appears to be turned off by high amounts of
Hb protein at the anterior end of its band of expression, activated within its expression band by Bcd protein in
616
Chapter
Figure18.23
18
The Genetic Analysis of Development
Pair-rulegenes.(a)Zonesofexpressionoftheproteinencodedbythepair-rulegenesfushitarazu(ftz)andeven-skipped(eve\at
the cellular blastoderm stage. Each gene is expressed in seven stripes. Eve stripe 2 is the second green stripe from the /eff. (b) The formation of Eve stripe 2 requires activation ofeve transcription by the Bcd and Hb proteins, and the absence of repression by the Gt and Kr proteins. {c) The 700 bp enhancer upstream of the eve gene that directs the Eve second stripe contains multiple binding sites for the four proteins shown in part (b).
(a) Distribution of pair-rule gene products
(b) Proteins regulating
Anterior
Anterior
Posterior
eve transcription Posterior K(
uu(
l)
{) {)
lEve stripe 2l Even-sklpped (Eve) Fushi tarazu (Ftz)
(c) Upstream regulatory region of eve -1500 base pairs
Kr f1
Tt
Bcd
conjunction with lower levels of Hb protein, and turned off at the posterior end ofits expression zone by the products of the knirps (kni) gap gene (Fig. 18.22c). (Note that the hb gene is usually classified as a gap gene, despite the maternal supply of some hb RNA, because the protein translated from the transcripts of zygotic nuclei actually plays the
Gt
-
Bcd
-800
Gt Kr -.P) t Bcd
Gt
t Bcd
KT
at -eI Hb Bcd
Kr
Kr
T
a
Hb
Hb
and Kr. Only in the stripe 2 region are Gt and Kr levels low enough, and Bcd and Hb levels high enough, to allow activa-
tion of the enhancer driving eve expression. In contrast with the primary pair-rule genes, transcription
ofthe five pair-rule genes ofthe secondary class is controlled by the transcription factors encoded by other pair-rule genes.
more important role.)
Segment polarity genes Pair-rule genes After the gap genes have divided the body axis into rough, generalized regions, activation ofthe pair-rule genes generates more sharply defined sections. These genes encode transcription factors that are expressed in seven stripes in preblastoderm and blastoderm embryos (Fig. f 8.23a). The stripes have a two-segment periodicity; that is, there is one stripe for every two segments. Mutations in pair-rule genes cause the deletion of similar pattern elements from every alternate segment. For example, larvae mutant for fushi tarazu ("segment deficient" in |apanese) lack parts of abdominal segments Al, A.3, A.5, and 47; mutations in even' skipped cause loss of even-numbered abdominal segments. Two classes of pair-rule genes exisl primary and secondary. The striped expression pattern of the three primary pairrule genes depends on the transcription factors encodedbythe maternal genes and the zygotic gap genes. Specific elements within the upstream regulatory region of each pair-rule gene drive the expression of that pair-rule gene within a particular stripe. For example, as Fig. 18.23b and c show, the DNA region responsible for driving the expression of even-skipped (eve) in the second stripe contains multiple binding sites for the Bcd protein and the proteins encoded by the gap genes Krilppel, giant (gt), andhb. The transcriplion of eve in this stripe of the embryo is activated by Bcd and Hb, while it is repressed by Gt
Many segment polarity genes are expressed in stripes that are repeated with a single segment periodicity; that is, there is one stripe per segment (Fig. f8.24a). Mutations in segment polarity genes cause deletion of part of each segment, often accompanied by mirror-image duplication of the remaining parts. The segment polarity genes thus function to determine certain patterns that are repeated in each segment. The regulatory system that directs the expression of segment polarity genes in a single stripe per segment is quite complex. In general, the transcription factors encoded by pair-rule genes initiate the pattern by directly regulating certain segment polarity genes. Interactions between various cell polarity genes then maintain this periodicity later in development. Significantly, activation of segment polarity genes occurs after cellularization of the embryo is complete, so the diffusion of transcription factors within the
syncltium ceases to play a role. Instead, intrasegmental patterning is determined mostlybythe diffusion of secreted proteins between cells. Two of the segment polarity genes, hedgehog (hh) and wingless (wg), encode secreted proteins. These proteins, together with the transcription factor encodedby the engrailed (en) segment polarity gene, are responsible for many aspects of segmental patterning (Fig. f 8.2ab). A key component of this control is that a one-cell-wide stripe of
18.5 A Comprehensive Example: Body Plan Development in
Figure 18.24 Segment polarity genes.
(a) Distribution of Engrailed protein
Drosophila
617
(a) Wild-type embryos
express the segment polarity gen e engrailed in 1 4 stripes. (b) The border between a segment's posterior and anterior halves, or compartments, is governed by the engrailed (en), wingless (wg), and hedgehog (hh) segment
polarity genes. Cells in posterior compartments express en. The En protein activates the transcription ofthe hh gene, which encodes a secreted protein ligand. Binding ofthis Hh protein to the Patched receptor in the adjacent anterior cell initiates a signal transduction pathway (through the Smo and Ci proteins) leading to the transcription of the wg gene. Wg is also a secreted protein that binds to a different receptor in the posterior cell, which is encoded by frizzled.Binding of Wg to the receptor initiates a different signal transduction pathway (including the Dsh, Zw3, and Arm proteins) that stimulates the transcription ofthe en and hh genes.The result is a reciprocal loop stabilizing the alternate fates of adjacent cells at the border.
(b) Segment polarity genes establish compartment borders. Wingless protein Frizzled receptors
of
Patched
\ o€
e
Hedgehog
Smo
Wp1> 0. For a genotype with deleterious effects, the fitness W64 can vary from just less than I (minimal selection against dd) to O (dd islethal, so no dd individuals survive to adulthood). If selection against dd homozygotes exists, Aq is always negative, and the frequency of the d allele
with each generation. A key feature of Eq. (20.6) when Walis less than I is its prediction that the rate at which 4 decreases over time diminishes as q becomes -smaller. This prediction emerges because A4 varies with q2, and because q is always less than decreases
r,q, 1q.
To understand this effect, consider the special case of a lethal recessive disease for which Waa: 0. The dotted line in Fig. 20.12 shows the decrease in allele frequency predicted by Eq. (20.6), starting from an initial allele frequency of 0.5. The decrease in allele frequency is rapid at first and then slows. After 10 generations, the predicted frequency of the recessive disease allele is still nearly 10%, even though the homozygous recessive genotype is lethal. The solid line in Fig.20.12 plots actual data for the decrease in frequency of an autosomal recessive lethal allele in a large experimental population of Drosophila melanogaster; the predicted and observed changes in allele frequency match quite closely.
Figure 2O.12 Decrease in the frequency of a recessive lethal allele over tim e,fhe dotted line represents the mathematical prediction. The blue line represents the actual data obtained with an autosomal reces-
sive lethalallele.
We can estimate this change as Aq q' q.Substituting Eq (20.5) for 4' yields (after some algebra):
-
0.6
>
L,q:
pqlq(W""- We)
- p(Wee- Wo,)l
w
675
(20.5)
Thus, in one generation ofselection, the allele frequency
a has changed from q to
Populations?
(20.6)
Equation (20.6) shows that selection can cause the frequency of an allele to change from one generation to the 1next, and this change depends both on the frequencies of ,,'the two alleles and on the relative fitnesses of the three genotypes. Note that if the fitnesses of all genotypes are the same, as in populations at Hardy-Weinberg equilibrium,
c o f
0.5 0.4
F
{
Predicted changes
c)
o
o ^^ 0.1
0
Obserued
012345678 1234567A9
I 10 Generation
10
Adults
1'1
Zygotes ot nexl generation
676
Chapter
20
Variation and Selection in Populations
Why does selection become less effective as the frequency of a recessive lethal allele moves closer to zero? The answer is that when q is small, individuals homozygous for the disease allele (at a frequency of qz) arevery rare because most copies of the d allele occur in Dd heterozygotes (at a frequency of 2pq) who do not experience negative selection. In mathematical terms, the ratio of q2 to 4 decreases exponentially for all values of q less than 1. Over successive generations, then, the allele frequency q should continue to decline, albeit more and more slowly over time as q moves closer and closer to a value ofzero.
Figure 2O.13 Natural selection together with genetic drift. Each colored line represents an independent Monte Carlo simulation of a population of N 500 individuals in which a new mutant allele that con-
:
fers a slight dominant fitness advantage appears at time 0. In three ofthe simulations, the mutant allele goes extinct in fewer than 100 generations because of genetic drift when the allele frequency is very low; these simulations go to fixation so quickly they are hard to distinguish in the lower left ofthe graph.The few simulations that escape from the loss ofthe advantageous mutation move inevitably to fixation for the new mutation.
1.00
0.75
c c)
f o c)
Natural selection in finite populations Modifying the Hardy-Weinberg equation with relative fitnesses overcomes one limitation of the original equation: the assumption that all possible genotypes are equal in fitness. But the analy'tical solution of this modified equation to determine A4 still suffers from a dependence on the assumption of an infinite population. However, we can use the modified Hardy-Weinberg equation to develop Monte Carlo simulations that explore the impact of natural selection on finite populations. As an example, let's consider a population of 500 individuals in which 499 are homozygous initially fot the b allele, and one is heterozygous with a B mutation on one chromosome that provides a slight dominant advantage in survival described by the following relative fitness values: WnB : l'0, Wp5: 1.0, and Wr,u : 0'98' These conditions can be modeled with a Monte Carlo approach that randomly eliminates 2o/o of bb individuals created in each generation, and replaces them with offspring from a new mating of the parental generation. Figure 20.13 shows the results of six simulations of this population model. Notice first that in three of these, the new B allele never takes off, going extinct within 65 generations. But in the populations where the B allele increases in frequency to about 0.10, it inevitably moves toward fixation. This example illustrates two important points concerning the impact that a new mutant allele with a small, yet realistic, fitness advantage can have on a population. First, even though the novel allele provides a selective advantage, it will often go extinct due to chance events of reproduction in the initial generations. But second, if the advantageous allele reaches a threshold frequency level that ensures its survival, its frequency will always increase all the way to fixation eventually, even if the small fitness advantage is imperceptible at the individual level.
The impact of natural selection on humans When people migrated out of the East African region in which H. sapiens originated, beginning about 80,000 to
0.50
C)
o
o.25
0
100 200 300 400 500 600 700 800
900
Generaiions from start
60,000 years ago (as will be discussed later in this chapter),
founder populations encountered environmental conditions in Europe and Asia that were distinct from those in Africa. As a result, the relative fitnesses of alternative alleles at a number of genes became reversed. Among the most obvious changes were differences in allele frequencies at genes that determine skin pigmentation. The ultraviolet rays of the sun provide people with benefits as well as harm. One benefit is the catalysis of vitamin D production; the harm is in the induction of mutations in our skin that can lead to skin cancer. Closer to the equator, the sun's rays are most intense; alleles that cause a darkening ofthe skin are advantageous in tropical regions because they protect against skin cancer while allowing enough ultraviolet light through for vitamin D production. At higher latitudes, where the sun's rays are less intense, skin cancer is less of a problem, and alleles that lighten the skin allow enough UV penetration for sufficient vitamin D production. As described in Chapter 3, skin pigmentation is a complex quantitative trait determined by alleles at many genes, but about a half dozen of these genes are most influential. One fascinating question concerning our history as a species is whether European and Asian populations derived lighter skin pigmentation from a common ancestral population, or whether the trait evolved separately on the two continents. A mixed answer has been obtained by surveying allele frequencies at multiple pigmentation loci in populations indigenous to different geographical locations around the Old World. the KITLG gene is among the small number with a prominent role in skin pigmentation. As you can see in Fig. 20.14a, Europeans and Asians share a common SNP variant of KITLG responsible for a reduction in pigmentation,
20.2 What Causes Allele Frequencies to Change in Real
Figure 20.14 Geographical distribution of allele frequencies at skin pigmentation loci.
30'
Human Genome Diverslty Project
La
60'
il
e
0'
e
270'
300'
270"
300'
30'
{t
oG
0"
,+
i-F
q' irf
'i -30'
I
"-
30'
60
90'
'150'
12A'
(b) 5tC24A5 locus SNP: F1834640 Ancestral Allele: G
677
Europeans and Asians independently accumulated variants
with roles in skin pigmentation at two other loci (MCIR and SLC24A5; Fig. 20.l4b and c). Thus, although the same
(a) KlTLGlocus SNP: Fl881227 Anceslral Allele: G
Populations?
30'
Humsn Genome DlveEity Proiect
c
60'
selective pressure ofreduced sunlight existed in both populations, the selection acted on different mutations that occurred at different times in human history. Another example of recent strong natural selection changing allele frequencies in different human populations is lactase persistence, which we introduced at the beginning of this chapter (review Fig. 20.1 on p. 663). Here, selection was brought about not by exposure to different environments, but rather through the development by humans of agriculture and the domestication of cattle that provided milk as a source of nutrition. The chance occurrence of mutations in regions upstream of the gene encoding the enzyme for milk sugar (lactose) digestion eliminated the turning off of gene expression past weaning that takes place in all other mammals. People who could digest milk as adolescents and adults apparently survived better and/or had more offspring when food was scarce, leading to a fitness advantage in certain parts of the world for the lactase persistence mutations.
0'
(i'
"f 30'
?'. .
0'
i
Sickle-cell anemia, which includes episodes of severe pain, serious anemia, and a probability of early death, is a recessive condition resulting from two copies of the sicklecell allele at the B-globin gene (HbB).It is thus surprising that the disease allele has not disappeared from several African populations, where it seems to have existed for a very long time. One clue to the maintenance of the sickle-cell allele HbBs in human populations lies in the observation that
o
\ .,t
,% )
30'
0'
(c)
q
60'
30"
90'
120'
150'
MCIR locus
SNP:6885479
30'
Human Genome Diversity Project
Ancestral Allele: C
c c
0"
ar
270
dt
300'
30-
a
0
-1
O
q
G
.,4
,%
i'!r
_30'
0'
30'
60'
90
120"
its allele frequency is highest in regions of Africa in which malaria is endemic (Fig. 20.15). A second clue is that heterozygotes for the normal and sickle-cell alleles (HbPo HbPt) are resistant to malaria. This resistance is due, in part, to the fact that red blood cells containing a sickle-cell allele break open after being infected by the malaria parasite, destroying the parasite
,,1.
?
Balancing Selective Forces Can Maintain Alleles in a Population
150'
suggesting that they derived it from a common ancestor who lived after humans migrated from Africa into the Arabian Peninsula and prior to the separation of populations heading northwest and northeast. In contrast,
as
well
as
the red
blood cell itself. By contrast, in red blood cells with two normal hemoglobin alleles, the malaria parasite thrives. Thus, individuals of gen otype H\BA HbBs have a heterozygote advantage in malaria-infested regions over either type of homozygote: The carriers are less susceptible to malaria than are HbBA HbPs homozygotes, and less susceptible to anemia than are HbBs HbPs homozygotes. Heterozygote advantage is one of several processes leading to balancing selection that actively maintains genetic polymorphisms.
To understand heterozygote advantage mathematically, assume that HbP^ HbP' heterozygotes have the maximum relative fitness of 1, while the relative fitness for
678
Chapter
20
Variation and Seiection in Populations
Figure 20.15 High frequency of the sickle-cell allele HbBs in regions of Africa where malaria is prevalent. (a) Geographical
Lq : 0 is known as the allele's equilibrium frequency. This value occurs when the term inside the brackets of Eq.
distribution of HbBs.(b) Geographical distribution of the malaria-causing
(20.6) is 0, that is, when
parasite Plasmodium falciparum. (cl HbBA Hbps heterozygotes have decreased susceptibility to malaria. and thus have a selective advantage in areas with malaria relative toboth HbBAHbBA homozygotes who are fully susceptible to malaria, and lo HbBsHbps homozygotes who suffer from sickle cell anemia and often die without reproducing.
(a) Distribution ot HbPs
lq(W,,- Wo)'P(Wee- We)l:0
Substituting | - q for p and solving Eq. (20.7) for 4 reveals that the equilibrium frequency of HbB'represented by q"is reached when
q,:M
I
(20.7)
Wet -
Wl,o
(20.8)
Thus, to find the equilibrium frequency, that is, the value of q at which Lq: 0 such that both alleles persist in the population, you need know only the relative fitnesses for the two homozygotes because Wao wos set to 1.0.
On the other hand, if you know the equilibrium frequency and the relative fitness of one of the homozygotes, you can use Eq. (20.8) to estimate the relative fitness of the other homozygote. For example, we can assume that
6
the African populations in which sickle-cell anemia is prevalent are roughly at equilibrium with regard to natural selection acting on the B-globin gene. Several field studies have revealed that the average frequency of the HbBs allele in tropical populations is 0.17, so we will take this number as the equilibrium frequency 4". Since the heterozygote HbBA HbBs has the highest fitness in areas with malaria, we will assign Weo : 1. Further, if you assume that HbBs HbBs homozygotes never reproduce, as was essentially true before medical advances enabled the survival of children expressing the sickle-cell trait, then Woo:0. With Woo: 0, and Weo: 1, Eq. (20.8) can be rearranged to provide the following estimate of the relative fitness of the wild-type genotyp e Waa given q":
(b) Distribution of malaria
!
c
I-2q"": Wnn:l-q,
(c) HbB genotype fitnesses
Genotype: Relative
HbBAHbBA HbpAHbBs HbpsHbPs
iitness: 0.8
1.0
0
Equilibrium frequency ot HbBs = 9.17 predicted (and observed) in areas wiih malaria
l-2(0.17\ :0.8 1-0.17
To understand the relationship between q, the change qe, you can use Eqs. (20.6) and (20.8) to formulate a new equation for A4:
in q, and the equilibrium frequency
L,q: -pql\
-
We) + (1 - W,,)l(q
w
-
q")
(20.e)
From Eq. (20.9), you can see that when 4 is greater than the HbBA HbBAhomozygotes is Wat, and the relative fitness for HbBs HbPs homozygote s is Woo. (To simplify the
q", A,qis negative. Under these circumstances, 4 (that is, the frequency of the HbBs allele) will decrease toward the equi-
following equations we are temporarily renaming the HbBA allele as A whose frequency is p, and renaming the HbBs allele as a whose frequency is 4.) Selection will maintain both alleles in the population only if Aq : g 6o. some value of 4 between 0 and l. The q value at which
librium. By contrast, when 4 is less than q,, Lq is positive and the frequency of HbBs willincrease toward the equilibrium. Thus, the allele frequency is stabilized at equilibrium, because a change away from equilibrium is always followed by a change toward it.
20.2 What Causes Allele Frequencies to Change in Reai
', A Comprehensive Example: Human ' Behavior Can Affect Evolution of lnsect Pests In Chapter
Populations?
679
Figure 20.16 How genotype frequencies among populations of A. aegypti mosquitoes changed in response to insecticide application. Results observed after the insecticide DDT was used in a suburb of Bangkok, Thailand, beginning in 1 964 and
ending in 1968.
13, we discussed how populations of bacterial
pathogens have evolved resistance to drugs humans de-
veloped to protect us from infections. Like infectious bacteria, many insects that threaten human health and agriculture spawn large populations because of their short generation times and rapid rates of reproduction. Via selection for resistance-conferring mutations, these large, rapidly reproducing populations of diploid insects evolve resistance to the chemical pesticides used to con-
trol them. DDT (dichlorodiphenyltrichloroethane) and other synthetic organic insecticides, begun in the 1940s, was at first highly successful at reducing crop destruction by agricultural pests, such as the boll weevil, and insect vectors of disease, such as the mosquitoes that transmit malaria and yellow fever. Within a few years, however, resistance to these insecticides was detectable in the targeted insect populations. Since the 1950s, resistance to every known insecticide has evolved within 10 years of its commercial introduction. Because different populations within a species The large-scale, commercial use of
can become resistant independently of other populations,
, insecticide resistance likely developed many separate times in many insect species since the introduction of insecticides. Genetic studies show that insecticide resistance can result from mutations in several different genes. DDT, for example, is a nerve toxin in insects because it binds to a sodium channel protein and therefore disrupts the protein's
function in nerve transmission. Some insects develop DDT resistance through recessive mutations in the channel-encoding gene that produce a channel protein that binds DDT poorly. Houseflies and certain mosquito species develop resistance to DDT from dominant mutations in other genes that encode enzymes that detoxify DDT, rendering it harmless to the insect. In some cases, these dominant alleles are mutations in gene regulatory regions that lead to the overexpression of a detoxifying enzyme. Both recessive and dominant mutations causing DDT resistance are ofconcern for the control ofinsect pests, but we focus here on dominant mutations, which as we have seen can spread very rapidly through populations. Consider, for example, the dominant mutation R (for insecticide resistance), which occurs initially at low frequency in a population. Soon after the mutation appears, most of the R alleles are in Rr heterozygotes (in which r is the wild-type susceptibility allele). With the application of insecticide, strong selection favoring Rr heterozygotes will rapidly increase the frequency of the resistance allele in the population. A field study of the use of DDT in Bangkok, Thailand to control Aedes aegyptl mosquitoes, the carriers of yellow
a
&
100
RB
Fao .9
Rr
o
960
F
o
o40 o (l)
P20 cc)
90 c)
(L
I
1964 1965 1966 1967 1968 1969
1970
fever, illustrates the rapid evolution of resistance. Spraying of the insecticide began in 1964 and was very effective in
controlling the mosquitoes. Within ayear, however, DDTresistant mutant alleles (R) emerged and rapidly increased in frequency. By mid-1967, the frequency of resistant RR homozygotes was nearly 100% (Fig. 20.16). Because DDT became ineffective in reducing mosquito populations in Bangkok due to the near fixation of the DDT-resistant allele, the insecticide spraying program was stopped. The response of the mosquito population to the cessation of spraying was intriguing: The frequency of the R allele decreased rapidly, and by 1969, RR genotypes had virtually disappeared (Fig. 20.16). The precipitous decline of the R allele suggests that in the absence of DDT, the RR genotype produces a lower fitness than the rr genotype. In other words, the homozygous resistance genotype imposes a fitness cost on individuals such that in the absence of insecticide, resistance is subject to a negative selection that decreases the frequency ofR in the population. This dependence of the fitness of individual genotypes on their environment is quite similar to the heterozygous advantage seen for humans carrying the mutation causing sickle-cell anemia in parts of the world in which malaria is endemic.
essential concepts
.
Although the Hardy-Weinberg equation almost always provides accurate estimates of allele and genotype frequencies over the course of a few generations, it fails in the long run because real populations do not conform to the HardyWeinberg assumptions.
.
ln small populations, genetic drift due to random sampling of finite gamete pools can alter the frequency of an allele rapidly, until it eventually becomes either lost or fixed.
680
. .
.
Chapter
20
Variation and Selection in Populations
Because genotypes may not display equal fitness, natural selection may increase or decrease allele frequencies in populations over time.
Shared Alleles Denote Common Genetic Ancestry
Relative to an alternative genotype, the same genotype can be more fit in one environment, but less fit in a different environment. For example, heterozygotes for the sickle-cell mutation are resistant to malaria, explaining the high frequency of this allele in tropical populations of humans.
Individual people have two kinds of ancestors: biological an-
The fitness benefits to insects of insecticide resistance often come with fitness costs; resistance allele frequencies thus
change rapidly when humans apply or stop applying insecticides.
f[f,l
Ancestry and the Evolution of Modern Humans
cestors and genetic ancestors. Biological ancestry is simply a description of who begat whom: You have two biological parents, four grandparents, eight great grandparents, and so forth (Fig.20.l7a and b). Any individual alive today could potentially have 2k biological ancestors k generations ago, assuming
Figure 2O.17 Biological and genetic ancestries. (a) The great grandparents of Dion and Ana came from four different regions ofthe world. (b) Tracing the Y chromosomal, autosomal, and mitochondrial (mt) contributions from biological ancestors to Dion and Ana. (c) Y chromosome DNA variation tracks the paternal lineage, while the mtDNA traces the maternal lineage.The autosomes (chromosomes 1-22) undergo recombination, so the ancestry of individual segments must be traced separately. (a) Great grandparents from four dilferent regions
Iearning objectives
1.
Distinguish between an individual's biological and genetic
+
ancestries.
2.
Summarize the evidence supporting the origin of modern humans in Africa.
3.
Explain how DNA sequencing has clarified the relationships of ancient and modern human lineages.
At the beginning of this chapter, we saw that human populations in different regions of the globe vary greatly in the frequency of alleles dictating lactose tolerance or intolerance. These current-day variations in allele frequency reflect many processes that occurred during the course of human history. The earliest humans almost certainly were all lactose intolerant, as are other primates. At least twice in human history, and in different geographical locations, mutations occurred that allowed expression of the lactase enzyme to continue through adulthood. The frequency of the mutant alleles increased rapidly in populations that raised dairy animals because of the selective advantage afforded by obtaining sustenance from milk products. Genetic drift in small populations of dairy herders may also have contributed to the spread of these alleles. The mutant alleles were then introduced into other populations by migration of individuals carrying the alleles. This example illustrates that the existence of specific DNA variants, and the frequencies at which these variants are represented in different current-day human populations, serve as molecular fossils that can provide scientists with insights into the events that shaped human history. We explore here how population genetics provides important tools for anthropology, the study of humankind.
+
r+
(b) Biological ancestors of Dion and Ana-3 generations back Great
Grandparents Y
Mt
Grandparents Y
Mt
Parents Y
Mt
Mt
Mt
(c) Genetic ancestry Dion
Y Chromosome
Ana
Chromosomes 1-22
Mitochondrial Genome
IEil!!CIEEll fi IE
E:ii:iillEii
20.3 Ancestry and the Evolution of Modern
that the ancestors in any one generation were unrelated. Thus, 20 generations ago (about 400 years) you could have had over I million biological ancestors, and 30 generations ago (about 600 years) more than 1 billion. This latter number is much higher than the number of humans thought to have been on the earth at that point in history (during the Middle Ages), the reason being that some ancestors in previous generations must have been related. But nonetheless, you stiil had a very large number of biological ancestors not so very far back in the past. In fact, you are almost certainly related to someone famous, and also to someone infamous.
The most recent common ancestor Genetic ancestry refers to the actual inheritance of segments of the genome from biological ancestors (Fig. 20.Uc). Comparison of Fig.20.l7b and c shows that for diploid regions of the human genome, we have many more biological ancestors than genetic ancestors. The reason is that we each have two parents, each of whom has two alleles at a diploid autosomal locus, but we inherit only one allele from each parent. Each generation, four biological ancestors exist for a given person, but only two genetic ancestors exist for any given locus in that person's diploid genome.
)
If two current-day siblings receive copies of the same allele of a region of a genome from one of the chromosomes in one of their parents, then that parental allele is the single most tecent common ancestor (MRCA) for that region of those two people (Fig. 20.f 8a). Because of recombination, different regions of the genomes of two relatives may have different
Humans
681
lack of recombination, the entire mtDNA and almost the entire Y chromosome each have only single MRCAs for all humans.
Lines of genetic ancestry revealed by mutations
Looking at the history of human populations through the lens of genetic ancestry is
a
powerful method for interpreting
DNA
sequence variation in present-day people. The different allele lineages coqlesce to the MRCA as we go backwards in
time (Fig. 20.18a). The MRCA thus provides a starting point for analysis, in the form of an ancestral allele whose descendant sequences are found in all people now on the earth. Even though an unbroken line of descent connects the MRCA with sequences found in all modern-day people, this does not mean that all MRCA-related sequences are identical. The reason is that mutations can occur by chance along the lineages that connect each modern-day person to the MRCA, leaving trails in our genes of our genetic ancestors
Figure 20.18 Tracing genetic ancestry. (a) While an individual alive today has 2k biological ancestors k generations ago, for a single genomic locus, the genetic lineages of five people alive today coalesce to just one ancestral allele, the Most Recent Common Ancestor (MRCA) for that region of the genome. (b) Analysis of shared mutations allows scientists to trace lineages back to the MRCA. (a) Most Recent eommon Ancestor (MRCA) MRCA for one
Pasf
MRCAs (Fig. 20.17c). In fact, across their genomes, two relatives have many different MRCAs in the past, reflecting the fact that they have inherited bits and pieces of their genome from many (but not all) of their many biological ancestors. In Fig. 20.18a, for example, the individual carrying the most recent common ancestor for an allele of a particular autosomal region in the five people alive today highlighted at the bottom ofthe figure existed seven generations ago.
For any specific region of the genome, the MRCA for allhumans is the most recent allele from which all currentday people have obtained DNA sequences in an unbroken line of descent. The MRCA for that particular locus was carried by an individual who co-existed with other people who also contributed some of their genes only to some (or none) current-day humans. In contrast to the recombining diploid portions of our genome, mitochondrial DNA (mtDNA) is passed directly from mothers to their offspring with no contribution from the fathers (Fig. 20.17b and c). Similarly, excluding the small PAR regions shared by the X and Y chromosomes, the DNA on Y chromosomes is passed directly from fathers to sons: Daughters do not inherit a Y chromosome (Fig. 20.I7b and c). Each of us thus has just one genetic ancestor each generation for mtDNA or Y-chromosomal DNA; they cannot have come from the same person since mtDNA is maternally derived and Y DNA is paternally derived. Because of a
Today
(b) Mutations track genetic ancestry to the MRCA AAAAA
Pasf I I I I I I I I
I I
I I I I
I I
I Today
AATAA AATAA AAAGA AAAGA
TGAAC
682
Chapter
20
Variation and Selection in Populations
in the generations that separate us from the MRCA
and aboriginal New Guineans all shared a most recent com-
(Fig. 20.fSb). Researchers determine these lines of descent by analyzing mutations shared by present-day individuals. Because mutations accumulate over time, the longer the ancestry branch length between alleles in two individuals alive today, the more different are their DNA sequences. You can see the implications of this fact in Fig. 20.18b, where, as we have seen, alleles in five individuals alive at present trace backto a single MRCA just seven generations ago. The DNA sequence of that MRCA was AAAAA. As that sequence was passed down over generations, germ-line mutations could occur in it, creating novel descendant alleles. Note in Fig. 20.18b how individuals closely linked in the family history (such as the siblings at the bottom left) are identical in sequence (AATAA), but they differ at four of the five nucleo-
mon ancestral mtDNA from a female who lived no more than 200,000 years ago. The estimate of time came from a calibration of the rate of accumulation of mutations in
tide positions from the individual at the bottom right (TGAAC), who comes from a different familylineage that is nonetheless still derived from the MRCA at the top. In Chapter 10, we saw how this picture of descent with modification allows population geneticists to interpret DNA sequence data obtained from many people in many different human populations (review Fig. 10.6 on p. 345). If a SNP allele is found both in some present-day humans and in some present-day members of other primate species such as chimpanzees, the allele is ancestral and must have been inherited
mtDNA from samples with independent archeological or geological estimates of times since divergence. This rate has been found to be remarkably constant for manylineages and genomes, and it thus constitutes a molecular clock. Scientists further concluded that a female carrying the MRCA for human mtDNA lived in Africa. The evidence for this statement is that African populations show much greater DNA sequence diversity than do populations in other parts of the world. That is, the branch point at which the lineages of the most different current-day Africans diverged occurred Ionger ago in history than the comparable branch point for any human populations found on other continents. Studies of the patrilineally inherited Y chromosome reached a very similar conclusion: African populations display much greater diversity in Y chromosome sequences than is found within or even between populations from other parts of the world. More recent surveys of autosomal genomes found that individuals carrying the MRCAs for biparentally inherited autosomal regions also lived in Africa. These genetic investigations have revealed a remarkably consistent picture of human origins and the spread ofhuman-
directly from a common ancestor of humans and chimps without mutational modification. By contrast, for a derived
ity around the globe (Fig. 20.19). Modern humans all origi-
SNP allele found in some human populations but not in other human groups nor in chimps, the mutational event must have
established roughly200,000 years ago. This population subse-
occurred at some generation after the MRCA in a specific human sublineage. Human populations who share more alleles must have separated from each other more recently than populations whose DNA sequences are more divergent. As was suggested previously in Fig. 20.17,tracinq variations in mtDNA and Y-chromosome sequences is particularly valuable because these sequences provide clear estimates of matrilineal and patrilineal genetic ancestries, respectively. Sequence variation assayed from your autosomes provides a more complex picture of your ancestry due to the presence of recombination each generation. Our autosomes also provide ancestry information, but the data require more complex analysis because of diploid inheritance and recombination.
Modern Humans Originated in Africa The rapid growth of inexpensive DNA sequencing technol-
ogies that allow data to be collected from easily obtained samples such as saliva has brought into view dramatically the genetic ancestries and relationships of contemporary humans. The first insights came from the study of the matrilineally inherited mtDNA molecule. These investigations established that people representing populations in sub-Saharan Africa, southeast Asia, Europe, aboriginal Australians,
nated from a sub-Saharan African population that was quently dispersed throughout Africa. Then, no later than 60,000 years ago, a subgroup of Africans left the continent and dispersed along a southern Asian route, followed by a more recent dispersal from Africa that settled in the Middle East. From these initial groups in Asia and the Middle East, people then spread out further to colonize the globe in several waves of migration. The reason that populations outside of Africa have less genetic diversity than do African populations is
that non-Africans all share
a
more recent common ancestor
for all genomic regions than do Africans. Another way to think of this phenomenon is as a kind of founder effect, in which the subgroup that left Africa had only a subset of the genetic diversity that was found in Africa 80,000 to 60,000 years ago (Fig.20.20). DNA sequencing surveys reveal many striking and interesting stories about human ancestries. For example, one Y chromosome lineage that originated in Mongolia approximately 1000 years ago is now found in 8% of men throughout an extensive part of Asia, from the Caspian Sea to the Pacific Ocean; this lineage constitutes roughly 0.5o/o of the world's total male population today. Such rapid dispersal through
such a large region could not have occurred by chance. Rather, it appears that the dissemination of this lineage accompanied the establishment of a massive land empire led by Genghis Khan and his male relatives. They slaughtered the males they encountered and fathered many children, whose descendants are now spread widely across southern Asia.
20.3 Ancestry and the Evolution of Modern
\
Humans
683
Figure 20.19 How modern humans populated the earth. A subgroup of humans left their ancestral home in Africa and moved across the ) R"d S"u into southern Asia and then into Australia; subsequently other members of the subgroup migrated from Africa'into the Middle East. People then spread to other parts of the globe via the major routes of migration shown. This map incorporates not only genetic data but also anthropological, cultural, language, and paleontological information.
Europe
Central Asia
East Asia
Middle
Med!lerranean
lndia
Africa
Americas Southeast
Asia
Oceania Australia
40,000
50,000 Years ago
60,000 Years ago
30,000 Years ago
Years ago
20,000 Years ago
Early homo sapiens sapiens in Africa
ara. aa
I
.
1'::'''--:'.'-. '. .-..,t. 1.1 .t.t 'l-t. 1t !I'1'.. ...t t 1..i. j'._lt -150,000
I ., !.r..t-.tr"' aaa 5]-a ar-, years ago =: .l:"=1
t ;tt -tt'.i, -. -tt'i. ''
li';l . rrr:
.i!.i. . .. ll
+?,'
]'.' 'l ' Ho^osapiens sapiens co,onizins sourhwest Asia
iii.iff
:3i;?.
ir,
lltt
:-:= : I
a
..lrr. f
.f
'
iEir-li'.|.#.:ti '-l:5=:":-i "
.!.'.i" . r-iro ':=ril"j -t+'
:t
Homo sapiens sapiens -10,000 years
ago
6o,ooo years aso
I
:ffit
;1;+t'.'=',.:=..' ;." t .'-.
:1'=i:',;!..
10,000 Years ago
'
'r.'l ,. , n
'"1' llrt
'., I
fJ , '-tr..
Figure 20.20 Why the genetic diversity of humans in Africa is greater than that elsewhere in the world. Human migration out of Africa no later than 60,000 years ago was initiated by a small founder population that contained only a fraction of the genetic diversity present on the African continent at that time. Colors represent different alleles of a given genomic region.
684
Chapter
20
Variation and Selection in Populations
Modern Humans in Europe and Asia lnterbred with Other Hominins One long-running issue in anthropology has been the relationship between anatomically modern humans and fossil remains that are clearly humanlike hominins but also quite
in key elements of morphology. Of particular interest are the Neanderthalt whose fossils, dating to as recently as 30,000 years ago, have been found in caves across southern Europe and into central Asia. These specimens showed a very rugged physique with large-browed different
and heavily boned skulls (Fig.20.2la). Were Neanderthals a completely separate lineage of the genus Homo that died
out? Or, given that they coexisted, possibly for 10,000 or so years, with other populations more clearly related to our own, is it conceivable that Neanderthals bred with individuals from these other populations, so that some Neanderthal genes are found in current-day humans?
Neanderthal and Denisovan genomic DNA sequences Recently, scientists have been able to address questions concerning the relationship between Homo sapiens and Neanderthals by comparing genome sequences of currentday humans with those obtained from particularly wellpreserved Neanderthal skeletons such as that shown in Fig.20.2la. For example, researchers remarkably have been able to sequence the full nuclear genome from a fragment of a Neanderthal femur (a leg bone) found in a cave in Croatia and dated to approximately 38,000 years ago. Several lines of evidence, including the finding that the sequence divergence between Neanderthal DNA and that of any presentday human is several-fold higher than that between any two present-day humans, resulted in the estimate that the hominin lineages leading to Neanderthals and Homo sapiens diverged between 500,000 and 800,000 years ago. The DNA sequence obtained from this Neanderthal femur and from the bones of several additional Neanderthal individuals allows scientists to describe certain phenotypic characteristics of Neanderthals. In the reconstruction shown
Figure 20.21 Neanderthals: Archaic humans.
(a) comparisons
of full skeletons of a Neanderthal (/eff) and a modern human (n?ht). (b) Artistt reconstruction of a Neanderthal face.
(a)
(b)
in Fig. 20.21b, the hair of this Neanderthal man is reddish and the skin color light. These traits are not just a figment of the artist's imagination, but are instead specific predictions from the sequences of the melanocortin 1 receptor gene (MCIR). Several Neanderthal specimens have a variant of this gene that is not found in modern humans, but that is
nonetheless similar
to an allele in some modern humans
that encodes an MCIR protein that functions inefficiently. People with this allele of the MCIR gene have a higher-thanaverage probability of having red hair and fair skin. The success of studying ancient samples of Neanderthal bone fragments obtained from caves in Europe motivated anthropologists to search for additional skeletal samples from Asia. The analysis of DNA isolated from a small bone fragment from the tip of a finger of a girl who lived and died in a cave in Siberia just north of Mongolia revealed an unexpected result. The pattern of DNA variation indicated that the individual represented a previously unknown type of hominin whose lineage separated from the Neanderthal lineage perhaps 600,000 years ago and from the modern human lineage roughly 800,000 years ago. Individuals in this lineage are calledDenisovans; the fossil record indicates that the Denisovans died out roughly 30,000 years ago, during a period in which modern humans had already spread out through much of Asia.
Hominin interbreeding and human history Given that modern humans coexisted with Neanderthals and Denisovans possibly for thousands of years, did our ancestors interbreed with these other hominins at least to some degree? Did the Neanderthals and Denisovans mate with each other before their lineages died out? Comparisons of DNA sequences from many modern humans, various nonhominin primates such as chimpanzees, multiple Neanderthals, and now a Denisovan allowed investigators to conclude that all of these tlpes of interbreeding between hominin groups in fact occurred. A simple statistical approach allows population geneticists to estimate ancestral interbreeding between genetically distinct lineages. They compare, for example, the sequence variants found at individual positions in the genomes of two different humans (F11 and H) to those of a Neanderthal (N) and a chimpanzee (C). The scientists then focus their attention on sites where the Neanderthal does not match the chimpanzee (that is, derived variants that could not have been inherited from a common ancestor of all three lineages tH, N, and C]; this idea was previously illustrated in Fig. 10.6 on p. 3a5).If Hr and H2 differ significantly in the percentage of matches to the Neanderthal sequences at such sites, then some present-day humans but not others inherited variants from Neanderthals.
Strikingly roughly
2o/o
of the variants found in
Neanderthals but not chimpanzees are found in the genomes of contemporary humans living outside of Africa, while people indigenous to Africa show few to none of the
Solved
)
Figure 2O.22 Hypothetical evolutionary relationships of modern humans, Neanderthals, and Denisovans inferred from their genomic DNA sequences. overlaps between branches of different colors indicate gene transfer through interbreeding ofdifferent hominins.
A{rica
Europe
Asia
suggests a hypothetical timetable for the evolution of modern humans illustrated in Fig.2O.22. Gene exchange between Neanderthals and the lineage of modern humans occurred in the Middle East and in Europe 85,000-30,000 years ago, soon after Homo sapiens emerged from Africa. From there, Neanderthal DNA variants accompanied human migrants as they spread into the rest of the world outside of Africa. Interbreeding also occurred during roughly the same timeframe in Asia between the ancestors of modern humans and the Denisovans akeady indigenous there. DNA variants obtained from Denisovans then spread with migrant groups of modern humans to Southeast Asia and Oceania, where these variants are still found in current-day populations.
essential concepts
. \L\
.o
\L\
.o
.sa'
\L\
"so'
,z
.06,b,b v'\ v't
1so'
go'
u'J v-l
uo'"ro'
685
with modification' illustrated in Fig. 20.18b)
Oceania
-oo'
Problems
.
Going backwards in time, genetic lineages coalesce to most recent common ancestors. Examination of DNA sequences using the logic of "descent with modifications"allows scientists to make inferences about the history of lineages. The fact that the diversity of DNA sequences among Africans is greater
Neanderthal-specific variants. Application of this test to Denisovan sequences shows, in contrast, that Denisovan ancestry in contemporary human populations is restricted largely to southeastern Asia and the south Pacific (Oceania). These insights into past interbreeding as well as inferences of the relatedness of DNA sequences of contemporary and ancient human lineages (using the logic of "descent
l.
A population called the "founder generationl' consisting of 2000 AA individuals, 2000 Aa individuals, and 6000 aa individuals is established on a remote island. Mating within this population occurs at random, the three genotypes are selectively neutral, and mutations occur at a negligible rate. a. What are the frequencies of alleles A and a in the founder generation? b. Is the founder generation at Hardy-Weinberg equilibrium? c. What is the frequency of the A allele in the second generation (that is, the generation subsequent to the founder generation)? d. What are the frequencies for the AA, Aa, and aa genotypes in the second generation?
e.
Is the
second generation
at
Hardy-Weinberg
equilibrium?
f. What are the frequencies for the AA, Aa, and aa genolypes in the third generation?
than that found among non-Africans suggests that modern humans first evolved in Africa, and that a subset of Africans later emigrated to populate the rest of the earth.
.
Certain, but not all, populations of modern humans share alleles with Neanderthals and Denisovans, indicating some degree of interbreeding between these lineages after the emergence of Homo sapiens from Africa.
Answer This question requires calculation of allele and genotype frequencies and an understanding of the HardyWeinberg equilibrium principle. a. To calculate allele frequencies, count the total alleles represented
in individuals with
each geno-
type and divide by the total number of alleles. Number of
individuals
AA Aa 6000 aa Total
Number of A alleles
Number of a alleles
2000
4000
0
2000
2000
2000
0
12,000
6000
14,000
: 6000/ 20,000 : 0.i. frequency of the A allele (p) : Thefrequency ofthe a allele(q) 14,000/20,000 - 0.7. The
b. If a population is at Hardy-Weinberg equilibrium, the genotlpe frequencies are p2,2pq, and qt.W"
686
Chapter
20
Variation and Selection in Populations
calculated in part a that p this population. Therefore,
:
0.3 and
4 : 0.7 in
homozygotes for one of these genes is 0.9, for another it is 0.8, and for the third it is 0.7. The following graph depicts changes in the frequencies of these alleles in populations of infinite size over time; in each case, the frequency of the allele in question (q) is 0.7 at the beginning of the experiment.
p':(0.3)':o.oe 2pq: 2(0.3)(0.7) :
0.42
q':(0.7)':0.49 For a population of 10,000 individuals, the number of individuals with each genotype, if the population were at equilibrium and the allele frequencies were p 0.3 and p 0.7, would be AA,900; Aa, 4200; and aa, 4900. The founder population described therefore is not at equilibrium. c. Given the conditions of random mating, selectively neutral alleles, and no new mutations, allele frequencies do not change from one generation to the next; p 0.3, and 0.7. d. The genotype frequencies for the second generation
:
q:
0.5
o
0.4 0.3 0.2
AA
0.42; and aa
0.1
in one generation a population not at equilibrium will go to equilibrium if mating is random and there is no selection or significant
0
: p2 : : o'49'
0.09;
Aa
:
2pq
:
qz
Yes,
U
50
100
150
200
250
Generation
mutation. The genotype frequencies
will
be the same in the third
generation as in the second generation.
l!. Two alleles
have been found at the X-linked phosphoglucomutase gene (Pgm) in a Drosophila persimilis population in California. The frequency of tlte PgmA allele is 0.25, while the frequency of the PgmB allele is 0.75. Assuming the population is at Hardy-Weinberg equilibrium, what are the expected genotype frequencies in males and females?
Answer This problem requires application of the concept of allele and genotype frequencies to X-linked genes. For X-linked genes, males (XY) have only one copy of
the X chromosome, so the genotype frequency is equal to the allele frequency. Therefore, P : 0.25 and q : 9.75. The frequency of male Jlies with genotype Xoc':y is 0.25; the frequency of males with genotype XPs-uy is 0.75. Three senotwes Lxist for femal-es, XFsi'o
aid ;11g'-B argmB corresponding to The frequencies of female Jlies with
aPBniA, ;1PBmA 11rs-B,
pt,2pq, and q2.
these three genotypes are (0.25)2, 2(0.25)(0.75), and (0.75)2; or 0.0625, 0.375, and 0.5625, respectively.
lll.
0.6
would be those calculated for part b because in one generation the population will go to equilibrium.
:
f.
0.7
:
:
e.
0.8
Three different genes (red, blue, and green) eachhave two alleles; one allele for each gene has deleterious recessive effects. The relative fitness of the recessive
a. For each gene (identified by the fitness ofthe recessive homozygotes), calculate Aq between the paren-
tal generation and the first generation of progeny (4 : frequency of the recessive deleterious allele). Assume that the relative fitnesses of heterozygotes and of homozygotes for the dominant allele are 1.0.
Then calculale q' (the frequency of the recessive allele in the first generation ofprogeny). You do not yet need to know which gene corresponds to which color in the graph. b. Which of the three genes (blue, red, or green) is the one for which the relative fitness of the recessive homozygote is 0.9? 0.8? 0.7?
c. Briefly explain why Aq will become a smaller negative number in each successive generation for all of these three genes. d. Briefly explain why q would never go to 0 in any of these populations of infinite size. e. Alleles do disappear from real populations, but not in the populations examined in the graph. How can this be?
Answer a. For one of the genes (which we will call gene A for the time being), Wee : L0; Wao : I.0; Woo : 0.9. For gene B, Wnn : 1.0; Wnb : I.0; W61, : 0.8. For the remaining gene C, Wcc: 1.0; W6 : 1.0; W* : 0.7. (That is, the deleterious effects
Problems
involved are all completely recessive.) To make the desired calculations, you would plug in these fitness values into Eq. (20.6). For gene A, Lq: -0.0154 and g' : 0.684. For gene B, Aq -0.0326 and q' : 0,667. For gene C, Aq : -0.0517 and q' : 0.648. b. For the green gene, the change in 4 is steeper than for the other genes, meaning that Aq must be the highest negative number in any given generation. Thus, the recessive homozygote fitness for the green gene must be 0.7 (this is gene C), that for the red gene (B) is 0.8, and that for the blue gene (C) must be 0.9.
Vocabulary
1. Choose the best matching phrase in the right column for each of the terms in the left column. a.
fitness
1. thegenotlpewiththehighest fitness is the heterozygote
b.
gene
pool
2. chance fluctuations in allele frequency
c.
fitness
cost
3. mutations accumulate at a relatively constant rate
d. e.
allele
frequencies
4. p
:
1.0
advantage 5. abilityto survive and reproduce f. equilibrium frequency 6. p and q 7. event that drastically lowers N g. genetic drift 8. the advantage ofaparticular h. molecular clock heterozygote
genotlpe in one situation is a disadvantage in another situation
tion has a successively smaller proportion of homozygotes
for the recessive allele who would
be
subject
to selection.
d. The populations of infinite size would always have heterozygous individuals who would retain the deleterious allele but would not be selected against.
e. Real populations are not infinite in size. Thus, genetic drift would eventually cause the deleterious qllele to be lost from the population.
frequency (p) of0.1 and a genotype frequency (p') ot 0.01 is also at equilibrium?
4. In a certain population of frogs,
120 are green, 60 are
brownish sreen. and 20 are brown. The allele for brown is denoted"GB, while that for green is GG, and these two alleles show incomplete dominance relative to each other.
a. What
are the genotype frequencies in the population?
b. What are the allele frequencies of
GB and GG in this population? c. What are the expected frequencies of the genotypes if the population is at Hardy-Weinberg equilibrium?
5. Which of the following populations are at HardyWeinberg equilibrium? AA
Aa
aa
0.25
0.50
0.25
10. collectionofallelescarriedby all members of a population
b
0.10
0.74
0.16
c
0.64
0.27
0.09
17. Homo sapiers, Neanderthals,
d
0.46
0.50
0.04
e
0.81
0.18
0.01
and Denisovans
Section 2O.t
2. When an allele is dominant, why does it not always increase in frequency to produce the phenotype proportion of 3:I (3/4dominant: 1/4 recessive individuals) in a population?
with an allele frequency (p) of 0.5 and a genotype frequency @\ of 0.25 is at equilibrium. How can you explain the fact that a population with an allele
.J 3. A population
every successive generation because each genera-
a
q:0
k. fixation
c. The rate of change decreases for each gene in
Population
i. populationbottleneck 9. frequencyofanalleleatwhich j. hominins
687
6. A dominant mutation in Drosophila calledDelta causes changes in wing morphology in Deltaf + heterozygotes. Homozygosity for this mutation (Delta / Deltq) is lethal. In a population of 150 flies, it was determined
that 60 had normal wings and 90 had abnormal wings. a. What are the allele frequencies in this population? b. Using the allele frequencies calculated in part (a), how many total zygotes must be produced by this
688
Chapter
20
Variation and Selection in Populations
population in order for you to count 160 viable adults in the next generation? c. Given that there is random mating, no migration, and no mutation, and ignoring the effects of genetic drift, what are the expected numbers of the different
genot)?es in the next generation if 160 viable offspring ofthe population in part (a) are counted? d. Is this next generation at Hardy-Weinberg equilibrium? Why or why not?
7. A large, random mating population is started with the following proportion of individuals for the indicated blood types: 0.5
MM
d. What
is the chance that the first child of a eteoRtRo female and a QFQF RCRD male will be a ete" R'R'
male?
9. Alkaptonuria
is a recessive autosomal genetic disorder associated with darkening of the urine. In the United States, approximately one out of every 250,000 people have alkaptonuria.
a. Assuming Hardy-Weinberg equilibrium, estimate the frequency of the allele responsible for this trait.
b. What proportion of people in the U.S. population are carriers for this trait? In this population, what is the ratio of carriers to individuals affected by alkaptonuria?
rilJ This blood type gene is autosomal, and the M and N alleles are codominant. a. Is this population at Hardy-Weinberg equilibrium? b. What will be the allele and genotype frequencies after one generation under the conditions assumed for the Hardy-Weinberg equilibrium? c. What will be the allele and genotype frequencies after two generations under the conditions assumed for the Hardy-Weinberg equilibrium?
c. if a woman without alkaptonuria had a child with this trait with one husband then remarried, what is the chance that a child produced by her second marriage would have alkaptonuria?
d. Alkaptonuria is a relatively benign condition, so there is little selective advantage to individuals with any genotype; as a result, your assumption of HardyWeinberg equilibrium in part (a) is reasonable. Could
you also use the assumption of Hardy-Weinberg equilibrium to estimate the allele frequencies and carrier frequencies of more severe recessive autosomal conditions such as cystic fibrosis? Explain.
8. A gene called
Q has two alleles, QF and QG, that encode alternative forms of a red blood cell protein that allows
blood group typing. A different, independently segregating gene called R has two alleles, Rc and RD, permitting a different kind of blood group tiping. A random, representative population of football fans was examined, and on the basis of their blood typing, the following distribution of genotlpes was inferred (all genotypes were equally distributed between males and females):
RtRt ete"RtRt e"e"R"R" etetRtR' ete" RtR' eoe"R"R' etetRoRo ete"R'Ro etet
e"e"R'R'
202 to1
lo1 372 186 186
166 83 83
This sample contains 1480 fans.
a. Is the population at Hardy-Weinberg equilibrium with respect to either or both of the Q and R genes? b. After one generation of random mating within this group, what fraction of the next generation of football fans will be QtQ" (independent of their R genotype)?
c. After one generation of random mating, what fraction^of the next generation of football fans will be R'R' (independent oftheir Q genotype)?
10. Two hypothetical lizard populations found on opposite sides of a mountain in the Arizonan desert have two alleles (AF, As) of a single gene A with the following three genotype frequencies
:
AFAF Population I Population 2
AFAS
ASAS
38
44
18
0
80
20
a. What is the allele frequency of AF
in the two
populations?
b. Do either of the two populations appear to be at Hardy-Weinberg equilibrium?
c. A huge flood opened a canyon in the mountain range separating populations 1 and 2. They were then able to migrate such that the two populations, which were of equal size, mixed completely and mated at random. What are the frequencies of the three genotlpes (AFAF, AFAS, and AsAs) in the next generation ofthe single new population oflizards? I 1. It is the year 1998, and the men and women sailors (in equal numbers) on the American ship the Medischol Bounty have mutinied in the South Pacific and settled on the island of Bali Hai, where they have come into contact with the local Polynesian population. Of the 400 sailors that come ashore on the island, 324 have MM blood type,4have the NN blood type, andT2have
Problems
the MN blood type. Already on the island are 600 Polynesians between the ages of 19 and 23. In the Polynesian population, the allele frequency of the M allele is 0.06, and the allele frequency of the N allele is 0.94. No other people come to the island over the next l0 years. a. What is the allele frequency of the N allele in the sailor population that mutinied?
b. It is the year 2008, and 1000 children have been born on the island of Bali Hai. If the mixed population of 1000 young people on the island in 1998 mated randomly and the different blood group phenotypes had no effect on viability, how many of the 1000 children would you expect to have MN blood type? c. In fact, 50 children have MM blood type, 850 have MN blood type, and i00 have NN blood type. What is the observed frequency of the N allele among the children?
689
"prot" type (cp) and one for the 'deuter" type (cd). (Protanopia and deuteranopia are slightly different forms of red-green colorblindness.) Importantly, some of the "normal" females in Waalerb studies were probably of genot 1pe cp f cd . Through further analysis of the 40 colorblind females, he found that 3 were prot (cP / ,P), and 37 were deuter (to I ,o) . c. Based on this new information, what is the frequency of the cp, cd, and C alleles in the population examined by Waaler? Calculate these values as if the frequencies obey the Hardy-Weinberg equilibrium. (Nofe: Refer to your answer to Problem 12a.)
d. Calculate the frequencies of all genotypes expected among men and women if the population is at equilibrium.
e. Do these results make it more likely or less likely that the population in Oslo is indeed at equilibrium for red-green colorblindness? Explain your reasoning.
12.
a.
Alleles of genes on the X chromosome can also be at equilibrium, but the equilibrium frequencies under the Hardy-Weinberg assumptions must be calculated separately for the two sexes. For a gene with two alleles A and a at frequencies ofp and q, respectively, write expressions that describe the equilibrium frequencies for all the genotypes in men and women.
b.
Approximately 1
in
10,000 males
in the United
States is afflicted with hemophilia, an X-linked recessive condition. If you assume that the population is at Hardy-Weinberg equilibrium, what proportion of American females wouldbe hemophiliacs? About
how many female hemophiliacs would you expect to find among the 100 million women living in the United
13.
States?
In 1927,lhe ophthalmologist
George Waaler tested
9049 schoolboys in Oslo, Norway for red-green colorblindness and found 8324 of them to be normal and 725 to be colorblind. He also tested 9072 schoolgirls and found 9032lhat had normal color vision while 40 were colorblind. a. Assuming that the same sex-linked recessive allele c causes all forms of red-green colorblindness, calculate the allele frequencies of c and C (the allele for normal vision) from the data for the schoolboys. (Nofe; Refer to your answer to Problem 12a above.) b. Does Waaler's sample demonstrate Hardy-Weinberg equilibrium for this gene? Explain your answer by describing observations that are either consistent or inconsistent with this hypothesis.
On closer analysis of these schoolchildren, Waaler found that there was actually more than one c allele causing colorblindness in his sample: one kind for the
14. The equation pt + 2pq t qt : I representing the Hardy-Weinberg proportions examines genes with only two alleles in a population. a. Derive a similar equation describing the equilibrium proportions of genotypes for a gene with three alleles. fHint: Remember that the Hardy-Weinberg equation can be written as the binomial expansion (p + q)'.1 b. A single gene with three alleles (IA, P, and l) is responsible for the ABO blood gronps. Individuals with blood type A can be either IA IA or 1A l; those with blood type B can be either 1B1B or IB i; people with AB blood are IAIB, andtype O individuals are li. A Among Armenians, the frequency of I is 0.360, the frequency of IB is 0.104, and the frequency of I is 0.536. Calculate the frequencies of individuals in this population with the four possible blood t1pes, assuming Hardy-Weinberg equilibrium.
In Problems I5-I7, you will see that because mating between individuals within populations at Hardy-Weinberg equilibrium is random, it is possible to predict mating frequencies: that is, the proportion of all matings in the population between individuals of particular genotypes or phenotypes.
15. A gene has two alleles A (frequency : P) and a (frequency : q). If a population is at Hardy-Weinberg equilibrium, develop mathematical expressions in terms of p and q that predict the following mating frequencies: a. Between two AA homozygotes b. Between two aa homozygotes c. Between two Aa heterozygotes d. Between an AAhomozygote and an aahomozygote
690
Chapter
20
Variation and Selection in Populations
e. Between anAA homozyote andan Aaheterozygote
d. Assuming random mating, what proportion of all
Between an aa homozygote and an Aaheterczygote
matings should be between a bald man and a nonbald woman? e. What percentage of the bald men in the population
f.
Considering your answers to parts (a)-(f): g. Do the six possibilities listed account for all possible matings? How would you know whether this is true
mathematically? Demonstrate this latter point by setting p equal to an arbitrary number between 0 and I such as 0.2. h. Can you develop a simple, general rule for calculating the mating frequencies between individuals of the same genotype versus the mating frequencies between individuals of diftbrent genotypes? i. If the population is equally divided between males and females, what proportion of all matings will be between an AA male andan AA female? Between an AA male and an Aa female? Between an AA male and an aa female?
If a nonbald couple produces a bald son, what is the probability that their next son will be bald?
g. A woman with androgenetic alopecia has a daughter, but nothing is known about the father. What is the probability that the daughter will be bald?
Section 20.2 18. Why is the elimination of a fully recessive deleterious allele by natural selection difficult in a large population and less so in a small population? 19. Tristan
16. Some people can taste the bitter compound phenyl-
thiocarbamide while others cannot. This trait is governed by a single autosomal gene; the allele for tasting is completely dominant with respect to the allele for nontasting. Among 1707 Hawaiians tested for the ability to taste, 1326 tasters were found. Assuming that
the population is at Hardy-Weinberg equilibrium for this gene and that mating is purely random: a, What are the allele frequencies for the tasting allele f t: (p)l and for the nontasting allele t l: k)l? b. What are the genotype frequencies in the population? c. Of all the matings in the population, what proportion will be between two nontasters? d. Of all the matings in the population, what propor,
tion will be between a taster and a nontaster? e. Of all the matings in the population, what propor-
da Cunha is a group of small islands in the midIn 1814, a group of 15 British
dle of the Atlantic Ocean.
colonists founded a settlement on these islands. In 1885, 15 of the 19 males on the island were lost in a shipwreck. In the late 1960s, four cases of retinitis pigmentosa, which progressively leads to blindness, were found amongthe240 descendants of these settlers remaining on the island. The frequency of retinitis pigmentosa in Britain is about 1 in 6000. Explain the high incidence of this disease on Tristan da Cunha relative to that seen in Britain.
20. Small population size causes genetic drift because of chance sampling of different frequencies of alleles from one generation to the next. We can predict how much genetic drift occurs for a given population size using
female?
binomial sampling statistics. With a population of size N, we can estimate that 95o/o of the time the allele frequency (p) in the next generation will be within the
What proportion of all of the progeny produced by all matings between a taster male and a nontaster
confidence interval of
tion will be between a taster male and a nontaster
f.
are heterozygotes?
f.
female will be nontasters?
g. Of all the matings in the population, what proportion will be between two tasters? 17. Androgenetic alopecia (pattern baldness) is a complex trait in humans governed by several genes, but suppose a human population exists in which a single autosomal mutation determines pattern baldness. This mutation behaves as a dominant in males and as a recessive in females. This population is in Hardy-Weinberg equilib-
rium, and 51% of the men are bald. a. What is the allele frequency of the mutant allele for
will exhibit pattern baldness?
o\
p(t 1,96
- p)\l. where
2N /'
is an estimate of the statistical variance in 2N allele frequencies from one generation to the next with random sampling of 2N alleles each generation. a. What is the confidence interval for p : 0.5 when
N:
100,000?
b. What is the confidence interval for p
N:10?
:
0.5 when
c. How
are the results in parts (a) and (b) related to the consequences of a population bottleneck?
21. Three basic predictions underlie genetic drift in popu-
baldness among males?
b. What is the allele frequency of the baldness allele among females? c. What percentage of the women
o(l -
p+
in this population
lations: (1) As long as the population size is finite, some level of genetic drift will occur; thus, without new mutations, all variation will either drift to fixation or loss. (2) Drift happens faster in small populations than in
Problems
large populations. (3) The probability that an allele is fixed (goes to a frequency of 1.0) is equal to its initial frequency (p) in the population, while its probability of loss from the population due to drift is equal to I - p. Given these three predictions: a. What is the allele frequency of a new autosomal mutation immediately after it occurs in a diploid population of size N: 100,000? b. What is the allele frequency of a new autosomal
mutation immediately after it occurs in a diploid population of size N: 10? c. In which population does the new mutation have a higher probability of going to fixation by chance with genetic drift?
random), what will be the frequency of the wild-type allele and mutant alleles in the Fa generation? d. Now the geneticist lets all of the Fn flies mate at random (that is, bothwild-type andvestigial flies mate). What will be the frequencies of wild-type and vestigial Fr flies?
24. In a population of infinite size, three loci A, B, and C have two alleles each. Alleles A1,81, and Cl are found in lo/o of the population at a particular point in time, and each has beneficial effects on the organisms'fi.tness as compared
same mutation acts as a recessive lethal that causes ho-
(t/t) to die in utero.In a population consist-
ing of i50 mice, 60 are t+ f t+ and 90 are heterozygotes. a. What are the allele frequencies in this population? b. Given that there is random mating among mice, no migration, and no mutation, and ignoring the
of random genetic drift, what are the expected numbers of the different genotypes in this next generation if 200 offspring are born? c. Two populations (called Pop 1 and Pop 2) of mice come into contact and interbreed randomly. These populations initially are composed of the following numbers of wild-type (t* lt*) homozygotes and tailless (t+ I t) heterozygotes. Wild type Tailless
I
I6 48
Wotot
W,tt,ct
1.00
0.99
0.99
Wutut
Wo,ot
Wnn,
1.00
r.00
0.90
Wcrct
Wcrc,
Wctct
1.00
1.00
0.99
1.2
1.0
0.8 >, O
Pop 2
c C) f
48
U
o.o
q)
36
{
-q)
What are the frequencies of the two genotypes
o.+
in the
next generation?
23. In Drosophila, thevestigial wings
Wotot
The frequencies of alleles Al, B-1, and Cl over thousands of generations is shown in the following graph:
effects
Pop
to the other allele of that locus (A2, 82,
and C2, respectively). The relative fitnesses of the three possible genotypes at each ofthese loci is:
22. A mouse mutation with incomplete dominance (f : tailless) causes short tails in heterozygotes (r+/r). The mozygotes
691
0.2 recessive allele, vg, causes
A geneticist crossed some truemales to some vestigial virgin females. breeding wild-t1pe female F1 flies were wild type. He then The male and that 1/4 of the male mate and found allowed the F1 flies to He dumped the wings. and female F2 flies had vestigial and allowed the wild-type vestigial F, flies into a morgue F3 generation. F, flies to mate and produce an a. Give the genotype and allele frequencies among the the wings to be very small.
wild-type F2 flies. b. What will be the frequencies of wild-type and vestigial flies in the F3? c. Assuming the geneticist repeated the selection against the vestigial F3 flies (that is, he dumped them in a morgue and allowed the wild-type F3 flies to mate at
0
0
2000
4000 6000
B000
10,000
12,000
Generation
a. Which line (blue, red, or green) corresponds to Al? 817 Cl?
b. Why does the allele represented by the redline go to fixation more quickly than that represented by the greenline?
c. Why does the allele represented by the blue line go to fixation more slowly than the alleles represented by either the red or greenlines?
d. Suppose the population only had 1000 individuals. Discuss how this change in population size might affect the shapes ofthe three lines.
692
Chapter
20
Variation and Selection in Populations
25. You have identified an autosomal gene that contributes to tail size in male guppies, with a dominant allele B for large tails and a recessive allele b for small tails. Female guppies of all genotypes have similar tail sizes. You know that female guppies usually mate with males with the largest tails, but the effects of population density and the ratio of the sexes on this
has an advantage in a region where malaria is prevalent. Will the equilibrium frequency (q,)be the same for an African and a North American country? What factors affect q"?
28. Explain why evolutionary biologists monitor selectively neutral polymorphisms
as
molecular clocks.
preference have not been studied. You therefore place
an equal number of males in three tanks. In tank 1, the number of females is twice the number of males. In tank two, the numbers of males and females are equal. In tank 3, there are half as many females as males. After mating, you find the following proportions of small-tailed males among the progeny: tank 1, 16%; tank 2,25o/o; I"ank3,30o/o. a. In your original population, 25o/o of the males have small tails. Assuming that the allele frequencies in males and females are the same, calculate the frequencies of B and b in your original population. b. Calculate Aq for each tank. c. If Ws: 1.0, what is WB6 for each tank? d. If WBB: 1.0, is 14166 less than, equal to, or greater than 1.0 for each tank?
26. In Europe, the frequency of the CF- allele causing the recessive autosomal disease cystic fibrosis is about 0.04.
Cystic fibrosis causes death before reproduction in virtually all cases. a. Determine values of relative fitness (W) for the unaffected, carrier, and affected genotypes. Assume
that no selective advantage is associated with het-
b.
c.
erozygosity for the disease allele. Given your answer to part (a), determine the average (mean) fitness at birth of the population as a whole with respect to the cystic fibrosis trait (W) and the expected change in allele frequency over one generation (A4) when measured at the birth of the next generation. Suppose the European population is at equilibrium
for the frequency of the CF- allele because some heterozygote advantage exists. Recalculate the rela-
tive fitness values for the three genotypes under this assumption. d. The CFTR protein encoded by the CF gene is a chloride ion channel. People suffering from cholera have diarrhea that pumps water and chloride ions out of the small intestine. Use these facts to explain why a heterozygote advantage might in fact exist for the CF gene.
Section 20.3 29. What is the most straightforward evidence at the molecular level in support of the idea that modern humans first appeared in Africa?
30.
In March 2013, the American lournal of Human Genetics published a report that an African-American man who submitted his genome for commercial genealogical analysis had a Y chromosome whose sequence was very different from that of other Y chromosomes that had previously been characterized. The investigators then found that certain males among the Mbo (an ethnic group in Cameroon) shared many of the polymorphisms first found in this African-American man. How do you think these findings would have altered estimates of when a man carrying the MRCA for the human Y chromosome would have lived on the earth?
31. If you go back 40 generations into your biological ancestry: a. How many ancestors are you predicted to have?
b. How could you reconcile that prediction with the fact that the world's population of humans is now roughly 7 billion people? 32. In Fig. 20.17 on p. 680, to what part of the world does Dion's mitochondrial DNA recently trace? Do his Y chromosome and autosomal chromosomes 1-22 also trace back recently to this same region of the world? rvVhy or why not?
33. Predict the DNA sequences at the four nodes (branching points) on the cladogram in Fig. 20.18b (p. 681). 34. A cladogram (not drawn to scale) for the taxonomic family Hominidae is presented on the next page. The numbers 1-10 represent evolutionarylineages or events. The letters A-F represent entries from the following list: Homo neanderthalensis P an tro glo
27. An allele of the G6PD gene acts in a recessive manner to cause sensitivity to fava beans, resulting in a hemolltic reaction (lysis of red blood cells) after ingestion of the beans. The same allele also confers
dominant resistance to malaria. The heterozygote
dyfes (chimpanzees)
Homo sapiens (African Bantu) Homo sapiens (European Danish) Homo sapiens (Native American Hopi) Homo sapiens (Asian Uighurs)
Problems
SNP
2
o B
A
CD
European Danes
G 1.0
Native American
Asian Uighurs
either possibility is correct.
b. One evolutionary divergence is indicated with
a
small arrow. Describe this divergence and estimate how many years ago it occurred based on Fig. 20.19
c.
1.0
(Hopi)
G 1.0
F
a. Match the entries in the preceding list with an appropriate letter from the cladogram. Two of the groups in the list are equivalent on this diagram;
on p. 683. Six SNPs (o, B, T, E, e, and () are sequenced in several individuals in each of the six groups; the allele frequencies are given in the table that follows. At the bottom of each column in the table, write a number from 1 to 10 (corresponding to a red number on the figure) that indicates where along the cladogram a mutation occurred that changed the allele in the common ancestor of all humans and chimps to a derived allele. One blank can be filled by either of two numbers; you only need to show one. Also indicate (in the last row at the bottom) the identity of the allele found in the common ancestor, or write'tant telll'
1
G
10 E
0.4
SNP
African Bantu
7
I
c
p
P. troglodytes
neanderthalensis A 1.0
J
4
SNP
1.0 G 1.0 A r.0 C 1.0
H. E
cr
G 1.0
A 0.6
A
1.0
(. t,.l'5 40.75
c
0.8
40.25
c
0.s
A 0.s
c
C 1.0
SNP
6
SNP
e
693
SNP
(
1.0 T 1.0 T 1.0 G 1.0 1'0.7 T 1.0 T 0.3
T
1.0
T
1.0
c
1.0
1.0
T
c
1.0
T
1.0 T 1.0
c
A
1.0
T
1.0 T 1.0
C 1.0
T
1.0
A 0.6
c
0.4
T
r.0
C 1.0
Number (l-10) Ancestral allele
35. As noted in Fig.20.22 on p. 685, humans now living in Oceania (e.g, Melanesia, Micronesia, Polynesia, and Australia) represent an early offshoot in the spread of humans around the world from an origin in Africa. The width of the branches in Fig. 20.22 indicates the predicted size of the population in each of the major regions of the world (Africa, Europe, Asia, and Oceania).
a. Given these population sizes and histories, would you expect to see more, or less, genetic variation in population samples of humans from Oceania compared to Africa? Explain.
b. Explain why variants found in Denisovan DNA are found only in modern humans living in Southeast Asia and Oceania but nowhere else on earth.
chapter
2 1 Genetics of Complex Traits
TODAY (lN 20l 3), the cost of sequencing a whole human genome is under $5000; within a few years, the cost will undoubtedly be under $1000. At such prices, how worth-
while would it be to you to obtain this information for your own genome or for that of a fetus conceived by you and your partner? The answer will be different for different people, but
Artifcial
selection by dog breeders has led to dramatic dffirences in size among dog breeds. Population geneticists haw found that only six genes are responsible for more than half the variation in sizes among dog breeds.
for everyone, a major component in weighing the costs and benefits is the degree to which genomic sequence data can be interpreted as predictions about specific phenotypes. We chapter outline have already seen that whole-genome sequences will reveal with near certainty whether an individual is a carrier or will 21.1 Heritability: Genetic Versus Environmental be afflicted by many Mendelian conditions such as sicklelnfl uences on Complex Traits cell anemia or cystic fibrosis, where the trait is governed by 21 .2 Mapping Quantitative Trait Loci (QTLs) alleles of a single gene and the penetrance is essentially complete. However, the thousands of dollars you spend on your (or your child's) whole-genome sequence will provide, at least in the near future, almost no clue about many other traits such as intelligence or personality. The reason is that these are complex traits influenced by many factors, including multiple genes, interactions between alleles of different genes, variations in the environment, and interactions between genes and the environment. The height of adult humans is one such complex trait. Tall parents tend to have
tall children, suggesting a genetic contribution to height. Scientists have recently established that hundreds of genes influence human height, and many of these genes have not yet been identified. Excepting special cases such as the mutation causing achondroplasia (dwarfism), the contribution of any one particular polymorphism to height is so small as to have virtually no predictive power. Another reason why genotypic information cannot easily anticipate adult height is that a key environmental factor-nutrition-has a strong influence on this phenotlpe. Figure 2l.l shows that in many different populations in Europe, average height increased dramatically during 694
21.1 Heritability: Genetic Versus Environmental Influences on Complex
Traits
695
the twentieth century. This period of time is so short that it is improbable that allele frequencies underwent significant changes; inslead, the changes almost certainly represent improvements in diet. In this chapter, we begin our discussion of complex traits by considering how scientists can distinguish between the contributions of genes and those of the environment to a given trait. We then focus on two different methods to identifr the specific genes, usually referred to as quantitative trait loci (QTLs), that contribute to these complex traits. Parsing out the many factors responsible for such phenotypes will be an increasingly important goal of genetics in the future, for both theoretical and practical reasons. Research into complex traits will
Figure 21.1 Adult male height by birth cohort for various European populations. rhese graphs
help us understand the causes ol and help develop treatments for, complicated degenerative conditions such as arthritis and coronary disease. Improvement of agriculturally valuable plants and animals will require us 1o track the factors that control complex traits including yield, drought and heat resistance, and nutritional value. A central theme of this chapter is that studies of complex traits require scientists to comPare individuals within a large population.
c .9) o 170 I
illustrate that height in humans is influenced both by genes (because the average heights of different populations vary) and the environment (because improvements in nutrition in the last 1 50 years have resulted in taller people on average).
-
185
ltaly
-
France
-
Belgium
-Sweden
*
G. Britain
180 E
o 175 .E
165
160
1850 1875
1900
1950 1975
1925
2000
Years
The conclusions from this type of analysis apply only to the specific population living in the specific environment currently under invesligation, so they cannot be generalized to other populations or conditions. Animal height provides a clear example of this theme. We have just seen that genotype is a pbor predictor of height in humans, in part because hundreds of genes, many of which have not yet been identified, contribute to this phenotype. However, just a handful of genes-six or fewer-determine a large fraction of the variation in height of domesticated mammalian species, such as dogs and horses. As you will see, the reason for the species differences in the factors governing this i1 complex trait is that past genetic history and future genetic destiny are inextrica-
bly intertwined. ffi
f[
Heritahility:Genetic
Versus Environmental lnfl uences on Complex Traits Iearning objectives
1. 2.
phenotype can be measured over a range of numbers called phenotypic values or trait values. Many such traits show a
roughly bell-shaped "normal" distribution of phenotypic values in populations. Human height provides a good example (Fig. 21.2). A few individuals are either very tall or very short, but most people have heights that are clustered
Summarize the meaning of the variance of a phenotype. Describe experiments that would allow you to distinguish genetic and environmental influences on the variance of a phenotype.
3.
Explain why the term heritability applies to populations and not to individuals.
4.
Describe how scientists obtain information about heritability in human populations by studying monozygotic and dizYgotic twins.
5.
Many complex traits are quantitative traits for which the
Diagram how plant breeders use truncation selection to improve agricultu ral crops.
Figure 21.2 Distribution of human height for members of a genetics class at the University of Notre Dame. Note the roughly bell-shaped distribution, with most individuals of intermediate height.
i
'4
t
t fi'
t
,l
', i' ,' t.,h '(. l' *\il
fj
'ti*f'
li
,!.!
F.
t
ft
L
I
tI
A
r
696
Chapter
21
Genetics of Complex Traits
Figure 21.3 How genetic and environmental influences on quantitative trait can interact to produce normal distributions. As was demonstrated previously in Fig.3.28 on p. 72, genetic varia-
.4
a
Figure 21 Phenotypic variance. (a) The familiar dandelion (Taraxacum sp.) found in many lawns and fields. (b) Calculating the mean
tion at two loci, each with two incompletely dominant alleles with equal effects on the phenotype, can produce five phenotypic classes with the phenotypic (trait) values shown at the /eft. Each of these phenotypic values becomes "blurred" by environmental influences (that is, differences in the environment experienced by various individuals of each class; center) to yield a continuous bell-shaped distribution of the phenotype among all the individuals in the population(right).
and variance of stem length among individual plants in a population. Variance is measured as the average squared difference between each individual value and the mean. To plot the magnitude ofthat variance on the bell-shaped distribution of individual stem lengths, the square root of the variance is shown so as to be in the same units as stem length. Statistical theory tells us that 95olo of the observations under the bell-shaped curve will be within the interval of the mean minus 1.96 times the square root of the variance, and the mean plus 1 .96 times the square root of the variance.
(a)
01234
01234
01
Phenotypic value
Phenotypic value
Phenotypic value
2 34
around the average for the group. The apparently continuous distribution of height in this bell-shaped distribution is shaped by the contributions ofboth genes and the environment. As we saw previously in Chapter 3, the more genes involved in the expression of a trait, the more possibilities exist for phenotypic variation in any given environment and the more the potential phenotypes resemble a normal distribution (Fig. 3.28, p. 72). Rather than being identical, the phenotypes exhibited by individuals of any one genotype are distributed in narrower bell-shaped curves centered around the average phenotype for that genotype. The reason is that individuals with the same genotype are subjected to slightly different microenvironments (Fig. 21.3). For the population as a whole, the effects of the environment superimposed on variation in just a few genes can therefore easily approximate
bell-shaped curve of phenotypic values. How do scientists evaluate distributions of phenotypes to estimate the heritability of a complex trait-that is, the contributions of genes (as opposed to the environment) to the phenotypic differences observed in a particular population? As you will see, the answers to this question are not of only theoretical interest. a
(b) Mean
o
c _(s
o-
o o
-o E
zf
Stem length (x) Finding the mean: Let x r = the stem length of the plant I in a
population of /V plants. The mean of stem length, x, for the population is defined as N
s
Variance ls a Statistical Measure of the Amount of Variation in a Population To estimate heritability, scientists first need to obtain a nu-
merical description for the curve, usually bell-shaped, of the trait's distribution in the population under study. Researchers track the amount of variation by comparing the phenotype for each individual in the population to the av-
*=1xi l=l
N Finding the variance: The variance V" of stem length for the population is defined as N VP
)
k,-")' A/
erage phenotype for the population as a whole. Statistically,
the result of this analysis is termed the total phenotype variance (Vp), and it is calculated as the average squared difference between each individual value and the mean. As an example, let's consider the phenotype of stem length in a population of dandelions, a common weedy plant in North America (Fig. 2f . a). The stem lengths in the
population exhibit a bell-shaped normal distribution. As Fig. 2 f .4b shows, you fi nd the mean stem lengthby summing the values of all stem lengths and dividing by the number of stems. You then find the variance in stem length (Vp for this trait) by expressing the stem lengths as plus or minus
21.1 Heritability: Genetic Versus Environmental Influences on Complex
deviations from the mean, squaring those deviations, summing the squares, and dividing the sum by the number of stems measured. The variance provides a mathematical description of this distribution; the narrower the curve relative to the peak, the lower the value of the variance. Once the totalvariance of the phenotlpe is determined, scientists can then begin to ask what fraction of this variance is due to differences in the genes carried by individual organisms and what part is due to differences in the microenvironments to which these individuals are subjected.
Traits
697
Figure 21.5 Environmental and genetic components of phenotypic variance. (a) The phenotypic variance of genetically identical plants grown in a variable environment like a hillside field is all due to the environmental variance VE. (b) The phenotypic variance of genetically diverse plants grown in a uniform environment such as a controlled greenhouse is all due to the genetic variance Vc. (c) For natural populations grown in diverse natural environments, the total phenotypic variance t/p is the sum ofthe genetic and environmental variance components (vc + vE).
(a) Environmental variance (Vs) Genetically identical seeds grown in a variable environment
Mean (x)
GeneticVariance Can Be SeParated from Environmental Variance To distinguish environmental from genetic effects on phenotypic variation, you need to quantify one variable, say the
environment, while controlling for the other one (that is, while holding the genetic contribution steady). This particular experiment is easy to accomplish in the case of dandelions, because most dandelion seeds arise from mitotic' rather than meiotic, divisions such that all the seeds from a single plant are genetically identical. You could begin by
planting genetically identical seeds on a grassy hillside and allowing them to grow undisturbed until they flower. You then measure the length of the stem of each flowering plant and determine the mean and variance of the distribu-
) tion of values for this trait in this dandelion
a c(d o-
o q)
E f
z
tr Stem length
(b) Genetic variance (V6) Genetically different seeds grown in a constant environment
population
Mean
(i)
(Fig.2r.sa). Because all members of this population are genetically identical (if we ignore rare mutations), any observed variation in stem length among individuals should be a conse-
quence
of environmental variations, such as different
amounts of water and sunlight at different locations on the
hillside. When represented as a variance from the mean, these observed environmentally determined differences in stem length are called the environmental variance (V6). To examine the impact of genetic differences on stem length, you take seeds from many different dandelion plants produced in many different locations, and you plant them in a controlled greenhouse (Fig. 2f.5b). Because you are
o
c(s o.
o
-o E
zJ
M Stem lengih
(c) Phenotypic variance (Vp) = V6 +
raising genetically diverse plants in a relatively uniform environment, the observed variation in stem length is mostly the result of genetic differences promoting genetic variance (V6).
Now, to illustrate the total impact on phenotype of variation in both genes and environment, you take the seeds of many different plants from many different locations (and thus with different genetic variants) and grow them on a common hillside (Fig. 2f .5c). For the population of dandelions that grow up from these genetically diverse seeds, the total phenotype variance (Vp) in stem length will be the sum of the genetic variance (Vc) and the environmental variance (Vs)r
Vp:
V6
*
VB
Vs
Genetically different seeds grown in a variable environment
Mean (x) o
c _g
o o
-o E
^E
f,
z
Stem length
698
Chapter
2l
Genetics of Cornplex Traits
Note that the bell-shaped curve for dandelion stem length inFig2I.4c is broader than either that for genetically identical individuals on a hillside (Fig. 2l.aa) or for genetically variable individuals grown in a controlled greenhouse (Fig. 21.5b). For natural populations of dandelions, both genetic variation among individuals and variation in the environmental conditions experienced by each plant contribute to the total phenotlpic variation.
Heritability ls the Proportion of Phenotypic Variance Due to GeneticVariance Geneticists define the heritability of a phenotypic trait as the proportion of total phenotypic variance (Vp) ascribable to genetic variation alone (V5). Formally, heritability defined in this way is called the broad sense heritability, which is abbreviated as H2 by convention because the variance is the average squared deviation from the mean for a characteristic:
Broad sense H'?
: k
Genetic variance itself can be subdivided into three components. The first of these is Va, or variance due to additive genetic effects Va ffor example, if allele a contributes one unit to the phenotype, A two units, b three units, and B four units, then an Aa Bb heterozygote would display | + 2 + 3 + 4: 10 units of phenotype). The two other contributions to genetic variance are the variance due to dominance effects (Vp) and the variance due to interactions between genetic loci (Vr). Dominance adds an additional component to variance because, for example, a heterozygote for a dominant allele has the same phenotype as a homozygote for that allele, yet these individuals do not have the same genotype. Similarly, interactions between alleles at different loci (for example, epistasis) can cause an allele of one gene to have different phenotypic values depending on the alleles present at the second gene. The total genetic variance is the sum of its three components:
Vc: Ve + VD + I{, thus Vp:Ve+VD+Vr+VE
vA+ vD + vr vA+vD+vr+vE
are those obtained from studies of identical twins.
In contrast, comparisons of phenotypic similarities between parents and offspring cannot measure broad sense heritability because only one of the alleles at any individual locus is shared between any one offspring and any one parent and because the combinations of alleles among different loci also differ. The allele of each locus that is shared between a
parent and an offspring instead must be considered as a genetic factor that acts in a simple, additive fashion; in other words, comparisons of parents and offspring represent only the additive genetic variance (Va) component of overall genetic variance (Vd.Because the additive component to the genetic effects most precisely predicts the range of phenotlpes expected among the progeny of crosses, plant and animal breeders often calculate a heritability estimator called the narrow sense heritability (h'z1-the proportion of total variance due specifically to variance of the additive genetic component: vA
vA+vD+vt+vE
VA
vP
(2r.r)
As you will see, the narrow-sense heritability is also important to breeders for a second reason: it dictates how strongly a particular phenotypic trait will respond to selection on a phenotype. Although for the sake of accuracy we
Il and h2 in the discussion that follows, the distinction is relatively minor for most of the purposes of this chapter. At the extremes, a heritability of 0 (whether in the broad or narrow sense) means there is no heritable variation influencing the trait in the population, and that all of the observed phenotypic variation is due to environmental effects; while a heritability of 1 means that all of the observed phenotlpic variation in the population is due to genetic variation and none is due to enyironmental effects. You should note carefully that heritability in either sense is a property of a population, not an individual. Thus, a statement that the heritability of a trait such as susceptibility to alcohol is 0.4 does not imply will differentiate between
that 40o/o of a particular person's susceptibility is due to genes
Using these identities, we can define broad sense heritability more precisely as:
:
bility that include all three components of genetic variance
Narrow-senseh2 --
VP
Broad sense H2
alleles at all loci, so all three components of genetic variance are the same in both twins. Thus the only estimates of herita-
vG
vP
You will see in the next section that heritability of particular traits can be estimated in studies of genetic relatives. The influences of allelic dominance at individual loci and interaction among alleles at different loci mean that broad sense heritability is typically measured only when comparing identical twins to each other. Identical twins share the same
and 60o/o to the environment. Instead, this value indicates thal 40o/o of the phenotypic variation in this trait observed in the population can be attributed to genetic differences between the individuals in this particular population. Because the amounts of genetic, environmental, and phenotypicvariation may differ among traits, amongpopulations, and among diff"erent environments, the heritability of a trait is always defined for a specific population and a specific set of environmental conditions. The heritability for any trait could thus differ between different populations with different genetic variants or different environments. The analysis of human height again provides a good example. When measured in a prosperous population with
21.1 Heritability: Genetic Versus Environmental Influeuces on Complex
modern standards of food production, human height ) shows a very high heritability, greater than 0.9; that is, 90% of the variation in height seen in this population is due to
allele differences. In contrast, in a poor country where not everyone gets adequate nutrition, heritability would be much lower. The explanation comes from the fact that a person's genome determines their maximum height potential. If their nutrition is sufficient, they will reach this po-
tential; further intake of food will make no difference' However, in underdeveloped countries, great differences exist between individuals in the amount of nourishment they can consume. This environmental difference will express itself as an increase in the environmental component of height variance.
Relatives (such as parents and offspring) share genes, so we expect them to be similar phenotypically to the degree that their genes are related and the phenotype is heritable. If genetic similarity contributes to phenotypic similarity for a trait, it is logical to expect that a pair ofclose genetic relatives will be more similar phenotypically than a pair of individuals chosen at random from the population at large. Thus, by comparing the phenotypic variation among a well-defined set of genetic relatives with the phenotypic variation of the entire population over some range of environments, it is possible to estimate the heritability of a
trait.
699
study involving birds first studied by Darwin; the second compares phenotypes in human twins.
Estimating heritability by comparing parents and progeny Figure 21.6 illustrates the implication of heritability with respect to the mean phenotypic values of the parents, their offspring, and the population as a whole. Heritability can be estimated as the correlation of phenotlpe between progeny and parents. If the mean phenotypic value of the progeny is similar to that of the population as a whole, regardless of the particular phenotypes or genotypes ofthe parents, then the heritability is 0 because none of the phenotypic variance is due to additive genetic effects of the alleles inherited from
the parents.
Heritability Studies Examine Phenotypic Variation in Genetic Relatives
Traits
If the
progeny deviate from the population
averageby 25%o as much as their parents did, then the narrow-
sense heritability is 0.25.If the progeny deviate from the population mean as much as did their parents, then the
heritability is 1.0 because the alleles inherited from the parents closely predict the phenotlpes of the offspring. The finches observed by Darwin in the Galdpagos Islands (often referred to as "Darwin's finches") provide an example of a population for which geneticists have measured the heritability of a trait under natural conditions in the field by comparing the phenotypes of parents and
offspring. Scientists studied the medium ground finch, Geospiza fortis, on the island of Daphne Major by banding
To conduct these kinds of comparisons, we first need
Figure 21.6 Determining heritability by parent/progeny
to define quantitatively the genetic relatedness of two
comparisons. The phenotypic distribution of the parent generation
individuals as the average fraction of common alleles at all genetic loci that the individuals share because they inherited them from a common ancestor. For an autosomal gene, the genetic relatedness of a parent and a child would be 0.5, because for any given locus, one of the two alleles in the child's genome came from that parent. The genetic relatedness of two siblings is also 0.5, because if you assume that one sibling received allele ,41 from an A1 A2heterozygous parent, the probability that the second sibling received the same allele is 0.5. Extending this
kind of analysis, we can see that an aunt and a niece have 0.25 genetic relatedness, while that between first same
population is shown in fdn with the mean of the two specific parents chosen to breed indicated near the right of the distribution. The purple /lne below indicates the range of the offspring phenotypes. lf h2 : 1, then the mean ofthe offspring will equal the mean ofthe selected parents because the alleles inherited from the parents dictate completely the phenotypes ofthe progeny. lf h2 : o.25,lhen the mean ofthe offspring is one-quarter ofthe difference between the mean ofthe selected parents and the mean ofthe total parental generation. lfh2 - 0, then the mean of the offspring will be the same as the mean of the total parental population; in such cases, the genotypes of the parents do not influence the phenotypes ofthe progeny at all.
Parental generation Mean of
the parents
cousins is 0.125.
Although heritability could theoretically be determined using even distant relatives, it makes sense to study the closest relatives possible: individuals for which the genetic relatedness is 0.5 or higher. The reason is simply that the contributions of genes to a trait will be most obvious in cases where the most alleles are shared. In the remainder of this section, we discuss two approaches to estimating heritability in close relatives. The first of these compares phenotypes in parents and progeny, focusing on a famous field
Mean of the
h2-o Progeny
h2:1 2
Short
-
o.2s
Mean of offspring Tall Size (height)
7OO
Chapter
2l
Genetics of Complex Traits
Figure 21 .7 Measuring the heritability of bill depth in populations of Darwin's finches. (al Geospiza forrrs is a seed-eating bird endemic to the Galiipagos lslands. (b) The correlation between beak size of ind ividua I offspring and thei midparent value (the average of the pa rents, beak depth) in 1 976 and in 1 978 is shown. The mean beak depth increased by natural selection associated with a drought in 1 977 that caused a shift in available seeds from small, soft seeds to larger, harder seeds.The slope of the line is, however, the same both years, indicating a constant high heritability that is relatively independent of the environmental change. (c) Plots if heritability were close to 1.0. (d) Plots if heritability were 0.0. ln panels (c) and (d), the plots to the rght show the phenotypic averages for four ofthe mating pairs and their progeny represented in the same format as Fig.21.6; each arrow in the plot at the nErht corresponds to one dot in the plot at the ieff.
(a)
Darwin's finch
(b)
Correlation between parents and olfspring 11.0
G. fortis
o
{F
E E
0.0 51 o"
aa
o
a
E
-o
o)
c
a
a
9.0
lo I
a o
a
a
o.
a
o
o
O
1978
.1976
8.0 o
8.0
9.0 10.0
1 1
.0
Midparent bill depth (mm)
(c)
lf the heritability were 1.0
E6: o)c
.o
.=v 6C
dE
ot
h2
:
1.0
-
Small
Midparent bill depth (mm)
(d)
itli
Distribution of bill size in Parents
Offspring
Large
lf the heritability were 0.0
= :6 €' O)c -=v nC o-':
h2:0
Parents
OE Offspring Midparent bill depth (mm)
many of the individual birds in the population (Fig. 21.7a). The researchers then measured the depth of the bill for the mother, father, and offspring in each nest on the island and calculated whether the bill depth of the offspring was statistically correlated with the average bill depth of the mother and father (the midparent value; Fig. 21.7b). The results show a clear positive correlation between parents and offspring; parents with deeper bills had offspring with deeper bills, while parents with smaller bill depth had offspring with smaller bill depth. For reasons that will become clear momentarily, when positive a correlation exists, the slope of the line relating offspring to midparent value is an estimate of the narrowsense heritability for bill depth. In the figure, the heritability of bill depth, as represented by the slope of the line
Small
Large
correlating midparent bill depth to offspring bill depth (that is, the line of correlation), is 0.82. This means that roughly 82o/o of thevariation in bill depth in this population of Darwint finches is attributable to additive genetic variation among individuals in the population; the other 18% results from variation in the environment and nonadditive genetic effects.
In Fig. 21.7c-d., we examine the
extreme cases to
illustrate why the slope of the line of correlation provides an estimate of the narrow-sense heritability ft2. Suppose first
that the environment had no influence at all on the trait (Fig.2I.7c).In such a case, the slope of the line representing the heritability of bill depth (that is, the line correlating bill depth in parents with bill depth in offspring) would be 1.0 (Fig. 21J c; Ieft). At the right of this figure, we show
21.1 Heritability: Genetic Versus Environmental Influences on Complex
the results for four mating pairs displayed in the same fashion previously used in Fig. 21.6. You can see that both of these representations are essentially equivalent. Now consider a population in which the bill depth for parents and their offspring is, on average, no more or less similar than the bill depths for any pair of individuals chosen from the population at random (Frg. 2I.7d).In such a
population, no correlation exists between the bill depth
trait in parents and in offspring, and a plot of midparent and offspring bill depths produces a "cloud" of points with no correlation between midparent and offspring bill depth (Fig. 21.7d, Ieft). The right panel of Fig. 21.7d shows an equivalent, alternative representation of this same case. You
can see lhat analyzing data that tracks the phenotypic values of parents, their progeny, and the population as a whole by either of the two methods used in Fig.2l.7 allows researchers to estimate heritability values. From these examples, you might conclude that pheno-
typic similarity among genetically related individuals provides evidence for the heritability of a trait. However, conversion of the phenotypic similarity among genetic rela-
tives to a measure of heritability depends on a crucial assumption: that the distribution of genetic relatives is random with respect to environmental conditions experienced by the population. In the finch example, we assumed that
I
parents and their offspring do not experience environments that are any more similar than the environments of unrelated individuals. In nature, however, there may be reasons why genetic relatives violate this assumption by inhabiting similar environments. With finches, for example, all offspring produced by a mother and father during a breeding season normally hatch and grow in a single nest where they receive food from their parents. Because bill depth affects a finchb capacity to forage for food, the amount of feeding in a nest may correlate partially with parental bill depth for reasons quite distinct from genetic similarities. One way to reduce the confounding issue of environ-
mental similarity is to remove eggs from the nest of one pair of parents and randomly place them in nests built by other parents in the population; this random relocation of eggs is called cross-fostering. In heritability studies of animals that receive parental care, cross-fostering helps randomize environmental conditions. Controlling for both environmental conditions and breeding crosses is a fundamental part of the experimental design of heritability studies carried out on wild and domesticated organisms.
Using twins to estimate complex trait heritability in humans Mating does not occur at random with respect to phenotypes in human populations, and researchers cannot apply techniques for controlling environmental conditions and setting up defined breeding crosses to studies ofsuch
Traits
7O1
populations. Nonetheless, in most human societies, family members share similar family and cultural environments. Thus, phenotypic similarity between genetic relatives may result either from genetic similarities or from similar environments or, most often, both. How can you distinguish the effects of genetic similarity from the effects of a shared environment? One way is to study monozygotic twins (identical twins) given up for adoption shortly after birth and raised in different families. In such a pair of identical twins, any phenotypic similarity should be the result of genetic similarity. At first glance, then, the study of adopted identical twins eliminates the confounding effects of a similar family environment. Further scrutiny, however, shows that this is often not true. Many pairs of twins are adopted by different genetic relatives; the adoptions often occur in the same geographic region (usually in the same state and even the same city); and families wishing to adopt must satisfy many criteria, including job and financial stability and a certain family size. As a result, the two families adopting a pair of twins are likely to be more similar than a pair of families chosen at random, and this similarity can reduce the phenotypic differences between the twins. A valid scientific study of separated twins must take these factors into consideration. A related approach is to compare the phenotypic differences between different sets ofgenetic relatives, particularly different types of twins (Fig. 2f .8a). For example, monozygotic (MZ) twins, which are the result of a split in the zygote after feftilization, are genetically identical because they come from a single sperm and a single egg; they share all alleles at all loci. By contrast, dizygotic twins (DZ twins), which are the result of different sperm from a single father fertilizing two different maternal eggs, are like any pair of siblings born at separate times in that they share on ayerage 50% of their alleles at all loci. Comparing the phenotypic differences between a pair of MZ twins (whose genetic relatedness is 1.0) with the phenotypic differences between a pair of DZ twins (whose genetic relatedness is 0.5) can help distinguish between the effects of genes and family environment. If twins are raised in the same family, they share a relatively common environment. Since MZ twins share twice as many alleles as DZ twins, the broad-sense heritability (Il : VGIV) is calculated as twice the difference in the statistical correlation of the phenotype between pairs of
MZandpairs of DZ twins. Because twin studies typically cannot separate the effects of dominance or gene interactions from additive gene effects on phenotype, they allow determinations only of the broad-sense heritability (H'z).In comparison with the more useful narrow-sense heritability (h2) measured in parent-offspring studies that consider only the additive io-pon"tti of the genetic variance, the values of H2 provided by twin studies will be higher but are useful approximations nonetheless.
702
2l
Chapter
Genetics of Complex Traits
Figure 21.8 lnsights into human heritability from the comparison of monozygotic (MZ) and dizygotic (DZ) twins. (a) MZ and DZ twins have different genetic origins. (b) The probability that two children will share a trait depends on their relatedness and the frequency ofthe trait in the population. (1) For hypothetical traits with a heritability of 0.0, the likelihood the two children share the trait depends on the traitt prevalence in the population, not on the degree of relationship of the children. (2) For hypothetical traits with a heritability of 1.O, MZ pairs would be 1000/o concordant (have identical phenotypes), whereas DZ pairs would show a concordance halfway between 1 00o/o and the concordance found between genetically unrelated children.
(a) Monozygotic (MZ) twins
Dizygotic (DZ) twins Two ovulated eggs ferlilized by different sperm
Single ovulated egg fertilized by one sperm
.-f'cG
r-J-"cg @-rU-
ll AG
Embryo./,n. in,o,*o
/\
i) #:J
lt
lt a.e
tt ,/+ .l ,i
r\
'iri" 4rl lllr. tt[. Monozygotic twins
Dizygotic twins
100% alleles shared 100% genotypic identity
50% of alleles shared 25% genotypic identity
(b) Probability that a second child will express a dominant trait that is expressed by a first child (1) Trait with 0.0 heritability Very common
trait
Common
trait
lnfrequent lrait
1000/0
100yo
'100%
50./"
500/"
50"/.
Consider a trait in which the differences in phenotype among individuals in the population arise entirely from differences in the environment experienced by each individual, that is, a trait for which the heritability is 0.0 (Fig. 2l.Sb.l).
For this trait, you would expect the likelihoo d that MZ twins share the same phenotype to be the same as that for DZ twins. The likelihood of trait sharing would also be the same for genetically unrelated siblings who had been adopted into the same family. If one child expresses a trait for which the heritability is 0.0, then the only factor that influences the chance a sibling will show the same phenotype is the probability that the range of environments investigated can produce the phenotype; the degree ofgenetic relationship between the children has no effect. Now consider a trait for which differences in phenotypes among individuals in a population arise entirely from genetic differences, that is, a trait for which the heritability is 1.0 (Fig. 21.8b.2). Since MZ twins are genetically identical, they are expected to show l00o/o concordance in expression, meaning that if one expresses the trait, the other does as well. The concordance of trait expression between unrelated individuals varies based on the commonality of the
trait; the more common the trait in the population, the greater the chance that two unrelated people will have that phenotype (Fig. 21.8b.2). Regardless of the commonality of the trait, DZ twins, because they share half of their alleles, will always display greater concordance than genetically unrelated individuals, but less than the 100% concordance of MZ twins. In the highly simplified case of a dominant trait caused by an allele at a single autosomal gene, DZ twins would show a level of concordance that is halfway between the unrelated value and 100%. In reality, nearly all traits are affected by multiple genes that may have dominant, recessive, codominant, and interacting effects, and the heritabilities of all traits lie between 0.0 and 1.0.
A Trait's
Heritability Determines
Its Potential for Evolution MZDZUR
-II
MZDZUR
MZDZUR
(2)Trail with 1.0 heritability
bility of a complex trait is a
Very common
trait
Common trait
lnlrequent trait
100"/"
100"/"
100%
50"/.
500/"
50%
MZDZUR
MZDZUR
We saw in Chapter 20 how the selection of preexisting mutations generates evolutionary change. Because the herita-
MZDZUR
MZ = monozvgotic twins, DZ = dizygotic twins, UR = unrelated due to adoption
measure of the genetic component of its variation, heritability quantifies the potential for selection and thus the potential for evolution from one generation to the next. A trait with high heritability has a large potential for evolution via selection, whether this selection is natural or artificial. This idea is important for understanding how phenotypes evolve over time. The role ofheritability in evolution also has considerable practical significance for breeding programs to improve agriculturally important plant and animal species. If the heritability of a trait is low, such a program would have little chance of success: It would make more sense either to alter the environment in which the crop or herd was raised,
21.2 Mapping Quantitative Trait Loci
Figure 21.9 The strategy of truncation selection. Narrowsense heritability h2 predicts the response to selection on a quantitative trait. The total parental generation has a bell-shaped distribution for the phenotypic trait, but only individuals above a certain size are allowed to breed (dark shadind.fhe difference in the mean phenotypic value between the selected group and the total population is 5, the se/ection differential. lfh2 heritability is high, the offspring ofthose selected parents will have a mean phenotype that
is also larger
to selection
ity (h2). Thus, realized h2
:
Rl S
Rearranging to solve for R:
R: h2s
is the response
R.
703
phenotypic values fit a normal bell-shaped curve, then the realized heritability is equal to the narrow-sense heritabil-
than the mean of the
original population; the difference between these values
(QTL$
(2r.2)
Consequences of "Truncation Selection"
In other words, the strength of selection (S) and the heritability of a trait (h2) directly determine the trait's amount or rate of evolution in each generation (as indi-
Parental generation
cated by the response to selection R). The greater the heritability, the greater the likelihood that the breeding program
Selected
will improve the crop.
Culled
essential concepts Population mean
-
Mean of selected group
Offspring
. .
The narrower the distribution around the mean, the lower the variance. The total phenotypic variance of a complex trait in
a
population has both genetic and environmental components.
:
ln a population of genetically identical individuals, all phenotypic variance is due to differences in the Mean of progeny
or to search out other representatives of the same species in remote areas of the globe to increase the genetic component of variation. Figure 21.9 illustrates how a plant breeder can exploit situations of high heritability to improve a crop by artificial selection. In this example, the breeder wants to select for larger edible beans, and he will employ a simple but powerful strategy called truncation selection. The essence ofthis method is that he would plant beans with phenotype values above a certain cutoff (in this case, beans of a chosen minimal size) to produce the next generation. In Fig. 2)..9, S represents lhe selection differential, mea' sured as the difference between the average value of this trait for the selected parents and the average value of the trait in the entire parental population (that is, both breeding and nonbreeding individuals). Among the offspring of the selected parents, the average value of the trait will be higher than the average value in the entire parental generation. In the figure, R represents this difference. Used in this way, R signifies the response to selection, that is, the amount of evolution, or change in mean trait value, resulting from the selection applied by the breeder.
The significance of heritability to the breeder comes from the fact that the response to selection (R), and thus the effectiveness of the breeding program, is directly related to the selection differential (S) through what is called the realized heritability of the trait. If the distribution of
environments experienced by different individuals.
.
Heritability describes the proportion of total phenotypic variation due to genetic variation. The heritability of a trait is a property of a specific population in a given environment.
.
Heritability is measured by examining the degree to which traits are more similar in close relatives than in less related individuals.
.
The heritability of a trait determines its potential for evolution.
Ef|
Mapping Quantitative
Trait Loci (QTLs! learning objectives
1.
Explain how researchers identify QTLs by establishing backcross lines and then fine-mapping the QTLs with introgressions.
2.
Discuss the importance of linkage disequilibrium (LD) to
the association mapping of QTLs in humans.
Two main approaches exist to mapping QTLs, the
genes
that contribute to complex traits. The first approach, which we call herc direct QTL mapping, requires researchers to conduct crosses between individuals that differ in the phenot)?e of interest, so this technique can be used only for species that can be bred through controlled crosses. The second approach, termed association mapping, takes advantage ofpast events that occurred in previous generations of
704
Chapter
21
Genetics of Complex Traits
populations, so this method can be applied even to species such as humans in which controlled breeding experiments cannot be performed. As will be seen, the basic idea underlying both approaches is the same, and it relies on genetic recombination to produce individuals with different genetic compositions.
In direct QTL
Figure 2l
.1
O
Size
variation between domesticated toma-
toes and their wild relatives. Domestic tomatofruit,solanum lycopersicum (left) and fruit of three wild tomato species, hobrochaites, and S. pennellii.
5. pimp
inellifolium,
S.
mapping, recombination occurs during
several generations of a series of crosses controlled by the researcher; in association mapping the recombination already happened during the history of a randomly breeding population. In both cases, investigators ultimately test for statistical correlations between markers in different regions of the genome and the particular phenotypic character of interest. These correlations can often pinpoint genetic changes responsible for variations in the phenotype.
; ;a-- t -: rF
:"
t
t:
1+
Researchers Map QTLs byAnalyzing Recombinants Obtained Through Breeding Programs To conduct direct QTL mapping in experimental organisms, investigators cross individuals showing two extremes of the phenotype of interest (such as very large and very small) and examine the joint segregation of the phenotype and genetic markers distributed throughout the genome of the organism. Markers showing a strong correlation be-
tween their presence/absence and the phenotypic trait value (the size in this example) are likely to be genetically linked to one or more genes that influence the trait. Researchers track the presence ofparticular regions of the genotypes that were in the original two parents by examining DNA sequence variants (SNPs, InDels, or SSRs) chosen because they are variable in the mapping population and are distributed throughout the genome. As the costs of gene chip and DNA sequencing methodologies continues to drop, researchers can screen more and more markers, increasing the resolution of the resulting QTL maps.
ldentification of QTLs by rough mapping Some of the most important applications of direct QTL
mapping involve agricultural plants and animals. If researchers can find markers that correlate with QTLs for a commercially valuable trait, then they can develop useful strains that maximize the expression of that trait by finding recombinants that contain particular combinations of alleles of several QTLs. In such applications, the researchers do not necessarily have to identi$' the particular genes that contribute to the phenotype; instead, they just need to find polymorphisms linked closely enough to the QTLs so that DNA analysis will identifii the strains most likely to have
the most desirable phenotypes.
To illustrate the general method for identifying QTLs, we examine a landmark study of tomato fruit size
conducted in the late 1990s. More than two thousand years of domesticated breeding has resulted in the tomatoes used in today's cuisines. Some domesticated tomatoes have fruits that are hundreds of times larger than those of their wild ancestors, which were originally from Mexico (Fig. 21.10). The size increase leading to today's large domesticated tomato occurred through the accumulation of mutations in many different genes over thousands of generations of selection.
To identify the relevant genes, researchers started with two closely related species, easily interbred, that exhibited phenotypes at the extreme ends of the size range: Solanum lycopersicum (large) and Solanum pennellii (small) (Fig.21.10). Each of the two starting strains was inbred for several generations so that the line became homozygous for essentially every allele of every one of its genes. Globally homozygous strains such as these, in which all individuals have the identical alleles at all of the genes, are referred to as isogenic lines. Like Mendel's strains that were pure breeding (homozygous) for the alleles affecting the trait in question, isogenic "small" or "large" strains are homozygous for each of the QTLs affecting tomato size. As you will see, isogenic starting strains are critical to the success of the experiment as they simplify the later analysis. The researchers crossed the large and small isogenic strains to produce the F1, which were medium-sized tomatoes (Fig. 2l.lla). The investigators next backcrossed F1 plants to the large isogenic parent strain (S. lycopersicum) to produce the BCt (backcross 1) generation. Because several different genes control tomato size and the small and large parents have different alleles of many of them, individual plants of the BCr generation display a wide range of size phenotlpes or trait values (Fig. 21.11a). The researchers then weighed
21.2 Mapping Quantitative Trait Loci
)
Figure 21.1 1 Strategy for genetic mapping of quantitative trait loci (QTLs) influencing complex traits. (a) lsogenic pennellii and S. lycopersicum parents (created by multiple generations of inbreeding) were crossed to make an F1 generation, which was then backcrossed to isogenic 5. lycopersicum, creating the BC1 generation. (b) BCr individuals were weighed and genotyped. lf a significant difference in phenotypic value (here, the weight) exists between heterozygotes with the S. pennellii and S.lycopersicum alleles of a marker and homozygotes fo r the S.lycopersicum allele, the marker is linked to a QTL infl uencing the phenotype. S.
c S.
pennellii
plp F1
+i
X
pll
S.Iycopersicum ilt
allele of this marker. Such predictions do not require this marker to be the polymorphism responsible for the trait,
X
I
only that the marker differences be linked to the responsible polymorphism.
Fine-mapping of QTLs with nearly isogenic (congenic) lines As you have just seen, rough mapping does not identify the causal gene for the QTL, but instead establishes a chromo-
(b) Finding QTLs Marker linked to QTL
Marker and QTL unlinked
o o o
Mean
(g
(d
I
pll or l/l
o
rE
t: lc)
p/l
t/t
_o
E
E
f
f
z Weight
+
z Weight
_-
and genotyped hundreds of individual BC1 tomatoes to answer the following question: Are there any genes for which alleles from S. pennellii (p alleles) or S. lycopersicum
(l alleles) are present at a nonrandom frequency in tomatoes ofa particular size class? In other words, do particular p or I alleles correlate with tomato size? Because
for allele 41 of a particular marker, the isogenic
{G
ilI
{*s
I Eo
version of the Lod score mapping statistic described in Chapter 10 [p. 361]). Using this method, the investigators
the heterozygotrs p/l (A1/Ar) BC1 tomatoes are significantly smaller than the l/lhomozygous (Az/A) BC1 tomatoes. The marker is thus linked to a QTL for tomato size. Plant breeders can then predict that tomatoes with the -4.1 allele will likely be larger than those with the A2
pll or lll
a)
each
marker. In the cases of most of these marker loci, the mean weights of the homozygotes and the heterozygotes were the same. But for markers linked to the relevant QTLs, the mean weights were different (Fig. 2f .f fb). (To determine if a calculated difference was significant, the scientists used a
S. lycopersicum strain was homozygous for allele 42, and
,",€€€€
o
The researchers then calculated the mean weights of
the tomatoes homozygous and heterozygous for
gous
S.lycopersicum
c
705
discovered 28 QTLs that influence tomato size. As an example of the direct QTL mapping technique, suppose the starting strain of S. pennellii was homozy-
(a) Cross scheme
P
(QTLs)
the starting strains of S. pennellii and
S. lycopersicum were isogenic, we can represent them as being p/p or l/l for every gene or marker locus at which the two strains were different. The F1 are thus genetically identical andheterozygots (p/l) at every one of these loci, while each BCr tomato is either homozygous I/l or heterozygous pll. You can see at this point the advantage ofbeginning with isogenic strains: The BC1 progeny will inherit different single known alleles of molecular markers specific to S. pennellii or S.lycopersicum that can be tracked easily.
somal segment in which the gene could lie, and whose boundaries are defined by the nearest linked molecular markers. In most such studies, this region is between 1 and 10 cM long, and could include over 100 genes. Although successful breeding programs may not require scientists to find the causal gene, good reasons often exist for extending the research to accomplish this goal. In the study just described, the QTL with the largest effect on size was called fw2.2 (fruit weight 2.2); the S. lycopersicum alleles of fw2.2 may increase fruit weight by up to 30%. In order to begin to understand what molecular factors govern tomato size, investigators wanted to identify thefw2.2 causal gene through a process calledfine-mapping. The first step in this fine-mapping procedure is to conduct a long series ofbackcrosses and intercrosses, starting with the BC1 progeny like those discussed in Fig. 2l.ll, to generate strains called NILs (nearly isogenic lines) or congenic lines. In the example shown in Fig. 2l.l2a, the congenic lines were essentially S. lycopersicum, excepteach line
had a different small region of the S. pennellii genome, called an introgression, in the region of the QTL. The second step is to measure the trait value of each NIL and also to genotype it for molecular markers to determine the precise boundaries ofeach S. pennellii introgression. The idea is that the pennellii genomic region that all of the small tomato (pennellii phenotype) congenic lines share and all the
706
Chapter
2l
Genetics of Complex Traits
Figure 21.12 Fine-mapping of QTLs through generation and analysis of nearly isogenic (congenic) lines.
(a) cross scheme
for making nearly isogenic lines (NlLs) whose genomes are almost entirely from S. lycopersicum (unshaded) but that are homozygous for small regions of S. pennelliiDNA (introgressions; black). (b) Researchers map QTLs by comparing the phenotypes of NILS that have overlapping introgressions. (c) Using transgenes to verify the assignment of a QTL to a particular gene.
(b)
(a)
P. c S.pennellii
S.lycopersicum
c
fw2.2 GENE
srfraii
llllilil,'illffimnl F1
c/
large $nlall
sirali
6 c
"*'*" Nil
til
yeast, maize, Arabidopsis, rice, Drosophila, mice, pigs, and cattle.
Association Mapping Can ldentify QTLs in Populations
ec,c j-"-."",.c lil
timeline is significantly shorter. Hundreds of QTL genes controlling a wide variety of traits have been identified in
sffiai! large large
|il|ililtil,.ilililililililil
ll
this gene is related to a tumor-suppressor protein that when mutated in humans causes the uncontrolled cell growth of cancer (see Chapter 19). It makes sense that the larger size of S. lycopersicum tomatoes as compared with S. pennellii would be due in part to lower fw2.2 gene activity in S. lycopersicum, as this allele has reduced function of a negative regulator of cell growth. Researchers worked for ten years throughout the 1990s to identify the causal gene in the fw2.2 QTL. Now, with high-density marker maps, microarray tools for genotyp, ing, and low-cost, high-throughput DNA sequencing, the
ill ilil ilil ilil
ililillllnn
(c)
lnbred line canying
;No transgene
lw22UFGF
*= +Transgene
tu229MALL
NlLs
large tomato (lycopersicum phenotype) congenic lines lack must contain the cawal fw2.2 gene. If this region is suffi-
ciently small (which will be the case when a number of independent introgressions are charact erized), researchers can then identify the causal gene by analyzing the DNA sequence of that region in the S. pennellii and S, lycopersicum genomes to look for polymorphisms that distinguish the two starting isogenic strains (Fig. 2f .l2b). The final step of the fine-mapping process is to validate the gene assignment, which in this case was done by phenotypic resuce. Researchers cloned the suspect fw2.2 gene ftomthe pennellii genome and introduced it as a transgene inlo lycopersicum.The result, shown in Fig. 2l.l2c, was that the transgene made lhe lycopersicum tomaloes substantially smaller. This phenotypic rescue test was possible because the pennellii fw2.2 allele is dominant, while the recessive lycopersicum allele causes a reduced function (that is, the lycopersicum allele is hypomorphic). New technologies introduced in the last few years allow alternative means to validate suspect QTL genes; for example, techniques such as the use of TALENs (described in Fig. 17.13 on p. 589) allow investigators to change one allele into another directly to verify if this change alters the phenotype under consideration. Ttre fw2.2 gene was found to encode a protein that is a negative regulator of cell division. In fact, the product of
The standard QTL mapping methods just described require
controlled matings of phenotypically different individuals;
but for many organisms including humans, such experiments are neither practical nor ethical. In addition, the number of recombination events that occur in the experimental crosses performed limits the resolution of standard QTL mapping. Fortunately, nature already provides an alternative way to map QTLs: Geneticists can use a method called association mapping to take advantage of past recombination events that occurred in the ancestors of present-day individuals. As Fig. 21.13 shows, association mapping is really just an extension of linkage mapping, in which recombination occurs not over just a single generation, but instead accumulates over many generations. The idea behind association mapping is quite simple. Suppose a new mutation that affects a phenotype took place many generations ago; this mutation (red in Fig. 2LI3) would have occurred on a particular chromosome in a particular individual, like the blue chromosome shown in Fig. 21.13. In subsequent generations, this chromosome will have recombined with other copies of this chromosome (green, yellow, andpurple) that had different variants of polymorphic markers scattered over the chromosomet length. Through many rounds of recombination, all chromosomes in the present-day population are patchworks of the chromosomal types that were present in the population's ancestors. In association mapping, scientists test present-day individuals for genetic variants that are statistically correlated with differences in phenotype. For example, if the phenotype is a condition such as coronary artery disease, then the goal is to find a marker whose frequency in a population of patients is significantly greater than that in a population of nondiseased controls. If the researchers find yariants strongly correlated with the condition, these markers must
21.2 Mapping Quantitative Trait Loci
Figure 21.13 Association mapping
is a
simple extension
) of linkage mapping with a longer time frame. ln both methods, geneticists examine recombinants to find correlations between specific genomic regions and phenotypic values. ln association mapping, the recombinants already exist in a population because recombination has occurred in many generations of ancestors. Researchers identify a QTL by its linkage to marker alleles whose frequencies in the affected group of current-day individuals are greater than their corresponding frequencies in the control group. Such marker alleles would have been present on
the b/ue chromosome when the disease-causing mutation (red) occurred nearby on this same chromosome.
(a) Linkage mapping
,./--
Disease-causing variant
(b) Association mapping Chromosomes in original population 20
generations "
Patient group
-
Chromosomes in
Control group
presenfday population
4+
- {
i I Nonrandom associations 'rr' of a disease mutation (red) and blue chromosome variants
be closely linked to QTLs influencing whether people will develop coronary artery disease.
Linkage disequilibrium: Statistical correlations between variations at two loci To understand how geneticists perform association mapping, we need first to examine how variants at different sites across the genome tend to be organized with respect to
each other in natural populations. The basic question is whether alternative variants at one site are randomly associated with variants at other sites, or instead whether they are statistically correlated. For example, in a hypothetical population, nucleotide position 300500 on chromosome
(QTLs)
707
I has two variants (A and T) in equal frequency, while nucleotide position 300600 on the same chromosome has two variants (G and C), also in equal frequency. With free recombination between sites, we would expect to have four different two-site haploid types (haplotypes) among gametes produced by parents in this population: A-G, A-C, T-G, and T-C in frequencies of 25o/o each. In such a case, the presence ofan A at site 1 does not provide any information about the variant at site 2, which is equally likely to be a G or a C. In such a case, no correlations exist between the identities of the alleles at the two sites, and we would then say that variation at these two sites is in linkage equilibrium. An extreme departure from random association would be where only two haplotypes are present, for example A-G and T-C, each in a frequency of 0.5. In the latter case, a perfect positive correlation exists between variants at the two sites: When we find an A at site 1 then we can predict with certainty that we will see a G at the second site. Variants T and C are also perfectly positively correlated with each other. When the variants of two loci are correlated, then we say the variation is in linkage disequilibrium (LD). LD is measured by
a
statistic,
D which
can be adjusted
it ranges from 0 (for no correlation as in the case of linkage equilibrium) to 1 (indicating a perfect correlation between variation at two sites). As we compare sites further and further apart along a chromosome, with greater genetic so that
map distances between them, LD gradually decays to 0 because more possibilities exist for genetic recombination to disrupt allele associations. Geneticists often illustrate this concept by a plot of pairwise LD values which has the SNPs listed across the top of a "triangular diagram" of diamonds, and the shade of color (from purple lo green) of each diamond indicates the strength of LD between the SNPs compared. Figure 21.14 shows through an analogy how such plots are made and interpreted; in the figure, the colors indicate the distances between six cities along the west coast of the United States rather than LD values. Rates of recombination in humans average roughly X 1 10 t crossover events per generation between adjacent base pairs. However, recombination tends to be clustered into "hotspots" of genetic exchange. This discontinuity of recombination, as well as the randomness of when and
where recombination occurs over evolutionary time, leads to discontinuities in LD across chromosomes for human populations. These discontinuities can be tracked by the presence of LD blocks on triangular diagrams (Fis.2r.1s). GWAS: Using LD
to map genes in populations
The presence of LD in genomes makes it possible for scientists to assay variation at random polymorphisms at different
7O8
Chapter
21
Genetics of Complex Traits
Figure 21.14 Analogy for plots of linkage disequilibrium: matrix of physical distances between cities on the west coast of the United States. Progressively lighter shades of brown indicate increasing distance.Turning the matrix in (a) on its side so that the cities are listed linearly across the top (as in b) results in a "triangle diagram" analogous to showing a matrix of Iinkage disequilibrium (LD) among SNPs listed by order of their location in the genome.
Figure 21.15 Clusters ("blocks") of LD found among SNPs across the human genome. A segment of the genome is expanded in the second row as a block of sequence with the locations of nine SNPs indicated. At the bottom is a "triangle diagram"with block color indicating the strength of the statistical correlation (LD) for pairwise comparisons of the indicated SNPs (progressively weaker correlations are in dark brown,
light brown, blue, and white, respectively).
(a) Distances between cities
Chromosome
Los BakersSan San field Fresno Jose Francisco
Angeles
San Jose
5n .IE
162
Fresno
SNP 211
206 212
!.rso
418
316
270
Bakersfield
348
Los Angeles
459
-
1"s1ff-
lntron 7
I
I
Block 2 6
San Diego
7
I I
f]rsr-soo Iaor-+oo f>+oo
(b)Triangular diagram of intercity distances
San
Bakers-
San
Francisco Jose
Fresno
lield
Los
San Diego
test to locate variants whose frequencies are positively correlated with the phenotype of interest (Fig. 2f .16). In other words, to identify a QTL, the chi-square test for a tightly linked variant would need to show a significant difference between the frequencies for that variant in the Figure 21.16 Comparing cases and controls in GWAS surveys of human diseases. The presence of a QTL is indicated if the frequen-
locations across chromosomes and then test for statistical correlations between the variants and phenotlpe of interest. When carried out on a genome-wide scale, such a survey is called a Genome-Wide Association Study (GWAS). Ideally, you would want to assay every DNA variant in the genome to discover the most likely causal variant(s) for the
difference
in phenotype.
However, genotyping to this
resolution would require whole-genome sequencing, and as of this writing in 2013, the cost of high-quality fullgenome sequencing is still too high for studies involving the tens of thousands of individuals needed. Thus, researchers performing the GWAS studies to date have used DNA microarrays (gene chips; see Chapter 10) that assay millions of common SNPs tagging different regions of the genome. To carry out a GWAS analysis, scientists assess the frequencies of variants distributed across genomes in large samples ofcases and controls, and then use the chi-square
cy of a particular SNP allele is higher among patients with the disease than among controls. The chi-square test establishes the significance of this difference for each SNP. ln this example,SNPl is linked to a QTL that contributes to risk for the disease, while 5NP2 is not.
Cases (with Cisease)
SNPT
SNP2
Cases
Cases
Count of G: 2104 of 4000
Count of G: 1648 of 4000
Frequency of Gi52.6Y"
Frequency oI Gi 41.2"/"
GC CC GG GC CC GG CC GC GG GG
Controis (wititout clisease)
GC CC GC GC GG CC GC GC GG GC
Controls
Controls
Count of G: 2676 of 6000
Count of G: 2532 of 6000
Frequency
ol G:44.6h
Frequency ot Gi 44.2oh
P-value: 5.0. '10-15
U.JJ
P-value:
SNP,.. Repeat for all SNPs
21.2 Mapping Quantitative Trait Loci
patient and control populations; the null hypothesis is that no difference exists. Performing tests on millions of SNPs leads to a problem of false positives, because with this many loci, alarge number of SNPs may appear to differ in their frequencies between affected cases and controls even if these differences reflect only chance sampling errors. Recall that the typical cutoff for a single chi-square test to be considered statistically significant is a p value less than 0.05, meaning that we would observe deviation as large or larger from the expected equal proportions between cases and controls by chance 5% of the time. Thus, if we repeat the chi-square test for all 1 million SNPs in a genome wide study, we will expect that 1,000,000 x 0.05 : 50,000 of them appear to be associated with that phenotype only by chance. To deal with this false positive problem, researchers do not consider individual tests in GWAS analysis to be statistically significant unless the probability of observed deviation from the expectations of the null hypothesis (that is, the p value in the chi-square test) is less than 0.05/ (1,000,000 SNPs) : 5 x 10-8. This condition necessitates very large sample sizes ofcases and controls. In2007, one of the first large-scale GWAS explorations of human diseases examined nearly 1 million SNPs in approximately 2000 patients and 3000 control individuals for each of seven complex diseases, including coronary artery disease. Because the most significant associations have a
very small probability of being observed by chance (such as I x 10-14), the researchers plotted the -logro p value [for example, -logro (1 X 10-14) : I4lfor the chi-square tests of statistical significance for the differences between the frequency of each individual SNP among the cases (for example, people with heart disease) versus the controls (no heart disease) (Fig. 21.17). This way of representing the data is often called a Manhattan plot as it looks somewhat like the skyline of Manhattan in New York City; the rare peaks of -log,o (5 x 10-8) :7.7 or greater indicate the presence of QTLs for this disease in this study of
cardial infarctions (heart attacks) also showed strong associations between SNPs in this region and the occurrence of heart attacks before the age of 60 (Fig. 21.18a). Figure
2l,l9a illustrates the chi-square calculations performed with data for one of these SNPs, called rs1333049, whose association with heart disease is very strong. It is important to note that rs13i3049 is only one of several SNPs with strong disease associations that are clustered together in two adjacent blocks of LD (Fig. 21.18b). Because of the linkage disequilibrium between alleles of these polymorphisms, it is not possible to saywhich if any of them is the causative mutation for coronary artery disease; in fact, the region may contain more than one causative mutation. Very little recombination has occurred in this region of human genomes, so we cannot discriminate where in this part of chromosome 9 is (are) the actual mutation(s) that increase(s) the risk of disease. Furthermore, because no obvious genes have been found to be located within these
Figure 21 .18 Associations with coronary artery disease of SNPs in a 300 kb region of human chromosome 9. (a) A blown-up scale of the Manhattan plot from Fig.21 .17 shows that multiple SNPs between nucleotides 22,010,000 and22,120,00Q display strong statistical associations with the disease in GWAS surveys of two different populations (red dots and blue triangles). A cutofffor statistical significance in this study of 500,000 SNPs is -log16 [p valueJ : 7, indicated by the dotted line. The two annotated genes (CDKN2A and CDKN2B) in the region are located at positions that do not show significant associations with the disease. (b) Triangle plot of LD among the SNPs shown in (a). Several blocks of strong LD (dark brown : maximum LD, light brown significant LD, blue: weakly significant LD, and white: no significant LD) are apparent. Multiple SNPs in LD blocks 1 and 2 such as rsl333049 have strong associations with coronary heart disease. SNP rs7333049
(a) 16
I
CDKN2A
14
{*H
12
o10 l
Figure 21.17 Manhattan plot for a GWAS of coronary artery disease in humans. Peak of red dots on chromosome 9
E8 o.^
represent a cluster of SNPs with high - log p values indicating very high statistical evidence for an association with heart disease.
i4
I
CDKN2B
*l
aa
a
a
A
o
4.r{
2 0 21
12
a
o$ ..t..!..
9O
14
,900,000
22,000,000
22,100,000
Chromosome 9 location (bp)
.10 5 E"
LD block
o-
(b)
o
12345
6 7 8 9 Chromosome
10 11 '12 1314'1516171819202122X
709
If we zoom in on the peak of red dots on chromosome 9, we find a cluster of significantly associated SNPs (Fig. 2f .f 8a). A follow-up study of German patients with myo-
I million SNPs.
o6 oJ t4
(QTLs)
1
LD block 2
22,200,000
71O
Chapter
21
Genetics of Complex Traits
blocks of LD (Fig. 2I.I8a), the nature of the genomic region has provided researchers with no clear-cut clues about the specific location of the disease-causing variant(s).
However, even
if we cannot identify the critical
mutation(s), the lack of recombination in the region also means that SNP alleles with strong disease associations can help physicians identify individuals who might have an elevated likelihood of coronary artery disease. We can quantify the average increase in risk associated with a particular SNP allele by calculating the allelic odds ratio (Fig. 2r. f 9b). In the case of SNP rs1333049, one of the SNPs with the strongest disease association in this region, the presence of a C in the genome confers a 1.38 times greater risk of having coronary artery disease than the presence of a G. We can also calculate a genotypic odds ratio by comparing the
risk allele homozygote or the heterozygote genotype frequencies between cases and controls to those for the nonrisk allele homozygote (Fig. 2l.f9c). It is important to appreciate that even a person with the genotype ofgreatest risk is not guaranteed to have a heart attack before the age of 60; the chances of such an event are simply higher than for people ofother genotypes.
Figure 21 .t 9 Calculations of association p values and allelic odds ratios. (a) The data presented are those for the two
offreedom 1 because the data represent two classes.The C allele ofthis SNP
(df) : is significantly associated with coronary artery disease. (b) The allelic odds /dtio indicates the magnitude of risk conferred by the tested SNP. The ratio of 1.38 in this study means that individuals having the C allele of this SNP have a 1.38 times greater risk of having the disease than people having the G allele. (lfhe genotypic odds ratioindicates the magnitude of risk conferred by particular genotypes at the tested SNP
(a) Determining the p value for the association of a SNP with the disease Allele:
G N ("/")
2132 2783
Cases
Controls
(55o/o) (47o/o)
:
QTLs Reflect Population Histories a detectable way to a particular complex trait in a particular population. This definition poses two major challenges to QTL detection. First, the number of QTLs depends on the amount of variation in the population, and the amount of variation in turn reflects the history of the population. If, for example, the population were an inbred population of genetically identical plants, you would never be able to find a QTL influencing a trait such as viability in stress conditions like drought, because no genetic variation exists in the population. A large number of genes must exist that encode proteins which could potentially influence a plant's survival during a drought, but none of these would constitute a QTL in this population because all of the plants have the same alleles of all of these genes. In other words, the set of QTLs that could in theory be found by experiment depends on the existence of mutations that occurred previously in the population. A second challenge to searches for QTLs is that our ability to find any QTL depends both on the frequency of that particular QTL allele in the population and on the strength of its contribution to the phenotype. If the allele is rare, statistical significance indicating the presence of the QTL might be seen only in very large-scale studies that include hundreds of thousands of individuals. And matters become further clouded when many different QTLs for a trait exist in the population, each of which only contributes a small amount to the phenotype.
df)
pvalue
59J
(44o/o)
1.1
X i0-r4
(530/o)
:
Freq ofC in case/Freq
ofG in case
Freq of C in control/Freq of G in control
CaseC/CaseG 2132/1716 woosnatro:-----------.-ControlC/ControlG--:2783/3089
r.s8
genotypic odds ratio
A/ (o/o)
in
1716 3089
Odds ofC allele given case status Odds of C allele given control status
CC
a genetic variant that contributes
x'(t
/V (o/o)
(b) Calculating the allelic odds ratio
(c) Calculating the
A QTL is by definition
alleles
(C and G) for SNP rs1333049from Fi1.21.1 8. The degrees
CG
N
(o/ol
GG /v (%)
Cases
s86 (30.s)
e60 (49.9)
378 (19.6)
Controls
676 (23.0)
1431 (48.7)
829 (28.2)
)
X (2 df)
59.6
p value
1
X 10-13
(CC) odds ratio (relative to GG) : (586/3781/676/829) : 1.91 Heterozygote (CG) odds ratio (relative to GG) : (960/378)/(1431/829) : 1.47
Homozygote
Striking contrasts that have been observed between GWAS studies in humans and in domesticated animals such as dogs and horses illustrate how QTLs reflect popula-
tion histories. Effects of the recent human population explosion Human height provides an instructive example of the challenges facing geneticists studying many complex traits and diseases in humans. The heritability (h2) of human 80%o in most populations, yet the first small GWAS surveys found no major "height" genes in humans
height is over
(excepting obviously rare and unusual states such as dwarfism). A much more massive, recent GWAS investigation of more than 180,000 people found 180 QTLs that influence height. These genes are, as expected, enriched for growth-related processes that influence adult height. However, each of these QTLs makes only a very small contribution to the phenotype, and all of these 180 QTLs put together explain only about 10% ofthe total variation in adult human height. In other words, the population of humans on the earth must have genomes containing many
21.2 Mapping Quantitative Trait Loci
(QTLs)
711
Figure 21.2O Genomic distribution across the 23 human chromosomes of published statistically significant GWAS associations for 17 medically important traits.
Published Genome-Wide Associations Through 06/2013 Published GWA at p red and coiorless2 -+ blue; (Cross 4) colorlessl -+ puryle and colorless2 + purple. c. Cross 2'
5.
gene, a true revertant
[su(FC0)] - FC7 and others-would all have the same sign, opposite to that of FC0, and could affect different base pairs. Likewise, independent su(FC7) mutants would all have the same sign, oPposite to that of FC7 and the same as FC0, and could affect diferent base pairs, etc.
genes involved
d. F2 only. (Cross 1) 2 purple: I blue: 1 white; (Cross 2) 1 purple: white; (Cross 3) 2 purple: 1 red: 1 blue; (Cross 4) all purple.
A true revertant; b. When recombined with a wild-type rIlB+ would never produce rIIB mutants, while an intragenic suppressor mutant would. c. FC7 mutants (and not FCO mutants) could be recombined with the original FCO mutants to generate an rIIB+ phage. d. Independent suppressors of FCO a.
A:5'AAC -+ 5'UAC, B: Gln3 (5' CAA) -+ Leu (5'CUA); b. A: 5' CUA -+ 5' CCA, B: Thr5 15' ACU) + unchanged (S'ACC); c. B: cln8 (5' CAA) -+ Leu (5'CUA), A: Lyslr (5'AAG) -+ Stop (5'UAG) and protein A truncated after 10 amino acids; d. It doesn't happen because random point mutation would affect two proteins instead
7.
a.
I
37.
a. 18 14 9 10 2l X + D -+ B -+ A -+ C +
39.
a. Successful, immediate, prolonged; b. unsuccessful; c. successful,
thymidine
delayed, prolonged; d. successful, immediate, prolonged; e. unsuccessful; f. unsuccessful; g. successful, delayed, prolonged; h. unsuccessful; i. successful, immediate, short term; j. successful,
of one meaning that, evolutionary pressure exists to avoid overlapping ORFs. Also, it is difficult for ORFs to evolve under constraint to encode functional proteins on both DNA strands.
High [Mg'z*]: pollpeptides composed of Cys and Val, low [Mg2+]: no polypeptides; b. high or low [Mg2+]: polypeptides composed of His, Ala, Cys, and Met; c. high or low [Mg2+]: pollpeptides composed of Va1, Cys, Met, and Tyr.
9.
a.
immediate, prolonged.
41.
b. 1/16 orlcrl B1B1: 1/8 cr1a2 BlBl: 1116 u2a2 p1$l: crlctl $1$2: Ua ala2 BlB2:718 ola2 B2B2: 1116 alctl B2B2: Il8 alct2 $252:1116 u2a2 B2$2.
a. Two;
1/8
11. a. 8: UUU, UUG, UGU, GUU UGG, GUG GGU, GGG; b' 6: Phe' Leu, Val, Cys, Trp, GIy; c. The same amino acid is specified by sevGly; Val, GGU or GGG eral different codons: GUU or GUG
43'sB
:
:
d./(uuu) : 27 Ka, f(uu G) : /(uGU) : /(GUU) : e t 64,f (uGG) : :27164, /(cUG) :/(GGU) :3164,f(GGG) -- 7164;e. Phe (UUU)
:
:
: :
9l64,Yal (GUU or GUG) 9l64,Cys (UGU) Leu (UUG) 3 I 64, Gly (GGU or GGG) -- 41 64 f . Phe t2 I 64, Trp (UGG) UUU, Val, Leu, and Cys are made of codons that are G * 2U, and Trp and Gly are from codons that are 2G + U.
E
13. The 85 amino acids could have come from an unspliced intron due to mutation in a splice site sequence, or from a mutation caused by insertion ofDNA sequence. An intron sequence would be present in genomic DNA, but an inserted sequence would not.
F
15.
E
rI II
E/p p
B/b
sensitivity on one X chromosome, and likely has normal red and green rhodopsin-like genes on her other X chromosome, and a normal autosomal (blue) photoreceptor gene; she may exPress four diferent rhodopsin-like proteins. c. Rare tetrachromat males can have three different X-linked rhodopsin-iike genes on a single X chromosome created via unequal crossing-over
is very close to the mutation site.
21. Required to add the appropriate ribonucleotide to a growing RNA chain.
23.
GeneF: bottom strand; gene G: top strand.
25.
Base
27
Chapter 8 f.2; g.9;h.
14;
i.3; j.
13;
k.
1; l. 7;
m. 15;
n. 11; o. 4;P.16.
3.
a. GUGU GUGUGUoTUGUGUGUG;b. GUUG GUUG GU UG GU UG GU; C. GUG UGU GUG UGU; d' GUG UGU GUG UGU; C. GUGU GUGU GUGU OT UGUG UGUG UGUG.
Anticodons are:5' IAC,5'IAG,5' IGC,5' IGU,5' IGA, 5' IGG, 5' ICC, 5' ICG; b. All anticodons of the form 5' UNN can exist except for 5' UAU, 5' UCA, 5' UUA; c. The Wobble U can be modified as xmsU, or xmts'U in any anticodon. The xo'U modification is restricted to the following anticodons: 5' UGG, 5' UAC, 5' UGC, 5' UCC, 5' UCG, 5' UGA, 5' UAG, 5'UGU; d. 31 + I for tRNAs"'.
a.
19. The in-frame stop codon caused by the original frameshift mutation
between red and green genes.
a. 5; b. 10; c. 8; d. 12; e. 6:
a. The trpA gene has hotspots and'toidspots" for crossing-over. b. The central region of the gene has a lower frequency of crossing-
over than does elsewhere.
17.
45. a. Females have 2 copies ofeach ofthe X-linked photoreceptors genes, while males have only 1 copy-usually one red and one green gene. Ifone ofthe gene copies in a female mutates to produce a rhodopsin-like protein with an altered spectral sensitivity, she could be a tetrachromat. b. A woman with a red/green colorblind son has a rhodopsin-like gene that has an aitered spectral
1.
:
.
pairing between the codon in the nRNA and the anticodon in the IRNA is responsible for aligning the IRNA that carries the appropriate amino acid to be added to the polypeptide chain. a.Translation. b. Tyrosine (Tyr) is the next amino acid to be added to the C terminus of the growing pollpeptide, which will be nine amino acids long when completed. c. The carbory-terminus of the growing pollpeptide chain is trlptophan. d. The first amino acid at the N terminus would be fMet in a prokaryotic cell and Met in a eukaryotic cell. The mRNA would have a cap at its 5' end and a poly-A tail at its 3' end in a eukaryotic cell but not in a prokaryotic cell. If the mRNA were sufficiently long, it might encode several proteins in a prokaryote but not in a eukaryote.
Brief Answer
29.
a. 1431 base pairs; b. 5' ACCCUGGACUAGUGGAAAGUUAACU - Pro Trp Thr Ser Gly Lys Leu Thr Ty. - C.
31. Mitochondria do not
use the same genetic code; mutate the 5' CUA codons in the mitochondrial gene to 5' ACN.
35.
g.
b. The DNA fragment can insert into the polylinker in either of two orientations. c. one; d. one band 4290 bp long; e. two; f. two * 19 bp
a. 4906 bp; b.4246bases; c. 1804 bases; d. 2nd exon; e. 4th exon; f. none;
intron; i. 4th exon; j. none; k. 768 bp (assuming that stop codon is included); l. 254 amino acids; m. 3'd intron; n. 2"d intron; o. 5th exon/ 3' UTR; p. alternative splicing in which the 2nd exon is connected either to the 4th or 5th exon; q. The 4th exon is out of frame with respect to the 2nd exon, and the 5th exon does not encode any B2 lens crystallin protein. The improperly processed mRNA would be expected to encode -20 amino acids of junk because -ll20 codons in the genetic code are stop codons, and so -1120 triplets in random DNA sequence would be stop codons. g. none; h. 4th
37
and 477lbp.
13. The white colonies contain recombinant plasmids
- an insert in the EcoRI site disrupts the lacZ gene; the blue colonies contain vector ligated with no insert - lacZ is expressed.
15, You would need (a) only; DNA polymerase synthesizes single strands of DNA using the DNA molecules to be sequenced as the template. 1
7.
genotype has the same mutant phenotlpe as the mutant / mutant geno!.p'e, the implication is that the mutant allele has the same leve1 of activity as an allele with known zero activity (the deletion). A limitation of this assumption is that some mutant phenotypes depend on a minimum (non-zero) threshold level of gene activiry b. mRNA or protein levels can be measured.
All but c and l; b. all but c and 1; c. ali but c and I through haploinsuficiency; abdefghi could be dominant negative; all but c and I could be neomorphic, fcould be neomorphic due to ectopic expression and the others due to expression of an abnormal protein. a.
43. a.5'CCG; b. Trp at amino acid 5 is compatible with the function of the enzyme. c. A nonsense mutation near the start of the ORF; d. nonsense suppressor tRNA.
45. Its anticodon recognizes
contig II: 5, AGCAAATTACAGCAATATGAAGAGATCMACAGT
47. In
19. Primer 1: 5' CGGATCCCCTAAGATGAATT 3'; primer 2: 5, GCCGMGCGCGCGGAATT 3'
21.
a. Chimpanzees6-7
23.
a. Sequence 1
millionyears ago (mya); mice75 mya; dogs 92 mya; chickens 310 mya; frogs 360 mya; b. missense mutations; c. missense mutations; d. missense mutations.
: genomic DNA; sequence 2 Arg-Asn-G1n-Leu-C.
25. Bacterial
:
cDNA; b. N-Va1-Glu-
genes do not have introns. As bacterial mRNAs do not have
poly-A tails, bacterial mRNAs cannot be purified using oligo-dT.
of3.
the second bacterial species, only a single tRNArv' and a single
3,
b. Both DNA strands are sequenced. c. Six random 27 nt sequences from a genome that is 3 billion bp would have a negligible chance of overlapping. d. 23 contigs if the genome is female, 24 if it is male.
27.
4 bases instead
II:
contig I: 5' TCTTTTAAAAATCTCATTTCCTTTAGGGCATTTT 3'
41. Amino acyl-tRNA synthetases recognize the anticodon of a IRNA, and also other features ofthe IRNA. In this case, those other features are suficient, even thought the anticodon is unusual.
a. There are two contigs: contig I: sequences 2, 4, and 6; contig
sequences 1,3 and 5.
. a.If a mutant / deletiotx
39.
B-7
5, CGGA f CCCC'IAAGATGAATTC'| ']'GGTTCTTCAGCGAATT CCGCGCGCATCGGC 3, 3, GCCTAGGGGAT'TC'i ACTTAAGAACCAAGAAGTCGCTTAA GGCGCGCG]AGCCG 5'
UAC 3'; c. N
33. Order: c e i fa kh dbj
Section
Genes that encode proteins can be identified by searching Genes that encode RNAs are more difficult to identifli.
for ORFs.
29. Human
genes have exons and bacterial genes do not, and the human genome has much more space between genes (repeated elements) than bacterial genes do.
tRNAGln gene exist; mutation of either to a nonsense suppressor
would be lethal.
31. U6
Chapter 9 1. 3.
a. 10;b. 12;c.9;d.7;e.
6;f.2;9.
8; h. 11; i. 5; j.
4;k. 3;1.
3'
3'
bacteriophage DNA, EcoRI; B
EcoRI; C
:
:
human genomic DNA, bacteriophage DNA, HpaII; D human genomic
:
a. longer; b. different; The enzyme would cut a sites
from B-cell genomic libraries (VDJ recombination) and cDNA libraries (RNA splicing).
distinct subset of
in the genome ofeach cell, resulting in overlapping
gene products
with its 25,000 genes. For example, one gene can generate many different proteins through alternate splicing, and posttranslational processing and modifi cation.
GCTAA5'
DNA, HpaII.
7.
Sequence antibody gene clones
5'AATTCGATT3'
EcoRI sites would be expected to appear once in 4096 bp on average in random DNA sequence, and this short sequence has two EcoRI sites.
:
35.
37. The human genome encodes more 25,000 different
5'AATTCGCTGAAGAACCAAG
3,TCTACTTAA5' 3' GCGACTTCTTGGTTCTTAA5' 3'
5. A
a. The gene is likely to be a transcription factor. b. The two genes likely arose by duplication and divergence.
1.
Three DNA molecules result: 5,AGATG
33.
39.
a. Protein phosphatase
I regulatory subunit 14A; b. perfect match; Ifthe entire human cod-
c. The 29 nt sequence is conserved 100%.
ing sequence (cDNA : 444 nt) is entered into BLAST and compared with mouse cDNA sequences, the coding sequences are 85%o identical. The mouse and human proteins are I47 aa and 857o identical.
41. Aberrant crossing-over due to misalignment of ctl and
ct 2
during
meiosis.
fragments.
9.
11.
)
You would need (c) and (d); the restriction endonuclease cleaves the vector and the insert, generating cornpatible ends that are joined through formation ofphosphodiester bonds by DNA ligase.
:
:
plasmid sequences; red insert: 5, CGGA:ICCCC'IAAGA'J.'GAATTCGCTGAAGAACCAAGA AT'I CCGCGCGCATCGGC 3' 3, GCC'TAGGGGATT'CTACTTAAGCGACTTCTTGGTTCTTAA GGCGCGCG]AC]CCG 5' a. Green
Chapter 1 0 1. a. 5,b. 3, c. 8, d. 6,e.2,f.7,g. l,h. 4,i. 10,j.9. 3. Non-coding. 5. The regions in which the two men share many
SNPs must have been inherited from a common ancestor who lived only a relatively few generations ago; in other words, Watson and Venter are related through a recent common ancestor. The regions of unshared SNPs
B-8
Brief Answer Section
located on the genome map. The microarray would come with information about the name of each SNP, and you could look it up in dbSNP. d. 70 MB between SNP1 and SNP8 /190 MB on
must have been obtained from ancestors who were not common to the two men. Shared and unshared regions are interspersed because recombination in past generations since the common ancestor reshuflled the shared and unshared regions.
7.
9.
loci are polymorphic in terms of the number of times the repeat unit is present. b. slipped mispairing (stuttering) during DNA replication; c. Diferent; CNV copy number varies because of crossing-over between non-sister chromatids that are locally misaligned in the repeat region. d. SSR loci occur every 30,000 bp in the genome. Given that the genome is 3 billion bp, the number of SSR loci is fixed at 100,000. The mutations at SSR loci change the number of repeats at the locus, but do not increase the number of loci. For SNPs, the mutations generate new loci, and the number of possible loci is the number of bp in the genome. a. SSR
chromosome4
25.
27
.
that the patient's PKU is caused by mutations in that persont phenylalanine hydroxylase gene, and to understand precisely how the mutations affect this gene; b. (660 g /mole bps) + (6 x 1023 bp/mole bps) 1.1 x l0-2r glbp: (3 x 10' bp/ haploid 12glhaploidgenome; genome) X (i.1 X 10 "g/bp):3.3 X 10 12 (1 X 10 e g DNA in sample) + (3.3 X 10 g /haploid genome) 300 template molecules in i ng of 300 haploid genomes in sample fi,066,329,600 (about 10 billion) PCR genomic DNA. c. 300 x 225 product molecules; d. (10 X 1010 molecules) x (103 bp/molecule) x a. To make sure
:
:
7g: (1.1 X 10-21 gbp):1X 10
11.
-
:
l00ngofPCRproducts.
BLAST your PCR primer sequence against the human genome sequence. b. The chance that any particular l6 bp sequence will be I14,294,967,296. There are only found in the genome is 1/416 3 billion bp in the haploid genome, so any 16 bp primer sequence is probably in the genome only once. The chance that a particular 1,073,741,824. 15 bp sequence will be found in the genome is 714t5 The expectation is that a 15 bp sequence would be found 3 times in the genome, so the lower limit is 16 bp.
29.
:
U2, or 313 and Il2, or 212 and 713:' b. a2l3heterozygote (TC/GT); you could not distinguish between
13. a. Diploid genotypes
35.
a.
contain haploid sperm, which collectively contain the diploid genome of the individual. b. purple; c. green; d. The rapist's mouth swab DNA should match every band in the semen sample. This is not the case for any of individuals I -4. e. The rapist could be related to individual 2; f.45 and 42. a. Semen
21. ASO for HbBA: S'-CTGACTCCTGAGGAGAAGTCG or 3,
-
GACTGAGGACTCCTCTTCAGC;
ASO for HbBs: 5'-CTGACTCCTGTGGAGAAGTCG or 3, - GACTGAGGACACCTCTTCAGC.
23. a.Father:l-AC, 2-GG, 3-AG, 4-GT, 5-CC, 6-CG, 7'AG, 8-GT; Mother: 1-AA, 2-CG, 3-GG, 4-AT, 5-CC, 6-CC, 7-AG, 8-TT; b. locus 4; c. The NIH RefSeq database has all of the SNPs in dbSNP
If the disease allele is a nonsense, missense, or a splice site mutation that led to a stable misspliced mRNA, cDNA sequences could be useful. Mutations that result in lowever levels of normal mRNA or the absence of mRNA would be useful only if many cDNAs are sequenced. b. Western blotting would be useful if a mutation changes the levels or size of the normal protein. Mutations resulting
the bottom graph.
19.
Poly-A is added so that the genomic fragments will hybridize with the oligo-dT on the microarray. The As are added to the 3' ends so that the oligo-dT primes DNA slmthesis using the genomic fragments as templates. b. so that the fluorescent tag on the next base can be detected; c. so that DNA polymerase will add only one fluorescent base at a time; after the color of the fluor is recorded, the block is removed so that the next fluorescence base complementary to the template may be added. a.
Top graph: -15 CAG repeats; bottom graph: -20 CAG repeats; b. (1) Repeat number varies in different sperm, so mutations occur in germJine cells; (2) The larger the original number of repeats in the
SSR markers are highly polymorphic. Therefore, you would need many more SNP loci to build up the odds that two people with the same alleles are uniikely to exist. b. Pedigreed dogs are highly inbred, so there will be fewer alleles of polymorphic markers to distinguish individual animals.
a.T7:'e SNP could be closely linked to this disease gene. b. (1) Determine whether or not the SNP locus is rare in humans-is it in dbSNP? (2) Determine if the SNP is within a gene and if so it is likely to be nonanonymous. (3) Checkifother families are heterozygous for nonanonymous SNPs in the same gene.
33.
a.
17. a. Each SNP marker has only two alleles;
a. Locrts 1: not informative because neither parent is a double heterozygote; Locus 2: not informative because neither parent is a double heterozygote; Locus 3: potentially informative, but only ifthe mother is a heterozygote for the disease gene and only if the child is an AA or CC homozygote (if the child was AC you could not tell which marker allele came from Mom and which from Dad); Locus 4: potentially informative if the mother is a carrier for the disease mutation. b. For Locus 3, you could solve the phase problem only ifyou already know the disease gene and Locus 3 are tightly linked; for Locus 4, the best assumption is that the carrier mother has a chromosome with both the disease-causing mutation and the C allele of the SNP because this is a consanguine-
need to be identified before the fetus could be tested.
713 and
allele inherited, the more likely it is that the repeat number in the sperm will vary and the greater the variation; (3) The repeat numbers tend to increase, not decrease. c. Expansion ofpre-mutation alleles can occur even during spermatogenesis in a person without the disease. d. Ifthe repeat expansion occurs mostly during spermatogenesis, then most blood cells of the person in the top graph shouid have alleles with 15 or 62 repeats, 20 or 48 repeats for the person in
:
31. a. Zero;b. 0.75o/o; c. The parents are likely to carry diferent alleles, and in each parent, the mutation in the very large CFTR gene would
TC/GT and TT/GC.
i5.
-37o/o.
1/11 :9o/o;b. The maximum Lod score : log [(.89)t0 (.11)/ (.50)111 - log70 - 1.8. It is 70X more likely that the SNP and the disease allele are linked with RF : 9o/o than that they are unlinked; as Lod score