Population Genetics Joe Felsenstein GENOME 453, Autumn 2011 Population Genetics p.1/74
Godfrey Harold Hardy (1877-1947) Wilhelm Weinberg (1862-1937) Population Genetics p.2/74
A Hardy-Weinberg calculation 5 AA 2 Aa 3 aa 0.50 0.20 0.30 Population Genetics p.3/74
A Hardy-Weinberg calculation 5 AA 2 Aa 3 aa 0.50 0.20 0.30 0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30 Population Genetics p.4/74
A Hardy-Weinberg calculation 5 AA 2 Aa 3 aa 0.50 0.20 0.30 0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30 0.6 A 0.4 a 0.6 A 0.4 a Population Genetics p.5/74
A Hardy-Weinberg calculation 5 AA 2 Aa 3 aa 0.50 0.20 0.30 0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30 0.6 A 0.4 a 0.6 A 0.36 AA 0.24 Aa 0.4 a 0.24 Aa 0.16 aa Population Genetics p.6/74
A Hardy-Weinberg calculation 5 AA 2 Aa 3 aa 0.50 0.20 0.30 0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30 0.6 A 0.4 a 0.6 A 0.36 AA 0.24 Aa Result: 0.4 a 0.24 Aa 0.16 aa 0.36 AA 0.48 Aa 0.16 aa 1/2 1/2 0.6 A 0.4 a Population Genetics p.7/74
Hardy-Weinberg mathematics AA Aa aa P Q R p P + (1/2) Q A (1/2) Q + R q a p q A a p 2 pq AA Aa pq q 2 Aa aa Result: 2 p AA 2pq Aa 2 q aa 1/2 1/2 p A q a Population Genetics p.8/74
Calculating the gene frequency (two ways) Suppose that we have 200 individuals: 83 AA, 62 Aa, 55 aa Method 1. Calculate what fraction of gametes bear A: Genotype Number Genotype frequency Fraction of gametes AA Aa aa 83 62 55 0.415 0.31 0.275 all 1/2 1/2 all 0.57 0.43 A a Population Genetics p.9/74
Calculating the gene frequency (two ways) Suppose that we have 200 individuals: 83 AA, 62 Aa, 55 aa Method 2. Calculate what fraction of genes in the parents are A: Genotype Number A s a s AA 83 166 0 228 400 = 0.57 A Aa aa 62 55 62 62 0 110 172 = 400 0.43 a 228 + 172 = 400 Population Genetics p.10/74
A numerical example of natural selection Genotypes: AA Aa aa relative fitnesses: 1 1 0.7 (assume these are viabilities) Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy Weinberg) (newborns) 0.04 0.32 0.64 Population Genetics p.11/74
A numerical example of natural selection Genotypes: AA Aa aa relative fitnesses: 1 1 0.7 (assume these are viabilities) Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy Weinberg) (newborns) 0.04 0.32 0.64 x 1 x 1 x 0.7 Survivors (these are relative viabilities) 0.04 + 0.32 + 0.448 = Total: 0.808 Population Genetics p.12/74
A numerical example of natural selection Genotypes: AA Aa aa relative fitnesses: 1 1 0.7 (assume these are viabilities) Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy Weinberg) (newborns) 0.04 0.32 0.64 x 1 x 1 x 0.7 Survivors (these are relative viabilities) 0.04 + 0.32 + 0.448 = Total: 0.808 genotype frequencies among the survivors: 0.0495 0.396 0.554 (divide by the total) Population Genetics p.13/74
A numerical example of natural selection Genotypes: AA Aa aa relative fitnesses: 1 1 0.7 (assume these are viabilities) Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy Weinberg) (newborns) 0.04 0.32 0.64 x 1 x 1 x 0.7 Survivors (these are relative viabilities) 0.04 + 0.32 + 0.448 = Total: 0.808 genotype frequencies among the survivors: 0.0495 gene frequency 0.396 0.554 A: 0.0495 + 0.5 x 0.396 = 0.2475 a: 0.554 + 0.5 x 0.396 = 0.7525 (divide by the total) Population Genetics p.14/74
A numerical example of natural selection Genotypes: AA Aa aa relative fitnesses: 1 1 0.7 (assume these are viabilities) Initial gene frequency of A = 0.2 Initial genotype frequencies (from Hardy Weinberg) (newborns) 0.04 0.32 0.64 x 1 x 1 x 0.7 Survivors (these are relative viabilities) 0.04 + 0.32 + 0.448 = Total: 0.808 genotype frequencies among the survivors: 0.0495 gene frequency 0.396 0.554 A: 0.0495 + 0.5 x 0.396 = 0.2475 a: 0.554 + 0.5 x 0.396 = 0.7525 (divide by the total) genotype frequencies: (among newborns) 0.0613 0.3725 0.5663 Population Genetics p.15/74
The algebra of natural selection New gene frequency is then Genotype: Frequency: AA p 2 Aa 2pq aa q 2 (adding up A bearers and dividing by everybody) p = p 2 w + (1/2) 2pq w AA Aa Relative fitnesses: w AA w Aa w aa p 2 w + 2pq w + q 2 w AA Aa aa After selection: p 2 w 2pq w q 2 w AA Aa aa Note that these don t add up to 1 p = mean fitness of A p ( p w + q w ) AA Aa = p p 2 w + 2pq w + q 2 w AA Aa aa w A w mean fitness of everybody Population Genetics p.16/74
gene frequency of A Is weak selection effective? Suppose (relative) fitnesses are: AA Aa aa (1+s) 2 1+s 1 x (1+s) x (1+s) So in this example each change of a to A multiplies the fitness by (1+s), so that it increases it by a fraction s. 1 The time for gene frequency change, in generations, turns out to be: s 1 0.1 0.01 0.001 change of gene frequencies 0.01 0.1 0.1 0.5 0.5 0.9 0.9 0.99 3.46 3.17 3.17 3.46 25.16 23.05 23.05 25.16 240.99 220.82 220.82 240.99 2399.09 2198.02 2198.02 2399.09 0.5 0 generations Population Genetics p.17/74
An experimental selection curve Population Genetics p.18/74
Rare alleles occur mostly in heterozygotes This shows a population in Hardy Weinberg equilibrium at gene frequencies of 0.9 A : 0.1 a Genotype frequencies: 0.81 AA : 0.18 Aa : 0.01 aa Note that of the 20 copies of a, 18 of them, or 18 / 20 = 0.9 of them are in Aa genotypes Population Genetics p.19/74
Overdominance and polymorphism AA Aa aa 1 s 1 1 t when A is rare, most A s are in Aa, and most a s are in aa The average fitness of A bearing genotypes is then nearly 1 The average fitness of a bearing genotypes is then nearly 1 t So A will increase in frequency when rare when a is rare, most a s are in Aa, and most A s are in AA The average fitness of a bearing genotypes is then nearly 1 The average fitness of A bearing genotypes is then nearly 1 s So a will increase in frequency when rare gene frequency of A 0 1 Population Genetics p.20/74
Underdominance and unstable equilibrium AA Aa aa 1+s 1 1+t when A is rare, most A s are in Aa, and most a s are in aa The average fitness of A bearing genotypes is then nearly 1 The average fitness of a bearing genotypes is then nearly 1+t So A will decrease in frequency when rare when a is rare, most a s are in Aa, and most A s are in AA The average fitness of a bearing genotypes is then nearly 1 The average fitness of A bearing genotypes is then nearly 1+s So a will decrease in frequency when rare 0 gene frequency of A 1 Population Genetics p.21/74
Fitness surfaces (adaptive landscapes) w Overdominance stable equilibrium w Underdominance unstable equilibrium (gene frequency changes) (gene frequency changes) 0 1 p 0 1 p Is all for the best in this best of all possible worlds? Population Genetics p.22/74
A case to consider: two interacting adaptations AA bb Aa aa Bb BB BB Bb bb AA 0.8 0.9 1.0 Aa 0.9 1.0 0.9 aa 1.0 0.9 0.8 Population Genetics p.23/74
A fitness surface (in a haploid case) 1.00 0.92 1.06 1.08 0.80 0.94 1.04 Gene Frequency of B 0.60 0.40 0.20 0.96 0.98 0.96 1.00 1.02 AB 1.1 Ab 0.9 ab 0.9 0.00 0.94 0.98 0.92 0.00 0.20 0.40 0.60 0.80 1.00 Gene Frequency of A ab 1 Population Genetics p.24/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.25/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.26/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.27/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.28/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.29/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.30/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.31/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.32/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.33/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.34/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.35/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.36/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.37/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.38/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.39/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.40/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.41/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.42/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.43/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.44/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.45/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.46/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.47/74
Genetic drift 1 0Gene frequency 0 1 2 3 4 5 6 7 8 9 10 11 Time (generations) Population Genetics p.48/74
Distribution of gene frequencies with drift 0 1 0 1 time 0 1 0 1 0 1 Note that although the individual populations wander their average hardly moves (not at all when we have infinitely many populations) Population Genetics p.49/74
Genetic drift in some small populations from Hartl and Clark, Principles of Population Genetics Population Genetics p.50/74
They re real (lab) populations of Drosophila from Barton et al., Evolution Peter Buri. 1956. Gene frequency in small populations of mutant Drosophila. Evolution 10 (4): 367-402. Population Genetics p.51/74
Some copies happen to have more descendants Time Population Genetics p.52/74
Some copies happen to have more descendants Time Population Genetics p.53/74
Some copies happen to have more descendants Time Population Genetics p.54/74
Some copies happen to have more descendants Time Population Genetics p.55/74
Some copies happen to have more descendants Time Population Genetics p.56/74
Some copies happen to have more descendants Time Population Genetics p.57/74
Some copies happen to have more descendants Time Population Genetics p.58/74
Some copies happen to have more descendants Time Population Genetics p.59/74
Some copies happen to have more descendants Time Population Genetics p.60/74
Some copies happen to have more descendants Time Population Genetics p.61/74
Some copies happen to have more descendants Time Population Genetics p.62/74
Averaging of gene frequencies when populations admix Population 1 Population 2 0.2 0.8 0.7 0.3 0.60 from Pop. 1 0.40 from Pop. 2 makes 0.60 x 0.8 + 0.40 x 0.3 which is 0.60 Population Genetics p.63/74
A cline (name by Julian Huxley) 1 no migration some more gene frequency 0 geographic position Population Genetics p.64/74
A famous common-garden experiment Clausen, Keck and Hiesey s (1949) common-garden experiment in Achillea lanulosa Population Genetics p.65/74
Heavy metal Population Genetics p.66/74
House sparrows Population Genetics p.67/74
House sparrows Population Genetics p.68/74
Mutation Rates Coat color mutants in mice. From Schlager G. and M. M. Dickie. 1967. Spontaneous mutations and mutation rates in the house mouse. Genetics 57: 319-330 Locus Gametes tested No. of Mutations Rate Nonagouti 67,395 3 4.4 10 6 Brown 919,619 3 3.3 10 6 Albino 150,391 5 33.2 10 6 Dilute 839,447 10 11.9 10 6 Leaden 243,444 4 16.4 10 6 - - Total 2,220,376 25 11.2 10 6 Population Genetics p.69/74
Mutation rates in humans Population Genetics p.70/74
Forward vs. back mutations Why mutants inactivating a functional gene will be more frequent than back mutations The gene 12 places can mutate to nonfunctionality only one place can mutate back to function function can sometimes be restored by a "second site" mutation, too Population Genetics p.71/74
A sequence space For sequences of length 1000, there are 3 X 1000 = 3000 "neighbors" one step away in sequence space But there are 4 1000 602 sequences, which is about 10 in all! No two of them are more than 1000 steps apart. Hard to draw such a space How do we ever evolve? Woiuldn t it be impossible to find one of the tiny fraction of possible sequences that would be even marginally functional? The answer seems to be that the sequences are clustered An example of such clustering is the English language, as illustrated by a popular word game: W O R D W O R E G O R E G O N E G E N E But the word BCGH cannot be made into an English word There are also only a tiny fraction of all 456,976 four letter words that are English words But they are clustered, so that it is possible to "evolve" from one to another through intermediates Population Genetics p.72/74
Mutation as an evolutionary force If we have two alleles A and a, and mutation rate from A to a is and mutation rate back is the same, 10 6 1 0.5 0 0 500,000 generations Mutation is critical in introducing new alleles but is very slow in changing their frequencies Population Genetics p.73/74
Estimation of a human mutation rate By an equilibrium calculation. Huntington s disease. Dominant. Does not express itself until after age 40. 1/100, 000 of people of European ancestry have the gene. Reduction in fitness maybe 2%. If allele frequency is q, then 2q(1 q) of everyone are heterozygotes. So q 0.000005. 0.02 of these die. So a fraction 0.02 of all copies in the population are eliminated by natural selection each generation. So the fraction of all copies that are mutations that are eliminated is 0.000005 0.02 10 7 If we are at equilibrium between mutation and selection, this is also the fraction of copies that have a new mutation. So the mutation rate is in that case 10 7 Similar calculations can be done with recessive alleles, but we must remember that in their case each death (or reduction in fitness) kills two copies of the mutant. Population Genetics p.74/74