Coalescence time distributions for hypothesis testing -Kapil Rajaraman 498BIN, HW# 2

Similar documents
Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms

The African Origin Hypothesis What do the data tell us?

Coalescence. Outline History. History, Model, and Application. Coalescence. The Model. Application

Comparative method, coalescents, and the future. Correlation of states in a discrete-state model

Anthropology 207: Hominid Evolution Fall 2008

Tópicos Depto. Ciencias Biológicas, UniAndes Profesor Andrew J. Crawford Semestre II

Comparative method, coalescents, and the future

Mitochondrial Eve and Y-chromosome Adam: Who do your genes come from?

Your mtdna Full Sequence Results

Coalescents. Joe Felsenstein. GENOME 453, Autumn Coalescents p.1/48

Warning: software often displays unrooted trees like this:

DNA and Ancestry. An Update on New Tests. Steve Louis. Jewish Genealogical Society of Washington State. January 13, 2014

Algorithms for Genetics: Basics of Wright Fisher Model and Coalescent Theory

DNA Haplogroups Report

Bioinformatics I, WS 14/15, D. Huson, December 15,

Coalescents. Joe Felsenstein. GENOME 453, Winter Coalescents p.1/39

Coalescent Theory: An Introduction for Phylogenetics

Human origins and analysis of mitochondrial DNA sequences

BIOL Evolution. Lecture 8

2 The Wright-Fisher model and the neutral theory

Population Structure and Genealogies

The genealogical history of a population The coalescent process. Identity by descent Distribution of pairwise coalescence times

CENTRAL TEXAS COLLEGE SYLLABUS FOR ANTH 2301 PHYSICAL ANTHROPOLOGY. Semester Hours Credit: 3 INSTRUCTOR: OFFICE HOURS:

1. Develop knowledge of the fundamental concepts and principles of physical/biological anthropology.

Simulated gene genealogy of a sample of size 50 from a population of constant size. The History of Population Size from Whole Genomes.

Genesis and Genetics Matthew Price

Recap: Properties of Trees. Rooting an unrooted tree. Questions trees can address: Data for phylogeny reconstruction. Rooted vs unrooted trees:

5 Inferring Population

Analysis of geographically structured populations: Estimators based on coalescence

The Two Phases of the Coalescent and Fixation Processes

HUMAN ORIGINS: V New York University Department of Anthropology

GENEALOGICAL TREES, COALESCENT THEORY AND THE ANALYSIS OF GENETIC POLYMORPHISMS

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

6 EARLY HUMANS WHAT MAKES HUMANS DIFFERENT FROM OTHER SPECIES?

Using Mitochondrial DNA (mtdna) for Genealogy Debbie Parker Wayne, CG, CGL SM

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Feb 3 & 5):

Ancestral Recombination Graphs

Evolutionary trees and population genetics: a family reunion

Every human cell (except red blood cells and sperm and eggs) has an. identical set of 23 pairs of chromosomes which carry all the hereditary

Ioanna Manolopoulou and Brent C. Emerson. October 7, Abstract

TRACK 1: BEGINNING DNA RESEARCH presented by Andy Hochreiter

Chapter 12 Gene Genealogies

Testing Multiregionality of Modern Human Origins

The Ascent to Man. Trinity College Digital Repository. Trinity College. Lauren Browne Trinity College

Anthropology 391:B1. Winter 2013 HOMINID EVOLUTION Dr. Pamela Willoughby

Forward thinking: the predictive approach

Reason and imagination are fundamental to problem solving and critical examination of self and others.

Human Evolution ANT Spring 2018

Report on the VAN_TUYL Surname Project Y-STR Results 3/11/2013 Rory Van Tuyl

Population Genetics using Trees. Peter Beerli Genome Sciences University of Washington Seattle WA

can mathematicians find the woods?

T O B E H U M A N? Exhibition Research Education

Growing the Family Tree: The Power of DNA in Reconstructing Family Relationships

Theoretical Population Biology. An approximate likelihood for genetic data under a model with recombination and population splitting

Prentice Hall Biology: Exploring Life 2004 Correlated to: Pennsylvania Academic Standards for Science and Technology (By the End of Grade 10)

DNA TESTING. This is the testing regime for FamilyTreeDNA. Other SNP tests were ordered from Yseq.

Genetic Genealogy Journey DNA Projects by Debbie Parker Wayne, CG SM, CGL SM

Pedigree Reconstruction using Identity by Descent

Origin of Species: Starting the Story with DNA

Reason and imagination are fundamental to problem solving and critical examination of self and others.

DNA study deals blow to theory of European origins

Kenneth Nordtvedt. Many genetic genealogists eventually employ a time-tomost-recent-common-ancestor

Estimating Ancient Population Sizes using the Coalescent with Recombination

Behavioral Adaptations for Survival 1. Co-evolution of predator and prey ( evolutionary arms races )

Big Y-700 White Paper

Using Y-DNA for Genealogy Debbie Parker Wayne, CG, CGL SM

Viral epidemiology and the Coalescent

CALIFORNIA STATE POLYTECHNIC UNIVERSITY, POMONA ACADEMIC SENATE GENERAL EDUCATION COMMITTEE REPORT TO THE ACADEMIC SENATE GE

Y-Chromosome Haplotype Origins via Biogeographical Multilateration

STAT 536: The Coalescent

Genetic Genealogy. Using DNA to research your maternal & paternal lines. Ed McGuire. Vermont Genealogy Library 2/24/14

FIRST THINGS FIRST Beginnings in History, to 500 B.C.E.

Ernie Ebayley s Adventure in DNA-Land. A Resource for Beginning Your Own Adventure into Genealogical Genetics

DNA CHARLOTTE COUNTY GENEALOGICAL SOCIETY - MARCH 30, 2013 WALL STREET JOURNAL ARTICLE

Exercise 4 Exploring Population Change without Selection

Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling

Chapter 2: Human Evolution

Human Evolution. Activity Overview. Essential Questions. Objectives. Introduction. Materials and Resources

The Neanderthals. Early Humans Review Game Chapter 4, Lesson 1-21 (pg ) Round One. Here we go

[CLIENT] SmithDNA1701 DE January 2017

Wanderers. Molecular Anthropologist Uses DNA to Track Migrations of Homo Sapiens. by Peter Nichols

TREES OF GENES IN POPULATIONS

COURSE SYLLABUS (Updated 8/20/2012)

MOLECULAR POPULATION GENETICS: COALESCENT METHODS BASED ON SUMMARY STATISTICS

Methods of Parentage Analysis in Natural Populations

Autosomal-DNA. How does the nature of Jewish genealogy make autosomal DNA research more challenging?

Your web browser (Safari 7) is out of date. For more security, comfort and the best experience on this site: Update your browser Ignore

What Can I Learn From DNA Testing?

Where do evolutionary trees comes from?

6.047/6.878 Lecture 21: Phylogenomics II

DISCUSSION: RECENT COMMON ANCESTORS OF ALL PRESENT-DAY INDIVIDUALS

Anthropology. Teacher Edition. Written by Rebecca Stark Illustrated by Karen Birchak and Nelsy Fontalvo

arxiv: v1 [q-bio.pe] 4 Mar 2013

COMMUNITY UNIT SCHOOL DISTRICT 200 Science Curriculum Philosophy

A Likelihood Method to Estimate/Detect Gene Flow and A Distance Method to. Estimate Species Trees in the Presence of Gene Flow.

POPULATION GENETICS: WRIGHT FISHER MODEL AND COALESCENT PROCESS. Hailong Cui and Wangshu Zhang. Superviser: Prof. Quentin Berger

G ene tree discordance, phylogenetic inference and the m ultispecies coalescent

Table of Contents. Introduction DNA Basics DNA Origins: How it works Concepts of Race BioGeographical Ancestry...

Population genetics: Coalescence theory II

Coalescent Theory. Magnus Nordborg. Department of Genetics, Lund University. March 24, 2000

Transcription:

Coalescence time distributions for hypothesis testing -Kapil Rajaraman (rajaramn@uiuc.edu) 498BIN, HW# 2 This essay will be an overview of Maryellen Ruvolo s work on studying modern human origins using multiple datasets. My interest in the subject was aroused when reading Roger Lewin s book Patterns of Evolution, which referred me to the Feb 1996 issue of Molecular Phylogenetics and Evolution, a special issue to celebrate Morris Goodman s 70 th birthday. (The issue is free from biological technicalities for the most part, and is a good place to see what the issues in primate-hominoid evolution are). Evolutionary hypotheses There seems to be no doubt that the earliest hominids arose in Africa. The first hominid to leave Africa was H. erectus, and it is universally accepted that this happened around 1.8 my (million years) ago. However, it was still unanswered (at the time of the article) if this time was the last time all living humans were connected by a common ancestor. One theory known as the multiregional hypothesis says that this is true, implying that human population differences are very old. Also, it is held that the transition from H. erectus to H. sapiens was made separately in different parts of the world, and cohesiveness was maintained by gene flow. A second candelabra theory does not hypothesize gene flow. Another theory known as the Out of Africa or Noah s Ark model states that some time after H. erectus spread through the world, another speciation occurred, and H. sapiens emerged in Africa. These humans then replaced the existing H. erectus as they moved outwards. This theory posits younger population differences. Molecular evidence from mtdna has put an approximate date of 250,000 years for the second exodus. Molecular data studies As we follow a history of genetic lineages back in time, these lineages join up into a common ancestral type called the coalesent; the age of this ancestor is called the coalescent time. However, to reconstruct phylogenies from genetic data, there should be a way of relating the gene tree to the population tree. Daughter populations take some samples of alleles from the original population on division, and so the coalescence times of each population s alleles are usually greater than or equal to the time of population divergence. However, in some cases, allelic loss can lead to reduced coalescence time observations, as seen in Fig. 1. Moreover, the correct alleles have to be used to make observations. For example, the HLA (human leukocyte antigen) tree gives a coalescence time of 35 my. This is because the alleles under study have been preserved from the times of H. habilis and the australopithecines. The problem, however, is that it cannot be said whether any allele under study is an anciently polymorphic (i.e. preserved) one or not, making results questionable.

Coalescence time distributions The steps in this experiment are to (1) collect datasets from many genes; (2) to estimate coalescence times for each dataset; (3) to plot the frequency distribution of times; and (4) to compare the observed distribution with model predictions. These predictions are shown in Fig. 2. The difference in shapes of the multi-regional and candelabra models is due to the hypothesized absence of gene flow in the candelabra model. This absence of gene flow reduces the effective population size, which in turn leads to increased allelic loss. This reduces the observed coalescence time. Even without allelic loss, gene flow maintains older coalescent times, as shown in Fig. 3. The distributions shown in the figure are very general, and a more rigorous information set would need to incorporate frequency distribution of mutation rates of all the genes in the human genome and the proportion of genes that are anciently polymorphic. But, even at this qualitative level, the experimental predictions should be able to distinguish the three models. Prevailing tests utilize the branch length methods, which use the amount of genetic distance accumulated along the branches of a gene tree to infer divergence dates (after calibrating the molecular clock). The advantage of the coalescence time methods is that error bars associated with the evolutionary process can be estimated. Also, as the size of the datasets increases, the coalescent times approach the dates set by the relative branch lengths. Applications to existing data In the article, Ruvolo describes all the experiments that have been performed to collect molecular data. Genetic evidence from protein polymorphisms, mitochondrial DNA, Y- chromosomal sequences, and other tests give recent dates of approximately 200,000 years for human genetic ancestors. Some tests give older dates of 1.3 my, 3 my, and 500,000 years. With this evidence, the out-of-africa model seems to be the likeliest for modern human origins. (Further refinements to the experiments have been suggested to make sure that the loci which are studied are unlinked. With these refinements, it is expected that the out-of-africa model would be made even likelier). Discussion The out-of-africa model has been upheld by almost all molecular tests, but some researchers mainly Wolpoff claim that the multiregional model is correct. This group feels that the molecular clock used to calibrate mutation accumulation time is faulty. When it was pointed out that the calibration would have to be off by a factor by 5 or so, the proponents of the multi-regional model claim that this does not matter, because the natural loss of genetic material prevents accurate reconstruction of phylogenetic histories. This view, however, does not seem to be supported by the larger part of the anthropological community. Population wave data studies have shown that at the time of

population divergences and expansions, there were too few humans in existence to be compatible with the multiregional theory! Current research A number of researchers have been using coalescent time distributions for phylogenetic studies since, and there has been considerable refining of the technique to account for variable population sizes, colonization, etc. I did not find any work specifically relating to the out-of-africa versus multiregional debate, but a recent article has claimed that the Homo separated from the chimpanzees 10 my ago, a factor of 2 over the more universally accepted date of 5 my ago. This goes to show that the issue of primate phylogeny is still not conclusively resolved, and we may expect to see more than a few debates in the near future. References 1. Roger Lewin, Patterns in Evolution The new molecular view, 1996. 2. Maryellen Ruvolo, A New Approach to Studying Modern Human Origins: Hypotheses testing with Coalescence Time Disributions and references therein, Molecular Phylogenetics and Evolution, 5 (Feb 1996), pp. 202-219 3. Molecular Phylogenetics and Evolution, Feb 1996 issue

Fig. 1 Scenario showing how a molecular coalescent can occur after a population divergence. (Left) Population history shows two alleles (a,b) in an ancestrally polymorphic population. Population divergence occurs at time T 1, and only one population P 1 receives both alleles during population formation. Subsequently, allele b is lost from P 1. A new allele a arises in P 2 after divergence. (Right) Allelic history. If no alleles had been lost, the coalescence times of all alleles (a,a,b) would have been T 0, which is greater than T 1. Instead, observed alleles (a,a ) have coalescence time T 2, which is less than T 1, the time of population divergence. Fig. 2 Qualitatively predicted coalescence time frequency distributions

Fig. 3 Gene flow acts to maintain older coalescence times. In the absence of gene flow (left), the alleles in Population P2 have a coalescent at t 2. With gene flow (right), the coalescent is older, at t 1.