Ancestral Recombination Graphs

Size: px
Start display at page:

Download "Ancestral Recombination Graphs"

Transcription

1 Ancestral Recombination Graphs Ancestral relationships among a sample of recombining sequences usually cannot be accurately described by just a single genealogy. Linked sites will have similar, but not necessarily identical genealogies. Recombination leads to variation in the TMRCA between different sites, which in turn can lead to variation in genetic polymorphism. The ancestral recombination graph (ARG) is a generalization of the coalescent which describes the sequence of genealogies along a sample of recombining sequences. Nordborg (2000) Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

2 Variation in Total Tree Length in a Sample of 10 Chromosomes Hudson (2000) Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

3 Applications of the ARG The ARG has several uses: Recombining sequences are potentially much more informative about demography, admixture and selection than a single completely-linked locus. Fine scale recombination rate estimation is possible if we can exploit high density SNP data. Statistical inference for GWAS can be improved if we can accurately account for the complex correlations that exist between multiple linked loci. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

4 Meiotic recombination generates mosaic chromosomes Strachan & Read (1996) Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

5 The Two-locus Ancestral Recombination Graph For simplicity, consider a sample of n sequences containing just two loci. We will make the following assumptions. The population evolves according to the diploid Wright-Fisher model, with population size N. Each individual samples two chromosomes uniformly at random from the previous generation. When a chromosome is sampled from a parent, it will either be inherited intact (with probability 1 r) or else it will undergo a recombination between the two loci (with probability r). When a recombination event occurs, the two loci will be inherited from the two different homologous chromosomes of that parent. We will assume that no recombination occurs within the loci. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

6 Suppose that N is large and that r is of order O(1/N). Then, when looking backwards in time, the ancestral relationships between the sampled sequences are determined by two processes: With probability ( ) n 1, a randomly chosen 2 2N pair of sequences coalesces. With probability nr, a randomly chosen sequence is produced by a recombination event. More complex scenarios involving multiple coalescences or recombination have negligible probabilities under the above assumptions. Hudson (2000) When a sequence is produced by a recombination event, the two loci have different ancestors. In this case, the branch experiencing the recombination splits in two, with each emerging branch corresponding to one of the two ancestral sequences. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

7 If N is large and time is measured in units of 2N generations, the ancestry of the sampled sequences can be modeled by a continuous-time Markov chain known as the two-locus ancestral recombination graph (Griffiths 1981): At rate ( n 2), a randomly chosen pair of sequences coalesces and the number of branches in the ARG decreases by one. At rate nρ/2, a randomly chosen sequence is produced by recombination. The corresponding branch splits into two branches, each containing material ancestral to one of the two loci. This increases the number of branches by one. The process terminates when both loci have reached their MRCA. Because the branching rate is linear in n while the coalescent rate is quadratic in n, the ARG is certain to reach an ultimate ancestor (UA) at some finite time. If the two loci reach their MRCA s at different times, then it may be possible to terminate the process before reaching the UA. The parameter ρ = 4N er is known as the population recombination rate. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

8 The two-locus ARG is readily extended to multiple loci. Suppose that the sampled sequences contain n loci and let r i be the recombination rate per generation between locus i and i + 1 and r = r r n 1 be the total recombination rate. At rate ( n 2), a randomly chosen pair of sequences coalesces and the number of branches in the ARG decreases by one. At rate nρ/2, a randomly chosen sequence is generated by recombination. In this case, the recombination breakpoint falls between locus i and i + 1 with probability P(B = i) = r i /r. The affected branch splits into two branches, one containing material ancestral to locus 1 and the other containing material ancestral to locus 2. This increases the number of branches by one. The process terminates when all loci have reached their MRCA. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

9 Example: An ARG with Mutation Arenas et al. (2010) Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

10 Although the ARG provides an accurate description of the ancestral relationships among a sample of recombining sequences, using it to analyze sequence data is computationally challenging for three reasons. Curse of dimensionality: The number of possible ARG s for a sample of n sequences containing L sites is ((2n 3)!!) L. Weakly informative data: In general, the ARG is only weakly determined by the sequence data. Long-range dependence: The genealogies at flanking sites remaining correlated even if we condition on the genealogy at an intermediate site. Because of these difficulties, a major research focus has been on the development of computationally efficient approximations for the ARG. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

11 Composite Likelihoods and the Two-locus ARG Hudson (2000) proposed a way of approximating the likelihood of recombinant sequence data based on the two-locus ARG. In this approach, a likelihood function L ij (ρ) is computed for each pair of segregating sites i and j using Monte Carlo-based methods. The composite likelihood of the complete data is then calculated by multiplying all of the pairwise likelihoods: L comp(ρ D) = i j L ij (ρ) The population recombination rate ρ can be estimated by finding the value ˆρ that maximizes L comp. The composite MLE is known to be consistent, but the composite likelihood function itself is too peaked around ˆρ, i.e., confidence intervals calculated using the curvature of L comp will be too narrow. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

12 A practical implementation of the composite likelihood approach was first made by McVean et al. (2002) and later updated by Auton & McVean (2007) to handle recombination rate variation (LDhat2). Chr 19 (A) and 22 (B) HLA Region Source: McVean et al. (2004) Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

13 The Sequentially Markov Coalescent (SMC) An alternative simplification of the ARG was proposed by McVean & Cardin (2005) which removes the long-range dependence of genealogies at different sites. In this approach, which is known as the sequentially Markov coalescent (SMC), the ARG is approximated by a process that iteratively determines the genealogy at each position along a chromosome. The SMC starts at one end of the chromosome and samples a coalescent tree T 1 using the ordinary coalescent. It then generates a sequence of breakpoints b 1,, b m, and coalescent trees T 1,, T m, such that T i is the genealogy of the n sequences in the nonrecombinant segment (b i 1, b i ). The procedure for generating the breakpoints and coalescent trees is such that the sequence T 1,, T n is a Markov process, i.e., conditional on T i, the trees T 1,, T i 1 are independent of the trees T i+1,, T m. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

14 The sequence of breakpoints and coalescent trees is generated using the following procedure: Given b i and T i, the distance to the next breakpoint b i+1 is exponentially distributed with rate ρ T i /2, where T i is the total branch length in T i. Given T i, the next tree T i+1 is obtained by sampling a location uniformly at random along T i and detaching this lineage (and its subtree) from T i. This generates a floating lineage, which then coalesces with the remaining parts of T i. McVean & Cardin (2005) Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

15 The Pairwise Sequentially Markov Coalescent (PSMC) To the extent that different loci have different genealogies, even a single diploid genome can be used to make inferences about demographic history and selection. With just two copies of each locus, the genealogy at each site is fully by specified by the pairwise coalescent time at that site. In this case, the ARG along a chromosome can be represented by a sequence of breakpoints b 1,, b m and pairwise coalescent times t 1,, t m. Under the SMC, the sequence of pairwise coalescent times becomes a discrete-time Markov chain. These sequences (b i, t i ) can be used to make inferences about demography and selection. Li & Durbin (2011) introduced a method called the PSMC which uses a hidden Markov model to infer (b i, t i ) from the sequence data. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

16 Hidden Markov Models (HMM) Hidden Markov models can be used to analyze data generated by processes in which the state of the system is hidden from the observer. The state of the system evolves according to a discrete-time Markov chain: X 1, X 2, X 3,. Whereas X t is hidden, at each time t, the observer can measure some variable Y t that depends only on X t. The objective is to use the observations (Y t) to learn about (X t), which can be done using dynamic programming algorithms. Y 0 Y 1 Y 2 Y 3 Y 4 Y 5 e e e e e e X 0 p X 1 p X 2 p X 3 p X 4 p X 5 Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

17 Under the PSMC, the ancestral recombination graph and the sequence data can be represented by a HMM. Both processes (X i ) and (Y i ) are indexed by position along a chromosome. The hidden variable X i is the TMRCA at that position. The observed variable Y i is the pair of nucleotides in the sampled genome at that position. For practical reasons, the state space of the hidden variables is usually required to be finite. To this end, Li & Durbin (2011) replace the continuous interval [0, ) by a discrete set {s 0, s 1,, s m}. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

18 Applications of the PSMC from Li & Durbin (2011) Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

19 Extensions to multiple sequences In principle, these ideas can be extended to samples containing more than two sequences. Hobolth et al. (2007) developed a coalescent HMM to handle data sampled from two or three species. Because the number of genealogies grows super-exponentially with the number of sampled sequences, this approach quickly becomes intractable. Rasmussen et al. (2014) proposed a novel approach based on the SMC-HMM which uses a clever MCMC algorithm ( threading ) to generate a sample of ARG s from the posterior distribution given the sequence data and model parameters. Threading works by stochastically building up the ARG sequence by sequence. This approach is implemented in the software package ARGweaver. Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

20 Application: Distinguishing Background Selection from Selective Sweeps Rasmussen et al. (2014) Jay Taylor (ASU) Ancestral Recombination Graphs 16 Feb / 20

Population Structure and Genealogies

Population Structure and Genealogies Population Structure and Genealogies One of the key properties of Kingman s coalescent is that each pair of lineages is equally likely to coalesce whenever a coalescent event occurs. This condition is

More information

Population Genetics using Trees. Peter Beerli Genome Sciences University of Washington Seattle WA

Population Genetics using Trees. Peter Beerli Genome Sciences University of Washington Seattle WA Population Genetics using Trees Peter Beerli Genome Sciences University of Washington Seattle WA Outline 1. Introduction to the basic coalescent Population models The coalescent Likelihood estimation of

More information

Coalescence. Outline History. History, Model, and Application. Coalescence. The Model. Application

Coalescence. Outline History. History, Model, and Application. Coalescence. The Model. Application Coalescence History, Model, and Application Outline History Origins of theory/approach Trace the incorporation of other s ideas Coalescence Definition and descriptions The Model Assumptions and Uses Application

More information

Algorithms for Genetics: Basics of Wright Fisher Model and Coalescent Theory

Algorithms for Genetics: Basics of Wright Fisher Model and Coalescent Theory Algorithms for Genetics: Basics of Wright Fisher Model and Coalescent Theory Vineet Bafna Harish Nagarajan and Nitin Udpa 1 Disclaimer Please note that a lot of the text and figures here are copied from

More information

Approximating the coalescent with recombination

Approximating the coalescent with recombination Approximating the coalescent with recombination Gilean A. T. McVean* and Niall J. Cardin 360, 1387 1393 doi:10.1098/rstb.2005.1673 Published online 7 July 2005 Department of Statistics, 1 South Parks Road,

More information

Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms

Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms Magnus Nordborg University of Southern California The importance of history Genetic polymorphism data represent the outcome

More information

Estimating Ancient Population Sizes using the Coalescent with Recombination

Estimating Ancient Population Sizes using the Coalescent with Recombination Estimating Ancient Population Sizes using the Coalescent with Recombination Sara Sheehan joint work with Kelley Harris and Yun S. Song May 26, 2012 Sheehan, Harris, Song May 26, 2012 1 Motivation Introduction

More information

Viral epidemiology and the Coalescent

Viral epidemiology and the Coalescent Viral epidemiology and the Coalescent Philippe Lemey and Marc A. Suchard Department of Microbiology and Immunology K.U. Leuven, and Departments of Biomathematics and Human Genetics David Geffen School

More information

Population genetics: Coalescence theory II

Population genetics: Coalescence theory II Population genetics: Coalescence theory II Peter Beerli August 27, 2009 1 The variance of the coalescence process The coalescent is an accumulation of waiting times. We can think of it as standard queuing

More information

The genealogical history of a population The coalescent process. Identity by descent Distribution of pairwise coalescence times

The genealogical history of a population The coalescent process. Identity by descent Distribution of pairwise coalescence times The coalescent The genealogical history of a population The coalescent process Identity by descent Distribution of pairwise coalescence times Adding mutations Expected pairwise differences Evolutionary

More information

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

More information

Comparative method, coalescents, and the future

Comparative method, coalescents, and the future Comparative method, coalescents, and the future Joe Felsenstein Depts. of Genome Sciences and of Biology, University of Washington Comparative method, coalescents, and the future p.1/36 Correlation of

More information

Bioinformatics I, WS 14/15, D. Huson, December 15,

Bioinformatics I, WS 14/15, D. Huson, December 15, Bioinformatics I, WS 4/5, D. Huson, December 5, 204 07 7 Introduction to Population Genetics This chapter is closely based on a tutorial given by Stephan Schiffels (currently Sanger Institute) at the Australian

More information

Forward thinking: the predictive approach

Forward thinking: the predictive approach Coalescent Theory 1 Forward thinking: the predictive approach Random variation in reproduction causes random fluctuation in allele frequencies. Can describe this process as diffusion: (Wright 1931) showed

More information

TREES OF GENES IN POPULATIONS

TREES OF GENES IN POPULATIONS 1 TREES OF GENES IN POPULATIONS Joseph Felsenstein Abstract Trees of ancestry of copies of genes form in populations, as a result of the randomness of birth, death, and Mendelian reproduction. Considering

More information

Comparative method, coalescents, and the future. Correlation of states in a discrete-state model

Comparative method, coalescents, and the future. Correlation of states in a discrete-state model Comparative method, coalescents, and the future Joe Felsenstein Depts. of Genome Sciences and of Biology, University of Washington Comparative method, coalescents, and the future p.1/28 Correlation of

More information

Theoretical Population Biology. An approximate likelihood for genetic data under a model with recombination and population splitting

Theoretical Population Biology. An approximate likelihood for genetic data under a model with recombination and population splitting Theoretical Population Biology 75 (2009) 33 345 Contents lists available at ScienceDirect Theoretical Population Biology journal homepage: www.elsevier.com/locate/tpb An approximate likelihood for genetic

More information

Analysis of geographically structured populations: Estimators based on coalescence

Analysis of geographically structured populations: Estimators based on coalescence Analysis of geographically structured populations: Estimators based on coalescence Peter Beerli Department of Genetics, Box 357360, University of Washington, Seattle WA 9895-7360, Email: beerli@genetics.washington.edu

More information

Pedigree Reconstruction using Identity by Descent

Pedigree Reconstruction using Identity by Descent Pedigree Reconstruction using Identity by Descent Bonnie Kirkpatrick Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2010-43 http://www.eecs.berkeley.edu/pubs/techrpts/2010/eecs-2010-43.html

More information

Part I. Concepts and Methods in Bacterial Population Genetics COPYRIGHTED MATERIAL

Part I. Concepts and Methods in Bacterial Population Genetics COPYRIGHTED MATERIAL Part I Concepts and Methods in Bacterial Population Genetics COPYRIGHTED MATERIAL Chapter 1 The Coalescent of Bacterial Populations Mikkel H. Schierup and Carsten Wiuf 1.1 BACKGROUND AND MOTIVATION Recent

More information

2 The Wright-Fisher model and the neutral theory

2 The Wright-Fisher model and the neutral theory 0 THE WRIGHT-FISHER MODEL AND THE NEUTRAL THEORY The Wright-Fisher model and the neutral theory Although the main interest of population genetics is conceivably in natural selection, we will first assume

More information

Simulated gene genealogy of a sample of size 50 from a population of constant size. The History of Population Size from Whole Genomes.

Simulated gene genealogy of a sample of size 50 from a population of constant size. The History of Population Size from Whole Genomes. Simulated gene genealogy of a sample of size 50 from a population of constant size The History of Population Size from Whole Genomes Alan R Rogers October 1, 2018 Short terminal branches; long basal ones

More information

Coalescent Theory: An Introduction for Phylogenetics

Coalescent Theory: An Introduction for Phylogenetics Coalescent Theory: An Introduction for Phylogenetics Laura Salter Kubatko Departments of Statistics and Evolution, Ecology, and Organismal Biology The Ohio State University lkubatko@stat.ohio-state.edu

More information

5 Inferring Population

5 Inferring Population 5 Inferring Population History and Demography While population genetics was a very theoretical discipline originally, the modern abundance of population genetic data has forced the field to become more

More information

STAT 536: The Coalescent

STAT 536: The Coalescent STAT 536: The Coalescent Karin S. Dorman Department of Statistics Iowa State University November 7, 2006 Wright-Fisher Model Our old friend the Wright-Fisher model envisions populations moving forward

More information

Coalescent Theory. Magnus Nordborg. Department of Genetics, Lund University. March 24, 2000

Coalescent Theory. Magnus Nordborg. Department of Genetics, Lund University. March 24, 2000 Coalescent Theory Magnus Nordborg Department of Genetics, Lund University March 24, 2000 Abstract The coalescent process is a powerful modeling tool for population genetics. The allelic states of all homologous

More information

Kenneth Nordtvedt. Many genetic genealogists eventually employ a time-tomost-recent-common-ancestor

Kenneth Nordtvedt. Many genetic genealogists eventually employ a time-tomost-recent-common-ancestor Kenneth Nordtvedt Many genetic genealogists eventually employ a time-tomost-recent-common-ancestor (TMRCA) tool to estimate how far back in time the common ancestor existed for two Y-STR haplotypes obtained

More information

Kinship/relatedness. David Balding Professor of Statistical Genetics University of Melbourne, and University College London.

Kinship/relatedness. David Balding Professor of Statistical Genetics University of Melbourne, and University College London. Kinship/relatedness David Balding Professor of Statistical Genetics University of Melbourne, and University College London 2 Feb 2016 1 Ways to measure relatedness 2 Pedigree-based kinship coefficients

More information

Ioanna Manolopoulou and Brent C. Emerson. October 7, Abstract

Ioanna Manolopoulou and Brent C. Emerson. October 7, Abstract Phylogeographic Ancestral Inference Using the Coalescent Model on Haplotype Trees Ioanna Manolopoulou and Brent C. Emerson October 7, 2011 Abstract Phylogeographic ancestral inference is a question frequently

More information

6.047/6.878 Lecture 21: Phylogenomics II

6.047/6.878 Lecture 21: Phylogenomics II Guest Lecture by Matt Rasmussen Orit Giguzinsky and Ethan Sherbondy December 13, 2012 1 Contents 1 Introduction 3 2 Inferring Orthologs/Paralogs, Gene Duplication and Loss 3 2.1 Species Tree..............................................

More information

MOLECULAR POPULATION GENETICS: COALESCENT METHODS BASED ON SUMMARY STATISTICS

MOLECULAR POPULATION GENETICS: COALESCENT METHODS BASED ON SUMMARY STATISTICS MOLECULAR POPULATION GENETICS: COALESCENT METHODS BASED ON SUMMARY STATISTICS Daniel A. Vasco*, Keith A. Crandall* and Yun-Xin Fu *Department of Zoology, Brigham Young University, Provo, UT 8460, USA Human

More information

Chapter 12 Gene Genealogies

Chapter 12 Gene Genealogies Chapter 12 Gene Genealogies Noah A. Rosenberg Program in Molecular and Computational Biology. University of Southern California, Los Angeles, California 90089-1113 USA. E-mail: noahr@usc.edu. Phone: 213-740-2416.

More information

DISCUSSION: RECENT COMMON ANCESTORS OF ALL PRESENT-DAY INDIVIDUALS

DISCUSSION: RECENT COMMON ANCESTORS OF ALL PRESENT-DAY INDIVIDUALS Adv. Appl. Prob. 31, 1027 1035 (1999) Printed in Northern Ireland Applied Probability Trust 1999 DISCUSSION: RECENT COMMON ANCESTORS OF ALL PRESENT-DAY INDIVIDUALS It is a pleasure to be able to comment

More information

UNDERSTANDING the genealogical relationship finite for any sample size. But, even positions sharing

UNDERSTANDING the genealogical relationship finite for any sample size. But, even positions sharing Copyright 1999 by the Genetics Society of America The Ancestry of a Sample of Sequences Subject to Recombination Carsten Wiuf and Jotun Hein Institute of Biological Sciences, University of Aarhus, DK-8000

More information

BIOL Evolution. Lecture 8

BIOL Evolution. Lecture 8 BIOL 432 - Evolution Lecture 8 Expected Genotype Frequencies in the Absence of Evolution are Determined by the Hardy-Weinberg Equation. Assumptions: 1) No mutation 2) Random mating 3) Infinite population

More information

Lecture 1: Introduction to pedigree analysis

Lecture 1: Introduction to pedigree analysis Lecture 1: Introduction to pedigree analysis Magnus Dehli Vigeland NORBIS course, 8 th 12 th of January 2018, Oslo Outline Part I: Brief introductions Pedigrees symbols and terminology Some common relationships

More information

Advanced data analysis in population genetics Likelihood-based demographic inference using the coalescent

Advanced data analysis in population genetics Likelihood-based demographic inference using the coalescent Advanced data analysis in population genetics Likelihood-based demographic inference using the coalescent Raphael Leblois Centre de Biologie pour la Gestion des Populations (CBGP), INRA, Montpellier master

More information

Report on the VAN_TUYL Surname Project Y-STR Results 3/11/2013 Rory Van Tuyl

Report on the VAN_TUYL Surname Project Y-STR Results 3/11/2013 Rory Van Tuyl Report on the VAN_TUYL Surname Project Y-STR Results 3/11/2013 Rory Van Tuyl Abstract: Recent data for two descendants of Ott van Tuyl has been added to the project, bringing the total number of Gameren

More information

Inference of Population Structure using Dense Haplotype Data

Inference of Population Structure using Dense Haplotype Data using Dense Haplotype Data Daniel John Lawson 1, Garrett Hellenthal 2, Simon Myers 3., Daniel Falush 4,5. * 1 Department of Mathematics, University of Bristol, Bristol, United Kingdom, 2 Wellcome Trust

More information

Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations

Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations Alkes L. Price 1,2,3, Arti Tandon 3,4, Nick Patterson 3, Kathleen C. Barnes 5, Nicholas Rafaels 5, Ingo Ruczinski

More information

GENEALOGICAL TREES, COALESCENT THEORY AND THE ANALYSIS OF GENETIC POLYMORPHISMS

GENEALOGICAL TREES, COALESCENT THEORY AND THE ANALYSIS OF GENETIC POLYMORPHISMS GENEALOGICAL TREES, COALESCENT THEORY AND THE ANALYSIS OF GENETIC POLYMORPHISMS Noah A. Rosenberg and Magnus Nordborg Improvements in genotyping technologies have led to the increased use of genetic polymorphism

More information

Inference of population structure using dense haplotype data Daniel John Lawson 1, Garrett Hellenthal 2, Simon Myers,3 and Daniel Falush,4,

Inference of population structure using dense haplotype data Daniel John Lawson 1, Garrett Hellenthal 2, Simon Myers,3 and Daniel Falush,4, 1 Inference of population structure using dense haplotype data Daniel John Lawson 1, Garrett Hellenthal 2, Simon Myers,3 and Daniel Falush,4, 1 Department of Mathematics, University of Bristol, Bristol,

More information

Gene coancestry in pedigrees and populations

Gene coancestry in pedigrees and populations Gene coancestry in pedigrees and populations Thompson, Elizabeth University of Washington, Department of Statistics Box 354322 Seattle, WA 98115-4322, USA E-mail: eathomp@uw.edu Glazner, Chris University

More information

Growing the Family Tree: The Power of DNA in Reconstructing Family Relationships

Growing the Family Tree: The Power of DNA in Reconstructing Family Relationships Growing the Family Tree: The Power of DNA in Reconstructing Family Relationships Luke A. D. Hutchison Natalie M. Myres Scott R. Woodward Sorenson Molecular Genealogy Foundation (www.smgf.org) 2511 South

More information

Research Article The Ancestry of Genetic Segments

Research Article The Ancestry of Genetic Segments International Scholarly Research Network ISRN Biomathematics Volume 2012, Article ID 384275, 8 pages doi:105402/2012/384275 Research Article The Ancestry of Genetic Segments R B Campbell Department of

More information

Ancestral population genomics: the coalescent hidden Markov. model approach. Julien Y Dutheil 1, Ganeshkumar Ganapathy 2, Asger Hobolth 1,

Ancestral population genomics: the coalescent hidden Markov. model approach. Julien Y Dutheil 1, Ganeshkumar Ganapathy 2, Asger Hobolth 1, Ancestral population genomics: the coalescent hidden Markov model approach Julien Y Dutheil 1, Ganeshkumar Ganapathy 2, Asger Hobolth 1, Thomas Mailund 1, Marcy K Uyenoyama 3, Mikkel H Schierup 1,4 1 Bioinformatics

More information

Evaluating the performance of likelihood methods for. detecting population structure and migration

Evaluating the performance of likelihood methods for. detecting population structure and migration Molecular Ecology (2004) 13, 837 851 doi: 10.1111/j.1365-294X.2004.02132.x Evaluating the performance of likelihood methods for Blackwell Publishing, Ltd. detecting population structure and migration ZAID

More information

DNA Basics, Y DNA Marker Tables, Ancestral Trees and Mutation Graphs: Definitions, Concepts, Understanding

DNA Basics, Y DNA Marker Tables, Ancestral Trees and Mutation Graphs: Definitions, Concepts, Understanding DNA Basics, Y DNA Marker Tables, Ancestral Trees and Mutation Graphs: Definitions, Concepts, Understanding by Dr. Ing. Robert L. Baber 2014 July 26 Rights reserved, see the copyright notice at http://gengen.rlbaber.de

More information

Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling

Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling Mary K. Kuhner, Jon Yamato, and Joseph Felsenstein Department of Genetics, University of Washington

More information

The Coalescent Model. Florian Weber

The Coalescent Model. Florian Weber The Coalescent Model Florian Weber 23. 7. 2016 The Coalescent Model coalescent = zusammenwachsend Outline Population Genetics and the Wright-Fisher-model The Coalescent on-constant population-sizes Further

More information

Coalescents. Joe Felsenstein. GENOME 453, Autumn Coalescents p.1/48

Coalescents. Joe Felsenstein. GENOME 453, Autumn Coalescents p.1/48 Coalescents p.1/48 Coalescents Joe Felsenstein GENOME 453, Autumn 2015 Coalescents p.2/48 Cann, Stoneking, and Wilson Becky Cann Mark Stoneking the late Allan Wilson Cann, R. L., M. Stoneking, and A. C.

More information

The Two Phases of the Coalescent and Fixation Processes

The Two Phases of the Coalescent and Fixation Processes The Two Phases of the Coalescent and Fixation Processes Introduction The coalescent process which traces back the current population to a common ancestor and the fixation process which follows an individual

More information

The Coalescent. Chapter Population Genetic Models

The Coalescent. Chapter Population Genetic Models Chapter 3 The Coalescent To coalesce means to grow together, to join, or to fuse. When two copies of a gene are descended from a common ancestor which gave rise to them in some past generation, looking

More information

Chapter 4 Neutral Mutations and Genetic Polymorphisms

Chapter 4 Neutral Mutations and Genetic Polymorphisms Chapter 4 Neutral Mutations and Genetic Polymorphisms The relationship between genetic data and the underlying genealogy was introduced in Chapter. Here we will combine the intuitions of Chapter with the

More information

Your mtdna Full Sequence Results

Your mtdna Full Sequence Results Congratulations! You are one of the first to have your entire mitochondrial DNA (DNA) sequenced! Testing the full sequence has already become the standard practice used by researchers studying the DNA,

More information

SINGLE nucleotide polymorphisms (SNPs) are single cases the SNPs have originally been identified by sequencing.

SINGLE nucleotide polymorphisms (SNPs) are single cases the SNPs have originally been identified by sequencing. Copyright 2000 by the Genetics Society of America Estimation of Population Parameters and Recombination Rates From Single Nucleotide Polymorphisms Rasmus Nielsen Department of Organismic and Evolutionary

More information

POPULATION GENETICS: WRIGHT FISHER MODEL AND COALESCENT PROCESS. Hailong Cui and Wangshu Zhang. Superviser: Prof. Quentin Berger

POPULATION GENETICS: WRIGHT FISHER MODEL AND COALESCENT PROCESS. Hailong Cui and Wangshu Zhang. Superviser: Prof. Quentin Berger POPULATIO GEETICS: WRIGHT FISHER MODEL AD COALESCET PROCESS by Hailong Cui and Wangshu Zhang Superviser: Prof. Quentin Berger A Final Project Report Presented In Partial Fulfillment of the Requirements

More information

Every human cell (except red blood cells and sperm and eggs) has an. identical set of 23 pairs of chromosomes which carry all the hereditary

Every human cell (except red blood cells and sperm and eggs) has an. identical set of 23 pairs of chromosomes which carry all the hereditary Introduction to Genetic Genealogy Every human cell (except red blood cells and sperm and eggs) has an identical set of 23 pairs of chromosomes which carry all the hereditary information that is passed

More information

Walter Steets Houston Genealogical Forum DNA Interest Group January 6, 2018

Walter Steets Houston Genealogical Forum DNA Interest Group January 6, 2018 DNA, Ancestry, and Your Genealogical Research- Segments and centimorgans Walter Steets Houston Genealogical Forum DNA Interest Group January 6, 2018 1 Today s agenda Brief review of previous DIG session

More information

Coalescent Theory for a Partially Selfing Population

Coalescent Theory for a Partially Selfing Population Copyright 6 1997 by the Genetics Society of America T Coalescent Theory for a Partially Selfing Population Yun-xin FU Human Genetics Center, University of Texas, Houston, Texas 77225 Manuscript received

More information

Coalescence time distributions for hypothesis testing -Kapil Rajaraman 498BIN, HW# 2

Coalescence time distributions for hypothesis testing -Kapil Rajaraman 498BIN, HW# 2 Coalescence time distributions for hypothesis testing -Kapil Rajaraman (rajaramn@uiuc.edu) 498BIN, HW# 2 This essay will be an overview of Maryellen Ruvolo s work on studying modern human origins using

More information

Meek DNA Project Group B Ancestral Signature

Meek DNA Project Group B Ancestral Signature Meek DNA Project Group B Ancestral Signature The purpose of this paper is to explore the method and logic used by the author in establishing the Y-DNA ancestral signature for The Meek DNA Project Group

More information

Tópicos Depto. Ciencias Biológicas, UniAndes Profesor Andrew J. Crawford Semestre II

Tópicos Depto. Ciencias Biológicas, UniAndes Profesor Andrew J. Crawford Semestre II Tópicos Depto. Ciencias Biológicas, UniAndes Profesor Andrew J. Crawford Semestre 29 -II Lab Coalescent simulation using SIMCOAL 17 septiembre 29 Coalescent theory provides a powerful model

More information

Walter Steets Houston Genealogical Forum DNA Interest Group February 24, 2018

Walter Steets Houston Genealogical Forum DNA Interest Group February 24, 2018 Using Ancestry DNA and Third-Party Tools to Research Your Shared DNA Segments Part 2 Walter Steets Houston Genealogical Forum DNA Interest Group February 24, 2018 1 Today s agenda Brief review of previous

More information

Genetics: Early Online, published on June 29, 2016 as /genetics A Genealogical Look at Shared Ancestry on the X Chromosome

Genetics: Early Online, published on June 29, 2016 as /genetics A Genealogical Look at Shared Ancestry on the X Chromosome Genetics: Early Online, published on June 29, 2016 as 10.1534/genetics.116.190041 GENETICS INVESTIGATION A Genealogical Look at Shared Ancestry on the X Chromosome Vince Buffalo,,1, Stephen M. Mount and

More information

Mitochondrial Eve and Y-chromosome Adam: Who do your genes come from?

Mitochondrial Eve and Y-chromosome Adam: Who do your genes come from? Mitochondrial Eve and Y-chromosome Adam: Who do your genes come from? 28 July 2010. Joe Felsenstein Evening At The Genome Mitochondrial Eve and Y-chromosome Adam: Who do your genes come from? p.1/39 Evolutionary

More information

Coalescent genealogy samplers: windows into population history

Coalescent genealogy samplers: windows into population history Review Coalescent genealogy samplers: windows into population history Mary K. Kuhner Department of Genome Sciences, University of Washington, Box 355065, Seattle, WA 98195-5065, USA Coalescent genealogy

More information

Coalescents. Joe Felsenstein. GENOME 453, Winter Coalescents p.1/39

Coalescents. Joe Felsenstein. GENOME 453, Winter Coalescents p.1/39 Coalescents Joe Felsenstein GENOME 453, Winter 2007 Coalescents p.1/39 Cann, Stoneking, and Wilson Becky Cann Mark Stoneking the late Allan Wilson Cann, R. L., M. Stoneking, and A. C. Wilson. 1987. Mitochondrial

More information

and 6.855J. Network Simplex Animations

and 6.855J. Network Simplex Animations .8 and 6.8J Network Simplex Animations Calculating A Spanning Tree Flow -6 7 6 - A tree with supplies and demands. (Assume that all other arcs have a flow of ) What is the flow in arc (,)? Calculating

More information

ESTIMATION OF THE NUMBER OF INDIVIDUALS FOUNDING COLONIZED POPULATIONS

ESTIMATION OF THE NUMBER OF INDIVIDUALS FOUNDING COLONIZED POPULATIONS ORIGINAL ARTICLE doi:1.1111/j.1558-5646.7.8.x ESTIMATION OF THE NUMBER OF INDIVIDUALS FOUNDING COLONIZED POPULATIONS Eric C. Anderson 1, and Montgomery Slatkin 3,4 1 Fisheries Ecology Division, Southwest

More information

A Likelihood Method to Estimate/Detect Gene Flow and A Distance Method to. Estimate Species Trees in the Presence of Gene Flow.

A Likelihood Method to Estimate/Detect Gene Flow and A Distance Method to. Estimate Species Trees in the Presence of Gene Flow. A Likelihood Method to Estimate/Detect Gene Flow and A Distance Method to Estimate Species Trees in the Presence of Gene Flow Thesis Presented in Partial Fulfillment of the Requirements for the Degree

More information

can mathematicians find the woods?

can mathematicians find the woods? Eolutionary trees, coalescents, and gene trees: can mathematicians find the woods? Joe Felsenstein Department of Genome Sciences and Department of Biology Eolutionary trees, coalescents, and gene trees:

More information

[CLIENT] SmithDNA1701 DE January 2017

[CLIENT] SmithDNA1701 DE January 2017 [CLIENT] SmithDNA1701 DE1704205 11 January 2017 DNA Discovery Plan GOAL Create a research plan to determine how the client s DNA results relate to his family tree as currently constructed. The client s

More information

TRACK 1: BEGINNING DNA RESEARCH presented by Andy Hochreiter

TRACK 1: BEGINNING DNA RESEARCH presented by Andy Hochreiter TRACK 1: BEGINNING DNA RESEARCH presented by Andy Hochreiter 1-1: DNA: WHERE DO I START? Definition Genetic genealogy is the application of genetics to traditional genealogy. Genetic genealogy uses genealogical

More information

Coalescent Likelihood Methods. Mary K. Kuhner Genome Sciences University of Washington Seattle WA

Coalescent Likelihood Methods. Mary K. Kuhner Genome Sciences University of Washington Seattle WA Coalescent Likelihood Methods Mary K. Kuhner Genome Sciences University of Washington Seattle WA Outline 1. Introduction to coalescent theory 2. Practical example 3. Genealogy samplers 4. Break 5. Survey

More information

Genetic Genealogy Journey DNA Projects by Debbie Parker Wayne, CG SM, CGL SM

Genetic Genealogy Journey DNA Projects by Debbie Parker Wayne, CG SM, CGL SM Genetic Genealogy Journey DNA Projects by Debbie Parker Wayne, CG SM, CGL SM Genealogy can be a solitary pursuit. Genealogists sometimes collaborate to work on common lines, but lone researchers can perform

More information

Walter Steets Houston Genealogical Forum DNA Interest Group April 7, 2018

Walter Steets Houston Genealogical Forum DNA Interest Group April 7, 2018 Ancestry DNA and GEDmatch Walter Steets Houston Genealogical Forum DNA Interest Group April 7, 2018 Today s agenda Recent News about DNA Testing DNA Cautions: DNA Data Used for Forensic Purposes New Technology:

More information

PATTERNS of heritable genetic variation in contem- relationships, but does not provide a basis for assessing

PATTERNS of heritable genetic variation in contem- relationships, but does not provide a basis for assessing Copyright 1998 by the Genetics Society of America Genealogical Inference From Microsatellite Data Ian J. Wilson*, and David J. Balding *School of Biological Sciences, Queen Mary and Westfield College,

More information

MODERN population genetics is data driven and

MODERN population genetics is data driven and Copyright Ó 2009 by the Genetics Society of America DOI: 10.1534/genetics.108.092460 Note Extensions of the Coalescent Effective Population Size John Wakeley 1 and Ori Sargsyan Department of Organismic

More information

The Structure of Genealogies and the Distribution of Fixed Differences Between DNA Sequence Samples From Natural Populations

The Structure of Genealogies and the Distribution of Fixed Differences Between DNA Sequence Samples From Natural Populations Copyright 0 1991 by the Genetics Society of America The Structure of Genealogies the Distribution of Fixed Differences Between DNA Sequence Samples From Natural Populations Department of Biological Sciences,

More information

DNA Testing What you need to know first

DNA Testing What you need to know first DNA Testing What you need to know first This article is like the Cliff Notes version of several genetic genealogy classes. It is a basic general primer. The general areas include Project support DNA test

More information

LANDSCAPE SMOOTHING OF NUMERICAL PERMUTATION SPACES IN GENETIC ALGORITHMS

LANDSCAPE SMOOTHING OF NUMERICAL PERMUTATION SPACES IN GENETIC ALGORITHMS LANDSCAPE SMOOTHING OF NUMERICAL PERMUTATION SPACES IN GENETIC ALGORITHMS ABSTRACT The recent popularity of genetic algorithms (GA s) and their application to a wide range of problems is a result of their

More information

Genetic Diversity and the Structure of Genealogies in Rapidly Adapting Populations

Genetic Diversity and the Structure of Genealogies in Rapidly Adapting Populations Genetic Diversity and the Structure of Genealogies in Rapidly Adapting Populations The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters

More information

How to use MIGRATE or why are Markov chain Monte Carlo programs difficult to use?

How to use MIGRATE or why are Markov chain Monte Carlo programs difficult to use? C:/ITOOLS/WMS/CUP/183027/WORKINGFOLDER/BLL/9780521866309C03.3D 39 [39 77] 20.12.2008 9:13AM How to use MIGRATE or why are Markov chain Monte Carlo programs difficult to use? 3 PETER BEERLI Population genetic

More information

Using Autosomal DNA for Genealogy Debbie Parker Wayne, CG, CGL SM

Using Autosomal DNA for Genealogy Debbie Parker Wayne, CG, CGL SM Using Autosomal DNA for Genealogy Debbie Parker Wayne, CG, CGL SM This is one article of a series on using DNA for genealogical research. There are several types of DNA tests offered for genealogical purposes.

More information

Big Y-700 White Paper

Big Y-700 White Paper Big Y-700 White Paper Powering discovery in the field of paternal ancestry Authors: Caleb Davis, Michael Sager, Göran Runfeldt, Elliott Greenspan, Arjan Bormans, Bennett Greenspan, and Connie Bormans Last

More information

On the nonidentifiability of migration time estimates in isolation with migration models

On the nonidentifiability of migration time estimates in isolation with migration models Molecular Ecology (2011) 20, 3956 3962 doi: 10.1111/j.1365-294X.2011.05247.x NEWS AND VIEWS COMMENT On the nonidentifiability of migration time estimates in isolation with migration models VITOR C. SOUSA,

More information

Walter Steets Houston Genealogical Forum DNA Interest Group January 27, 2018

Walter Steets Houston Genealogical Forum DNA Interest Group January 27, 2018 Using Ancestry DNA and Third-Party Tools to Research Your Shared DNA Segments Walter Steets Houston Genealogical Forum DNA Interest Group January 27, 2018 1 Today s agenda Brief review of previous DIG

More information

Ancient Admixture in Human History

Ancient Admixture in Human History Genetics: Published Articles Ahead of Print, published on September 7, 2012 as 10.1534/genetics.112.145037 Ancient Admixture in Human History Nick Patterson 1, Priya Moorjani 2, Yontao Luo 3, Swapan Mallick

More information

University of Washington, TOPMed DCC July 2018

University of Washington, TOPMed DCC July 2018 Module 12: Comput l Pipeline for WGS Relatedness Inference from Genetic Data Timothy Thornton (tathornt@uw.edu) & Stephanie Gogarten (sdmorris@uw.edu) University of Washington, TOPMed DCC July 2018 1 /

More information

Recent Trends in Population Genetics: More Data! More Math! Simple Models?

Recent Trends in Population Genetics: More Data! More Math! Simple Models? Journal of Heredity 24:95(5):397 45 doi:.93/jhered/esh62 ª 24 The American Genetic Association Recent Trends in Population Genetics: More ata! More Math! Simple Models? J. WAKELEY From the epartment of

More information

Estimating Effective Population Size and Mutation Rate From Sequence Data Using Metropolis-Hastings Sampling

Estimating Effective Population Size and Mutation Rate From Sequence Data Using Metropolis-Hastings Sampling Copyright 0 1995 by the Genetics Society of America Estimating Effective Population Size and Mutation Rate From Sequence Data Using Metropolis-Hastings Sampling Mary K. Kuhner, Jon Yarnato and Joseph Felsenstein

More information

GEDmatch Home Page The upper left corner of your home page has Information about you and links to lots of helpful information. Check them out!

GEDmatch Home Page The upper left corner of your home page has Information about you and links to lots of helpful information. Check them out! USING GEDMATCH Created March 2015 GEDmatch is a free, non-profit site that accepts raw autosomal data files from Ancestry, FTDNA, and 23andme. As such, it provides a large autosomal database that spans

More information

Halley Family. Mystery? Mystery? Can you solve a. Can you help solve a

Halley Family. Mystery? Mystery? Can you solve a. Can you help solve a Can you solve a Can you help solve a Halley Halley Family Family Mystery? Mystery? Who was the great grandfather of John Bennett Halley? He lived in Maryland around 1797 and might have been born there.

More information

Inbreeding and self-fertilization

Inbreeding and self-fertilization Inbreeding and self-fertilization Introduction Remember that long list of assumptions associated with derivation of the Hardy-Weinberg principle that I went over a couple of lectures ago? Well, we re about

More information

Factors affecting phasing quality in a commercial layer population

Factors affecting phasing quality in a commercial layer population Factors affecting phasing quality in a commercial layer population N. Frioni 1, D. Cavero 2, H. Simianer 1 & M. Erbe 3 1 University of Goettingen, Department of nimal Sciences, Center for Integrated Breeding

More information

FOSSIL, archaeological, and genetic data all lend

FOSSIL, archaeological, and genetic data all lend Copyright Ó 2008 by the Genetics Society of America DOI: 10.1534/genetics.107.080432 Testing for Archaic Hominin Admixture on the X Chromosome: Model Likelihoods for the Modern Human RRM2P4 Region From

More information

The program Bayesian Analysis of Trees With Internal Node Generation (BATWING)

The program Bayesian Analysis of Trees With Internal Node Generation (BATWING) Supplementary methods Estimation of TMRCA using BATWING The program Bayesian Analysis of Trees With Internal Node Generation (BATWING) (Wilson et al. 2003) was run using a model of a single population

More information

Paper Presentation. Steve Jan. March 5, Virginia Tech. Steve Jan (Virginia Tech) Paper Presentation March 5, / 28

Paper Presentation. Steve Jan. March 5, Virginia Tech. Steve Jan (Virginia Tech) Paper Presentation March 5, / 28 Paper Presentation Steve Jan Virginia Tech March 5, 2015 Steve Jan (Virginia Tech) Paper Presentation March 5, 2015 1 / 28 2 paper to present Nonparametric Multi-group Membership Model for Dynamic Networks,

More information

Warning: software often displays unrooted trees like this:

Warning: software often displays unrooted trees like this: Warning: software often displays unrooted trees like this: /------------------------------ Chara /-------------------------- Chlorella /---------16 \---------------------------- Volvox +-------------------17

More information