Systematics - BIO 615

Size: px
Start display at page:

Download "Systematics - BIO 615"

Transcription

1 Outline 1. Optimality riteria: Parsimony continued 2. istance vs character methods 3. uilding a tree vs finding a tree - lustering vs Optimality criterion methods 4. Performance of istance and clustering methods erek S. Sikes University of laska Four steps - each should be explained in methods 1. haracter (data) selection (not too fast, not too slow) Why did you choose these data? 2. lignment of ata (hypotheses of primary homology) How did you align your data? Two problems to solve Parsimony 1. etermine optimality criterion score (tree length) for each tree (easy / fast) 3. nalysis selection (choose the best model / method(s)) - data exploration Why did you chose your analysis method? 4. onduct analysis 2. Search over all possible trees to find the tree(s) that is/are the best according to the optimality criterion (e.g. shortest; hard for more than 11 OTUs) Some relationships Trees rooted binary tree of n OTUs 2n-3 branches Thus, an unrooted tree with n OTUs has 2n-3 rooted versions. Parsimony Parsimony & ladistics - Strict cladists typically use only parsimony methods & justify this choice on philosophical grounds eg it provides the least falsified hypothesis - Parsimony has also been interpreted as a fast approximation to maximum likelihood avalli-sforza and dwards (1967:555), stated that parsimony s success is probably due to the closeness of the solution it gives to the projection of the maximum likelihood tree and parsimony certainly cannot be justified on the grounds that evolution proceeds according to some minimum principle. - Parsimony is often used in conjunction with other methods (by those who use statistical phylogenetic methods) 1

2 Parsimony Given a set of characters, such as aligned sequences, parsimony analysis works by determining the fit (number of steps) of each character on a given tree The sum over all characters is called Tree Length Most parsimonious trees (MPTs) have the minimum tree length needed to explain the observed distributions of all the characters Parsimony etermining tree length site 5! ! G!species1!!species2! G!species3!!species4 T! G!species5 G Sites 1-4 are constant Parsimony uninformative, site 6 is an autapomorphy = 1 change only, ignored by cladistic software, counted by PUP (variable but parsimony uninformative) t 100k trees/sec PUP would take over 2 billion years to evaluate all trees for 21 OTUs Number of OTUs , , ,027, ,459, ,729, ,749,310, ,234,143, ,905,853,580,625 Number of unrooted trees ,458,046,676, ,190,283,353,629, ,898,783,962,510, ,332,659,870,762,850, ,643,095,476,699,771, ,200,794,532,637,891,559,375 For n taxa # trees= (2n-5)! n-3 (n-3)! Tree searching If 11 or fewer OTUs can do an exhaustive search - this guarantees the shortest tree(s) will be found (an exact solution) - every possible tree for n taxa examined - slowest and most rigorous method - provides a frequency histogram of tree scores xhaustive Search Tree searching Step 1 Starting tree, any 3 taxa If OTUs can do a branch and bound search - this also guarantees the shortest tree(s) will be found but not all trees are examined (also an exact solution) dd fourth taxon () in each of three possible positions -> three trees Step 2a 2b 2c dd fifth taxon () in each of the five possible positions on each of the three trees - > 15 trees, and so on... - families of trees that cannot lead to shorter trees are discarded and not examined - save time - read text for details on method - faster than exhaustive search - no histogram of tree scores 2

3 Tree searching For more than 25 OTUs (most datasets) must use other methods, heuristic searching - approximate methods - do not guarantee the shortest tree will be found - fastest method (but less rigorous) - many issues to consider to employ best strategy for searching tree space - can get trapped in local optima while searching for global optima (shortest trees) (to be continued see lecture on large datasets) Tree searching heuristic searching - approximate methods 1. starting tree is obtained by some clustering method eg stepwise addition or neighbor-joining 2. This tree is then subjected to branch swapping (movement of branches to new places on tree) - each swap makes a new topology which is scored using the Optimality riterion - hope is to find a more optimal tree through extensive branch swapping ata types haracter data - OTU x character matrix! OTUs!!!characters!!!!! ! Species1!!TGTTG! Species2!!TGTTGT! Species3! Species4!!TGTG!!TGTGTGG! ata types istance data - OTU x OTU matrix! species1 0 species1 species2 species3 species4 species species species These are uncorrected distances - they can be corrected to improve chance of finding correct tree (see upcoming lecture on models of evolution) ata types istance methods vs haracter methods - epending on the data can prefer the same or different topologies - ven if the same topology is chosen, only character based methods allow an estimate of ancestral states (which characters are changing where on the tree) - Once characters have been converted to distances we lose this information - Important for understanding historical forms of molecules or adaptations! 3

4 Optimality riteria vs clustering methods Methods of Phylogenetic Inference - more fundamental split is not between methods that use distance data vs character data - ut between methods that use optimality criteria vs those that do not (sometimes called algorithmic methods, or clustering methods) - lustering methods are fast because they build only one tree ( one-tree methods ) - ut, often a dataset will be explained equally well by multiple, sometimes thousands of trees uilding a tree vs Finding a tree uilding or onstructing a tree vs Finding a tree - ach tree is a hypothesis of relationships - For n OTUs we know how many alternative hypotheses there are before we do any analysis - The question to be answered: Is the signal in your data strong enough to weed through these hypotheses? - ie an your data reject all but one (or a few) of these hypotheses? uilding a tree vs Finding a tree - Methods that build a tree do not use optimality criteria - They do not test hypotheses, they do not evaluate the alternative hypotheses for n taxa - Instead, they create a single tree from the data, they build (create, construct) a tree - Since hypotheses are not tested I contend these trees are not scientific, they are good places to start a search for optimal trees, or a means to explore the data - (ut will find the true tree if the distance matrix is an exact reflection of the true tree) uilding a tree vs Finding a tree omparison: lustering method 1. uild a tree (eg with NJ) 2. Stop Optimality riteria (eg parsimony) 1. uild a tree (eg with NJ) 2. Score tree with criterion 3. Try to improve tree with branch swapping 4. Goto step 1 until +/- all trees are scored 5. Stop when search is done lustering Methods - UPGM UPGM (Unweighted Pair-Group Method using rithmetic averages) for ultrametric data - Sokal & Michener phenetics / phenograms - assumes / requires equal rates of evolution on all branches - false for most real data - descendents are equidistant from ancestor onstraints: e1 = e2 e4 = e1 + e3 e5 + e4 = e9 + e8 + e6 etc. H I J F G e1 e2 e6 e7 e3 e4 e8 e5 e9 SOLUT TIM or IVRGN 4

5 lustering Methods - UPGM UPGM Very sensitive to unequal rates among lineages Real data rarely are ultrametric Should no longer be used True tree Neighbor-Joining (NJ) - Saitou & Nei (1987) - unjustifiably popular method but better than UPGM - clustering method to deal with non-ultrametric data - no assumption of clock-like evolution - approximation of the Minimum volution tree (thus not phenetic ) - for many datasets, however, NJ fails to produce the M tree - it gets close but better trees often exist - eg 27,000 equal or better trees than the NJ tree published by Hedges et al (1992) - Swofford et al thus, hard to justify an NJ tree as a reasonable estimate of phylogeny (Swofford & Sullivan, text) Neighbor-Joining (NJ) Farris et al. (1996) ladistics 12: Farris et al. (1996) ladistics 12: NJ obscures ambiguities in the data - sensitive to the input order of the OTUs - epending on the sorting of the OTUs different trees might result - This problem was addressed by jumbling the input order for building multiple NJ trees - Typically people do NJ-bootstrapping which reduces but does not eliminate the problem of ignoring ambiguities in the data No means to decide which is better - (no optimality criterion) Two NJ trees built from same dataset with differently sorted OTUs Your text - istance methods (Yves Van de Peer) - Repeatedly mentions that optimality criteria methods suffer the drawback that all tree topologies have to be investigated - lustering methods, in contrast, produce only one tree (- I consider this a huge weakness) - This is like saying: the advantage of NJ is that it ignores alternative equally good or possibly better estimates of phylogeny - Similar to the pheneticists goal to produce a stable tree that was not necessarily a phylogeny NJ often considered inferior but is preferred by those who consider a quickly produced tree more important than accuracy or honest presentation of the signal in their data If a more rigorous method, an optimality criterion method, finds the same tree as NJ Publish the more rigorously obtained tree! (the NJ tree in this case tells us nothing extra - and publishing only the NJ tree tells readers that not much effort or care was put into the analysis) 5

6 lustering Methods dvocates argue that we can do NJ ootstrapping - method to assess the strength of the signal in the data - to assess the precision of the estimate - consider a bogus clustering method that builds a tree by alphabetically ordering the OTUs according to their names & ignores the data - bootstrapping this method would produce scores of 100% for every branch - obviously meaningless lustering Methods NJ is the primary method of the N arcoding protocol: HRT, P..N., PNTON,.H., URNS, J.M., JNZN,.H. & HLLWHS, W Ten species in one: N barcoding reveals cryptic species in the neotropical skipper butterfly straptes fulgerator. Proceedings of the National cademy of Sciences of the US 101, rower,.v.z Problems with N barcodes for species delimitation: ten species of straptes fulgerator reassessed (Lepidoptera: Hesperiidae). Systematics and iodiversity rower bootstrapped their NJ tree - found support for at least 3 but not more than 7 clades that may correspond to cryptic species - Not 10! Suspected cryptic species - adults look identical Optimality riterion for istances Minimum volution (M) - Uses distance data (preferably corrected data) - Optimality criterion (searches tree space, much more rigorous & time consuming than NJ) - Tree that minimizes the sum of the lengths of the branches is the best estimate of phylogeny - Parsimony using distance data - etter than NJ but still weaker than character methods Minimum volution xample Philips, M. J., F. elsuc,. Penny Genome-scale phylogeny and the detection of systematic biases. M 21(7): Reevaluated Rokas et al s (2003) dataset of 8 yeast genomes (106 nuclear genes, 127, 026 nucleotides) - ompared the two types of statistical error: Stochastic (random) error - deviation between estimate and true value due to sampling, by definition will vanish with infinite data - assessed using branch support measures Systematic error - deviation between estimate and true value due to violated assumptions in the estimation method - will not vanish with infinite data - branch support tells us nothing 6

7 Minimum volution xample - Random error +/- gone with huge dataset of 124,026 characters - Systematic error evident in M analysis (tree on right) (even with corrected distance data!) (& loss of data) - 100% branch support values indicate no random error Minimum volution xample Zwickl,. J. &. M. Hillis Increased taxon sampling greatly reduces phylogenetic error. Sys. iol. 51(4): investigated impact of taxon sampling on accuracy of estimates of phylogenies - all optimality criteria saw a reduction in error with increased taxon sampling - however, Minimum volution saw the smallest benefit and had overall the highest error rates - another source of phylogenetic error - use of a method that has a higher failure rate than other methods Summary 1. Two data types: istance & haracter 2. Two ways to get a tree : lustering methods & Optimality riteria methods (building vs searching) 3. ll clustering methods are distance based - ut not all distance methods are clustering methods (Minimum volution) Tree Method Optimality lustering riterion algorithm istances UPGM NJ Minimum evolution ata type haracters Parsimony, Maximum Likelihood Summary istance methods Should only be used with corrected data (see lecture on models of evolution) Phylogenetic error can result from use of distances that don t reflect true distances ssessment of strength of signal is critical to modern phylogenetics but seemingly the opposite goal of clustering (one-tree) methods haracter evolution / ancestral state data are lost ranch lengths sometimes estimated as less than minimum possible (observed = minimum) Less capable with difficult dataset (eg Rokas et al) than character methods (eg Parsimony) Summary 6. If dataset is clean, strong historical signal, low rate heterogeneity, distance methods typically do fine 7. If a comparison is made with more powerful, non-distance methods, as it should be, why bother using distance methods at all? 8. Only justification I can think of, which is weak, is that distance methods are very efficient [ = fast] (and if you are lucky, they will choose the same tree as the more powerful methods) 9. ue to their speed these methods are ideal for data exploration prior to final analyses 10. lustering methods (eg NJ) do not try to find globally optimal trees Terms - from lecture & readings 2n-3" tree length" MPT (most parsimonious tree(s))" exhaustive (exact) search" branch & bound search" heuristic search" tree space" tree islands" branch swapping" ancestral states" optimality criteria" clustering method" distance data" character (discrete) data" UPGM" ultrametric data" Neighbor-joining" minimum evolution" stochastic (random) error" systematic error" 7

8 Study questions What does ʻtree lengthʼ mean for a Parsimony analysis?" What are the differences between the 3 different means of searching for optimal trees? When would you use each? Pros & cons?" What are tree islands & why are they important? How does NJ deal with tree islands?" ompare & contrast optimality criteria methods with clustering methods. Is the requirement that tree space be extensively searched a drawback or advantage of optimality criterion methods?" What are the requirements of the data for an ultrametric method? re these requirements typically met by real data?" Why is it hard to justify a NJ tree as a reasonable estimate of phylogeny?" ompare & contrast stochastic error with systematic error." ll clustering methods use (distance? or character?) data. " ll distance methods are clustering methods (True or False)?" 8

Introduction to Biosystematics - Zool 575

Introduction to Biosystematics - Zool 575 Introduction to Biosystematics Lecture 21-1. Introduction to maximum likelihood - synopsis of how it works - likelihood of a single sequence - likelihood across a single branch - likelihood as branch length

More information

Recap: Properties of Trees. Rooting an unrooted tree. Questions trees can address: Data for phylogeny reconstruction. Rooted vs unrooted trees:

Recap: Properties of Trees. Rooting an unrooted tree. Questions trees can address: Data for phylogeny reconstruction. Rooted vs unrooted trees: Pairwise sequence alignment (global and local) Recap: Properties of rees Multiple sequence alignment global local ubstitution matrices atabase ing L equence statistics Leaf nodes contemporary taxa Internal

More information

Lecture 2. Tree space and searching tree space

Lecture 2. Tree space and searching tree space Lecture 2. Tree space and searching tree space Joe Felsenstein epartment of Genome Sciences and epartment of iology Lecture 2. Tree space and searching tree space p.1/48 Orang Gorilla himp Human Gibbon

More information

Phylogenetic Reconstruction Methods

Phylogenetic Reconstruction Methods Phylogenetic Reconstruction Methods Distance-based Methods Character-based Methods non-statistical a. parsimony statistical a. maximum likelihood b. Bayesian inference Parsimony has its roots in Hennig

More information

Lecture 30. Phylogeny methods, part 2 (Searching tree space) p.1/22

Lecture 30. Phylogeny methods, part 2 (Searching tree space) p.1/22 Lecture 30. Phylogeny methods, part 2 (Searching tree space) Joe elsenstein epartment of Genome Sciences and epartment of iology Lecture 30. Phylogeny methods, part 2 (Searching tree space) p.1/22 ll possible

More information

Parsimony II Search Algorithms

Parsimony II Search Algorithms Parsimony II Search Algorithms Genome 373 Genomic Informatics Elhanan Borenstein Raw distance correction As two DNA sequences diverge, it is easy to see that their maximum raw distance is ~0.75 (assuming

More information

Human origins and analysis of mitochondrial DNA sequences

Human origins and analysis of mitochondrial DNA sequences Human origins and analysis of mitochondrial DNA sequences Science, February 7, 1992 L. Vigilant et al. [1] recently presented "the strongest support yet for the placement of [their] common mtdna [mitochondrial

More information

Frequent Inconsistency of Parsimony Under a Simple Model of Cladogenesis

Frequent Inconsistency of Parsimony Under a Simple Model of Cladogenesis Syst. Biol. 52(5):641 648, 2003 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150390235467 Frequent Inconsistency of Parsimony Under a Simple Model

More information

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Feb 3 & 5):

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Feb 3 & 5): Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week (Feb 3 & 5): Chronogram estimation: Penalized Likelihood Approach BEAST Presentations of your projects 1 The Anatomy

More information

Coalescent Theory: An Introduction for Phylogenetics

Coalescent Theory: An Introduction for Phylogenetics Coalescent Theory: An Introduction for Phylogenetics Laura Salter Kubatko Departments of Statistics and Evolution, Ecology, and Organismal Biology The Ohio State University lkubatko@stat.ohio-state.edu

More information

The African Origin Hypothesis What do the data tell us?

The African Origin Hypothesis What do the data tell us? The African Origin Hypothesis What do the data tell us? Mitochondrial DNA and Human Evolution Cann, Stoneking and Wilson, Nature 1987. WOS - 1079 citations Mitochondrial DNA and Human Evolution Cann, Stoneking

More information

Project. B) Building the PWM Read the instructions of HO_14. 1) Determine all the 9-mers and list them here:

Project. B) Building the PWM Read the instructions of HO_14. 1) Determine all the 9-mers and list them here: Project Please choose ONE project among the given five projects. The last three projects are programming projects. hoose any programming language you want. Note that you can also write programs for the

More information

Bootstraps and testing trees

Bootstraps and testing trees ootstraps and testing trees Joe elsenstein epts. of Genome Sciences and of iology, University of Washington ootstraps and testing trees p.1/20 ln L log-likelihood curve and its confidence interval 2620

More information

Coalescence. Outline History. History, Model, and Application. Coalescence. The Model. Application

Coalescence. Outline History. History, Model, and Application. Coalescence. The Model. Application Coalescence History, Model, and Application Outline History Origins of theory/approach Trace the incorporation of other s ideas Coalescence Definition and descriptions The Model Assumptions and Uses Application

More information

Where do evolutionary trees comes from?

Where do evolutionary trees comes from? Probabilistic models of evolutionary trees Joint work with Outline of talk Part 1: History, overview Part 2: Discrete models of tree shape Part 3: Continuous trees Part 4: Applications: phylogenetic diversity,

More information

Statistics Laboratory 7

Statistics Laboratory 7 Pass the Pigs TM Statistics 104 - Laboratory 7 On last weeks lab we looked at probabilities associated with outcomes of the game Pass the Pigs TM. This week we will look at random variables associated

More information

Analysis of geographically structured populations: Estimators based on coalescence

Analysis of geographically structured populations: Estimators based on coalescence Analysis of geographically structured populations: Estimators based on coalescence Peter Beerli Department of Genetics, Box 357360, University of Washington, Seattle WA 9895-7360, Email: beerli@genetics.washington.edu

More information

can mathematicians find the woods?

can mathematicians find the woods? Eolutionary trees, coalescents, and gene trees: can mathematicians find the woods? Joe Felsenstein Department of Genome Sciences and Department of Biology Eolutionary trees, coalescents, and gene trees:

More information

Algorithms for Genetics: Basics of Wright Fisher Model and Coalescent Theory

Algorithms for Genetics: Basics of Wright Fisher Model and Coalescent Theory Algorithms for Genetics: Basics of Wright Fisher Model and Coalescent Theory Vineet Bafna Harish Nagarajan and Nitin Udpa 1 Disclaimer Please note that a lot of the text and figures here are copied from

More information

Characteristics of Routes in a Road Traffic Assignment

Characteristics of Routes in a Road Traffic Assignment Characteristics of Routes in a Road Traffic Assignment by David Boyce Northwestern University, Evanston, IL Hillel Bar-Gera Ben-Gurion University of the Negev, Israel at the PTV Vision Users Group Meeting

More information

Ioanna Manolopoulou and Brent C. Emerson. October 7, Abstract

Ioanna Manolopoulou and Brent C. Emerson. October 7, Abstract Phylogeographic Ancestral Inference Using the Coalescent Model on Haplotype Trees Ioanna Manolopoulou and Brent C. Emerson October 7, 2011 Abstract Phylogeographic ancestral inference is a question frequently

More information

Warning: software often displays unrooted trees like this:

Warning: software often displays unrooted trees like this: Warning: software often displays unrooted trees like this: /------------------------------ Chara /-------------------------- Chlorella /---------16 \---------------------------- Volvox +-------------------17

More information

Population Structure and Genealogies

Population Structure and Genealogies Population Structure and Genealogies One of the key properties of Kingman s coalescent is that each pair of lineages is equally likely to coalesce whenever a coalescent event occurs. This condition is

More information

Comparative method, coalescents, and the future. Correlation of states in a discrete-state model

Comparative method, coalescents, and the future. Correlation of states in a discrete-state model Comparative method, coalescents, and the future Joe Felsenstein Depts. of Genome Sciences and of Biology, University of Washington Comparative method, coalescents, and the future p.1/28 Correlation of

More information

5 Inferring Population

5 Inferring Population 5 Inferring Population History and Demography While population genetics was a very theoretical discipline originally, the modern abundance of population genetic data has forced the field to become more

More information

Comparative method, coalescents, and the future

Comparative method, coalescents, and the future Comparative method, coalescents, and the future Joe Felsenstein Depts. of Genome Sciences and of Biology, University of Washington Comparative method, coalescents, and the future p.1/36 Correlation of

More information

A Likelihood Method to Estimate/Detect Gene Flow and A Distance Method to. Estimate Species Trees in the Presence of Gene Flow.

A Likelihood Method to Estimate/Detect Gene Flow and A Distance Method to. Estimate Species Trees in the Presence of Gene Flow. A Likelihood Method to Estimate/Detect Gene Flow and A Distance Method to Estimate Species Trees in the Presence of Gene Flow Thesis Presented in Partial Fulfillment of the Requirements for the Degree

More information

Lecture5: Lossless Compression Techniques

Lecture5: Lossless Compression Techniques Fixed to fixed mapping: we encoded source symbols of fixed length into fixed length code sequences Fixed to variable mapping: we encoded source symbols of fixed length into variable length code sequences

More information

1 This work was partially supported by NSF Grant No. CCR , and by the URI International Engineering Program.

1 This work was partially supported by NSF Grant No. CCR , and by the URI International Engineering Program. Combined Error Correcting and Compressing Codes Extended Summary Thomas Wenisch Peter F. Swaszek Augustus K. Uht 1 University of Rhode Island, Kingston RI Submitted to International Symposium on Information

More information

Tópicos Depto. Ciencias Biológicas, UniAndes Profesor Andrew J. Crawford Semestre II

Tópicos Depto. Ciencias Biológicas, UniAndes Profesor Andrew J. Crawford Semestre II Tópicos Depto. Ciencias Biológicas, UniAndes Profesor Andrew J. Crawford Semestre 29 -II Lab Coalescent simulation using SIMCOAL 17 septiembre 29 Coalescent theory provides a powerful model

More information

The genealogical history of a population The coalescent process. Identity by descent Distribution of pairwise coalescence times

The genealogical history of a population The coalescent process. Identity by descent Distribution of pairwise coalescence times The coalescent The genealogical history of a population The coalescent process Identity by descent Distribution of pairwise coalescence times Adding mutations Expected pairwise differences Evolutionary

More information

Phylogeny and Molecular Evolution

Phylogeny and Molecular Evolution Phylogeny and Molecular Evolution Character Based Phylogeny Large Parsimony 1/50 Credit Ron Shamir s lecture notes Notes by Nir Friedman Dan Geiger, Shlomo Moran, Sagi Snir and Ron Shamir Durbin et al.

More information

Player Speed vs. Wild Pokémon Encounter Frequency in Pokémon SoulSilver Joshua and AP Statistics, pd. 3B

Player Speed vs. Wild Pokémon Encounter Frequency in Pokémon SoulSilver Joshua and AP Statistics, pd. 3B Player Speed vs. Wild Pokémon Encounter Frequency in Pokémon SoulSilver Joshua and AP Statistics, pd. 3B In the newest iterations of Nintendo s famous Pokémon franchise, Pokémon HeartGold and SoulSilver

More information

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game Outline Game Playing ECE457 Applied Artificial Intelligence Fall 2007 Lecture #5 Types of games Playing a perfect game Minimax search Alpha-beta pruning Playing an imperfect game Real-time Imperfect information

More information

AI Approaches to Ultimate Tic-Tac-Toe

AI Approaches to Ultimate Tic-Tac-Toe AI Approaches to Ultimate Tic-Tac-Toe Eytan Lifshitz CS Department Hebrew University of Jerusalem, Israel David Tsurel CS Department Hebrew University of Jerusalem, Israel I. INTRODUCTION This report is

More information

2 person perfect information

2 person perfect information Why Study Games? Games offer: Intellectual Engagement Abstraction Representability Performance Measure Not all games are suitable for AI research. We will restrict ourselves to 2 person perfect information

More information

CPS331 Lecture: Heuristic Search last revised 6/18/09

CPS331 Lecture: Heuristic Search last revised 6/18/09 CPS331 Lecture: Heuristic Search last revised 6/18/09 Objectives: 1. To introduce the use of heuristics in searches 2. To introduce some standard heuristic algorithms 3. To introduce criteria for evaluating

More information

Phylogenetic analysis of Gregory of Nazianzus Homily 27

Phylogenetic analysis of Gregory of Nazianzus Homily 27 Phylogenetic analysis of Gregory of Nazianzus Homily 27 Anne-Catherine Lantin 1, Philippe V. Baret 1, Caroline Macé 2 1 Université catholique de Louvain AGRO GENA 1348 Louvain-la-Neuve Belgique 2 Katholieke

More information

Experiments on Alternatives to Minimax

Experiments on Alternatives to Minimax Experiments on Alternatives to Minimax Dana Nau University of Maryland Paul Purdom Indiana University April 23, 1993 Chun-Hung Tzeng Ball State University Abstract In the field of Artificial Intelligence,

More information

Introduction to Probability

Introduction to Probability 6.04/8.06J Mathematics for omputer Science Srini Devadas and Eric Lehman pril 4, 005 Lecture Notes Introduction to Probability Probability is the last topic in this course and perhaps the most important.

More information

FPGA Acceleration of the Phylogenetic Parsimony Kernel?

FPGA Acceleration of the Phylogenetic Parsimony Kernel? FPGA Acceleration of the Phylogenetic Parsimony Kernel? Nikolaos Alachiotis, Alexandros Stamatakis The Exelixis Lab, Scientific Computing Group Heidelberg Institute for Theoretical Studies Heidelberg,

More information

CS 787: Advanced Algorithms Homework 1

CS 787: Advanced Algorithms Homework 1 CS 787: Advanced Algorithms Homework 1 Out: 02/08/13 Due: 03/01/13 Guidelines This homework consists of a few exercises followed by some problems. The exercises are meant for your practice only, and do

More information

Past questions from the last 6 years of exams for programming 101 with answers.

Past questions from the last 6 years of exams for programming 101 with answers. 1 Past questions from the last 6 years of exams for programming 101 with answers. 1. Describe bubble sort algorithm. How does it detect when the sequence is sorted and no further work is required? Bubble

More information

Effect of Information Exchange in a Social Network on Investment: a study of Herd Effect in Group Parrondo Games

Effect of Information Exchange in a Social Network on Investment: a study of Herd Effect in Group Parrondo Games Effect of Information Exchange in a Social Network on Investment: a study of Herd Effect in Group Parrondo Games Ho Fai MA, Ka Wai CHEUNG, Ga Ching LUI, Degang Wu, Kwok Yip Szeto 1 Department of Phyiscs,

More information

Information and Decisions

Information and Decisions Part II Overview Information and decision making, Chs. 13-14 Signal coding, Ch. 15 Signal economics, Chs. 16-17 Optimizing communication, Ch. 19 Signal honesty, Ch. 20 Information and Decisions Signals

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Announcements Midterm next Tuesday: covers weeks 1-4 (Chapters 1-4) Take the full class period Open book/notes (can use ebook) ^^ No programing/code, internet searches or friends

More information

BIOL Evolution. Lecture 8

BIOL Evolution. Lecture 8 BIOL 432 - Evolution Lecture 8 Expected Genotype Frequencies in the Absence of Evolution are Determined by the Hardy-Weinberg Equation. Assumptions: 1) No mutation 2) Random mating 3) Infinite population

More information

Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms

Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms Magnus Nordborg University of Southern California The importance of history Genetic polymorphism data represent the outcome

More information

2048: An Autonomous Solver

2048: An Autonomous Solver 2048: An Autonomous Solver Final Project in Introduction to Artificial Intelligence ABSTRACT. Our goal in this project was to create an automatic solver for the wellknown game 2048 and to analyze how different

More information

Coalescence time distributions for hypothesis testing -Kapil Rajaraman 498BIN, HW# 2

Coalescence time distributions for hypothesis testing -Kapil Rajaraman 498BIN, HW# 2 Coalescence time distributions for hypothesis testing -Kapil Rajaraman (rajaramn@uiuc.edu) 498BIN, HW# 2 This essay will be an overview of Maryellen Ruvolo s work on studying modern human origins using

More information

The Contest Between Parsimony and Likelihood. Elliott Sober*

The Contest Between Parsimony and Likelihood. Elliott Sober* The Contest Between Parsimony and Likelihood Elliott Sober* Two of the main methods that biologists now use to infer phylogenetic relationships are maximum likelihood and maximum parsimony. The method

More information

Lecture 8 Link-State Routing

Lecture 8 Link-State Routing 6998-02: Internet Routing Lecture 8 Link-State Routing John Ioannidis AT&T Labs Research ji+ir@cs.columbia.edu Copyright 2002 by John Ioannidis. All Rights Reserved. Announcements Lectures 1-5, 7-8 are

More information

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask Set 4: Game-Playing ICS 271 Fall 2017 Kalev Kask Overview Computer programs that play 2-player games game-playing as search with the complication of an opponent General principles of game-playing and search

More information

Autocomplete Sketch Tool

Autocomplete Sketch Tool Autocomplete Sketch Tool Sam Seifert, Georgia Institute of Technology Advanced Computer Vision Spring 2016 I. ABSTRACT This work details an application that can be used for sketch auto-completion. Sketch

More information

Chapter 12 Summary Sample Surveys

Chapter 12 Summary Sample Surveys Chapter 12 Summary Sample Surveys What have we learned? A representative sample can offer us important insights about populations. o It s the size of the same, not its fraction of the larger population,

More information

LESSON 2. Opening Leads Against Suit Contracts. General Concepts. General Introduction. Group Activities. Sample Deals

LESSON 2. Opening Leads Against Suit Contracts. General Concepts. General Introduction. Group Activities. Sample Deals LESSON 2 Opening Leads Against Suit Contracts General Concepts General Introduction Group Activities Sample Deals 40 Defense in the 21st Century General Concepts Defense The opening lead against trump

More information

Using Artificial intelligent to solve the game of 2048

Using Artificial intelligent to solve the game of 2048 Using Artificial intelligent to solve the game of 2048 Ho Shing Hin (20343288) WONG, Ngo Yin (20355097) Lam Ka Wing (20280151) Abstract The report presents the solver of the game 2048 base on artificial

More information

Game-Playing & Adversarial Search

Game-Playing & Adversarial Search Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search (two lectures) Chapter 5.1-5.5 Next lecture topic: Constraint Satisfaction Problems (two lectures) Chapter 6.1-6.4,

More information

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

More information

Coding for Efficiency

Coding for Efficiency Let s suppose that, over some channel, we want to transmit text containing only 4 symbols, a, b, c, and d. Further, let s suppose they have a probability of occurrence in any block of text we send as follows

More information

Ancestral Recombination Graphs

Ancestral Recombination Graphs Ancestral Recombination Graphs Ancestral relationships among a sample of recombining sequences usually cannot be accurately described by just a single genealogy. Linked sites will have similar, but not

More information

Chapter 2 Distributed Consensus Estimation of Wireless Sensor Networks

Chapter 2 Distributed Consensus Estimation of Wireless Sensor Networks Chapter 2 Distributed Consensus Estimation of Wireless Sensor Networks Recently, consensus based distributed estimation has attracted considerable attention from various fields to estimate deterministic

More information

6.047/6.878 Lecture 21: Phylogenomics II

6.047/6.878 Lecture 21: Phylogenomics II Guest Lecture by Matt Rasmussen Orit Giguzinsky and Ethan Sherbondy December 13, 2012 1 Contents 1 Introduction 3 2 Inferring Orthologs/Paralogs, Gene Duplication and Loss 3 2.1 Species Tree..............................................

More information

Lecture 4: Chapter 4

Lecture 4: Chapter 4 Lecture 4: Chapter 4 C C Moxley UAB Mathematics 19 September 16 4.2 Basic Concepts of Probability Procedure Event Simple Event Sample Space 4.2 Basic Concepts of Probability Procedure Event Simple Event

More information

Do You Understand Evolutionary Trees? By T. Ryan Gregory

Do You Understand Evolutionary Trees? By T. Ryan Gregory Do You Understand Evolutionary Trees? By T. Ryan Gregory A single figure graces the pages of Charles Darwin's groundbreaking work On the Origin of Species, first published in 1859. The figure in question

More information

CS510 \ Lecture Ariel Stolerman

CS510 \ Lecture Ariel Stolerman CS510 \ Lecture04 2012-10-15 1 Ariel Stolerman Administration Assignment 2: just a programming assignment. Midterm: posted by next week (5), will cover: o Lectures o Readings A midterm review sheet will

More information

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS The major design challenges of ASIC design consist of microscopic issues and macroscopic issues [1]. The microscopic issues are ultra-high

More information

Coalescents. Joe Felsenstein. GENOME 453, Winter Coalescents p.1/39

Coalescents. Joe Felsenstein. GENOME 453, Winter Coalescents p.1/39 Coalescents Joe Felsenstein GENOME 453, Winter 2007 Coalescents p.1/39 Cann, Stoneking, and Wilson Becky Cann Mark Stoneking the late Allan Wilson Cann, R. L., M. Stoneking, and A. C. Wilson. 1987. Mitochondrial

More information

Population Genetics using Trees. Peter Beerli Genome Sciences University of Washington Seattle WA

Population Genetics using Trees. Peter Beerli Genome Sciences University of Washington Seattle WA Population Genetics using Trees Peter Beerli Genome Sciences University of Washington Seattle WA Outline 1. Introduction to the basic coalescent Population models The coalescent Likelihood estimation of

More information

LECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR

LECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR 1 LECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR 2 STORAGE SPACE Uncompressed graphics, audio, and video data require substantial storage capacity. Storing uncompressed video is not possible

More information

Chapter 7: Sorting 7.1. Original

Chapter 7: Sorting 7.1. Original Chapter 7: Sorting 7.1 Original 3 1 4 1 5 9 2 6 5 after P=2 1 3 4 1 5 9 2 6 5 after P=3 1 3 4 1 5 9 2 6 5 after P=4 1 1 3 4 5 9 2 6 5 after P=5 1 1 3 4 5 9 2 6 5 after P=6 1 1 3 4 5 9 2 6 5 after P=7 1

More information

Artificial Intelligence Lecture 3

Artificial Intelligence Lecture 3 Artificial Intelligence Lecture 3 The problem Depth first Not optimal Uses O(n) space Optimal Uses O(B n ) space Can we combine the advantages of both approaches? 2 Iterative deepening (IDA) Let M be a

More information

Regulatory Motif Finding II

Regulatory Motif Finding II Regulatory Motif Finding II Lectures 13 Nov 9, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall (JHN) 022 1 Outline Regulatory

More information

Search then involves moving from state-to-state in the problem space to find a goal (or to terminate without finding a goal).

Search then involves moving from state-to-state in the problem space to find a goal (or to terminate without finding a goal). Search Can often solve a problem using search. Two requirements to use search: Goal Formulation. Need goals to limit search and allow termination. Problem formulation. Compact representation of problem

More information

Similarity & Link Analysis. Stony Brook University CSE545, Fall 2016

Similarity & Link Analysis. Stony Brook University CSE545, Fall 2016 Similarity & Link nalysis Stony rook University SE545, Fall 6 Finding Similar Items? (http://blog.soton.ac.uk/hive//5//r ecommendation-system-of-hive/) (http://www.datacommunitydc.org/blog/ 3/8/entity-resolution-for-big-data)

More information

Coalescents. Joe Felsenstein. GENOME 453, Autumn Coalescents p.1/48

Coalescents. Joe Felsenstein. GENOME 453, Autumn Coalescents p.1/48 Coalescents p.1/48 Coalescents Joe Felsenstein GENOME 453, Autumn 2015 Coalescents p.2/48 Cann, Stoneking, and Wilson Becky Cann Mark Stoneking the late Allan Wilson Cann, R. L., M. Stoneking, and A. C.

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Algorithms and Game Theory Date: 12/4/14

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Algorithms and Game Theory Date: 12/4/14 600.363 Introduction to Algorithms / 600.463 Algorithms I Lecturer: Michael Dinitz Topic: Algorithms and Game Theory Date: 12/4/14 25.1 Introduction Today we re going to spend some time discussing game

More information

4. Games and search. Lecture Artificial Intelligence (4ov / 8op)

4. Games and search. Lecture Artificial Intelligence (4ov / 8op) 4. Games and search 4.1 Search problems State space search find a (shortest) path from the initial state to the goal state. Constraint satisfaction find a value assignment to a set of variables so that

More information

Entropy Coding. Outline. Entropy. Definitions. log. A = {a, b, c, d, e}

Entropy Coding. Outline. Entropy. Definitions. log. A = {a, b, c, d, e} Outline efinition of ntroy Three ntroy coding techniques: Huffman coding rithmetic coding Lemel-Ziv coding ntroy oding (taken from the Technion) ntroy ntroy of a set of elements e,,e n with robabilities,

More information

Instability of Scoring Heuristic In games with value exchange, the heuristics are very bumpy Make smoothing assumptions search for "quiesence"

Instability of Scoring Heuristic In games with value exchange, the heuristics are very bumpy Make smoothing assumptions search for quiesence More on games Gaming Complications Instability of Scoring Heuristic In games with value exchange, the heuristics are very bumpy Make smoothing assumptions search for "quiesence" The Horizon Effect No matter

More information

The Queen of Sheba Comes to Visit Solomon

The Queen of Sheba Comes to Visit Solomon The Queen of Sheba Comes to Visit Solomon Ian C. McKay, 20 April 2011 I recently examined and compared four ancient versions of the story of the census of Israel and Judah ordered by King David, with a

More information

Department of Statistics and Operations Research Undergraduate Programmes

Department of Statistics and Operations Research Undergraduate Programmes Department of Statistics and Operations Research Undergraduate Programmes OPERATIONS RESEARCH YEAR LEVEL 2 INTRODUCTION TO LINEAR PROGRAMMING SSOA021 Linear Programming Model: Formulation of an LP model;

More information

Jitter Analysis Techniques Using an Agilent Infiniium Oscilloscope

Jitter Analysis Techniques Using an Agilent Infiniium Oscilloscope Jitter Analysis Techniques Using an Agilent Infiniium Oscilloscope Product Note Table of Contents Introduction........................ 1 Jitter Fundamentals................. 1 Jitter Measurement Techniques......

More information

GENOMIC REARRANGEMENT ALGORITHMS

GENOMIC REARRANGEMENT ALGORITHMS GENOMIC REARRANGEMENT ALGORITHMS KAREN LOSTRITTO Abstract. In this paper, I discuss genomic rearrangement. Specifically, I describe the formal representation of these genomic rearrangements as well as

More information

On the GNSS integer ambiguity success rate

On the GNSS integer ambiguity success rate On the GNSS integer ambiguity success rate P.J.G. Teunissen Mathematical Geodesy and Positioning Faculty of Civil Engineering and Geosciences Introduction Global Navigation Satellite System (GNSS) ambiguity

More information

A Brief Introduction to Information Theory and Lossless Coding

A Brief Introduction to Information Theory and Lossless Coding A Brief Introduction to Information Theory and Lossless Coding 1 INTRODUCTION This document is intended as a guide to students studying 4C8 who have had no prior exposure to information theory. All of

More information

CS 229 Final Project: Using Reinforcement Learning to Play Othello

CS 229 Final Project: Using Reinforcement Learning to Play Othello CS 229 Final Project: Using Reinforcement Learning to Play Othello Kevin Fry Frank Zheng Xianming Li ID: kfry ID: fzheng ID: xmli 16 December 2016 Abstract We built an AI that learned to play Othello.

More information

CMPT 310 Assignment 1

CMPT 310 Assignment 1 CMPT 310 Assignment 1 October 16, 2017 100 points total, worth 10% of the course grade. Turn in on CourSys. Submit a compressed directory (.zip or.tar.gz) with your solutions. Code should be submitted

More information

Notes on 4-coloring the 17 by 17 grid

Notes on 4-coloring the 17 by 17 grid otes on 4-coloring the 17 by 17 grid lizabeth upin; ekupin@math.rutgers.edu ugust 5, 2009 1 or large color classes, 5 in each row, column color class is large if it contains at least 73 points. We know

More information

Dice Games and Stochastic Dynamic Programming

Dice Games and Stochastic Dynamic Programming Dice Games and Stochastic Dynamic Programming Henk Tijms Dept. of Econometrics and Operations Research Vrije University, Amsterdam, The Netherlands Revised December 5, 2007 (to appear in the jubilee issue

More information

Study guide for Graduate Computer Vision

Study guide for Graduate Computer Vision Study guide for Graduate Computer Vision Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 01003 November 23, 2011 Abstract 1 1. Know Bayes rule. What

More information

Variant Calling. Michael Schatz. Feb 20, 2018 Lecture 7: Applied Comparative Genomics

Variant Calling. Michael Schatz. Feb 20, 2018 Lecture 7: Applied Comparative Genomics Variant Calling Michael Schatz Feb 20, 2018 Lecture 7: Applied Comparative Genomics Mission Impossible 1. Setup VirtualBox 2. Initialize Tools 3. Download Reference Genome & Reads 4. Decode the secret

More information

LANDSCAPE SMOOTHING OF NUMERICAL PERMUTATION SPACES IN GENETIC ALGORITHMS

LANDSCAPE SMOOTHING OF NUMERICAL PERMUTATION SPACES IN GENETIC ALGORITHMS LANDSCAPE SMOOTHING OF NUMERICAL PERMUTATION SPACES IN GENETIC ALGORITHMS ABSTRACT The recent popularity of genetic algorithms (GA s) and their application to a wide range of problems is a result of their

More information

OPTIMIZING APPLICATIONS AND DATA LINKS FOR HF RADIO INTERMEDIATE TERM VARIATION: CAN YOU RIDE THE WAVE?

OPTIMIZING APPLICATIONS AND DATA LINKS FOR HF RADIO INTERMEDIATE TERM VARIATION: CAN YOU RIDE THE WAVE? OPTIMIZING APPLICATIONS AND DATA LINKS FOR HF RADIO INTERMEDIATE TERM VARIATION: CAN YOU RIDE THE WAVE? Steve Kille Isode Ltd Hampton, UK steve.kille@isode.com SUMMARY HF Radio transmission is subject

More information

Opportunistic Routing in Wireless Mesh Networks

Opportunistic Routing in Wireless Mesh Networks Opportunistic Routing in Wireless Mesh Networks Amir arehshoorzadeh amir@ac.upc.edu Llorenç Cerdá-Alabern llorenc@ac.upc.edu Vicent Pla vpla@dcom.upv.es August 31, 2012 Opportunistic Routing in Wireless

More information

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence Adversarial Search CS 486/686: Introduction to Artificial Intelligence 1 Introduction So far we have only been concerned with a single agent Today, we introduce an adversary! 2 Outline Games Minimax search

More information

Viral epidemiology and the Coalescent

Viral epidemiology and the Coalescent Viral epidemiology and the Coalescent Philippe Lemey and Marc A. Suchard Department of Microbiology and Immunology K.U. Leuven, and Departments of Biomathematics and Human Genetics David Geffen School

More information

Proposed Graduate Course at ANU: Statistical Communication Theory

Proposed Graduate Course at ANU: Statistical Communication Theory Proposed Graduate Course at ANU: Statistical Communication Theory Mark Reed mark.reed@nicta.com.au Title of the course: Statistical Communication Theory Course Director: Dr. Mark Reed (ANU Adjunct Fellow)

More information

Frequency Hopping Pattern Recognition Algorithms for Wireless Sensor Networks

Frequency Hopping Pattern Recognition Algorithms for Wireless Sensor Networks Frequency Hopping Pattern Recognition Algorithms for Wireless Sensor Networks Min Song, Trent Allison Department of Electrical and Computer Engineering Old Dominion University Norfolk, VA 23529, USA Abstract

More information

Hypothesis Tests. w/ proportions. AP Statistics - Chapter 20

Hypothesis Tests. w/ proportions. AP Statistics - Chapter 20 Hypothesis Tests w/ proportions AP Statistics - Chapter 20 let s say we flip a coin... Let s flip a coin! # OF HEADS IN A ROW PROBABILITY 2 3 4 5 6 7 8 (0.5) 2 = 0.2500 (0.5) 3 = 0.1250 (0.5) 4 = 0.0625

More information