DNA: Statistical Guidelines

Similar documents
AFDAA 2012 WINTER MEETING Population Statistics Refresher Course - Lecture 3: Statistics of Kinship Analysis

Popstats Parentage Statistics Strength of Genetic Evidence In Parentage Testing

DNA Parentage Test No Summary Report

DNA Interpretation Test No Summary Report

Supporting Online Material for

DNA Parentage Test No Summary Report

DNA Parentage Test No Summary Report

4. Kinship Paper Challenge

DNA Parentage Test No Summary Report

Interpretation errors in DNA profiling

ICMP DNA REPORTS GUIDE

Web-based Y-STR database for haplotype frequency estimation and kinship index calculation

Mix & match: Getting comfortable with DNA reporting. Elmira, New York. Cybergenetics People of New York v Casey Wilson

Methods of Parentage Analysis in Natural Populations

Statistical DNA Forensics Theory, Methods and Computation

Lecture 1: Introduction to pedigree analysis

DNA (DeoxyriboNucleic Acid)

Large scale kinship:familial Searching and DVI. Seoul, ISFG workshop

Basics of DNA & Sales and Marketing

Investigations from last time. Inbreeding and neutral evolution Genes, alleles and heterozygosity

Statistical DNA Forensics Theory, Methods and Computation

CONGEN. Inbreeding vocabulary

NON-RANDOM MATING AND INBREEDING

INDIAN RIVER CRIME LABORATORY

Lecture 6: Inbreeding. September 10, 2012

Kenneth Nordtvedt. Many genetic genealogists eventually employ a time-tomost-recent-common-ancestor

Enhanced Kinship Analysis and STR-based DNA Typing for Human Identification in Mass Fatality Incidents: The Swissair Flight 111 Disaster

Primer on Human Pedigree Analysis:

Population Structure. Population Structure

Bottlenecks reduce genetic variation Genetic Drift

Free Online Training

1/8/2013. Free Online Training. Using DNA and CODIS to Resolve Missing and Unidentified Person Cases. Click Online Training

BIOL 502 Population Genetics Spring 2017

Coalescence. Outline History. History, Model, and Application. Coalescence. The Model. Application

Algorithms for Genetics: Basics of Wright Fisher Model and Coalescent Theory

Statistical Interpretation in Making DNA-based Identification of Mass Victims

Inbreeding and self-fertilization

DNA Parentage Test No Summary Report

Populations. Arindam RoyChoudhury. Department of Biostatistics, Columbia University, New York NY 10032, U.S.A.,

Revising how the computer program

Automated Discovery of Pedigrees and Their Structures in Collections of STR DNA Specimens Using a Link Discovery Tool

Non-Paternity: Implications and Resolution

National Academy of Sciences

Inbreeding and self-fertilization

1) Using the sightings data, determine who moved from one area to another and fill this data in on the data sheet.

Manual for Familias 3

The genealogical history of a population The coalescent process. Identity by descent Distribution of pairwise coalescence times

Chapter 2: Genes in Pedigrees

Technology Transition through the Forensic Technology Center of Excellence

Population Genetics 3: Inbreeding

Exercise 4 Exploring Population Change without Selection

Determining Relatedness from a Pedigree Diagram

Chromosome X haplotyping in deficiency paternity testing principles and case report

Science & Sorcery in Forensic DNA Evidence. National Academy of Sciences. DNA biology. Cybergenetics

Inbreeding depression in corn. Inbreeding. Inbreeding depression in humans. Genotype frequencies without random mating. Example.

Population Structure and Genealogies

Spring 2013 Assignment Set #3 Pedigree Analysis. Set 3 Problems sorted by analytical and/or content type

Objective: Why? 4/6/2014. Outlines:

Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms

Kinship and Population Subdivision

ADJUSTING POPULATION ESTIMATES FOR GENOTYPING ERROR IN NON- INVASIVE DNA-BASED MARK-RECAPTURE EXPERIMENTS

9Consanguineous marriage and recessive

DNA PATERNITY TESTING YOUR QUESTIONS ANSWERED. Need some advice on testing? Call us free on:

Mitochondrial DNA Mixture Detection, Analysis, and Interpretation

DNA Basics, Y DNA Marker Tables, Ancestral Trees and Mutation Graphs: Definitions, Concepts, Understanding

An Optimal Algorithm for Automatic Genotype Elimination

Population Genetics using Trees. Peter Beerli Genome Sciences University of Washington Seattle WA

Identification of the Hypothesized African Ancestry of the Wife of Pvt. Henry Windecker Using Genomic Testing of the Autosomes.

Identification. Match. Cybergenetics Detecting and Denying DNA Evidence: A History of Forensic Identification

SNP variant discovery in pedigrees using Bayesian networks. Amit R. Indap

The African Origin Hypothesis What do the data tell us?

Mapping small-effect and linked quantitative trait loci for complex traits in. backcross or DH populations via a multi-locus GWAS methodology

Automatic Image Timestamp Correction

Illumina GenomeStudio Analysis

1.4.1(Question should be rather: Another sibling of these two brothers) 25% % % (population risk of heterozygot*2/3*1/4)

Two-point linkage analysis using the LINKAGE/FASTLINK programs

Pedigree Reconstruction using Identity by Descent

Contributed by "Kathy Hallett"

Decrease of Heterozygosity Under Inbreeding

Math 58. Rumbos Fall Solutions to Exam Give thorough answers to the following questions:

Autosomal DNA. What is autosomal DNA? X-DNA

Lutz Roewer, Sascha Willuweit Dept. Forensic Genetics, Institute of Legal Medicine and Forensic Sciences Charité Universitätsmedizin Berlin, Germany

Developing Conclusions About Different Modes of Inheritance

Using Y-DNA for Genealogy Debbie Parker Wayne, CG, CGL SM

Kinship/relatedness. David Balding Professor of Statistical Genetics University of Melbourne, and University College London.

February 24, [Click for Most Updated Paper] [Click for Most Updated Online Appendices]

STAT 536: The Coalescent

Genome-Wide Association Exercise - Data Quality Control

Comparative method, coalescents, and the future

Population Genetics. Joe Felsenstein. GENOME 453, Autumn Population Genetics p.1/70

Optimum contribution selection conserves genetic diversity better than random selection in small populations with overlapping generations

Every human cell (except red blood cells and sperm and eggs) has an. identical set of 23 pairs of chromosomes which carry all the hereditary

Detecting Heterogeneity in Population Structure Across the Genome in Admixed Populations

CM I. The risk of false inclusion of a relative in parentage testing - an in shico population study

PopGen3: Inbreeding in a finite population

[CLIENT] SmithDNA1701 DE January 2017

Forensic Statistics and Graphical Models (1) Richard Gill Spring Semester 2015

Chapter 3 Monday, May 17th

KINSHIP ANALYSIS AND HUMAN IDENTIFICATION IN MASS DISASTERS: THE USE OF MDKAP FOR THE WORLD TRADE CENTER TRAGEDY

Probability - Introduction Chapter 3, part 1

Transcription:

Frequency calculations for STR analysis When a probative association between an evidence profile and a reference profile is made, a frequency estimate is calculated to give weight to the association. Frequency estimates are calculated for at least three major population groups, generally Caucasian, African American, and Hispanic. Additional population/ethnic groups known to be relevant to the case for which data is available may also be calculated, if deemed appropriate or if requested. Reference data for allele frequencies AmpF STR Identifiler Plus frequency estimates will use empirical values tabulated from data in the following references: Budowle, B., et al., "Population data on the thirteen CODIS core short tandem repeat loci in African Americans, U.S. Caucasians, Hispanics, Bahamians, Jamaicans, and Trinidadians," Journal of Forensic Sciences, 1999, 44(6) pp. 1277-126. Budowle, B., Genotype Profiles for Five Population Groups at the Short Tandem Repeat Loci D2S133 and D19S433, Forensic Science Communications, 2001, 3(3). Moretti, T., et al., Erratum, Journal of Forensic Sciences, 2015, 60(4) pp. 1114-1116. Statistical calculations Based on the interpretation of the profile, the analyst may apply one of the following statistical calculations: Random Match Probability (RMP) The RMP estimates the probability that a profile from a random person in the population is consistent with the profile from the evidence sample. Combined Probability of Inclusion (CPI) The CPI calculation estimates the frequency that a randomly selected person would be included as a possible contributor to an observed mixture. Page 1 of

, Continued Random match probability Depending on the mixture type (refer Steps for Profile Interpretation- Step 4), either a restricted or an unrestricted random match probability may be applied. The restricted RMP is conditioned on the number of contributors and with consideration of quantitative peak height information and inference of contributor mixture ratios. A restricted approach will limit the genotypic combinations of possible contributors. The unrestricted RMP is also conditioned on the number of contributors, but is performed without consideration of quantitative peak height information or inference of contributor mixture ratios. Both use the following formulas: 2pq Heterozygote genotype frequency p 2 + p(1 p) Homozygote genotype frequency 2p - p 2 Obligate allele with dropout ( 2p ) where p = the frequency of allele p q = the frequency of allele q = homozygote correction factor (see Correction factor for homozygotes, below) The appropriate calculation to estimate the frequency of all genotypes that include an obligate allele (with a frequency of p) is 2p. The laboratory will use the expanded 2p formula, 2p - p 2. If the 2p formula is used more than once at a locus, the frequency of one of the duplicated heterozygote genotypes will be removed. When the interpretation for a locus or a profile is inconclusive, that locus or profile will not be used for statistical analysis. When the interpretation at a locus includes only one genotype, the appropriate formula above is used to calculate the genotype frequency. When the interpretation at a locus includes more than one genotype, the RMP is the sum of the individual frequencies for the genotypes included following mixture interpretation. Adding the frequencies of each genotype provides a frequency of A genotype or B genotype. Block continued on next page Page 2 of

, Continued Random match probability (continuted) The frequencies calculated for all of the individual loci are then multiplied together, using the product rule, to give the estimated probability of the profile as a whole. If a number greater than one is generated by adding the individual frequencies together, round the number to 1.0. For examples, refer to DNA: Statistical Calculations. Combined Probability of Inclusion (CPI) CPI is applied to mixture profiles where the contributions of individual donors cannot be resolved. There are no assumptions about the number of contributors when using CPI. Loci with alleles below the stochastic threshold may not be used for statistical purposes. The probability of inclusion (PI) calculation provides an estimate, at a locus, of the portion of the population that has a genotype that is represented in the mixed profile, and therefore would be included as a possible contributor to the mixed profile. Example: If evidence includes three alleles (A1, A2, A3) at a locus, then: PI = (a1 + a2 + a3) 2 PI = (a1 a1 + a2 a2 + a3 a3 + 2a1 a2 + 2a1 a3 + 2a2 a3) In this example, the only genotypes that would be included in the PI calculation and as possible contributors to the mixture would be: A1A1 A1A2 A2A2 A1A3 A3A3 A2A3 PI at each locus is first determined and then PIs from all of the included loci are multiplied together, using the product rule, to give the combined probability of inclusion (CPI). A locus that contains a single allele will not be included in a PI calculation, if allele dropout is suspected. If the analyst concludes that there is no allele dropout, and all contributors are represented by the single allele (all homozygous), then the allele can be included in a PI calculation. Page 3 of

, Continued Conservative calculations The following concepts were implemented to ensure that the frequency estimates are conservatively calculated: Correction factor for homozygotes To account for non-random mating, is applied to the calculation for a homozygote genotype. Empirical studies have shown that a conservative value for is 0.01. A value of 0.03 is applied when calculating a homozygote genotype for an isolated sub-population such as a Native American population. Five-event minimum allele frequency A five-event minimum allele frequency is used for rare alleles. For each individual allele, an observed allele count less than five is raised to five. This modified allele count is converted to a frequency and used for all subsequent genotype calculations. Conclusions The interpretation and comparison of evidence profiles to reference profiles will lead to the conclusions the analyst makes. Refer to DNA: Profile Interpretation for additional information. NOTE: A statistical calculation less than 1 in 2 for any of the three should not be used to make an inclusion. The profile can, however, be used to exclude an individual. Haplotype statistics for Y-STR analysis A consolidated United States Y-STR population database (www.usystrdatabase.org) consisting of anonymous Y-STR profiles from various population/ethnic groups has been established and should be used for reporting the significance of a Y-STR inclusion. The website also has the statistical formulas that are used to calculate frequency estimates. The search of the database provides the number of times a specific haplotype is observed in the database. The basis for the haplotype frequency estimation is the counting method. The application of upper bound of a confidence interval corrects for sampling variation uncertainty. Typically, upper bound frequency estimates for African Americans, Caucasians, and Hispanics will be used for reporting purposes. Examples of frequency estimate calculations can be found in DNA: Statistical Calculations. Page 4 of

, Continued Generating paternity statistics The CODIS Popstats software will be used to generate the Combined Paternity Index (CPI) and Probability of Paternity (PP) (see sections that follow) for the three major population groups reported: Caucasian, African American, and Hispanic For Hispanics, the data for the Southwestern Hispanic population group is reported. Data for Southeastern Hispanic, Chamorro, and Filipino population groups are automatically generated, but this data will not be reported. Popstats compares the child and alleged parent profiles at each locus and automatically computes the following values: Parentage Index Probability of Exclusion Probability of Parentage This laboratory will only be entering the mother as the known parent and the alleged father as the alleged parent. Therefore, the Parentage Index generated will be the reported Paternity Index and the Probability of Parentage will be the reported Probability of Paternity. Probability of Exclusion will not be reported. Page 5 of

, Continued Popstats calculations In order to perform parentage calculations using Popstats, three profiles must be entered: mother child, and suspected father The following procedure is used to obtain results. Step Action 1 Log on to CODIS and open the Analyst Workbench program. Select Popstats. 2 Select Parentage Calculation. 3 Enter the profiles of Biological parent (mother) Child/product of Conception Alleged parent (suspected father) 4 Click on Calculate and the window will display the Parentage Statistics. Paternity index The Paternity Index (PI) is a likelihood ratio based on two conditional probabilities: PIlocus = P (that the alleged father passed an allele to the child) P (a randomly selected man passed an allele to the child) The Paternity Index reflects how many more times likely it is to observe a particular set of alleles under the hypothesis that the alleged father is the biological father compared to the hypothesis that a randomly selected man is the biological father. The Paternity Index is based on the assumption that the randomly selected man has a similar ethnic background to the alleged father. Formula tables for the Paternity Index are listed in the Parentage Formula Table in the Popstats software. The exact formula for the Paternity Index depends on the obligate paternal allele and the homozygosity of the alleged father. Obligate paternal alleles are alleles that the biological father is required to have based on the relationship between the mother and the child. Page 6 of

, Continued Combined paternity index The Combined Paternity Index (CPI) for a DNA profile is calculated using the Product Rule by multiplying the individual PIs for each locus tested. Probability of paternity The Probability of Paternity (PP) is based upon Bayes Theorem. The probability of paternity tests the hypothesis that the alleged father is the biological father by incorporating a prior probability that the alleged father is the true biological father. PP = CPI (prior probability) CPI (prior probability) + [1-(prior probability)] The laboratory uses a prior probability set to a neutral value of 0.5 which simplifies the formula to: PP = CPI (1+CPI) For example, a probability of paternity of 99% reflects a 99% probability that the hypothesis that the alleged father is the biological father is correct and a 1% probability that this hypothesis is incorrect. Page 7 of

, Continued Paternal mutation rates and mean power of exclusion In cases where there is a paternal mutation, Popstats requires a mutation rate and mean power of exclusion to be entered for that locus. The current mutations rates and mean powers of exclusion for each population group can be found in the following table: Locus Paternal Mutation Rate African American Mean Power of Exclusion Caucasian Hispanic DS1179 0.002031 0.5747954 0.6122379 0.6027166 D21S11 0.001709 0.7233475 0.702609 0.6412956 D7S20 0.00134 0.5757591 0.61646 0.5622929 CSF1PO 0.002021 0.5777266 0.492312 0.4524626 D3S135 0.001691 0.5433942 0.5644 0.491526 TH01 0.00007 0.5111277 0.5719272 0.5316202 D13S317 0.001743 0.4655392 0.56405 0.6544299 D16S539 0.001127 0.6025692 0.5579725 0.5609566 D2S133 0.001526 0.70132 0.737616 0.675526 D19S433 0.000745 0.6923103 0.5729251 0.675710 VWA 0.00325 0.6239527 0.6252292 0.562939 TPOX 0.00013 0.55445 0.376957 0.360941 D1S51 0.00253 0.7442273 0.743559 0.7463515 D5S1 0.001742 0.5050401 0.4260790 0.493753 FGA 0.003713 0.7279926 0.7170999 0.7515565 Page of