Nessie is alive! Gerco Onderwater. Role of statistics, bias and reproducibility in scientific research

Similar documents
Laboratory 1: Uncertainty Analysis

MITOCW mit_jpal_ses06_en_300k_512kb-mp4

Statistical Hypothesis Testing

OFF THE WALL. The Effects of Artist Eccentricity on the Evaluation of Their Work ROUGH DRAFT

On the Monty Hall Dilemma and Some Related Variations

Unit 8: Sample Surveys

Sampling distributions and the Central Limit Theorem

Translational scientist competency profile

Grade 8 Pacing and Planning Guide Science

Section 6.4. Sampling Distributions and Estimators

Prepared by the YuMi Deadly Centre Faculty of Education, QUT. YuMi Deadly Maths Year 6 Teacher Resource: SP Loaded dice

x y

Chapter 1: About Science

Social Studies 201 Notes for November 8, 2006 Sampling distributions Rest of semester For the remainder of the semester, we will be studying and

Basic Probability Concepts

Chapter 12 Summary Sample Surveys

Statistics, Probability and Noise

Genealogical trees, coalescent theory, and the analysis of genetic polymorphisms

Chapter 2 Scientific Method

MAT 1272 STATISTICS LESSON STATISTICS AND TYPES OF STATISTICS

Probability and Statistics

Grade 8 Math Assignment: Probability

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. Chester F. Carlson Center for Imaging Science

Probability - Introduction Chapter 3, part 1

Reading Skills Practice Test 9

Clay County District Schools. Addison Davis, Superintendent. Graduation Rate

SPIRE MATHS Stimulating, Practical, Interesting, Relevant, Enjoyable Maths For All

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. Chester F. Carlson Center for Imaging Science

Patterns Professor Jon Kolko

Science. What it is Why it s important to know about it Elements of the scientific method

Introduction. (Good) Sources of Drug Use Data [drugdata.pdf]

Teacher s Notes. Problem of the Month: Courtney s Collection

Chapter 3 Monday, May 17th

Introduction to Chi Square

Mohammad Hossein Manshaei 1394

Statistical Methods in Computer Science

Polls, such as this last example are known as sample surveys.

Level Below Basic Basic Proficient Advanced. Policy PLDs. Cognitive Complexity

No-Three-in-Line, Intransitive Dice, and Other Amusements in Mathematics

1-What type of graph is used to show trends? 2-What type of graph is used to compare information?

Ancient Worlds Chapter 2. Puzzling Pieces Copy the blue print, it means they are Key Ideas or Key Words

English I RI 1-3 Stop Wondering, Start Experimenting

AP STATISTICS 2015 SCORING GUIDELINES

All that begins... peace be upon you

COMMUNITY UNIT SCHOOL DISTRICT 200 Science Curriculum Philosophy

CCMR Educational Programs

STAT 100 Fall 2014 Midterm 1 VERSION B

What Do You Expect? Concepts

USE OF BASIC ELECTRONIC MEASURING INSTRUMENTS Part II, & ANALYSIS OF MEASUREMENT ERROR 1

Permutation and Randomization Tests 1

Trial version. Resistor Production. How can the outcomes be analysed to optimise the process? Student. Contents. Resistor Production page: 1 of 15

Why Randomize? Jim Berry Cornell University

TO PLOT OR NOT TO PLOT?

The Next Generation Science Standards Grades 6-8

BOOSTING AFFILIATE PROFITS

Warm Up The following table lists the 50 states.

Statistical Tests: More Complicated Discriminants

Student Outcomes. Classwork. Exercise 1 (3 minutes) Discussion (3 minutes)

Sample Surveys. Chapter 11

SF2972: Game theory. Introduction to matching

INTRODUCTION my world

Chapter 8. Producing Data: Sampling. BPS - 5th Ed. Chapter 8 1

Now let s figure the probability that Angelina picked a green marble if Marc did not replace his marble.

EXPLAINING THE SHAPE OF RSK

Common Phrases (2) Generic Responses Phrases

Lesson Sampling Distribution of Differences of Two Proportions

SDS PODCAST EPISODE 148 FIVE MINUTE FRIDAY: THE TROLLEY PROBLEM

K.1 Structure and Function: The natural world includes living and non-living things.

Investigate the great variety of body plans and internal structures found in multi cellular organisms.

Deadly windows Featured scientist: Natasha Hagemeyer from Old Dominion University

Game Mechanics Minesweeper is a game in which the player must correctly deduce the positions of

Mindful Communication In Code Reviews. By Amy Ciavolino, presenter notes are at the bottom.

What Is Color Profiling?

Research & Development (R&D) defined (3 phase process)

QUANTITATIVE METHODOLOGY IN RESEARCH DESIGN: A PROCESS OF ACQUIRING KNOWLEDGE Adjunct Professor: Joseph W. Dorsey, Ph.D.

Introduction to Biosystematics - Zool 575

Exam 2 Review. Review. Cathy Poliak, Ph.D. (Department of Mathematics ReviewUniversity of Houston ) Exam 2 Review

Grades 6 8 Innoventure Components That Meet Common Core Mathematics Standards

I: Can you tell me more about how AIDS is passed on from one person to the other? I: Ok. Does it matter a how often a person gets a blood transfusion?

Elements of Scholarly Discourse in a Digital World

The Importance of Scientific Reproducibility in Evidence-based Rulemaking

How do I review a manuscript? Karl V. Clemons, PhD Editor-in-Chief Medical Mycology

Sampling Terminology. all possible entities (known or unknown) of a group being studied. MKT 450. MARKETING TOOLS Buyer Behavior and Market Analysis

Fairfield Public Schools Science Curriculum. Draft Forensics I: Never Gone Without a Trace Forensics II: You Can t Fake the Prints.

Iowa Core Science Standards Grade 8

Common Core Structure Final Recommendation to the Chancellor City University of New York Pathways Task Force December 1, 2011

Lecture Start

Open Methodology and Reproducibility in Computational Science

COS Lecture 7 Autonomous Robot Navigation

Love Is The Answer Lyrics

IE 361 Module 7. Reading: Section 2.5 of Revised SQAME. Prof. Steve Vardeman and Prof. Max Morris. Iowa State University

ECON 214 Elements of Statistics for Economists

Graphing Techniques. Figure 1. c 2011 Advanced Instructional Systems, Inc. and the University of North Carolina 1

Fraud Detection using Benford s Law

Basic Probability Ideas. Experiment - a situation involving chance or probability that leads to results called outcomes.

Gage Repeatability and Reproducibility (R&R) Studies. An Introduction to Measurement System Analysis (MSA)

Where tax and science meet part 2*

Elements of the Sampling Problem!

MATHEMATICAL MODELS Vol. I - Measurements in Mathematical Modeling and Data Processing - William Moran and Barbara La Scala

MITOCW watch?v=krzi60lkpek

Transcription:

Nessie is alive! Role of statistics, bias and reproducibility in scientific research Gerco Onderwater c.j.g.onderwater@rug.nl

4/23/15 2 Loch Ness, Scotland

4/23/15 3 Legendary monster Saint Adomnán of Iona describes the wonders by Saint Columba; in 565 AD, [At] the river Ness a poor unfortunate little fellow, whom some water monster had a little before snatched at as he was swimming, and bitten with a most savage bite...

4/23/15 4 The surgeon's photo April 19, 1934

4/23/15 5 Sightings aplenty!

4/23/15 6 Controversy Legend, fact or fake?

4/23/15 7 Some opinions I think it is just a lot of tripe! I don't quite believe it. I'm wondering if it's a stunt If I stay here much longer... I shall see it It may possibly exist I don't know what to think

4/23/15 8 Getting beyond opinions Time for research!

4/23/15 9 Scientific method 1. 2. 3. 4. 5. 6. 7. 8. Define a question Gather information and resources (observe) Form an explanatory hypothesis Test the hypothesis by performing an experiment and collecting data in a reproducible manner Analyze the data Interpret the data and draw conclusions that serve as a starting point for a new hypothesis Publish results Retest (frequently done by other scientists)

4/23/15 10 Experimentation Cannot deal with full range of possibilities

4/23/15 11 Experimentation Cannot deal with full range of possibilities Select a representative sample Perform measurement Infer properties of full population

4/23/15 12 Probability Using properties of population & selection process, probability for an outcome predictable with certainty, outcome itself is subject to chance Truth Observation Observation Observation Observation Observation Observation Observation Observation Observation Observation

4/23/15 13 Demo 365 364 363 362 365 n p0 (n)= 0.02 365 365 365 365 365 for n=50

4/23/15 14 Likelihood In research we have to do the reverse Which explanation most likely given this observation? Observation Truth Speculation Wild guess Lie Lie Lie Lie lie Lie Lie

4/23/15 15 Sta tis tics branch of mathematics dealing with collection, analysis, interpretation, and presentation of numerical data Descriptive statistics summarize data for concise overview the mean grade of HC students is... Inferential statistics make claims about population based on sample my HC students were OK, so most likely they're all

4/23/15 16 Descriptive statistics 6, 1, 2, 6, 2, 3, 6, 5, 4, 5, 5, 4, 2, 5, 4, 3, 4, 2, 2, 1, 1, 4, 5, 4, 5, 1, 5, 1, 1, 3, 5, 2, 3, 1, 2, 4, 2, 4, 5, 3, 1, 3, 6, 3, 5, 1, 5, 5, 3, 4, 3, 1, 4, 4, 3, 5, 4, 5, 1, 1, 1, 5, 1, 2, 2, 1, 4, 5, 2, 3, 6, 4, 2, 4, 4, 2, 2, 3, 1, 6, 1, 4, 1, 3, 3, 6, 6, 3, 2, 3, 4, 5, 4, 1, 5, 5, 2, 2, 3, 4, 3, 5, 4, 5, 6, 1, 2, 6, 2, 2, 2, 6, 3, 1, 5, 1, 2, 2, 6, 5, 1, 2, 3, 3, 5, 2, 5, 6, 4, 5, 4, 6, 1, 3, 6, 1, 4, 6, 6, 1, 3, 2, 1, 6, 5, 3, 3, 5, 3, 2, 4, 1, 2, 3, 4, 5, 4, 1, 2, 6, 4, 4, 4, 4, 2, 3, 6, 5, 1, 5, 3, 1, 5, 6, 3, 1, 2, 1, 3, 4, 5, 1, 2, 4, 1, 5, 5, 5, 3, 4, 5, 3, 5, 3, 5, 6, 3, 3, 2, 2, 6, 5, 6, 6, 6, 2, 1, 3, 5, 3, 1, 6, 1, 3, 1, 2, 6, 4, 4, 5, 4, 2, 1, 2, 1, 6, 5, 5, 2, 3, 6, 4, 3, 6, 4, 1, 5, 3, 4, 6, 6, 5, 3, 2, 4, 5, 4, 4, 6, 4, 3, 4, 5, 1, 3, 1, 1, 6, 3, 5, 6, 3, 5, 3, 2, 3, 6, 3, 6, 1, 5, 4, 3, 6, 6, 5, 6, 3, 2, 1, 2, 1, 1, 2, 2, 6, 3, 5, 2, 6, 6, 1, 3, 4, 4, 4, 2, 4, 1, 5, 3, 2, 6, 2, 3, 5, 6, 1, 3, 3, 6, 5, 1, 3, 6, 5, 6, 2, 4, 3, 1, 5, 1, 6, 6, 4, 1, 5, 3, 4, 5, 4, 2, 3, 6, 3, 2, 1, 2, 1, 5, 1, 3, 4, 6, 1, 4, 2, 5, 6, 2, 6, 4, 6, 1, 3, 4, 2, 3, 4, 6, 2, 1, 2, 6, 1, 3, 1, 4, 5, 3, 3, 2, 4, 1, 3, 2, 5, 3, 1, 4, 6, 4, 4, 5, 2, 4, 4, 1, 6, 2, 4, 3, 4, 4, 6, 6, 1, 1, 3, 6, 5, 3, 1, 5, 5, 2, 2, 1, 4, 5, 4, 1, 3, 6, 1, 3, 2, 5, 5, 5, 2, 3, 6, 2, 4, 1, 4, 2, 2, 4, 6, 4, 4, 4, 4, 6, 6, 5, 3, 5, 2, 4, 3, 5, 5, 5, 3, 6, 4, 6, 1, 6, 5, 1, 1, 5, 1, 4, 1, 5, 6, 3, 2, 5, 4, 5, 3, 3, 3, 1, 1, 3, 1, 6, 5, 3, 4, 2, 5, 2, 2, 3, 1, 5, 1, 4, 2, 4, 6, 6, 2, 4, 5, 5, 1, 4, 1, 5, 5, 6, 4, 3, 3, 1, 3, 3, 5, 3, 1, 1, 4, 5, 6, 1, 1, 3, 2, 5, 2, 1, 6, 5, 1, 5, 2, 6, 5, 5, 4, 1, 4, 5, 6, 5, 5, 3, 4, 4, 3, 3, 4, 1, 2 4, 2, 3, 3, 2, 1, 4, 2, 2, 1, 4, 5, 5, 3, 1, 1, 1, 2, 1, 2, 6, 6, 1, 3, 3, 5, 2, 4, 2, 3, 6, 3, 3, 1, 6, 6, 5, 4, 3, 2, 5, 4, 1, 1, 5, 4, 5, 4, 4, 5, 6, 6, 1, 4, 4, 4, 1, 1, 5, 2, 1, 4, 5, 2, 2, 4, 5, 3, 4, 1, 1, 5, 1, 1, 6, 5, 5, 4, 1, 5, 4, 2, 4, 4, 5, 3, 4, 6, 2, 1, 5, 4, 3, 5, 2, 3, 2, 3, 4, 2, 5, 4, 3, 1, 2, 3, 2, 3, 1, 1, 1, 5, 6, 5, 5, 4, 6, 6, 2, 3, 5, 1, 2, 4, 4, 1, 2, 6, 3, 6, 6, 3, 6, 3, 4, 6, 1, 5, 5, 4, 2, 3, 6, 1, 6, 1, 3, 6, 5, 4, 4, 6, 2, 1, 1, 5, 5, 1, 4, 4, 6, 6, 6, 3, 6, 2, 6, 1, 1, 2, 6, 5, 3, 4, 2, 3, 6, 4, 5, 6, 3, 6, 3, 1, 2, 1, 5, 4, 5, 6, 2, 1, 3, 2, 6, 1, 4, 1, 2, 6, 3, 1, 3, 3, 3, 3, 3, 4, 4, 3, 6, 2, 5, 4, 6, 5, 6, 4, 1, 1, 1, 1, 1, 4, 5, 1, 2, 1, 2, 3, 5, 6, 5, 4, 5, 3, 4, 1, 4, 3, 4, 1, 4, 1, 4, 5, 6, 5, 3, 5, 6, 2, 6, 6, 2, 1, 6, 5, 6, 3, 3, 6, 4, 5, 5, 4, 6, 5, 1, 1, 6, 3, 6, 5, 3, 6, 5, 3, 4, 3, 6, 5, 1, 2, 6, 6, 3, 2, 6, 6, 5, 5, 5, 2, 5, 3, 1, 2, 4, 3, 2, 1, 6, 6, 2, 3, 3, 2, 4, 2, 5, 4, 2, 6, 6, 3, 6, 4, 2, 4, 3, 2, 4, 1, 6, 1, 2, 4, 1, 5, 6, 4, 6, 4, 3, 4, 4, 5, 4, 3, 4, 2, 6, 5, 3, 2, 5, 2, 6, 2, 4, 1, 4, 4, 5, 3, 6, 4, 4, 6, 3, 2, 5, 5, 4, 3, 1, 6, 1, 4, 3, 5, 5, 6, 2, 1, 1, 4, 6, 3, 4, 6, 2, 3, 5, 4, 4, 5, 3, 5, 3, 5, 4, 6, 3, 3, 6, 4, 2, 1, 2, 3, 4, 6, 1, 5, 1, 3, 4, 1, 6, 5, 3, 1, 2, 2, 1, 2, 2, 2, 6, 3, 6, 3, 2, 5, 4, 6, 2, 2, 2, 1, 5, 1, 5, 5, 2, 3, 4, 2, 4, 3, 2, 1, 2, 1, 5, 2, 4, 5, 2, 2, 5, 1, 6, 1, n N 1 171 2 153 3 169 4 176 5 174 6 157 μ = 3.50, σ = 1.69 χ²/ndf = 2.75/5 p = 0.74 6 x 166 + 4 Deviations can & must be there!

4/23/15 17 Descriptive statistics sample Includes fitting Parameters calculated from observations Parameters (thus) have uncertainty Functional form is assumed... My first publication...

4/23/15 18 Inferential statistics Check whether your assumptions are correct Best match doesn't mean good match (just that nothing else was better) Statistical fluctuation are predictable can test goodness-of-fit e.g. with χ² also χ² has fluctuations, follows χ²-distribution

4/23/15 19 Getting a good fit Great challenge: getting a good fit with 1010 events

4/23/15 20 Quality testing Put X's in the grid Give each square 50% change for X Count number of X's 0 6, 19 25 7, 8, 17, 18 9, 10, 15, 16 11 14 : : : : 1 5 16 29 Humans ill suited for randomness

4/23/15 21 Inferential statistics Decide between multiple truths (hypotheses) Match observation with expectation (with likelihood) Also likelihood can be calculated with certainty

4/23/15 22 Einstein Many experiments may prove me right, but it takes only one to prove me wrong! Make sure you pick the right one!

4/23/15 23 Einstein Many experiments may prove me right, but it takes only one to prove me wrong! Make sure you pick the right one!

4/23/15 24 Einstein Many experiments may prove me right, but it takes only one to prove me wrong! Make sure you pick the right one! risk for bias

4/23/15 25 Types of bias Intellectual phase locking Experimental imperfections Correlations Find what you want to find Stop looking at positive 'proof' Keep looking until positive 'proof' Fix problems until positive 'proof'

4/23/15 26 Reproduce independently Support claim of discovery Expose unfortunate mistakes Avoid fraud!

4/23/15 27 So what about Nessie? Wishful thinking or historically founded?

4/23/15 28 Systematic observation

4/23/15 29 Deep-scan Systematic scan with sonar

4/23/15 30 Hoax?

4/23/15 31 Nessie in Queensland, AUS she's on vacation!

4/23/15 32

Thank you for your attention!