Statistical tests. Paired t-test

Similar documents
Statistical Hypothesis Testing

Please Turn Over Page 1 of 7

Section 6.1 #16. Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Name: Exam 01 (Midterm Part 2 take home, open everything)

Comparing Means. Chapter 24. Case Study Gas Mileage for Classes of Vehicles. Case Study Gas Mileage for Classes of Vehicles Data collection

Geometric Distribution

8.6 Jonckheere-Terpstra Test for Ordered Alternatives. 6.5 Jonckheere-Terpstra Test for Ordered Alternatives

Chapter 25. One-Way Analysis of Variance: Comparing Several Means. BPS - 5th Ed. Chapter 24 1

The Effect Of Different Degrees Of Freedom Of The Chi-square Distribution On The Statistical Power Of The t, Permutation t, And Wilcoxon Tests

Name. Is the game fair or not? Prove your answer with math. If the game is fair, play it 36 times and record the results.

Most typical tests can also be done as permutation tests. For example: Two sample tests (e.g., t-test, MWU test)

There is no class tomorrow! Have a good weekend! Scores will be posted in Compass early Friday morning J

Simulations. 1 The Concept

Introduction to Chi Square

Name Class Date. Introducing Probability Distributions

One-Sample Z: C1, C2, C3, C4, C5, C6, C7, C8,... The assumed standard deviation = 110

Math 1313 Section 6.2 Definition of Probability

Bellwork Write each fraction as a percent Evaluate P P C C 6

CHAPTER 6 PROBABILITY. Chapter 5 introduced the concepts of z scores and the normal curve. This chapter takes

The topic for the third and final major portion of the course is Probability. We will aim to make sense of statements such as the following:

Probability. March 06, J. Boulton MDM 4U1. P(A) = n(a) n(s) Introductory Probability

Probability - Introduction Chapter 3, part 1

MITOCW mit_jpal_ses06_en_300k_512kb-mp4

Chapter 3: Elements of Chance: Probability Methods

Math 58. Rumbos Fall Solutions to Exam Give thorough answers to the following questions:

Probability Rules. 2) The probability, P, of any event ranges from which of the following?

Randomized Algorithms

How can it be right when it feels so wrong? Outliers, diagnostics, non-constant variance

3 The multiplication rule/miscellaneous counting problems

JIGSAW ACTIVITY, TASK # Make sure your answer in written in the correct order. Highest powers of x should come first, down to the lowest powers.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. B) Blood type Frequency

Math 1313 Conditional Probability. Basic Information

Probability. The Bag Model

The study of probability is concerned with the likelihood of events occurring. Many situations can be analyzed using a simplified model of probability

Chapter 11. Sampling Distributions. BPS - 5th Ed. Chapter 11 1

The next several lectures will be concerned with probability theory. We will aim to make sense of statements such as the following:

If a fair coin is tossed 10 times, what will we see? 24.61% 20.51% 20.51% 11.72% 11.72% 4.39% 4.39% 0.98% 0.98% 0.098% 0.098%

Discrete probability and the laws of chance

STAT Statistics I Midterm Exam One. Good Luck!

Discrete Random Variables Day 1

The Teachers Circle Mar. 20, 2012 HOW TO GAMBLE IF YOU MUST (I ll bet you $5 that if you give me $10, I ll give you $20.)

7.1 Chance Surprises, 7.2 Predicting the Future in an Uncertain World, 7.4 Down for the Count

CS1802 Week 9: Probability, Expectation, Entropy

Case 1: If Denver is the first city visited, then the outcome looks like: ( D ).

Player Speed vs. Wild Pokémon Encounter Frequency in Pokémon SoulSilver Joshua and AP Statistics, pd. 3B

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

2. Inference for comparing two proportions

A Probability Work Sheet

Permutation and Randomization Tests 1

Key Concepts. Theoretical Probability. Terminology. Lesson 11-1

a) Getting 10 +/- 2 head in 20 tosses is the same probability as getting +/- heads in 320 tosses

Before giving a formal definition of probability, we explain some terms related to probability.

STAT 430/510 Probability Lecture 3: Space and Event; Sample Spaces with Equally Likely Outcomes

LAMC Junior Circle February 3, Oleg Gleizer. Warm-up

23 Applications of Probability to Combinatorics

Chapter 1. Probability

Chapter 1. Probability

1. The chance of getting a flush in a 5-card poker hand is about 2 in 1000.

Mathematical Foundations HW 5 By 11:59pm, 12 Dec, 2015

The point value of each problem is in the left-hand margin. You must show your work to receive any credit, except on problems 1 & 2. Work neatly.

Chapter 11. Sampling Distributions. BPS - 5th Ed. Chapter 11 1

Probability (Devore Chapter Two)

Probability. Dr. Zhang Fordham Univ.

(a) Suppose you flip a coin and roll a die. Are the events obtain a head and roll a 5 dependent or independent events?

When a number cube is rolled once, the possible numbers that could show face up are

Comparative Power Of The Independent t, Permutation t, and WilcoxonTests

Suppose Y is a random variable with probability distribution function f(y). The mathematical expectation, or expected value, E(Y) is defined as:

Probability: Terminology and Examples Spring January 1, / 22

Moore, IPS 6e Chapter 05

PRE TEST KEY. Math in a Cultural Context*

Discrete Structures for Computer Science

3. Discrete Probability. CSE 312 Spring 2015 W.L. Ruzzo

Simple Probability. Arthur White. 28th September 2016

Permutation inference for the General Linear Model

Syntax Menu Description Options Remarks and examples Stored results References Also see

Example 1. An urn contains 100 marbles: 60 blue marbles and 40 red marbles. A marble is drawn from the urn, what is the probability that the marble

CS 361: Probability & Statistics

Probability and Counting Rules. Chapter 3

Hypothesis Tests. w/ proportions. AP Statistics - Chapter 20

Obs location y

Lectures 15/16 ANOVA. ANOVA Tests. Analysis of Variance. >ANOVA stands for ANalysis Of VAriance >ANOVA allows us to:

Section 6.4. Sampling Distributions and Estimators

STANDARD COMPETENCY : 1. To use the statistics rules, the rules of counting, and the characteristic of probability in problem solving.

Math Exam 2 Review. NOTE: For reviews of the other sections on Exam 2, refer to the first page of WIR #4 and #5.

Math Exam 2 Review. NOTE: For reviews of the other sections on Exam 2, refer to the first page of WIR #4 and #5.

November 6, Chapter 8: Probability: The Mathematics of Chance

Due Friday February 17th before noon in the TA drop box, basement, AP&M. HOMEWORK 3 : HAND IN ONLY QUESTIONS: 2, 4, 8, 11, 13, 15, 21, 24, 27

Jednoczynnikowa analiza wariancji (ANOVA)

Assignment 2 1) DAY TREATMENT TOTALS

November 8, Chapter 8: Probability: The Mathematics of Chance

Stat 20: Intro to Probability and Statistics

AP Statistics Ch In-Class Practice (Probability)

Laboratory 1: Uncertainty Analysis

Chapter 0: Preparing for Advanced Algebra

Possible responses to the 2015 AP Statistics Free Resposne questions, Draft #2. You can access the questions here at AP Central.

Teaching Randomness Using Coins and Dice

Test 2 SOLUTIONS (Chapters 5 7)

1. How to identify the sample space of a probability experiment and how to identify simple events

I. WHAT IS PROBABILITY?

Conditional Probability Worksheet

Transcription:

Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate a departure from the hypothesis. Compare the observed value of the statistic to its distribution under the null hypothesis. 1 Paired t-test Pairs (X 1, Y 1 ),..., (X n, Y n ) independent X i normal(µ A, σ A ) Y i normal(µ B, σ B ) Test H 0 : µ A = µ B vs H a : µ A µ B Paired t-test D i = Y i X i D 1,..., D n iid normal(µ B µ A,σ D ) sample mean D; sample SD s D T = D/(s D / n) Compare to t distribution with n 1 d.f. 2

Example Y 200 X 180 100 120 140 160 180 200 Y 160 140 120 D 100 10 0 10 20 30 40 100 110 120 130 140 150 160 X D = 14.7 s D = 19.6 n = 11 T = 2.50 P = 2*(1-pt(2.50,10)) = 0.031 3 Assumptions Random sample from the target populations Hard to check Need a well-designed study Underlying population follows a normal distribution Not necessary if the sample size is large (but large is relative) Checkable, but really only if the sample size is large 4

Assessing normality To assess the assumption that the underlying population follows a normal distribution, we often use a QQ plot. For a sample size n, look at n values evenly distributed between 0 and 1: 0.5 n 1.5 n 2.5 n n 0.5 n Look at the corresponding quantiles of the normal distribution. qnorm(0.5/n) qnorm(1.5/n) qnorm(2.5/n) qnorm((n-0.5)/n) i.e., qnorm( ((1:n)-0.5)/n ) Plot the sorted data values against these idealized draws from a normal distribution. Look for a straight line. 5 QQ plots 50 Sorted data 45 40 3 2 1 0 1 2 3 1.64 0.67 0.13 0.39 1.04 1.04 0.39 0.13 0.67 1.64 1.5 1.0 0.5 0.0 0.5 1.0 1.5 Normal quantiles 55 52 50 50 Sorted data 45 Sorted data 48 46 44 40 42 1.5 1.0 0.5 0.0 0.5 1.0 1.5 Normal quantiles 1.5 1.0 0.5 0.0 0.5 1.0 1.5 Normal quantiles 6

Examples Skewed distribution 0 10 20 30 40 50 60 3 2 1 0 1 2 3 0 10 20 30 40 50 Normal quantiles Sorted data Heavy tails 4 2 0 2 4 6 8 3 2 1 0 1 2 3 4 2 0 2 4 6 8 Normal quantiles Sorted data 7 Sign test Suppose we are concerned about the normal assumption. (X 1, Y 1 ),..., (X n, Y n ) independent Test H 0 : X s and Y s have the same distribution Another statistic: S = #{i : X i < Y i } = #{i : D i > 0} (the number of pairs for which X i < Y i ) Under H 0, S binomial(n, p=0.5) Suppose S obs > n/2. P-value = 2 Pr(S S obs H 0 ) = 2 * (1 - pbinom(sobs - 1, n, 0.5)) 8

Example For our example, 8 out of 11 pairs had Y i > X i. P-value = 2*(1 - pbinom(7, 11, 0.5)) = 23% Or type binom.test(8, 11, 0.5). (Compare this to P = 3% for the t-test.) 9 Signed Rank test Another nonparametric test. (Also called the Wilcoxon signed rank test) Rank the differences according to their absolute values. R = sum of ranks of positive (or negative) values D 28.6 5.3 13.5 12.9 37.3 25.0 5.1 34.6 12.1 9.0 39.4 rank 8 2 6 5 10 7 1 9 4 3 11 R = 2 + 4 + 5 = 11 Compare this to the distribution of R when each rank has an equal chance of being positive or negative. In R: wilcox.test(d) P = 0.054 10

Permutation test (X 1, Y 1 ),..., (X n, Y n ) T obs Randomly flip the pairs. (For each pair, toss a fair coin. If heads, switch X and Y; if tails, do not switch.) Compare the observed T statistic to the distribution of the T-statistic when the pairs are flipped at random. If the observed statistic is extreme relative to this permutation/randomization distribution, then reject the null hypothesis (that the X s and Y s have the same distribution). Actual data: (117.3,145.9) (100.1,94.8) (94.5,108.0) (135.5,122.6) (92.9,130.2) (118.9,143.9) (144.8,149.9) (103.9,138.5) (103.8,91.7) (153.6,162.6) (163.1,202.5) T obs = 2.50 Example shuffled data: (117.3,145.9) (94.8,100.1) (108.0,94.5) (135.5,122.6) (130.2,92.9) (118.9,143.9) (144.8,149.9) (138.5,103.9) (103.8,91.7) (162.6,153.6) (163.1,202.5) T = 0.19 11 Permutation distribution 5 4 3 2 1 0 1 2 3 4 5 P-value = Pr( T T obs ) Small n: Look at all 2 n possible flips Large n: Look at a sample (w/ repl) of 1000 such flips Example data: All 2 11 permutations: P = 0.037; sample of 1000: P = 0.040 12

At least four choices: Paired comparisons Paired t-test Sign test Signed rank test Permutation test with the t-statistic Which to use?: Paired t-test depends on the normality assumption Sign test is pretty weak Signed rank test ignores some information Permutation test is recommended The fact that the permutation distribution of the t-statistic is generally well-approximated by a t distribution recommends the ordinary t-test. But if you can estimate the permutation distribution, do it. 13 2-sample t-test X 1,..., X n iid normal(µ A, σ) Y 1,..., Y m iid normal(µ B, σ) Test H 0 : µ A = µ B vs H a : µ A µ B Test statistic: T = X Ȳ s p 1 n + 1 m where s p = s 2 A (n 1)+s2 B (m 1) n+m 2 Compare to t distribution with n + m 2 degrees of freedom. 14

Example Y X 40 50 60 70 80 90 100 X = 47.5 s A = 10.5 n = 6 Ȳ = 74.3 s B = 20.6 m = 9 s p = 17.4 T = 2.93 P = 2*pt(-2.93, 6+9-2) = 0.011 15 Wilcoxon rank-sum test Rank the X s and Y s from smallest to largest (1, 2,..., n+m) R = sum of ranks for X s (Also known as the Mann-Whitney Test) X Y rank 35.0 1 38.2 2 43.3 3 46.8 4 49.7 5 50.0 6 51.9 7 57.1 8 61.2 9 74.1 10 75.1 11 84.5 12 90.0 13 95.1 14 101.5 15 R = 1 + 2 + 3 + 6 + 8 + 9 = 29 P-value = 0.026 (use wilcox.test()) Note: The distribution of R (given that X s and Y s have the same dist n) is calculated numerically 16

Permutation test X or Y group X 1 1 X 2 1. 1 X n 1 T obs Y 1 2 Y 2 2. 2 Y m 2 X or Y group X 1 2 X 2 2. 1 X n 2 T Y 1 1 Y 2 2. 1 Y m 1 Group status shuffled Compare the observed t-statistic to the distribution obtained by randomly shuffling the group status of the measurements. 17 Permutation distribution 4 3 2 1 0 1 2 3 4 5 6 7 P-value = Pr( T T obs ) Small n & m: Look at all ( ) n+m n possible shuffles Large n & m: Look at a sample (w/ repl) of 1000 such shuffles Example data: All 5005 permutations: P = 0.015; sample of 1000: P = 0.013 18

Estimating the permutation P-value Let P = true P-value (if we do all possible shuffles) Do N shuffles, and let X = # times the statistic after shuffling the observed statistic ˆP = X N where X binomial(n, P) E(ˆP) = P SD(ˆP) = P(1 P) N If the true P-value P = 5% and we do N=1000 shuffles, SD(ˆP) = 0.7%. 19 Summary The t-test relies on a normality assumption If this is a worry, consider: Paired data: Sign test Signed rank test Permutation test Unpaired data: Rank-sum test Permutation test Crucial assumption: independence The fact that the permutation distribution of the t-statistic is often closely approximated by a t distribution is good support for just doing t-tests. 20