Statistical Hypothesis Testing

Statistical Hypothesis Testing Statistical Hypothesis Testing is a kind of inference Given a sample, say something about the population Examples: Given a sample of classifications by a decision tree, test the hypothesis that classification performance is no better than chance (testing that a population parameter has a value) Given a sample of KOSO and KOSO* trials, test the hypothesis that KOSO* generally has shorter run-times than KOSO (comparison of two populations based on samples) Given a sample of knowledge engineers each entering x pieces of knowledge in a fixed time t, how what is the value of x/t for any knowledge engineer, and what is your confidence in it (parameter estimation)

Statistical Hypothesis Testing More examples Given runtimes for KOSO and KOSO* under a factorial arrangement of conditions four levels of number of processors, four levels of alpha test the hypothesis that the superiority of KOSO* increases as the number of processors decreases (looking for interaction effects) Given a correlation between log(runtime) and the size of a job, test whether the population value of the correlation is ρ Calculate a confidence interval around your estimate of the population value of the correlation

Statistical Inference Let s start with an example... Fred tosses a coin 20 times and it comes up heads 15 times. Fred wonders whether the coin is fair. The Logic: Let R be the result 15 out of 20 heads Assume the coin is fair (Π =.5) If R is very unlikely under the assumption that the coin is fair, Pr(R Π =.5) α then Fred is inclined to reject the hypothesis that the coin is fair. Fred picks a value of α (say,.01) and calculates the conditional probability p = Pr(R Π =.5) Fred s residual uncertainty that the coin is fair is less than or equal to α p is not the probability that the coin is fair, nor is 1 - p the probability that the coin is not fair

Running the example Write down the null and alternative hypotheses Ho: Π =.5 H1: Π.5 Decide on a level of confidence α =.5 Calculate the sampling distribution of the number of heads in 20 tosses under the assumption that the coin is fair Calculate the probability p of the sample result from the sampling distribution If p α reject the null hypothesis

Running the example Write down the null and alternative hypotheses Ho: Π =.5 H1: Π.5 Decide on a level of confidence α Calculate the sampling distribution of the number of heads in 20 tosses under the assumption that the coin is fair Calculate the probability p of the sample result from the sampling distribution If p α reject the null hypothesis

The probability of 15 out of 20 tosses coming up heads under the assumption that the coin is fair (defun toss-coin (p) "Returns 1 with probability p, 0 with probability 1-p" (if (< (random 1.0) p) 1 0)) (defun sampling-distribution-of-heads (p n k) "p is the bias of the coin, n is the number of tosses in a sample, and k is the size of the sampling distribution the number of sample values of the number of heads" (loop repeat k collect (apply #'+ (loop repeat n collect (toss-coin p)))))

The probability of 15 out of 20 tosses coming up heads under the assumption that the coin is fair (sampling-distribution-of-heads.5 20 500) 90 80 70 60 50? (setf sorted-dist (sort dist '<))? (what-percentile? 15 sorted-dist) 0.0040000000000000036? (setf critical-value (quantile sorted-dist.95 )) 13.0 40 30 20 10 4 5 6 7 8 9 10 11 12 13 14 15 16

Statistical Inference Fred tosses a coin 20 times and it comes up heads 15 times. Fred wonders whether the coin is fair. The Logic: Let R be the result 15 out of 20 heads Assume the coin is fair (Π =.5) If R is very unlikely under the assumption that the coin is fair, Pr(R Π =.5) α then Fred is inclined to reject the hypothesis that the coin is fair. Fred picks a value of α (say,.01) and calculates the conditional probability p = Pr(R Π =.5) =.004 Fred s residual uncertainty that the coin is fair is less than or equal to α.004 is not the probability that the coin is fair, nor is 1 -.004 the probability that the coin is not fair

Statistical Hypothesis Testing Sampling Distributions Suppose you have a sample of size N and you calculate a statistic φ on that sample (e.g., the mean) If you had drawn a different sample of size N, you would have a different value of φ The distribution of φ for all possible samples of size N is called the sampling distribution of φ The distribution of some possible samples of size N, obtained by Monte Carlo sampling (or other computer-intensive methods) is called the empirical sampling distribution of φ The empirical sampling distribution estimates the sampling distribution

Statistical Hypothesis Testing Sampling Distributions and the Null Hypothesis When we get a sampling distribution for hypothesis testing, it should be the sampling distribution for φ under the assumption that Ho is true That way, we can calculate the probability p of the sample result the particular value of φ under Ho We reject Ho if p α α Sampling distribution of φ under Ho Pr ( φ Ho) φ