Chapter 12 Summary Sample Surveys

Similar documents
Sample Surveys. Chapter 11

Stats: Modeling the World. Chapter 11: Sample Surveys

b. Stopping students on their way out of the cafeteria is a good way to sample if we want to know about the quality of the food there.

Polls, such as this last example are known as sample surveys.

Chapter 3 Monday, May 17th

Sample Surveys. Sample Surveys. Al Nosedal. University of Toronto. Summer 2017

Chapter 8. Producing Data: Sampling. BPS - 5th Ed. Chapter 8 1

STA 218: Statistics for Management

Chapter 4: Designing Studies

Chapter 12: Sampling

Stat Sampling. Section 1.2: Sampling. What about a census? Idea 1: Examine a part of the whole.

Basic Practice of Statistics 7th

CHAPTER 4 Designing Studies

AP Statistics S A M P L I N G C H A P 11

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu Albuquerque, UNM

Elements of the Sampling Problem!

Other Effective Sampling Methods

4.1: Samples & Surveys. Mrs. Daniel AP Stats

CHAPTER 8: Producing Data: Sampling

Gathering information about an entire population often costs too much or is virtually impossible.

October 6, Linda Owens. Survey Research Laboratory University of Illinois at Chicago 1 of 22

Class 10: Sampling and Surveys (Text: Section 3.2)

Population vs. Sample

MAT 1272 STATISTICS LESSON STATISTICS AND TYPES OF STATISTICS

Census: Gathering information about every individual in a population. Sample: Selection of a small subset of a population.

Introduction INTRODUCTION TO SURVEY SAMPLING. Why sample instead of taking a census? General information. Probability vs. non-probability.

Introduction INTRODUCTION TO SURVEY SAMPLING. General information. Why sample instead of taking a census? Probability vs. non-probability.

Warm Up The following table lists the 50 states.

7.1 Sampling Distribution of X

Sampling Terminology. all possible entities (known or unknown) of a group being studied. MKT 450. MARKETING TOOLS Buyer Behavior and Market Analysis

These days, surveys are used everywhere and for many reasons. For example, surveys are commonly used to track the following:

Sampling Designs and Sampling Procedures

Full file at

The Savvy Survey #3: Successful Sampling 1

Ch. 12: Sample Surveys

Objectives. Module 6: Sampling

Honors Statistics. Daily Agenda

PUBLIC EXPENDITURE TRACKING SURVEYS. Sampling. Dr Khangelani Zuma, PhD

SAMPLING. A collection of items from a population which are taken to be representative of the population.

not human choice is used to select the sample.

Unit 8: Sample Surveys

Chapter 4: Sampling Design 1

Introduction. Descriptive Statistics. Problem Solving. Inferential Statistics. Chapter1 Slides. Maurice Geraghty

3. Data and sampling. Plan for today

Statistical and operational complexities of the studies I Sample design: Use of sampling and replicated weights

Honors Statistics. Daily Agenda

The challenges of sampling in Africa

March 10, Monday, March 10th. 1. Bell Work: Week #5 OAA. 2. Vocabulary: Sampling Ch. 9-1 MB pg Notes/Examples: Sampling Ch.

Key Words: age-order, last birthday, full roster, full enumeration, rostering, online survey, within-household selection. 1.

Session V: Sampling. Juan Muñoz Module 1: Multi-Topic Household Surveys March 7, 2012

Randomized Evaluations in Practice: Opportunities and Challenges. Kyle Murphy Policy Manager, J-PAL January 30 th, 2017

Sampling, Part 2. AP Statistics Chapter 12

6 Sampling. 6.2 Target Population and Sample Frame. See ECB (2011, p. 7). Monetary Policy & the Economy Q3/12 addendum 61

Sampling distributions and the Central Limit Theorem

SAMPLING BASICS. Frances Chumney, PhD

STAT 100 Fall 2014 Midterm 1 VERSION B

Chapter 1 Introduction

AmericasBarometer, 2016/17

Field Techniques ICH 3 Lecture 1

Botswana - Botswana AIDS Impact Survey III 2008

POLI 300 PROBLEM SET #2 10/04/10 SURVEY SAMPLING: ANSWERS & DISCUSSION

INTEGRATED COVERAGE MEASUREMENT SAMPLE DESIGN FOR CENSUS 2000 DRESS REHEARSAL

Section 6.4. Sampling Distributions and Estimators

Sampling. I Oct 2008

a) Getting 10 +/- 2 head in 20 tosses is the same probability as getting +/- heads in 320 tosses

6 Sampling. 6.2 Target population and sampling frame. See ECB (2013a), p. 80f. MONETARY POLICY & THE ECONOMY Q2/16 ADDENDUM 65

Unit 1B-Modelling with Statistics. By: Niha, Julia, Jankhna, and Prerana

**Gettysburg Address Spotlight Task

Section 2: Preparing the Sample Overview

Sampling Subpopulations in Multi-Stage Surveys

An Introduction to ACS Statistical Methods and Lessons Learned

Comparing Generalized Variance Functions to Direct Variance Estimation for the National Crime Victimization Survey

Coaching Questions From Coaching Skills Camp 2017

A Guide to Sampling for Community Health Assessments and Other Projects

GAMBLING ( ) Name: Partners: everyone else in the class

Experiences with the Use of Addressed Based Sampling in In-Person National Household Surveys

6. Methods of Experimental Control. Chapter 6: Control Problems in Experimental Research

Chapter 4. Displaying and Summarizing Quantitative Data. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Sapsford(2e)-3445-Ch-01.qxd 7/18/2006 5:38 PM Page 1. Part A INTRODUCTION

Sierra Leone - Multiple Indicator Cluster Survey 2017

PROBABILITY-BASED SAMPLING USING Split-Frames with Listed Households

PMA2020 Household and Female Survey Sampling Strategy in Nigeria

Thailand - The Population and Housing Census of Thailand IPUMS Subset

Nonuniform multi level crossing for signal reconstruction

Getting a Job in the Business

Sample size, sample weights in household surveys

The Statistical Cracks in the Foundation of the Popular Gauge R&R Approach

INTRODUCTORY STATISTICS LECTURE 4 PROBABILITY

Lesson 2: What is the Mary Kay Way?

BREAKING GROUND 2012: THIS ONE S DIFFERENT TOOL KIT

PROBABILITY M.K. HOME TUITION. Mathematics Revision Guides. Level: GCSE Foundation Tier

Saint Lucia Country Presentation

Social Studies 201 Notes for November 8, 2006 Sampling distributions Rest of semester For the remainder of the semester, we will be studying and

RMT 2015 Power Round Solutions February 14, 2015

Leadership: Getting and Giving the Call for Action

CH 13. Probability and Data Analysis

There is no class tomorrow! Have a good weekend! Scores will be posted in Compass early Friday morning J

Variance Estimation in US Census Data from Kathryn M. Coursolle. Lara L. Cleveland. Steven Ruggles. Minnesota Population Center

If a fair coin is tossed 10 times, what will we see? 24.61% 20.51% 20.51% 11.72% 11.72% 4.39% 4.39% 0.98% 0.98% 0.098% 0.098%

EECS 203 Spring 2016 Lecture 15 Page 1 of 6

Transcription:

Chapter 12 Summary Sample Surveys What have we learned? A representative sample can offer us important insights about populations. o It s the size of the same, not its fraction of the larger population, that determines the precision of the statistics it yields. There are several ways to draw samples, all based on the power of randomness to make them representative of the population of interest: o Simple Random Sample, Stratified Sample, Cluster Sample, Systematic Sample, Multistage Sample Bias can destroy our ability to gain insights from our sample: o Nonresponse bias can arise when sampled individuals will not or cannot respond. o Response bias arises when respondents answers might be affected by external influences, such as question wording or interviewer behavior. Bias can also arise from poor sampling methods: o Voluntary response samples are almost always biased and should be avoided and distrusted. o Convenience samples are likely to be flawed for similar reasons. o Even with a reasonable design, sample frames may not be representative. Undercoverage occurs when individuals from a subgroup of the population are selected less often than they should be. Finally, we must look for biases in any survey we find and be sure to report our methods whenever we perform a survey so that others can evaluate the fairness and accuracy of our results. Background We have learned ways to display, describe, and summarize data, but have been limited to examining the particular batch of data we have. We d like (and often need) to stretch beyond the data at hand to the world at large. Let s investigate three major ideas that will allow us to make this stretch Idea 1: Examine a Part of the Whole The first idea is to draw a sample. We d like to know about an entire population of individuals, but examining all of them is usually impractical, if not impossible. We settle for examining a smaller group of individuals a sample selected from the population. Sampling is a natural thing to do. Think about sampling something you are cooking you taste (examine) a small part of what you re cooking to get an idea about the dish as a whole. Opinion polls are examples of sample surveys, designed to ask questions of a small group of people in the hope of learning something about the entire population. o Professional pollsters work quite hard to ensure that the sample they take is representative of the population. o If not, the sample can give misleading information about the population. AP Statistics Page 1 2007

Bias Unit 3 Planning a Study Samples that don t represent every individual in the population fairly are said to be biased. o Bias is the bane of sampling the one thing above all to avoid. o There is usually no way to fix a biased sample and no way to salvage useful information from it. The best way to avoid bias is to select individuals for the sample at random. o The value of deliberately introducing randomness is one of the great insights of Statistics. Idea 2: Randomize Randomization can protect you against factors that you know are in the data. o It can also help protect against factors you are not even aware of. Randomizing protects us from the influences of all the features of our population, even ones that we may not have thought about. o Randomizing makes sure that on the average the sample looks like the rest of the population. Not only does randomizing protect us from bias, it actually makes it possible for us to draw inferences about the population when we see only a sample. Such inferences are among the most powerful things we can do with Statistics. But remember, it s all made possible because we deliberately choose things randomly. Idea 3: It s the Sample Size How large a random sample do we need for the sample to be reasonably representative of the population? It s the size of the sample, not the size of the population, that makes the difference in sampling. o Exception: If the population is small enough and the sample is more than 10% of the whole population, the population size can matter. The fraction of the population that you ve sampled doesn t matter. It s the sample size itself that s important. Does a Census Make Sense? Why bother determining the right sample size? Wouldn t it be better to just include everyone and sample the entire population? o Such a special sample is called a census. There are problems with taking a census: o It can be difficult to complete a census there always seem to be some individuals who are hard to locate or hard to measure. o Populations rarely stand still. Even if you could take a census, the population changes while you work, so it s never possible to get a perfect measure. o Taking a census may be more complex than sampling. Populations and Parameters Models use mathematics to represent reality. o Parameters are the key numbers in those models. A parameter that is part of a model for a population is called a population parameter. We use data to estimate population parameters. o Any summary found from the data is a statistic. o The statistics that estimate population parameters are called sample statistics. AP Statistics Page 2 2007

Notation We typically use Greek letters to denote parameters and Latin letters to denote statistics. Simple Random Samples We draw samples because we can t work with the entire population. o We need to be sure that the statistics we compute from the sample reflect the corresponding parameters accurately. o A sample that does this is said to be representative. We will insist that every possible sample of the size we plan to draw has an equal chance to be selected. o Such samples also guarantee that each individual has an equal chance of being selected. o With this method each combination of people has an equal chance of being selected as well. o A sample drawn in this way is called a Simple Random Sample (SRS). An SRS is the standard against which we measure other sampling methods, and the sampling method on which the theory of working with sampled data is based. To select a sample at random, we first need to define where the sample will come from. o The sampling frame is a list of individuals from which the sample is drawn. Once we have our sampling frame, the easiest way to choose an SRS is with random numbers. Samples drawn at random generally differ from one another. o Each draw of random numbers selects different people for our sample. o These differences lead to different values for the variables we measure. o We call these sample-to-sample differences sampling variability. Stratified Sampling Simple random sampling is not the only fair way to sample. More complicated designs may save time or money or help avoid sampling problems. All statistical sampling designs have in common the idea that chance, rather than human choice, is used to select the sample. Designs used to sample from large populations are often more complicated than simple random samples. Sometimes the population is first sliced into homogeneous groups, called strata, before the sample is selected. Then simple random sampling is used within each stratum before the results are combined. This common sampling design is called stratified random sampling. Stratified random sampling can reduce bias. Stratifying can also reduce the variability of our results. o When we restrict by strata, additional samples are more like one another, so statistics calculated for the sampled values will vary less from one sample to another. AP Statistics Page 3 2007

Cluster and Multistage Sampling Sometimes stratifying isn t practical and simple random sampling is difficult. Splitting the population into similar parts or clusters can make sampling more practical. o Then we could select one or a few clusters at random and perform a census within each of them. o This sampling design is called cluster sampling. o If each cluster fairly represents the full population, cluster sampling will give us an unbiased sample. Cluster sampling is not the same as stratified sampling. o We stratify to ensure that our sample represents different groups in the population, and sample randomly within each stratum. Strata are homogeneous, but differ from one another. o Clusters are more or less alike, each heterogeneous and resembling the overall population. We select clusters to make sampling more practical or affordable. Sometimes we use a variety of sampling methods together. Sampling schemes that combine several methods are called multistage samples. Most surveys conducted by professional polling organizations use some combination of stratified and cluster sampling as well as simple random sampling. Systematic Samples Sometimes we draw a sample by selecting individuals systematically. o For example, you might survey every 10th person on an alphabetical list of students. To make it random, you must still start the systematic selection from a randomly selected individual. When there is no reason to believe that the order of the list could be associated in any way with the responses sought, systematic sampling can give a representative sample. Systematic sampling can be much less expensive than true random sampling. When you use a systematic sample, you need to justify the assumption that the systematic method is not associated with any of the measured variables. Who s Who? The Who of a survey can refer to different groups, and the resulting ambiguity can tell you a lot about the success of a study. To start, think about the population of interest. Often, you ll find that this is not really a well-defined group. o Even if the population is clear, it may not be a practical group to study. Who s Who? (cont.) Second, you must specify the sampling frame. o Usually, the sampling frame is not the group you really want to know about. o The sampling frame limits what your survey can find out. Then there s your target sample. o These are the individuals for whom you intend to measure responses. o You re not likely to get responses from all of them nonresponse is a problem in many surveys. AP Statistics Page 4 2007

Who s Who? (cont.) Finally, there is your sample the actual respondents. o These are the individuals about whom you do get data and can draw conclusions. o Unfortunately, they might not be representative of the sample, the sampling frame, or the population. At each step, the group we can study may be constrained further. The Who keeps changing, and each constraint can introduce biases. A careful study should address the question of how well each group matches the population of interest. One of the main benefits of simple random sampling is that it never loses its sense of who s Who. o The Who in an SRS is the population of interest from which we ve drawn a representative sample. (That s not always true for other kinds of samples.) What Can Go Wrong? or, How to Sample Badly Sample Badly with Volunteers: o In a voluntary response sample, a large group of individuals is invited to respond, and all who do respond are counted. Voluntary response samples are almost always biased, and so conclusions drawn from them are almost always wrong. o Voluntary response samples are often biased toward those with strong opinions or those who are strongly motivated. o Since the sample is not representative, the resulting voluntary response bias invalidates the survey. Sample Badly, but Conveniently: o In convenience sampling, we simply include the individuals who are convenient. Unfortunately, this group may not be representative of the population. o Convenience sampling is not only a problem for students or other beginning samplers. o In fact, it is a widespread problem in the business world the easiest people for a company to sample are its own customers. Sample from a Bad Sampling Frame: o An SRS from an incomplete sampling frame introduces bias because the individuals included may differ from the ones not in the frame. Undercoverage: o Many of these bad survey designs suffer from undercoverage, in which some portion of the population is not sampled at all or has a smaller representation in the sample than it has in the population. o Undercoverage can arise for a number of reasons, but it s always a potential source of bias. AP Statistics Page 5 2007

What Else Can Go Wrong? Watch out for nonrespondents. o A common and serious potential source of bias for most surveys is nonresponse bias. o No survey succeeds in getting responses from everyone. The problem is that those who don t respond may differ from those who do. And they may differ on just the variables we care about. Don t bore respondents with surveys that go on and on and on and on o Surveys that are too long are more likely to be refused, reducing the response rate and biasing all the results. Work hard to avoid influencing responses. o Response bias refers to anything in the survey design that influences the responses. o For example, the wording of a question can influence the responses: How to Think About Biases Look for biases in any survey you encounter there s no way to recover from a biased sample of a survey that asks biased questions. Spend your time and resources reducing biases. If you possibly can, pretest your survey. Always report your sampling methods in detail. AP Statistics Page 6 2007