Sampling Terminology MARKETING TOOLS Buyer Behavior and Market Analysis Population all possible entities (known or unknown) of a group being studied. Sampling Procedures Census study containing data from all population members Sample subset of of population from whom data are collected. Two Basic Types of Samples Two Basic Types of Samples Nonprobability Samples are not selected using a random process. Samples are usually selected based on convenience or judgment. convenience samples: the researcher studies populations whose members are most available. judgment samples: the researcher selects sample members who s/he believes will best represent the population of interest. quota samples: researcher seek to build a sample based on selected (usually demographic) respondent characteristics. Nonprobability Samples. Probability Samples Also called random samples. Use processes based on chance to select samples. Random process must allow calculation of the odds of a given population member being selected. Probability samples allow generalizations to the population. Incidence Rate: percent of total population actually appearing in the sample frame Error: percent of population not in sample frame plus percent of sample frame not in population. Incidence 1
Incidence Rate: percent of total population actually appearing in the sample frame Error: percent of population not in sample frame plus percent of sample frame not in population. Incidence Two Random Sampling Procedures Simple Random Sampling rolling dice flipping coin random number generator (MS Excel). Systematic Sampling Error Examples: Divide a population of hospital administrators into strata based on the number of beds in their hospitals. Divide a population of students into strata based on their current GPA, then randomly sample from each. Consider stratification when...... a research objective calls for comparing subgroups.... subgroups have different statistical properties.... data collection costs differ by subgroups.... a variable of interest is known to differ by subgroup.... strata sizes differ substantially. Example: Randomly select 5 of the 100 largest cities. Ask questions of the hospital administrators in those cities. Randomly select a sample of zip codes from a large metro area. Draw a sample of addresses from the zip code. 77 Zip Codes randomly choose 0
Consider clustering when...... significant travel costs might otherwise be incurred.... a list of clusters is more readily available than a list of population members. Multistage sampling Multistage sampling occurs when multiple sampling strategies are used in tandem. Example: Randomly select 50 of the 00 largest cities (cluster). Divide hospitals in those cities into small, medium and large and randomly select from each group (stratify). Divide administrators into male and female and randomly select from each group (stratify). Sample Bias versus Sampling Error Sample bias refers to error (departure from the truth) that results from poorly drawn samples. Sampling error refers to the natural error (departure from the truth) that results from not taking a census, even if the sample was perfectly drawn. Determining needed sample size: What you are determining: I want to be A% confident that my estimate of the population proportion or mean falls within plus or minus B percent of the true population proportion or mean. How big a sample will I need? Sample bias cannot be mathematically estimated; sampling error can be estimated. For Example: Suppose a cell phone company found with a small pilot study of 150 Chicago households that about 40% of land line users felt they spent too much on long distance. Now management says, We want a nationwide study of long distance telephone bill satisfaction, and we to be 99% confident that the pilot study results are accurate within plus or minus 3%. z-score: the number of standard deviations from the mean needed to account for a given percentage of the total area under a normal distribution. for 95% confidence: z = 96 for 99% confidence: z =.58 3
. The proportion of the population expected to have some characteristic of interest. P We can estimate this proportion, based on intuition, previous research, secondary data, or pilot studies. An estimate of 50% will always produce the largest required sample size (and therefore should be used when no estimate or guess is available).. The proportion of the population expected to have some characteristic of interest. P 3. The decision maker s desired precision of the estimate. Φ Simply plus or minus a percent the decision maker wants the estimate to be from the actual population parameter. Greater precision requires larger samples and more money. Three to five percent is common. The formula for sample size: n = z CL [P (100 P)] Φ A long distance company found with a small study of 150 Chicago households that about 40% spent more than $100 per month on long distance. where: n = needed sample size z CL = z-score for specified confidence level P = estimated proportion of population with characteristic Φ = specified level of precision Now management says, We want a nationwide study of long distance telephone usage patterns, and we to be 99% confident that the results to be accurate within plus or minus 3%. confidence level: 99% z =.58 confidence level: 99% z =.58 n = z CL [P (100 P)] Φ n =.58 40 60 3 4
confidence level: 99% z =.58 confidence level: 99% z =.58 n = 6.656 66.667 n = 1774.93 or 1775 5