Sampling, Part 2 AP Statistics Chapter 12
bias error Sampling error is just sampling variation!
Bias vs Error BIAS is something that causes your measurements to systematically miss in the same direction, every time. Sampling error is just sampling variation. (If you flip a coin 10 times, you won t ALWAYS get 5 heads and 5 tails some variation is inevitable with randomness.)
bias! Sampling variation/ error Neither!
so if samples are prone to sampling error, why not conduct a census EVERY TIME? Because Even with a census, there are going to be mistakes made, whether it s human error, OR we ll learn next class that there could still be bias in the way the survey questions are asked.
Types of data - Numerical vs Categorical numerical numerical categorical categorical categorical categorical* Name Job Type Age Gender Race Salary Zip Code Jose Cedillo Technical 27 Male Hispanic 52,300 90630 Amanda Childers Tonia Chen Clerical 42 Female White 27,500 90521 Management 51 Female Asian 83,600 90629 *numerical variables are when the count, or amount, of the item is the most important With zip code, the numerical value of that number isn t important. Nobody cares if someone else has a bigger zip code than you! Also, you can think of it as, a numerical variable consists of numbers that it would make sense to take an average of - why would we find an average zip code?
A research group wishes to know the mean GPA of all 2600(ish) students at McNeil High School. To estimate this, they take a random sample of 100 students from those that have 1 st period classes in the C wing, and pull those records. The mean GPA of the students in the sample is 2.98. According to the school registrar, the GPA of all 2600 students at McNeil High School is 3.09. Identify the following:
A research group wishes to know the mean GPA of all 2600(ish) students at McNeil High School. To estimate this, they take a random sample of 100 students from those that have 1 st period classes in the C wing, and pull those records. The mean GPA of the students in the sample is 2.98. According to the school registrar, the GPA of all 2600 students at McNeil High School is 3.09. a) Population (of interest) all 2600(ish) students at McNeil High School
A research group wishes to know the mean GPA of all 2600(ish) students at McNeil High School. To estimate this, they take a random sample of 100 students from those that have 1 st period classes in the C wing, and pull those records. The mean GPA of the students in the sample is 2.98. According to the school registrar, the GPA of all 2600 students at McNeil High School is 3.09. b) Parameter of interest mean GPA of all students at McNeil
A research group wishes to know the mean GPA of all 2600(ish) students at McNeil High School. To estimate this, they take a random sample of 100 students from those that have 1 st period classes in the C wing, and pull those records. The mean GPA of the students in the sample is 2.98. According to the school registrar, the GPA of all 2600 students at McNeil High School is 3.09. c) Sampling frame students who have 1st period class in the C wing
A research group wishes to know the mean GPA of all 2600(ish) students at McNeil High School. To estimate this, they take a random sample of 100 students from those that have 1 st period classes in the C wing, and pull those records. The mean GPA of the students in the sample is 2.98. According to the school registrar, the GPA of all 2600 students at McNeil High School is 3.09. d) Sample the 100 students who were randomly selected
A neighborhood interest group wants to know what proportion of households in Austin watch the TV show Doctor Who. They select a random sample of 59 houses from Northwest Austin, and find that 35.6% of those families watch the program regularly. Local ratings indicate that about 22% of all households watch Doctor Who on a regular basis. Identify the following:
A neighborhood interest group wants to know what proportion of households in Austin watch the TV show Doctor Who. They select a random sample of 59 houses from Northwest Austin, and find that 35.6% of those families watch the program regularly. Local ratings indicate that about 22% of all households watch Doctor Who on a regular basis. a) Population (of interest) all households in Austin
A neighborhood interest group wants to know what proportion of households in Austin watch the TV show Doctor Who. They select a random sample of 59 houses from Northwest Austin, and find that 35.6% of those families watch the program regularly. Local ratings indicate that about 22% of all households watch Doctor Who on a regular basis. b) Parameter of interest proportion of households that watch Doctor Who
A neighborhood interest group wants to know what proportion of households in Austin watch the TV show Doctor Who. They select a random sample of 59 houses from Northwest Austin, and find that 35.6% of those families watch the program regularly. Local ratings indicate that about 22% of all households watch Doctor Who on a regular basis. c) Sampling frame households in Northwest Austin
A neighborhood interest group wants to know what proportion of households in Austin watch the TV show Doctor Who. They select a random sample of 59 houses from Northwest Austin, and find that 35.6% of those families watch the program regularly. Local ratings indicate that about 22% of all households watch Doctor Who on a regular basis. d) Sample the 59 households that were randomly selected
Parameters vs. Statistics (population) (sample) Means (numerical data) Proportions (categorical data) Population Parameter μ p Sample Statistic x p Stats wear hats!
A research group wishes to know the mean GPA of all 2600(ish) students at McNeil High School. To estimate this, they take a random sample of 100 students from those that have 1 st period classes in the C wing, and pull those records. The mean GPA of the students in the sample is 2.98. According to the school registrar, the GPA of all 2600 students at McNeil High School is 3.09. This problem deals with means. 2.98 is the mean for the sample - the symbol we would use is x. 3.09 is the mean for the population - the symbol we would use is μ.
A neighborhood interest group wants to know what proportion of households in Austin watch the TV show Doctor Who. They select a random sample of 59 houses from Northwest Austin, and find that 35.6% of those families watch the program regularly. Local ratings indicate that about 22% of all households watch Doctor Who on a regular basis. This situation deals with proportions (or %). 35.6% is the proportion for the sample, so we would use p. 22% is the proportion for the population, so we would use p.
VERY quick intro into bias. Will definitely revisit this next class. :) Voluntary Response Bias People choose to respond Usually only people with very strong opinions
Ann Landers: If you had to do it over again, would you have children? 10,000 parents responded 70% of parents say kids NOT worth it!
A new sample was conducted by Gallup They used a random sampling method and found that, actually, 90% of parents were happy with their decision to have children. I love my kids!