Survey of Massachusetts Congressional District #4 Methodology Report

Similar documents
Methodology Marquette Law School Poll February 25-March 1, 2018

Methodology Marquette Law School Poll June 22-25, 2017

Methodology Marquette Law School Poll October 26-31, 2016

Methodology Marquette Law School Poll August 13-16, 2015

Methodology Marquette Law School Poll April 3-7, 2018

PROBABILITY-BASED SAMPLING USING Split-Frames with Listed Households

October 6, Linda Owens. Survey Research Laboratory University of Illinois at Chicago 1 of 22

Experiences with the Use of Addressed Based Sampling in In-Person National Household Surveys

Comparative Study of Electoral Systems (CSES) Module 4: Design Report (Sample Design and Data Collection Report) September 10, 2012

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

1 NOTE: This paper reports the results of research and analysis

Census: Gathering information about every individual in a population. Sample: Selection of a small subset of a population.

Recall Bias on Reporting a Move and Move Date

Unit 8: Sample Surveys

Introduction INTRODUCTION TO SURVEY SAMPLING. Why sample instead of taking a census? General information. Probability vs. non-probability.

Comparative Study of Electoral Systems (CSES) Module 4: Design Report (Sample Design and Data Collection Report) August 12, 2014

INTEGRATED COVERAGE MEASUREMENT SAMPLE DESIGN FOR CENSUS 2000 DRESS REHEARSAL

Introduction INTRODUCTION TO SURVEY SAMPLING. General information. Why sample instead of taking a census? Probability vs. non-probability.

ESP 171 Urban and Regional Planning. Demographic Report. Due Tuesday, 5/10 at noon

SURVEY ON POLICE INTEGRITY IN THE WESTERN BALKANS (ALBANIA, BOSNIA AND HERZEGOVINA, MACEDONIA, MONTENEGRO, SERBIA AND KOSOVO) Research methodology

Section 2: Preparing the Sample Overview

Poverty in the United Way Service Area

Sampling Terminology. all possible entities (known or unknown) of a group being studied. MKT 450. MARKETING TOOLS Buyer Behavior and Market Analysis

Estimation Methodology and General Results for the Census 2000 A.C.E. Revision II Richard Griffin U.S. Census Bureau, Washington, DC 20233

2007 Census of Agriculture Non-Response Methodology

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu Albuquerque, UNM

Eastlan Ratings Radio Audience Estimate Survey Methodology

Census Data for Transportation Planning

American Community Survey 5-Year Estimates

American Community Survey 5-Year Estimates

Report to Frack Free Frodsham & Helsby. Survey Analysis and Report of Residents Attitudes Towards Shale Gas Fracking in Helsby Parish Council Area

2016 Election Impact on Cherokee County Voter Registration

Comparative Study of Electoral Systems (CSES) Module 3: Sample Design and Data Collection Report June 05, 2006

An Overview of the American Community Survey

Italian Americans by the Numbers: Definitions, Methods & Raw Data

FINANCIAL LITERACY SURVEY IN BOSNIA AND HERZEGOVINA 2011

RECOMMENDED CITATION: Pew Research Center, March 2014, Hillary Clinton s Strengths: Record at State, Toughness, Honesty

Claritas Demographic Update Methodology Summary

The 2020 Census A New Design for the 21 st Century

Sierra Leone - Multiple Indicator Cluster Survey 2017

3. Data and sampling. Plan for today

US Census. Thomas Talbot February 5, 2013

Overview. Scotland s Census. Development of methods. What did we do about it? QA panels. Quality assurance and dealing with nonresponse

The American Community Survey. An Esri White Paper August 2017

Reengineering the 2020 Census

STAT 100 Fall 2014 Midterm 1 VERSION B

ARIZONA: CLINTON, TRUMP NECK AND NECK; McCAIN ON TRACK FOR REELECTION

An Introduction to ACS Statistical Methods and Lessons Learned

Understanding and Using the U.S. Census Bureau s American Community Survey

2020 Census Update. Presentation to the Council of Professional Associations on Federal Statistics. December 8, 2017

1999 AARP Funeral and Burial Planners Survey. Summary Report

Vincent Thomas Mule, Jr., U.S. Census Bureau, Washington, DC

The American Community Survey and the 2010 Census

Can a Statistician Deliver Coherent Statistics?

Quick Reference Guide

2011 Modified-BRFSS Data Collected for the CPPW Communities. Methodology for Weighting Authors. August 2011

Census Data Boot Camp

MATRIX SAMPLING DESIGNS FOR THE YEAR2000 CENSUS. Alfredo Navarro and Richard A. Griffin l Alfredo Navarro, Bureau of the Census, Washington DC 20233

Ghana - Financial Inclusion Insights Survey 2014

The U.S. Decennial Census A Brief History

American Community Survey Accuracy of the Data (2014)

Using Administrative Records for Imputation in the Decennial Census 1

MAT 1272 STATISTICS LESSON STATISTICS AND TYPES OF STATISTICS

The main focus of the survey is to measure income, unemployment, and poverty.

Comparing Generalized Variance Functions to Direct Variance Estimation for the National Crime Victimization Survey

Botswana - Botswana AIDS Impact Survey III 2008

Using 2010 Census Coverage Measurement Results to Better Understand Possible Administrative Records Incorporation in the Decennial Census

NATIONAL: MOST AMERICANS SAY MERRY CHRISTMAS

National Longitudinal Study of Adolescent Health. Public Use Contextual Database. Waves I and II. John O.G. Billy Audra T. Wenzlow William R.

These days, surveys are used everywhere and for many reasons. For example, surveys are commonly used to track the following:

Zambia - Demographic and Health Survey 2007

Liberia - Household Income and Expenditure Survey 2016

Working with United States Census Data. K. Mitchell, 7/23/2016 (no affiliation with U.S. Census Bureau)

Census Response Rate, 1970 to 1990, and Projected Response Rate in 2000

SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES American Community Survey 5-Year Estimates

Country Paper : Macao SAR, China

The 2010 Census: Count Question Resolution Program

The 2020 Census: A New Design for the 21 st Century Deirdre Dalpiaz Bishop Chief Decennial Census Management Division U.S.

Ghana - Ghana Living Standards Survey

The Representation of Young Children in the American Community Survey

Key Words: age-order, last birthday, full roster, full enumeration, rostering, online survey, within-household selection. 1.

Sampling. I Oct 2008

SURVEY ON USE OF INFORMATION AND COMMUNICATION TECHNOLOGY (ICT)

Census 2010 Participation Rates, Results for Alaska, and Plans for the 2020 Census

Manuel de la Puente ~, U.S. Bureau of the Census, CSMR, WPB 1, Room 433 Washington, D.C

2012 Ohio Medicaid Assessment Survey

Reference Guide for Journalists: Using the American Community Survey

Supplementary questionnaire on the 2011 Population and Housing Census SWITZERLAND

The Unexpectedly Large Census Count in 2000 and Its Implications

Who s in Your Neighborhood? Using the American FactFinder. Salma Abadin and Carrie Koss Vallejo Data You Can Use

M N M + M ~ OM x(pi M RPo M )

2020 Census. Bob Colosi Decennial Statistical Studies Division February, 2016

2020 Census Geographic Partnership Programs. Update. Atlanta Regional Office Managing Census Operations in: AL, FL, GA, LA, MS, NC, SC

THE AP-GfK POLL August, 2012

Dallas Regional Office US Census Bureau

Using Administrative Records to Improve Within Household Coverage in the 2008 Census Dress Rehearsal

A QUALITY ASSURANCE STRATEGY IN MALAYSIA 2020 POPULATION AND HOUSING CENSUS

The Road to 2020 Census

Case 2:12-cv RJS-DBP Document 184 Filed 08/26/15 Page 1 of 12 UNITED STATES DISTRICT COURT FOR THE DISTRICT OF UTAH CENTRAL DIVISION

Some Indicators of Sample Representativeness and Attrition Bias for BHPS and Understanding Society

Transcription:

Survey of Massachusetts Congressional District #4 Methodology Report Prepared by Robyn Rapoport and David Dutwin Social Science Research Solutions 53 West Baltimore Pike Media, PA, 19063

Contents Overview... 3 Sample Design... 3 Field Preparations... 4 Data Collection Procedures... 4 Weighting Procedures... 6 1. Phone-Status Correction (W PS ):... 7 2. Within household selection correction (W HC ):... 7 3. Post stratification weighting:... 7 Response Rate... 9 2

Overview The University of Massachusetts Lowell contracted with Social Science Research Solutions/SSRS to conduct the Massachusetts Congressional District #4 (MA CD-4) Study from February 2 through February 4 and February 6 through February 8, 2012. The purpose of the MA CD-4 Study was to conduct the first valid and reliable poll on the possibility that Joseph Kennedy III will run for Congress in the newly redrawn MA Congressional District #4 for the seat being vacated by Barney Frank. This report provides information about the methods used to collect the data and report the survey results. The study collected data from a representative sample of 408 registered voters living in the newly redistricted area of MA CD-4. The study consisted of a landline component (n = 304) and a cell phone component (n = 104). Sample Design To address concerns about coverage, the study employed a dual-frame landline/cell phone random digit dial (RDD) telephone design. Both samples were generated by SSRS s sister company, Marketing Systems Group (MSG). RDD landline sample was drawn from telephone exchanges within the new MA CD-4. Using Marketing Systems Group s Genesys database of telephone exchanges, we were able to select telephone exchanges that would result in a 92 percent incidence of reaching households in MA CD-4. These telephone exchanges cover 95 percent of all households in the District. Following generation, landline sample was prepared using MSG s proprietary procedures that not only limit sample to non-zero banks, but also identify and eliminate approximately 90% of all nonworking and business numbers and ported cell phones. For the RDD cell phone sample, numbers were initially drawn from the four switch-points (central routing mechanisms that send cell phone calls to different parts of the country) located in MA CD-4. After the initial sample was drawn, additional analyses were conducted through the Telcordia database in order to better align the cell phone sample with the borders of the new MA CD-4 and improve coverage. Based on these analyses, the cell sample was refined as follows: First, the analysis identified a number of 1,000 blocks of telephone numbers connected with switch-points outside of the District that are most often routed to households within the District. These 1,000 blocks were therefore included in the sample file. Second, the analysis tagged several 1,000 blocks in the four switch points within the District as being owned by telephone resellers that typically do not provide numbers used by households. SSRS dialed a portion of these exchanges and confirmed that indeed, these exchanges are non-residential; therefore, these exchanges were excluded from further dialing. Third, Telcordia flagged 1,000 blocks within the four in-district switch-points that target households outside of MA CD-4. 3

SSRS also dialed several of these exchanges. After confirmation that none of the households were part of MA CD-4, telephone numbers associated with these 1,000 blocks were removed from the active sample. Survey incidence before the sample refinements outlined above was less than four percent; following the refinements, the sample attained a 15 percent incidence of registered voters living with MA CD-4, closer to the original estimate of a 20 percent incidence. Field Preparations The questionnaire was developed by UMass Lowell in consultation with the SSRS project team. Prior to the field period, SSRS programmed the study into CfMC Computer Assisted Telephone Interviewing (CATI) software. Extensive checking of the program was conducted to assure that skip patterns followed the design of the questionnaire. The field period for this study was February 2 through February 4 and February 6 through February 8, 2012. All interviews were done through the CATI system. The CATI system ensured that questions followed logical skip patterns and that complete dispositions of all call attempts were recorded. CATI interviewers received both written materials on the survey and formal training. The written materials were provided prior to the beginning of the field period and included an annotated questionnaire that contained information about the goals of the study as well as detailed explanations of why questions were being asked, the meaning and pronunciation of key terms, potential obstacles to be overcome in getting good answers to questions, and respondent problems that could be anticipated ahead of time as well as strategies for addressing the potential problems. Interviewer training was conducted immediately before the survey was officially launched. Call center supervisors and interviewers were walked through each question from the questionnaire. Interviewers were given instructions to help them maximize response rates and ensure accurate data collection. Data Collection Procedures Interviews were conducted from February 2 through February 4 and February 6 through February 8, 2012; interviews were not conducted on Sunday, February 7, Super Bowl Sunday, because of the likelihood that cooperation and response rates would be low on that day. For the landline sample, interviewers asked to speak with the youngest adult male or female currently at home. In order to produce a sample that would more closely resemble the general population in the area by gender and age when combined with the cell completes, the program 4

asked for youngest males first preferentially, 70% of the time. Callbacks were set up if no adult was available to complete the interview at the time of the call. For the cell phone sample, interviewers first determined whether the person who answered the phone was an adult and then confirmed that the respondent was not driving or doing anything that required their full attention. If possible, callbacks were set up if the respondent was not able to complete the interview at the time of the call. Respondents were asked their zip code in order to determine geographic eligibility. Interviews with out-of-area respondents were terminated. Interviews were continued with respondents who provided in-area or borderline zip codes. Borderline zip codes are zip codes associated with residential areas that are both inside and outside the borders of MA-CD4. Screening questions were asked to determine if the respondent was registered to vote at their current address. Respondents who said that they were not registered to vote or were not certain of their registration status, either in general or at their current address, were asked demographic questions necessary for weighting the sample. Registered voters continued with the main interview. Notably, the survey instrument used the respondent-reported zip code to ascertain whether a respondent resided within MA CD-4. For the majority of the households in MA CD-4, geographic eligibility is knowable based on the zip code alone; for the remaining respondents those living in households with borderline zip codes it was necessary to determine geographic eligibility using geo-coding information (i.e., 100 block and cross-street), collected at the end of the survey. Since geographic eligibility for these cases could not be determined programmatically, SSRS needed to conduct additional interviews in order to ensure that the final sample of completed interviews would contain a minimum of 300 landline and 100 cell completes with registered voters known to live in MA CD-4. Overall, SSRS completed 41 full interviews (27 landline and 14 cell) with registered voters living in a borderline zip code and asked demographic questions of 22 respondents living in a borderline zip code who did not qualify for the full survey as registered voters. SSRS mapped geocoding information for each borderline case and compared the location with boundaries of MA CD-4. Of the borderline completes, SSRS determined that 11 (four landline and seven cell phone) were out of the area of MA CD-4; of the borderline demographic-only interviews, SSRS determined that five (one landline and four cell) were out of the area. Thus, while SSRS completed 419 complete interviews and 119 demographic-only interviews, the final sample used for weighting included 408 interviews with registered voters and 114 demographic-only interviews. In order to maximize survey response, SSRS enacted the following procedures during the field period: An average of 3 follow-up attempts were made to contact non-responsive numbers (no answer, busy, answering machine). 5

Each non-responsive number was contacted multiple times, varying the times of day, and the days of the week that call-backs were placed using a programmed differential call rule. Respondents were offered the option to set a schedule for a call-back. Phone numbers received a daytime call attempt, if necessary. Weighting Procedures The final data were weighted to correct for variance in the likelihood of selection for a given case and to balance the sample to known population parameters in order to correct for systematic under- or over-representation of meaningful social categories. Typically, data are weighted to Census parameters via the American Community Survey or the Current Population Survey. However, these data are only reliable down to the PUMA (Public Area Microdata) level. Because the Congressional District does not perfectly overlap with PUMA, SSRS utilized counts from Claritas, a Nielsen company, to weight the data for this survey. Claritas takes data from the decennial Census and models it from a variety of sources to update the 2010 Census counts quarterly, until the next Census in 2020. These data are therefore quite accurate, given our proximity to the 2010 Census. We selected Claritas data for the block groups in MA CD-4. We then compared demographic frequencies for age, race, education, and gender to the best fit overlap of PUMA from the 2010 American Community Survey. The estimates were quite close. This is an important check used to ascertain the reliability of the Claritas data in providing meaningful weighting targets for our sample because Claritas data provides race and ethnicity separately but these data are weighted in our sample in a single step. In addition, Claritas provides education for the 25+ population; thus, educational attainment for 18-24 year olds needs to be imputed using the ACS estimates to produce counts for the full 18+ adult population. SSRS has enacted this procedure for dozens of local-area studies and are quite confident in the accuracy of the results. Phone use (cell phone only, dual users, and landline only) was modeled utilizing the same procedure used by the National Health Interview Survey to estimate phone use at the state level. Namely, a logistic regression was run within NHIS data, predicting these three phone use types separately. Then, Claritas and ACS estimates of the District were utilized to solve the regression equation for CD-4 specifically. This procedure found that 30.2% of CD-4 households are cell phone only, compared to only 12 percent that are landline only. Demographic data were collected and weighting procedures were executed for all geographically-eligible respondents. These steps were necessary because universe counts for registered voters living in MA CD-4 are not available. After weighting the data of all respondents who are geographically eligible to the universe counts, the final step is to remove 6

cases that were not eligible as voters registered to vote at an address located within MA CD-4. This results in a final self-weighted sample of registered voters in CD-4. The weighting procedure involved the following steps: 1. Phone-Status Correction (W PS ): Respondents whose household members answer both landlines and cell phones have a higher likelihood of inclusion in the sample. To correct for this, cases from dual-frame households were assigned a weight equal to half the weight assigned to single-mode households. 2. Within household selection correction (W HC ): To correct for the fact that only one qualifying adult was selected in any given household, landline cases from households with a single qualifying adult received a weight of 1, those with two received a weight of 2, and those with 3 or more qualifying adults received a weight of 3. Respondents with missing data were assigned the mean weight. Cell phone respondents received a weight of 1, as there was no within-household selection on the cell phones. The product of these two stages was the baseweight for the sample. BW = W PS W HC 3. Post stratification weighting: The baseweight was used as a balancing weight in the iterative proportionate fitting (IPF) process, or raking. Universe counts were attained through the procedure described earlier for age, educational attainment, gender, phone use, and race. Table 1: Comparison of Benchmark Data, Unweighted Sample, and Weighted Sample Parameter Value Label Benchmark* Unweighted* Weighted* Less than High School 10.1% 5.4% 10.1% Education High School Graduate 23.4% 18.8% 23.4% Some College 25.9% 23.8% 25.9% College+ 39.9% 51.3% 39.9% Gender Male 47.4% 46.6% 47.4% Female 52.6% 53.4% 52.6% 18-24 12.8% 9.6% 12.8% 25-34 14.4% 8.2% 14.4% 35-44 17.4% 18.0% 17.4% 45-54 21.2% 22.4% 21.2% Age 55-64 16.2% 18.4% 16.2% 65+ 17.0% 22.4% 17.0% Race White 88.4% 88.7% 88.4% Black (non-hispanic) 2.0% 1.5% 2.0% 7

Hispanic 3.0% 3.8% 3.0% Other (non-hispanic) 4.9% 4.2% 4.9% Cell phone only 30.2% 10.9% 30.2% Phone Use Dual Frame 57.8% 83.5% 57.8% Landline only 12.0% 5.6% 12.0% *-Percentages may not add to 100% to account for cases where respondents refused to provide this demographic information. Weighting procedures increase the variance in the data, with larger weights causing greater variance. Complex survey designs and post-data collection statistical adjustments affect variance estimates and, as a result, tests of significance and confidence intervals. The final design effect for the survey was 1.7, and the margin of sampling error was 4.85 (6.39 with design effect). 8

Response Rate The landline response rate was 27.3% and the cell phone response rate was 14.0%, for an overall response rate of 20.9%, using AAPOR s RR3 formula. Below is a full disposition of the sample selected for the survey. Table 2: Sample Dispositions LL Cell Total Eligible, Interview (Category 1) Complete 304 104 408 Eligible, non-interview (Category 2) Refusal (Eligible) 17 15 32 Break-off 5 7 12 Answering Machine (Eligible) 0 3 3 Physically or mentally unable 0 0 0 Language problem 0 0 0 Unknown eligibility, non-interview (Category 3) Always busy 90 344 434 No answer 2375 7746 10121 Answering machine, don t know if household 117 4574 4691 Call blocking 36 396 432 Technical phone problems 3 19 22 Housing unit, unknown if eligible respondent 891 3687 4578 No screener completed 801 2917 3718 Not eligible (Category 4) Fax/data line 848 125 973 Non-working number 9420 4314 13734 Business, government office, other organizations 1076 344 1420 No eligible respondent 94 1509 1603 Total phone numbers used 16,077 26,104 42,181 9