Investigation of Variance Estimators for the Survey of Business Owners (SBO)

Similar documents
Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

PROBABILITY-BASED SAMPLING USING Split-Frames with Listed Households

An Introduction to ACS Statistical Methods and Lessons Learned

not human choice is used to select the sample.

INTEGRATED COVERAGE MEASUREMENT SAMPLE DESIGN FOR CENSUS 2000 DRESS REHEARSAL

Vincent Thomas Mule, Jr., U.S. Census Bureau, Washington, DC

Key Census Bureau Economic Programs and Tools

Sampling, Part 2. AP Statistics Chapter 12

Finding U.S. Census Data with American FactFinder Tutorial

Paper ST03. Variance Estimates for Census 2000 Using SAS/IML Software Peter P. Davis, U.S. Census Bureau, Washington, DC 1

2010 Census Coverage Measurement - Initial Results of Net Error Empirical Research using Logistic Regression

The Representation of Young Children in the American Community Survey

2011 Modified-BRFSS Data Collected for the CPPW Communities. Methodology for Weighting Authors. August 2011

2016 Election Impact on Cherokee County Voter Registration

The American Community Survey and the 2010 Census

Gathering information about an entire population often costs too much or is virtually impossible.

Stats: Modeling the World. Chapter 11: Sample Surveys

Jerry Reiter Department of Statistical Science Information Initiative at Duke Duke University

INTERNATIONAL TELECOMMUNICATION UNION

Census Data for Transportation Planning

Survey of Massachusetts Congressional District #4 Methodology Report

Documentation for April 1, 2010 Bridged-Race Population Estimates for Calculating Vital Rates

2007 Census of Agriculture Non-Response Methodology

Objectives. Module 6: Sampling

Italian Americans by the Numbers: Definitions, Methods & Raw Data

Variance Estimation in US Census Data from Kathryn M. Coursolle. Lara L. Cleveland. Steven Ruggles. Minnesota Population Center

Census Response Rate, 1970 to 1990, and Projected Response Rate in 2000

SURVEY ON USE OF INFORMATION AND COMMUNICATION TECHNOLOGY (ICT)

Chapter 3 Monday, May 17th

MAT 1272 STATISTICS LESSON STATISTICS AND TYPES OF STATISTICS

Sierra Leone - Multiple Indicator Cluster Survey 2017

ROSE GARDEN MEDICAL PLAZA University Blvd., Jacksonville, FL 32216

Sample Surveys. Chapter 11

American Community Survey 5-Year Estimates

Estimation Methodology and General Results for the Census 2000 A.C.E. Revision II Richard Griffin U.S. Census Bureau, Washington, DC 20233

American Community Survey 5-Year Estimates

Other Effective Sampling Methods

1543 Delplaza Drive. Office 515-1,545 SF. For more information: Evan Scroggs Property Features.

Using Administrative Records and the American Community Survey to Study the Characteristics of Undercounted Young Children in the 2010 Census

THE EVALUATION OF THE BE COUNTED PROGRAM IN THE CENSUS 2000 DRESS REHEARSAL

FOR SALE COMMERCIAL DEVELOPMENT LAND

The American Community Survey Motivation, History, and Design. Workshop on the American Community Survey Havana, Cuba November 16, 2010

Adjusting for linkage errors to analyse coverage of the Integrated Data Infrastructure (IDI) and the administrative population (IDI-ERP)

Birding in the United States: A Demographic and Economic Analysis

Calabrese Café

SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES American Community Survey 5-Year Estimates

Understanding the Census A Hands-On Training Workshop

The Savvy Survey #3: Successful Sampling 1

Quick Reference Guide

AfterBurn Report 2015 Black Rock City Census

Statistics and Data Long-Term Memory Review Review 1

Consumer Trends Shaping the Future of Homes and Community. Presented by Colleen Sharp, Vice President January 19, 2017

Using Administrative Records for Imputation in the Decennial Census 1

Introduction INTRODUCTION TO SURVEY SAMPLING. Why sample instead of taking a census? General information. Probability vs. non-probability.

RENWICK PLAZA FOR LEASE AT THE NEXUS OF CREST HILL, ROMEOVILLE, PLAINFIELD, JOLIET AND LOCKPORT

Gender Pay Gap Report 2017

Measuring Multiple-Race Births in the United States

Taming the Census TIGER:

May 10, 2016, NSF-Census Research Network, Census Bureau. Research supported by NSF grant SES

Statistical and operational complexities of the studies I Sample design: Use of sampling and replicated weights

Botswana - Botswana AIDS Impact Survey III 2008

Sampling Terminology. all possible entities (known or unknown) of a group being studied. MKT 450. MARKETING TOOLS Buyer Behavior and Market Analysis

Chapter 1 Introduction

Introduction INTRODUCTION TO SURVEY SAMPLING. General information. Why sample instead of taking a census? Probability vs. non-probability.

2011 UK Census Coverage Assessment and Adjustment Methodology

Mathematicsisliketravellingona rollercoaster.sometimesyouron. Mathematics. ahighothertimesyouronalow.ma keuseofmathsroomswhenyouro

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu Albuquerque, UNM

Use in Colorado of Non-Cigarette Tobacco (NCT)

American Community Survey: Sample Design Issues and Challenges Steven P. Hefter, Andre L. Williams U.S. Census Bureau Washington, D.C.

Section 6.4. Sampling Distributions and Estimators

SAMPLING. A collection of items from a population which are taken to be representative of the population.

A STUDY IN HETEROGENEITY OF CENSUS COVERAGE ERROR FOR SMALL AREAS

2020 Census: Researching the Use of Administrative Records During Nonresponse Followup

Simulated Statistics for the Proposed By-Division Design In the Consumer Price Index October 2014

Birding in the United States: A Demographic and Economic Analysis

Methodology Statement: 2011 Australian Census Demographic Variables

U.S. CENSUS MONITORING BOARD

CH 13. Probability and Data Analysis

2012 Ohio Medicaid Assessment Survey

3. Data and sampling. Plan for today

Conducting Research in the ACRDC

GRAPH P-1: YEARS OF LIFE EXPECTANCY AT BIRTH, FLORIDA AND UNITED STATES, CENSUS YEARS AND YEAR

Using 2010 Census Coverage Measurement Results to Better Understand Possible Administrative Records Incorporation in the Decennial Census

Claritas Demographic Update Methodology Summary

Chapter 4: Sampling Design 1

COMPARISON OF ALTERNATIVE FAMILY WEIGHTING METHODS FOR THE NATIONAL HEALTH INTERVIEW SURVEY

Using Administrative Records to Improve Within Household Coverage in the 2008 Census Dress Rehearsal

Victim Support volunteer demographic data April 2015-April 2016 v census data 2011

My Tribal Area: Census Data Overview & Access. Eric Coyle Data Dissemination Specialist U.S. Census Bureau

Nigeria - Multiple Indicator Cluster Survey

A Compendium of National Statistics on Women-Owned Businesses in the U.S. Executive Summary and Data Report

Press Contact: Tom Webster. The Heavy Radio Listeners Report

Census Pro Documentation

Key Words: age-order, last birthday, full roster, full enumeration, rostering, online survey, within-household selection. 1.

Census Data Access Workshop Census Data On A Dealine

Guyana - Multiple Indicator Cluster Survey 2014

GENDER PAY GAP. Published December 7th 2017

2. The value of the middle term in a ranked data set is called: A) the mean B) the standard deviation C) the mode D) the median

Sampling distributions and the Central Limit Theorem

REVISED - Census Tract Measures for Fragile Families Mothers and Fathers at Baseline. September 16, 2005

Transcription:

Investigation of Variance Estimators for the Survey of Business Owners (SBO) Marilyn Balogh and Sandy Peterson U.S. Census Bureau November 5, 2013

Outline Background on SBO Variance Estimation Methodology Random group (simple and stratum-specific) Delete-a-group Jackknife (simple and stratumspecific) Stratified Jackknife Simulation Study Results Conclusion 2 of 30

Background on SBO Part of the Economic Census taken every 5 years for years ending in 2 and 7 The only comprehensive, regularly collected data for businesses and business owners by - Gender - Race - Ethnicity (Hispanic origin of any race) - Veteran status 3 of 30

SBO universe: Background on SBO 9 sampling frames based off modeled likelihoods stratify by frame, state, industry code, and employment status (68,585) Firms are selected with certainty or are subjected to systematic sampling 4 of 30

Background on SBO Hot-deck donor imputation for unit and item non-response Calculate estimates using Horvitz- Thompson estimator Estimates sampling error using the random group (RG) variance estimator 10 non-certainty random groups fpc adjustment factor 5 of 30

Variance Estimation Methodology Three variance estimators: Random group (RG) simple and stratum-specific Delete-a-group jackknife (DAG) simple and stratum-specific Stratified jackknife (SJK) 6 of 30

Random Group and Delete-a- Group Jackknife Methods Divides the non-certainty firms into R random groups Creates R replicate estimates Calculates the simple variance 2 reweighting procedures (simple and stratum-specific) 7 of 30

RG simple method 8 of 30

RG stratum-specific method 9 of 30

RG Variance 10 of 30

DAG Simple Method 11 of 30

DAG stratum-specific method 12 of 30

DAG stratum-specific method 13 of 30

DAG stratum-specific method 14 of 30

DAG Variance 15 of 30

Stratified Jackknife Method 16 of 30

SJK Method 17 of 30

SJK Variance 18 of 30

Simulation Study Created a simulated population Selected 5 states Florida, Georgia, Kansas, New York, and North Dakota Assigned race, gender, ethnicity, and veteran status Selected 5,000 different stratified systematic samples 19 of 30

Simulation Study Assigned sampled units to 10 noncertainty random groups Calculated the 5 variance estimators: RG simple (RG_S) RG stratum-specific (RG_ST) DAG simple (DAG_S) DAG stratum-specific extended (DAG_ST) Stratified Jackknife (SJK) 20 of 30

Simulation Study 21 of 30

Results 22 of 30

CV Sign Tests Results Median of CVs of SJK method is smaller than median of other methods Median of CVs of RG_ST method is smaller than median of other methods, except SJK Median of CVs of DAG_ST method is smaller than the simple methods 23 of 30

Relative Bias Results Table 2: Relative biases for the firm count by demographic characteristic and variance estimator for all firms within New York Demographic Characteristic Relative Bias RG_S RG_ST DAG_S DAG_ST SJK All firms 0.174 31.417 0.174-0.997-0.988 Female -0.032 0.048-0.032-0.040-0.012 Male -0.069 0.222-0.069-0.086-0.076 Hispanic -0.050-0.049-0.050-0.058-0.047 Non-Hispanic -0.054 1.664-0.054-0.136-0.146 White -0.245 0.414-0.245-0.281-0.276 Black or African American -0.095-0.074-0.095-0.107-0.098 AIAN -0.078-0.064-0.078-0.079-0.049 Asian -0.552-0.544-0.552-0.555-0.560 NHOPI -0.011-0.018-0.011-0.011-0.020 24 of 30

Coefficient of Variation Results Table 3: CVs for the firm count by demographic characteristic and variance estimator for all firms within New York Demographic Coefficient of Variations Characteristic RG_S RG_ST DAG_S DAG_ST SJK All firms 0.643 31.419 0.643 0.997 0.988 Female 0.449 0.445 0.449 0.444 0.014 Male 0.446 0.487 0.446 0.443 0.077 Hispanic 0.454 0.451 0.454 0.452 0.048 Non-Hispanic 0.449 1.714 0.449 0.429 0.146 White 0.429 0.535 0.429 0.439 0.276 Black or African American 0.442 0.430 0.442 0.438 0.098 AIAN 0.451 0.445 0.451 0.452 0.065 Asian 0.591 0.582 0.591 0.593 0.560 NHOPI 0.490 0.483 0.490 0.491 0.124 25 of 30

Real Time Results The SJK method generally has the lowest CVs Amount of time to run the SJK method is extremely high For our small study sample, the SJK method took 12.6 times longer For the full 2007 SBO sample, the SJK method took 73 times longer SJK method would take over a month to run all the estimates 26 of 30

Conclusion SJK variance estimator was the superior method Consistently produced a low CV Showed little difference in RB Processing time for SJK method would take too long Recommend future research into more efficient processing for the SJK variance estimator 27 of 30

Conclusion 28 of 30

Acknowledgements Sandy Peterson Maxwell Mitchell Jeffrey Dalzell Robin Gibson Terry Pennington Beth Schlein Meijin Ye 29 of 30

Contact Information Marilyn Balogh, Mathematical Statistician, US Census Bureau Marilyn.K.Balogh@census.gov Sandy Peterson, Mathematical Statistician, US Census Bureau Sandra.Peterson@census.gov General SBO inquiries Phone: 888.225.4022 or 301.763.3316 Email: csd.sbo@census.gov 30 30 of 30