Using Location-Based Services to Improve Census and Demographic Statistical Data Deirdre Dalpiaz Bishop May 17, 2012
U.S. Census Bureau Mission To serve as the leading source of quality data about the nation's people and economy Central to achieving our mission: data collection, data processing, data tabulation, and data dissemination 2
Data Collection Mail-in surveys Census enumerators Hand-held devices 3
Data Tabulation Electronic Tabulator Data processing Hollerith machine Optical Sensing Device (FOSDIC) 4
Data Dissemination 5
1860 Data Visualization Population by Census Tract Population by County 6
The Decennial Census Required by the U.S. Constitution Article 1, Section 2 A complete count of the population Every 10 years April 1 is our reference date Largest peacetime activity undertaken by the federal government Data obtained from the Census Determines how many representatives a state will have in the U.S. House of Representatives Creates local districts for elections, schools, and utilities Used in part to allocate $300 billion in federal funds every year Helps government and private industry make informed decisions Determining where to locate new housing, businesses, and public institutions Examining the demographic characteristics of communities, cities, states, and the nation 7
8
9
The Geographic Foundation The MAF/TIGER System (processing environment and address and spatial data) is the geographic framework that allows us to accomplish the Census Bureau s data collection, processing, tabulation, and dissemination programs The MAF/TIGER Database is The Master Address File (MAF) Used to create the decennial census Address List The Topologically Integrated Geographic Encoding and Referencing System (TIGER) 10
The MAF/TIGER Database Contains: For every housing unit known to the Census Bureau An address or physical description Manual or GPS structure points Spatial features, including all streets and hydrography Geographic areas and their boundaries Legal, statistical, and field assignment areas Provides the infrastructure for geocoding 11
The MAF/TIGER Database 010230123456780 010230123456781 010230123456782 010230123456783 010230123456784 010230123456785 010230123456780 Tabulation Block Code (fictitious data for this graphic) Every Structure Point has associated address information 12
MAF/TIGER Database Update Sources Mailing addresses U.S. Postal Service Delivery Sequence File A computerized file that contains all addresses serviced by the U.S. Postal Service Each address is a separate record Additional address information, including GPS or manual coordinates, and streets Partner-supplied GIS Files Multiple, ongoing Partnership Programs with tribal, state, and local governments Imagery Census Bureau field data collection operations Address Canvassing (once a decade) Community Address Update System (continuous) Current Surveys (various) The Economic Census (planned updates) Address Updates 123 Testdata Road Anytown, CA 94939 Lat 37 degrees, 9.6 minutes N Lon 119 degrees, 45.1 minutes W Address & Street Updates 13
The First Major Decennial Census Operation Address Canvassing Conducted the year prior to the decennial census to build an up-to-date Address List that is used to mail or hand deliver a census questionnaire to every household A successful operation translates to a more accurate address list and, ultimately, a more accurate count of the population Census Bureau field staff canvass by comparing what they find on the ground to what is already on the Address List Verify, update, or delete addresses currently on the Address List Add addresses missing from the Address List 14
The First Use of Location Based Services 2009 Address Canvassing LBS first used in last decade s Address Canvass operation, with the goals of Capturing quality data Collect a GPS structure point for each address with the ultimate goal of assigning the correct housing unit to its correct tabulation area 100% of the time Improved positional accuracy requirement Assist field staff in identifying housing units that appear on the Address List two or more time (commonly called duplicates ) with different addresses Road network improvement Field data collection efficiency GPS structure points to ensure field staff can easily locate the correct housing unit during required follow-up operations Cost Reduction Ultimately, field staff efficiencies with improved address locatability and decreased travel time (advanced routing and navigation) Handheld computers reduce large volumes of paper required for past censuses 15
Using GPS to Identify Duplicate Addresses M a i n S t r e e t King Street 107 King Street? GPS can help solve the confusion of duplicates by providing an exact location for each address in the MAF/TIGER Database Latitude N 34 o 18' 00'' Longitude W 81 o 37' 00'' Latitude N 34 o 18' 00'' Longitude W 81 o 37' 00'' Shady Knoll Home Latitude N 34 o 18' 00'' Longitude W 81 o 37' 00'' 16 16
Preparing for the Successful Use of LBS Improving the positional accuracy of the MAF/TIGER Database Realigning existing roads to use GPS technology From variable accuracy (up to 60 meter) to 7.6 meter or better accuracy 7.6 meter road network supports 3.0 meter GPS structure point collection to ensure placement in the correct census block 99.6% of the time Developing Handheld Computers (HHCs) Displayed a GPS-powered You Are Here Indicator (YAHI) for navigating/locating Field staff initiated manual structure point collection Concurrent, behind-the-scenes collection of a GPS structure point; field staff unaware of this collection Ability to collect new roads using GPS Additional GPS requirement Post Processing Handheld Computers had Wide Area Augmentation System (WAAS) capability providing a 3 meter or less radial accuracy (95%) in an unobstructed environment The 3 meter accuracy coverage was extended by post processing GPS structure points with Continuously Operating Reference Station (CORS) when the WAAS signal was not available 17
A Realigned MAF/TIGER Database REALIGNED MAF/TIGER DATABASE Used in 2009 Address Canvassing INITIAL MAF/TIGER DATABASE Inaccurate road locations preclude adopting GPS technology Pictured are good GPS structure points over mislocated TIGER street centerlines that place many houses on the wrong side of the road and, therefore, in the wrong census block 18 4 10 18 8
Addresses Address Canvassing Results 144,890,809 addresses were sent to Address Canvassing 134,171,391 addresses when forward to enumeration following Address Canvassing Significant reduction of duplicate addresses and incorrect addresses Roads: A total of 2,756,444 actions 623,544 roads added; 97% using GPS 758,166 roads deleted 1,189,403 road names updated 19
Address Canvassing LBS Use Lessons Learned Handheld Computer performance issues impact data collection capabilities Hardware Power requirements (battery life) Usability in extreme conditions (heat/cold) Screen viewing (sunlight) Software Scalability (device freeze due to excessive data) GPS Technology limitations Signal interference Roofs, porches, awnings, trees, towers, tall buildings Signal distortion Large buildings, large metal objects (trailers, sheds, silos), bare rock faces Collection time length (up to 20 seconds) Capture difficulties due to structure characteristics Row houses & closeness factor (to one another & to road) High-rise buildings; inability to capture indoor front doors 20
Procedural issues Address Canvassing LBS Use Lessons Learned Where to collect the structure point? GPS structure point capture awareness A new twist on an old problem: curbstoning Determining precedence: manual or GPS structure points Which to trust Successfully capturing a multitude of structure types Capturing one address for multiple structures Capturing multiple units within one structure Difficulty of GPS road collection 21
Structure Point Collection Where? 2 2 Side Door Back Door Front Door Garage Door Key Structure 1 3 1 Pathway 4 Driveway 4 Priority GPS Capture Location Street 22 Multiple options resulted in a range of positions relative to the housing unit, negative impact on quality Census instructions differed from common practices for locating address points in commercial and government organizations Center of rooftop, center of parcel, or at the street Impact ability to ingest and reconcile data from partner GIS files in the future 10 22 8
Procedure Issue: A Curbstoning Cluster According to procedures, the field staff should have been at the housing unit while collecting the structure point Curbstoning cluster group of GPS coordinates representing the field staff s location when recording manual structure points 23 Manual structure point GPS location for all manually collected structure points 10 23 8
Looking (and Planning) Ahead Despite some difficulties encountered during the 2009 Address Canvassing, the use of LBS in Census Bureau operations is seen as a critical element moving forward for producing highly accurate, quality data and support the use and analysis of statistical data The Census Bureau is moving forward with incorporating LBS as an integral component of all its field data collection operations The Handheld Computer was designed to meet the needs of a single operation - Address Canvassing A second device, the Automated Listing and Mapping Instrument, has been retro-fitted with GPS and is being used in the short term for limited current surveys field work There is an acknowledged need for a single Corporate Listing Device and application that has LBS functionality 24
General Considerations for Future LBS Use accuracy requirements for coordinates that are relevant How accurate is accurate enough? Lowest possible cost Minimal impact on existing network and equipment Support technological innovation throughout the decade Which Global Navigation Satellite System & Data Correction? Carrier Phase GPS L1 & L1 Carrier USA GLONASS Russia GALILEO EU Satellite Based Augmentation Systems (SBAS) Wide Area Augmentation System (WAAS) Real Time Kinematic (RTK) OmniSTAR Data Correction Nationwide Differential GPS (NDGPS) Continuously Operating Reference Stations (COORS) NONE? 25
A Corporate Listing Device Must meet the data collection needs of all Census Bureau programs throughout the decade We are now developing high-level requirements that define the functions of the Listing and Mapping Application (LiMA) The requirements team is comprised of representatives from all potential programs who will use the device to ensure understanding of the entire universe of needs The application will be platform independent (executable on many devices) There may be multiple devices rather than one used; currently reviewing devices to ensure Operating System requirements Data transmission needs GNSS functionality Potential for Bring Your Own Device 26
Devices? HHC ALMI 27
Specifics Considerations: Applying Lessons Learned Up front identification of all requirements for all programs Achieve consensus on common requirements Consistency of collection Allows for a single set of procedures used by all programs Also documenting program-specific requirements Programmers participate; alleviates misunderstandings Respond to both via a modular (and AGILE) approach to application development Resource and schedule allotted to meet differing program start dates The importance of data quality standards Geography Division has developed minimum address standards to ensure the quality of field data collected addresses Refine the use of GNSS Structure Points A transparent approach; The field staff will KNOW when GNSS structure point collection occurs The field staff can initiate more than one attempt at GNSS structure point collection The efficiency of collecting roads? An alternate solution to circle areas of new roads; headquarters staff can then capture from imagery Improved routing / navigation to meet specific field staff needs 28
The Bigger Picture: Mobile Computing Mobile computing is becoming pervasive and presents an organizational challenge Mobile device development cycle is settling down There are more choices for specific devices and functions Our organizational move from in-house development to COTS There are approximately 12 separate, high-level IT activities/approaches to mobile computing at the Census Bureau We are stepping back and becoming better organized Developing a Mobile Computing Strategy A roadmap to this decade and beyond Making practical choices 29
The Census Bureau is committed to continuing use of improved Location Based Services to improve data quality, and ultimately provide improved statistical data 30
Maps from the 2010 Census 73 31
1 Statistical Information from the 2010 Decennial Census is reported at the Census Block level HOW DO STATISTICS AND SPATIAL DATA COMBINE FOR INFORMATION? White House Washington, DC Census Tract 6202 2 Geographic entities (e.g., Census tracts) can be linked to statistical information using the GEO codes allowing the statistical data to be used for mapping applications. 3 Statistical map of Washington DC from the Social Explorer website showing the percentage of Black Population (76%) within Census Tract 6202 (dark blue polygon) 32
33
34
Thank you Questions? Leslie Godwin Assistant Division Chief, Geographic Program Management Geography Division U.S. Census Bureau Leslie.S.Godwin@Census.Gov Phone: 001-301-763-9077