Assessing Geocoding Solutions

Similar documents
Violent Intent Modeling System

Central Cancer Registry Geocoding Needs


Cross-Border Communication for Public Safety Licensees

PRODUCT INFORMATION FORM (PIF TM )

A Knowledge-Centric Approach for Complex Systems. Chris R. Powell 1/29/2015

Pan-Canadian Trust Framework Overview

2

Digital Engineering Support to Mission Engineering

THE AMERICAN INTELLECTUAL PROPERTY LAW ASSOCIATION RECOMMENDATIONS REGARDING QUALIFICATIONS FOR

Confidently Assess Risk Using Public Records Data with Scalable Automated Linking Technology (SALT)

Modeling & Simulation Roadmap for JSTO-CBD IS CAPO

AAPSilver System Performance Validation

Interagency Working Group on Import Safety. Executive Order July 18, 2007

A Probabilistic Geocoding System based on a National Address File

CRA Wiz & Fair Lending Wiz Geocoding Basics. August 2017

Interoperable systems that are trusted and secure

PREFACE. Introduction

Training that is standardized and supports the effective operations of NIIMS.

Using Location-Based Services to Improve Census and Demographic Statistical Data. Deirdre Dalpiaz Bishop May 17, 2012

Strategy for a Digital Preservation Program. Library and Archives Canada

Digital System Models: An Investigation of the Non-Technical Challenges and Research Needs

LMR Encryption Navigating Recent FCC Rule Changes

Proposed Curriculum Master of Science in Systems Engineering for The MITRE Corporation

Protection of Privacy Policy

Market Briefing: S&P 500 Bull & Bear Markets & Corrections

Executive Summary: Understanding Risk Communication Best Practices and Theory

Appendix B: Example Research-Activity Description

Issues in Emerging Health Technologies Bulletin Process

Project 25 Mission Critical PTT

Translation University of Tokyo Intellectual Property Policy

This is a preview - click here to buy the full publication

Proposed International Standard on Auditing 315 (Revised) Identifying and Assessing the Risks of Material Misstatement

System of Systems Software Assurance

TRUSTING THE MIND OF A MACHINE

Narrow-banding What It Means to Public Safety Webinar

Committee on Development and Intellectual Property (CDIP)

Communications Interoperability- Current Status

Presentation by Matthias Reister Chief, International Merchandise Trade Statistics

Type Approval JANUARY The electronic pdf version of this document found through is the officially binding version

What is a collection in digital libraries?

Statistical Thinking & Methodology: Pillars of Data Availability & Quality in the Big Data Era

US Economic Indicators: Atlanta Fed s Median Wage Growth Tracker

High Precision Positioning Unit 1: Accuracy, Precision, and Error Student Exercise

INTERNAL AUDIT DIVISION CLERK OF THE CIRCUIT COURT

The Geotechnical Data Journey How the Way We View Data is Being Transformed

Lawyer Referral Service Membership Manual. For Attorneys and Staff

Policy on Patents (CA)

Green Globes, GPC + Green Building Initiative GRAPHIC STANDARDS GUIDE

A Case Study on the Use of Unstructured Data in Healthcare Analytics. Analysis of Images for Diabetic Retinopathy

Models, Simulations, and Digital Engineering in Systems Engineering Restructure (Defense Acquisition University CLE011)

RFP No. 794/18/10/2017. Research Design and Implementation Requirements: Centres of Competence Research Project

GESIS Leibniz Institute for the Social Sciences

DEFENSE ACQUISITION UNIVERSITY EMPLOYEE SELF-ASSESSMENT. Outcomes and Enablers

Report to Congress regarding the Terrorism Information Awareness Program

Improving Emergency Response and Human- Robotic Performance

CHARTER ON THE PROTECTION AND MANAGEMENT OF UNDERWATER CULTURAL HERITAGE (1996)

Digital Engineering. Phoenix Integration Conference Ms. Philomena Zimmerman. Deputy Director, Engineering Tools and Environments.

Military Robotics - Emerging Trends and Future Outlook. Reference code: DF4580PR Published: July 2015 Single user price: US$1950

West Norfolk CCG. CCG 360 o stakeholder survey 2014 Main report. Version 1 Internal Use Only Version 7 Internal Use Only

KT for TT Ensuring Technologybased R&D matters to Stakeholders. Center on Knowledge Translation for Technology Transfer University at Buffalo

Challenges and Innovations in Digital Systems Engineering

Indigenous and Public Engagement Working Group Revised Recommendations Submitted to the SMR Roadmap Steering Committee August 17, 2018

Systems Approaches to Health and Wellbeing in the Changing Urban Environment

2016 Smart Cities Survey Summary Report of Survey Results

December Eucomed HTA Position Paper UK support from ABHI

The U.S. Decennial Census A Brief History

Unit 2: Understanding NIMS

MAPS & ENHANCED CONTENT

Federal Partnership for Interoperable Communications

HOTELS, TOURISM & LEISURE. Hotels, Tourism & Leisure

MINISTRY OF HEALTH STAGE PROBITY REPORT. 26 July 2016

ITU/ITSO Workshop on Satellite Communications, AFRALTI, Nairobi Kenya, 17-21, July, Policy and Regulatory Guidelines for Satellite Services

Enfield CCG. CCG 360 o stakeholder survey 2014 Summary report. Version 1 Internal Use Only Version 1 Internal Use Only

Sutton CCG. CCG 360 o stakeholder survey 2014 Summary report. Version 1 Internal Use Only Version 1 Internal Use Only

UPGRADE YOUR MPT NETWORK THE SMART WAY. harris.com #harriscorp

THE NATIONAL LITTER POLLUTION MONITORING SYSTEM LITTER MONITORING BODY 2017 AUDIT REPORT

1. Redistributions of documents, or parts of documents, must retain the SWGIT cover page containing the disclaimer.

Expression Of Interest

Management of Toxic Materials in DoD: The Emerging Contaminants Program

Phase 1 US Compliance Report

Lecture 8 Geocoding. Dr. Zhang Spring, 2017

Toward Improved Visualization of Unstructured Information

What Works Cities Brief: The City Hall Data Gap

Symposium 2001/36 20 July English

the practice of law the way it should be

SERS primary mission was to design, purchase, build and operate a county-wide 800 MHz radio system along with supporting infrastructure and

What makes a co-ordinate unique?

Introduction. Data Source

By RE: June 2015 Exposure Draft, Nordic Federation Standard for Audits of Small Entities (SASE)

Case Study: The Autodesk Virtual Assistant

Global Alliance for Genomics & Health Data Sharing Lexicon

Tidal Energy. Transmission & Distribution Network. Wind Energy. Offshore Substation. Onshore Substation. Tidal Stream Energy.

5 TH MANAGEMENT SEMINARS FOR HEADS OF NATIONAL STATISTICAL OFFICES (NSO) IN ASIA AND THE PACIFIC SEPTEMBER 2006, DAEJEON, REPUBLIC OF KOREA

MEMORANDUM OF UNDERSTANDING BETWEEN THE BUREAU OF LAND MANAGEMENT AND THE FRIENDS OF THE MUSTANGS

Evidence Standards and AI

Highlights from the Vaccine Safety Net meeting

VHF/UHF Narrowbanding Information for Public Safety Licensees

The Credit Reporting Industry is About to Experience the Biggest Change in Decades... Are You Prepared?

The Corporation of the City of Nelson Office of the Finance and Purchasing Manager Telephone : (250) Fax : (250)

Transcription:

Assessing Geocoding Solutions Carrie Muenks & Chris Lawrence September 9, 2014

2 Homeland Security Systems Engineering and Development Institute The Homeland Security Systems Engineering and Development Institute (hereafter HS SEDI or SEDI ) is a federally funded research and development center (FFRDC) established by the Secretary of Homeland Security under Section 305 of the Homeland Security Act of 2002. The MITRE Corporation operates SEDI under the Department of Homeland Security (DHS) contract number HSHQDC-09-D-00001. SEDI s mission is to assist the Secretary of Homeland Security, the Under Secretary for Science and Technology, and the DHS operating elements in addressing national homeland security system development issues where enterprise, lifecycle, and/or acquisition systems engineering expertise is required. SEDI also consults with other government agencies, nongovernmental organizations, institutions of higher education, and nonprofit organizations. SEDI delivers independent and objective analyses and advice to support systems development, decision making, alternative approaches, and new insight into significant acquisition issues. SEDI s research is undertaken by mutual consent with DHS and is organized by Tasks in the annual SEDI Research Plan. This briefing does not necessarily reflect official DHS opinion or policy. This briefing was prepared for public release. Approved for Public Release; Distribution Unlimited. Case Number 14-2511

3 Outline Definitions Geocoding Requirements Methodology Quantitative Test Procedure Test Datasets User Simulation Accuracy Analysis Qualitative Test Procedure Software Requirements Scalability of Assessment

Definitions 4 Geocoding process for converting street addresses into spatial data that can be displayed as features on a map Geocoding solution comprised of a geocoding engine and geocoding reference data Geocoding engine - entity in the geocoding framework that drives the geocoding process. The engine maps to the reference data source, based on the geographic places (e.g., country code) listed in the non-spatial data file Then, the engine determines the appropriate algorithms for standardizing the addresses and matching them to the reference data Finally, the engine defines parameters for reading address data, matching address data to the reference data, and creating output Geocoding reference data - data that a geocoding service uses to determine the geometric representations for locations

I need a geocoding solution 5 What are the requirements for a geocoding solution and/or output? Do you just need information from the vendor or do you want to independently test the geocoding solution? How soon do you need to make a decision? Is cost an issue?

Geocoding Requirements 6 Accuracy Latitude and longitude coordinate in relation to what is true on the ground Precision The level of precision (i.e., decimal places) needed within the latitude and longitude coordinates Positional accuracy Acceptable latitude and longitude coordinates are dependent on the use case, especially for international locations Reference data coverage The reference data can affect the accuracy, precision, and positional accuracy of the output Geodetic aspects Knowledge of the coordinate system and projection of output data is needed

Geocoding Requirements 7 Processing environment Currency and reference data can differ between disconnected and web-based environments Data structure Structured vs unstructured data formats can affect the ability of the geocoder to assign latitude and longitude coordinates Output information Output should include: latitude and longitude coordinates, positional accuracy, address associated with the coordinates, and a confidence score at minimum for users to understand the output properly. Cost Cost can be a limiting factor and would likely influence any decision for a geocoding solution.

8 Geocoding Requirements Larger enterprise considerations Performance Amount of addresses processed within a specified time and the number of concurrent users Customization The ability to customize the user experience or processing options. Up-front customizations might reduce the processing time

9 Methodology Three-Phased Approach Discovery Phase Review Phase Analysis Phase Discovery Phase COTS Products eliminated Review Phase eliminated Analysis Phase

10 Methodology Discovery Phase Exploration (i.e., industry survey) of the product space to identify geocoding solutions. Review Phase Qualitative study of product capabilities according to vendor-provided and publicly available resources against a set of requirements. Analysis Phase Quantitative and qualitative study of product capabilities and performance based on hands-on use. Cost Proposals would be solicited during the Analysis Phase.

11 Quantitative Test Procedure Test procedure (Analysis Phase) comprised of 2 approaches User Simulation 3-tiered method to simulate increasing degrees of the user s perceived understanding in using each of the geocoders. Accuracy Assessment Generate the latitude and longitude coordinates for each grouping of test data and assess the additional output fields. Geocoding Processes

12 Test Datasets Truth Data Records obtained from authoritative sources where the latitude, longitude, and positional accuracy of the addresses were considered trusted Synthetic Data Truth data records where elements within the address were intentionally and systematically altered to simulate dirty data Nonexistent state Incorrect state abbreviation Nonexistent country code Incorrect postal code Spacing Incorrect street type Incorrect ordering Incomplete address Misspelling Multiple issues

13 User Simulation Designed to simulate the various skills of users and how they typically approach interacting with new software 1. Tester launched the geocoder software and attempted to geocode the test dataset without any external guidance and/or documentation Simulate a user who tries to figure out the software on his/her own 2. Tester read the geocoder s documentation and then attempted to geocode Simulate a user who wanted to be more informed and was driven to be so, prior to working with a new software product Simulate a user who had failed attempts with the first non-methodical approach 3. Tester contacted the vendor to confirm the recommended approach to processing the test dataset via the geocoder Simulate a trained or less novice user Simulate a user who experienced failed attempts on the previous methods and who was now seeking help desk support

14 Accuracy Analysis Results were binned according to the test data s positional accuracy for each record and the positional accuracy of the geocoder s output for that same record. The bins were directional, which means if the geocoder s output was at a less coarse level (e.g., parcel) than the truth data (e.g., street range), the output was categorized as such and different from where the truth data was less coarse than the output Bin Truth Positional Accuracy Geocoder Positional Accuracy 1 Parcel Parcel 2 Parcel Street Centerpoint 3 Street Range Street Centerpoint 4 Street Range Parcel 5 Accuracy was determined by calculating the distance between the truth latitude and longitude coordinates and the geocoder s output latitude and longitude coordinates for the same address

15 Qualitative Assessment The qualitative analysis (Analysis Phase) focused on areas that are not easily quantifiable but were important to this assessment Total scores for each geocoding solution were calculated based on answers to the qualitative questions The importance of a factor was handled through weighting

16 Qualitative Requirements Availability Amount of time the system must be operational and available for use Reliability The probability of failure on demand Data Retention Amount of time data must be stored and archived Robustness The degree to which system is able to handle error conditions gracefully, without failure Scalability The ability to handle a wide variety of system configuration sizes

Qualitative Requirements 17 Interoperability The ability of two or more diverse systems or components to exchange information and use the information that has been exchanged Maintainability The ability of a system to be maintained through updates, upgrades, and failure Portability A property of software that enables it to be transferred from one environment to another Security The ability to protect systems, information, and services from unintended or unauthorized access, change, or destruction Auditability The ability to log, review, and analyze events, transactions, and effectiveness

Qualitative Requirements 18 Transition The ability to load required data from various sources into the system for operations with data changed as necessary for system use Usability/Human Factors/User Interface/Aesthetics The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction Documentation Conditions for user-focused and/or technical materials that describe the use and the operation of the system Resources/Resource Management These conditions address responsibilities for acquisition or monitoring of personnel and equipment for the development, operation, and support of the product Regulatory/Programmatic Specific laws and regulations that constrain COTS solution selection

19 Scalability of Assessment Number of primary requirements (functional and non-functional) Consideration of secondary requirements User s characteristics Extent of testing needed Time frame and manpower Availability of a test environment and trial licenses Lifespan of the solution Needed scalability of the solution within your organization

Questions? 20 Carrie Muenks cmuenks@mitre.org Chris Lawrence clawrence@mitre.org McLean, Virginia