Health Record Linkage at Statistics Canada

Similar documents
Record Linkage between the 2006 Census of the Population and the Canadian Mortality Database

2011 National Household Survey (NHS): design and quality

A Special Case of integrating administrative data and collection data in the context of the 2016 Canadian Census

Article. Unintentional injury hospitalizations and socio-economic status in areas with a high percentage of First Nations identity residents

How Statistics Canada Identifies Aboriginal Peoples

The SCOTTISH LONGITUDINAL STUDY (SLS)

Data Sources & Limitations

NILS-RSU Introductory Information

Aboriginal Demographics. Planning, Research and Statistics Branch

MINISTERIAL DIRECTIVE TO SERVICE MANAGERS UNDER S OF THE HOUSING SERVICES ACT, 2011

Tabling of Stewart Clatworthy s Report: An Assessment of the Population Impacts of Select Hypothetical Amendments to Section 6 of the Indian Act

Supplementary questionnaire on the 2011 Population and Housing Census SWITZERLAND

THE 1991 POST-CENSAL ABORIGINAL PEOPLES SURVEY: AN OVERVIEW OF THE DEVELOPMENT OF THE SURVEY, IT'S CONTENT AND PLANNED OUTPUTS

2016 Census of Population: Age and sex release

HEALTH STATUS. Health Status

1981 CENSUS COVERAGE OF THE NATIVE POPULATION IN MANITOBA AND SASKATCHEWAN

Supplementary questionnaire on the 2011 Population and Housing Census SLOVAKIA

Albania - Demographic and Health Survey

2017 Regional Discussions

0-4 years: 8% 7% 5-14 years: 13% 12% years: 6% 6% years: 65% 66% 65+ years: 8% 10%

; ECONOMIC AND SOCIAL COUNCIL

The Finnish Social Statistics System and its Potential

Strategies for the 2010 Population Census of Japan

LOGO GENERAL STATISTICS OFFICE OF VIETNAM

Census Response Rate, 1970 to 1990, and Projected Response Rate in 2000

WORLD HEALTH ORGANIZATION - Questionnaire on mortality data

Section 2: Preparing the Sample Overview

Geocoding regional and remote poor quality address records with confidence

Supplement No. 7 published with Gazette No. 18 dated 30 August, THE STATISTICS LAW (1996 REVISION) THE CENSUS (CAYMAN ISLANDS) ORDER, 2010

Some Indicators of Sample Representativeness and Attrition Bias for BHPS and Understanding Society

Manifold s Methodology for Updating Population Estimates and Projections

NISRA Merged Report. Area Profile Report. Created Friday, July 04, :54 PM. Page 1

United Nations Demographic Yearbook Data Collection System

The ONS Longitudinal Study

aboriginal policy studies Fertility of Aboriginal People in Canada: An Overview of Trends at the Turn of the 21st Century

March 2018 CCG localities profile for Hertfordshire

Postal Code Conversion for Data Analysis

Inuit Research Comes to the Fore

census 2016: count yourself in

The Internet Response Method: Impact on the Canadian Census of Population data

THE SCOTTISH LONGITUDINAL STUDY Tracing rates and sample quality for the 1991 Census SLS sample

Overview of Demographic Data

1996 CENSUS: ABORIGINAL DATA 2 HIGHLIGHTS

Response ID ANON-TX5D-M5FX-5

UNICEF Mexico/Mauricio Ramos BIRTH REGISTRATION IN LATIN AMERICA AND THE CARIBBEAN: CLOSING THE GAPS 2016 UPDATE

The main focus of the survey is to measure income, unemployment, and poverty.

The ONS Longitudinal Study

How It Works and What s at Stake for Massachusetts. Wednesday, October 24, :30-10:30 a.m.

2012 UN International Seminar for Global Agenda - The Population and Housing Census. Hyong-Joon Noh Statistics Korea

Methodology Statement: 2011 Australian Census Demographic Variables

Country presentation

Overview of Civil Registration and Vital Statistics systems

Neighbourhood Profiles Census and National Household Survey

United Nations Demographic Yearbook review

Statistics for Development in Pacific Island Countries: State-of-the-art, Challenges and Opportunities

Examples of Record Linkage Studies from Norway and Bosnia

An Overview of the American Community Survey

The Accuracy and Coverage of Internet based Data collection for Korea Population and Housing Census

Evaluation and analysis of socioeconomic data collected from censuses. United Nations Statistics Division

The progress in the use of registers and administrative records. Submitted by the Department of Statistics of the Republic of Lithuania

1) Analysis of spatial differences in patterns of cohabitation from IECM census samples - French and Spanish regions

Neighbourhood Profiles Census and National Household Survey

UK Data Service Introduction to Census

Lessons learned from a mixed-mode census for the future of social statistics

Zambia - Demographic and Health Survey 2007

Benefits of Sample long Form to Enlarge the scope of Census Data Analysis: The Experience Of Bangladesh

Methods and Techniques Used for Statistical Investigation

American Community Survey Review and Tips for American Fact Finder. Sarah Ehresman Kentucky State Data Center August 7, 2014

Lesson Learned from the 2010 Indonesia Population and Housing Census Dudy S. Sulaiman, BPS-Statistics Indonesia

Monday, 1 December 2014

Visible Minority and Population Group Reference Guide

First insights: Population change for Territory Growth Towns, 2001 to 2011 Dr Andrew Taylor (**)

The Census questions. factsheet 9. A look at the questions asked in Northern Ireland and why we ask them

End of the Census. Why does the Census need reforming? Seminar Series POPULATION PATTERNS. seeing retirement differently

Country report Germany

MALAYSIA STRATEGIES FOR IMPROVING CIVIL REGISTRATION AND VITAL STATISTICS SYSTEMS

BMC Health Services Research

K.R.N.SHONIWA Director of the Production Division Zimbabwe National Statistics Agency

Adjusting for linkage errors to analyse coverage of the Integrated Data Infrastructure (IDI) and the administrative population (IDI-ERP)

Collection and dissemination of national census data through the United Nations Demographic Yearbook *

HOW TO BUILD GEODEMOGRAPHICS FROM BIG DATA. March 2016 Graham Smith, Associate Director

Counting the People of Rwanda

NCRIS Capability 5.7: Population Health and Clinical Data Linkage

Use of Registers in the Traditional Censuses and in the 2008 Integrated Census International Conference on Census methods Washington, DC 2014

The 2010 Census: Count Question Resolution Program

Presented by Doris Ma Fat on behalf of the. Department of Health Statistics and Information Systems World Health Organization, Geneva

population and housing censuses in Viet Nam: experiences of 1999 census and main ideas for the next census Paper prepared for the 22 nd

COMPONENTS OF POPULATION GROWTH IN SEOUL: * Eui Young Y u. California State College, Los Angeles

The 1999 Population Census in the Republic of Kazakhstan CENSUS QUESTIONNAIRE 3C

Demographic and Social Statistics in the United Nations Demographic Yearbook*

Population Censuses and Migration Statistics. Keiko Osaki Tomita, Ph.D.

National approaches to the dissemination of demographic statistics and their implication for the Demographic Yearbook

Tonga - National Population and Housing Census 2011

Canadian Census Records

Preparing IPUMS samples for Ireland. Deirdre Cullen Senior Statistican

Digit preference in Iranian age data

SAMOA - Samoa National Population and Housing Census 2006

Enhanced reporting of deaths among Aboriginal and Torres Strait Islander peoples using linked administrative health datasets

Austria Documentation

Finding U.S. Census Data with American FactFinder Tutorial

Transcription:

Health Record Linkage at Statistics Canada www.statcan.gc.ca Telling Canada s story in numbers Nicole Aitken, Philippe Finès Statistics Canada Thursday, November 16 th 2017

Why use linked data? Harnessing the full potential of data Innovations in linking data Improve the care and health of Canadians High analytical potential: allows researchers to fill data gaps 2

What is Record Linkage? A process whereby personal identifiers are used to identify the same people in different datasources Name, date of birth, health card number, postal code Canadian Health Measures Survey to Canadian Cancer Registry 3

Record linkage at Statistics Canada Secure virtual linkage environment that stores only personal identifiers in a protected depository that is used to generate linkage keys across data sources. Keys are stored separately from data. Do NOT create large integrated data bases of survey information about individuals. Strong governance, adherence to policy and privacy requirements. Suite of services, tools and support for analysts and external researchers conducting record linkage activities within the social domain. 4

How does it work? 5

Linked Data Available to All in the RDCs 6

Process to access linked RDC data Secondary use of existing linked data-sources Have a research question Access the data in an Research Data Center (RDC) following standard RDC procedures Submit a project proposal Complete the application form 7

What linked health data are available in the RDC now? Census 2006 to Discharge Abstract Database (2006 to 2008) Canadian Community Health Survey (CCHS) Annual (2000 to 2011) and Focus (1.2, 2.2 and 4.2) to: Canadian Vital Statistics Deaths (CVSD; 2000 to 2015) Discharge Abstract Database (1999/2000 to 2012/13) 1991 and 2001 Canadian Census Health and Environment Cohorts (CanCHEC) Weights will be available by the end of the calendar year Perinatal Outcomes (2006 Canadian Birth-Census Cohort) 8

What linked health data are coming to the RDC? DAD (2000/01-2014/15); NACRS (2000/01-2014/15) and; OMHRS (2005/06-2014/15) to CVSD (2000-2012) CVSD (2008-2014) to DAD (2004/05-2014/15); NACRS (2004/05-2014/15) Canadian Cancer Registry (1992 to 2014) to deaths (1992 to 2014) 1996 CanCHEC followed for mortality to 2013 (with weights) 2001 CanCHEC followed for cancer to 2013 (with weights) Note: DAD= Discharge Abstract Database NACRS=National Ambulatory Care Reporting System OMHRS=Ontario Mental Health Reporting System CVSD= Canadian Vital Statistics Death Database CanCHEC = Canadian Census Health and Environment Cohort 9

What linked health data are coming to the RDC? Canadian Cancer Registry (CCR; 1992 to 2014) to DAD and NACRS Tax (income data) 2016 Census Longitudinal Immigration Database (IMDB) CVSD Canadian Community Health Survey (CCHS) Annual (2003-2014) and Focus (1.2, 5.2) to Longitudinal Immigration Database (1980-2013) Occupational Cohorts: National Dose Registry to CVSD and CCR Newfoundland Fluorspar Miners cohort to CVSD and CCR 10

For more information HSD Record Linkage Mailbox statcan.hsdrecordlinkage-dsscouplageenregistrements.statcan@canada.ca Evan Green Evan.Green@Canada.ca 11

12 Transition to part II

Summary Details on the databases 3 linked databases The Census DAD linked database The CCHS CMDB DAD linked database The Census Tax Mortality Cancer linked database 13

14 Details on the databases

Canadian Community Health Survey (CCHS) Large, biennial, cross-sectional survey (~130,000); after 2007, annual survey (~65,000); Covers the household population aged 12+ representing ~98% Excludes members of the regular Forces, institutionalized, Indian Reserves, and some remote areas Regular collection since 2000/01 Core Content: health status, Risk behaviours, chronic conditions, socioeconomic indicators Focus content since 2002 Topics include mental health (Cycle 1.2), food intake (Cycle 2.2), aging (Cycle 4.2) Sample size (~30,000) 15

Census Long form (20% representative sample of the Canadian household population) Income personal, household, source Immigration time of immigration, world region of birth, generational status Ethnicity Household composition - marital status, relationship of occupants, living arrangements Housing type, tenure, need of repair Collective dwellings - rooming houses, hotels and shelters Language - mother tongue, home language, knowledge of official language Disability status Rural-urban residence Indigenous status.and on and on. 16

Discharge Abstract Database (DAD) Obtained from the Canadian Institute of Health Information (CIHI) DAD 2005/06 through 2008/09 used for pre-processing DAD 2006/07 through 2008/09 used for record linkage Census of discharges from acute care hospitals (~3 million records per yr) (excludes Quebec) Contains demographic, non-medical administrative and clinical information (diagnostics and interventions) Use of resources via the Resource Intensity Weights which used in combination with costs of hospital stays (per day) can be used to derive costs. Able to count events but also create patient histories by linking hospitalizations at the person-level using personal health numbers 17

Mortality and place of residence Canadian Vital Statistics Death Database (CVSD) 2000 to 2009 Census of deaths in Canada Underlying cause of death, date of death, age at death Tax file 1990 to 2009 Tax filers Annual place of residence (postal code on tax return) 18

Some words on Validation Two parts of validation: Internal validation quality of the linkage (error rates) Do the linked pairs represent good links? Are there any missed links among the non-linked pairs? External validation quality of the linked data (representativeness of analytical file) Do the outcomes in the linked data file represent the experiences of the population of interest? 19

20 1) 2006 Canadian Census and Discharge Abstract Database Linkage

Context To better understand the health outcomes and healthcare use of specific sub-populations Immigrants, Indigenous groups Identify and quantify differences Understand differences in the context of other social determinants of health 21

Research areas Immigrant research Comparative analysis of hospitalizations by immigrant status, source country and time since immigration; Use of hospital services among immigrant seniors; Multi-generational analysis of cardiovascular related hospitalizations is the health advantage lost among second generation? Aboriginal research Comparative analysis of hospitalization rates among Indigenous groups, on and off reserve Impact of housing condition on respiratory related hospitalizations among First Nations on reserve 22

2006 Census Cohort: DAD follow-up 2006 long-form census Discharge Administrative Database age & sex education & income employment immigration status ethnicity 23

Step 1: Data Preparation Eligibility of records for linkage: Complete (non-missing) date of birth in both Census and DAD; Statistical linkage key must be unique in Census - no duplicates (e.g. multiple births removed) Statistical linkage key associated with only one Health Insurance Number (HIN) in DAD Hierarchical Deterministic Linkage Unique statistical linkage key date of birth, sex, postal code Used postal code information from HSTF as alternative to capture change in address overtime Series of exact matches -conservative approach but appropriate given lack of unique identifying information 24

Steps 2: Record Linkage 2006 Census (keys) Hierarchical Deterministic Linkage Match: Deterministic linkage using PHIN and P/T to link to other DAD transactions 2006-2009 DAD (keys) 2006-2009 DAD (transactions) 25

Research Results (Carriere G, Bougie E et al. Health Reports, August 2016) Age-standardized acute-care hospitalization rates (ASHR) per 100,000 non-institutionalized population, by Aboriginal identity and by diagnostic chapter, Canada (excluding Quebec), combined fiscal 2006/2007 through 2008/2009 Digestive Injuries Respiratory Circulatory system Mental and behavioural disorders Endocrine, nutritional,metabolic Genitourinary First Nations living on reserve First Nations living off reserve Métis Inuit living in Inuit Nunangat Non-Aboriginal Musculoskeletal, connective tissue 26 0 200 400 600 800 1 000 1 200 1 400 1 600 1 800 Rate per 100,000 population Source: Census of Population 2006, Census-linked Discharge Abstract Database 2006/2007, 2007/2008, 2008/2009 pooled.

27 2) Canadian Community Health Survey (CCHS) linked to Canadian Vital Statistics Death Database (CVSD) and Discharge Abstract Database (DAD)

Background Enhance the capacity of health data to address complex questions with value added information - fill data gaps Survey data lots of socio-economic, risk factor information but no outcomes; Administrative data outcome information (hospitalization, mortality) but limited individual information Linked data allow for population health lens to the study of health care services and outcomes Used to study a wider range of determinants of health care use and outcomes of care Population based studies on a representative sample of Canadians Large sample sizes - study specific populations and rare events Opportunity for comparisons across provinces and territories 28

Research examples 1. To understand the interaction between socio-economic and behavioural risk factors and their effect on the use and cost of hospital services 2. To understand the extent to which differences in the prevalence of risk factors in Canada explains the variation in the use of hospital services 3. To examine the interaction between risk factors, ambient air pollution exposures, mortality, and the use of hospital services 29

Canadian Community Health Survey Cohorts Residential mobility through time CCHS survey cycles Aged 12 or older at time of survey Some population exclusions (~2% of population) Quebec excluded for DAD and NACRS linkages Socioeconomic Ethno-cultural Health status Health behaviours Health care use Canadian Vital Statistics Death Database Discharge Abstract Database 30

Main strengths & limitations Strengths Population based Rich source of information on the cohort characteristics and outcomes Large sample size Able to examine several variables simultaneously Multilevel analysis Limitations Information collected at one point in time (changes in risk factors are not captured) Some population exclusions (reserves, children) 31

32 3) 1991 Canadian Census Health and Environment Cohort aka CanCHEC

Context Greater focus on understanding potential inequalities in health outcomes Vital statistics, registries and health administrative data lack individual identifiers (ethnicity, Indigenous identity) or characteristic Identification of differences in mortality across socio-economic characteristics for a number of populations Immigrants, ethnic origins, First Nations, Métis, and Inuit Produce baseline indicators of mortality for monitoring health disparities Life expectancy & mortality by detailed population groups (occupation, education, income groups) 33

Research areas Sub-population analysis First Nations, Métis, immigrants (year of immigration), place of birth, ethnic origin etc Analysis by socioeconomic status Income (source, household, individual), education (years, qualifications), occupation, industry, type of housing, marital status Multi-dimensional analysis Exposure analysis Assign exposure via postal code representative points 34

1991 Census Cohort: mortality & cancer follow-up Residential mobility through time 1991 long-form census n= 2.7 million Canadian Vital Statistics Death Database Satellitederived PM 2.5, NO 2, O 3 age & sex education & income employment immigration status ethnicity Canadian Cancer Registry 35 Land use regression models Point sources of pollution

1991 census cohort Eligibility Enumerated on 1991 census long form (1 in 5 households *) Aged 25 or older as of June 4, 1991 Not a usual resident of an institution N=3,576,487 Note that 3.4% of the Canadian population of all ages were not enumerated by the census Linkage approval for 15% of persons aged 25+ * Note that all residents of Indian Reserves and remote northern communities receive long form questionnaire 36

1991 census cohort Cohort creation Eligible census respondents linked to tax filer data (non-financial) in order to get names Matching variables: sex, date of birth, postal code, spousal date of birth Results: 80% linkage rate, 99% correct links Cohort is slightly biased to those of higher socioeconomic status Deterministic linkage to annual place of residence and Longitudinal Worker File Probabilistic linkage to mortality and cancer 37

How good was the cohort? Characteristic Cohort In-scope* Total (count) 2,734,835 3,576,485 Sex (%) Male Female Age (%) 25 to 44 45 to 64 65 + Educational attainment (%) Less than secondary graduation Secondary graduation or higher Income adequacy quintile (%) Quintile 1-poorest Quintile 5-richest 49.7 50.3 54.5 30.0 15.4 34.9 65.1 17.2 21.5 48.6 51.4 52.6 30.5 16.9 37.8 62.2 20.0 20.0 * In-scope refers to all individuals who were enumerated by the long-form, were aged 25+, and were not a resident of an institution 38

Results survival 100 Percentage surviving to various ages in Canada for 1995-1997 and 2002 (average) compared to cohort for 1991-2006 90 80 70 60 50 40 30 20 10 0 25 30 35 40 45 50 55 60 65 70 75 80 85 90 Men-Life tables Men-Cohort Women-Life tables Women-Cohort 39

Research results: Income and Education Remaining life expectancy (at 25) by educational attainment within each income adequacy quintile and for each sex, 1991-2006 follow-up Source: 1991 Canadian census cohort: mortality and cancer follow-up study (1991-2006) 40

Main strengths & limitations Strengths Population based Large sample size (rare outcomes, small population groups) Able to examine several variables simultaneously Long latency period required for cancer outcomes Multilevel analysis Captures residential mobility over a 27 year period (environmental exposure via the use of postal code representative points) Limitations Census characteristics only measured at baseline (1991) No information on health behaviours Some population exclusions Non tax filers, under the age of 25, institutional residents at cohort inception, those not enumerated by 1991 long form census 41

Thank you! Philippe Finès, philippe.fines@canada.ca 42

Record linkage at StatCan http://www.statcan.gc.ca/eng/record/gen http://www.statcan.gc.ca/health-sante/link-coup-eng.htm General information For Health http://www.statcan.gc.ca/eng/record/policy4-1 http://www.statcan.gc.ca/eng/record/summ Social Data Linkage Environment (SDLE) Statistics Canada s official directives on our record linkage activities. This is a list and description of previously approved record linkage activities http://www.statcan.gc.ca/eng/sdle/index http://www.statcan.gc.ca/eng/rdc/index http://www.statcan.gc.ca/eng/rdc/network http://www.statcan.gc.ca/eng/rdc/data http://www.statcan.gc.ca/eng/rdc/process Research Data Centers (click on DRD linkage status for a list of data sources that are already linked in which you may be interested) The Research Data Centres (RDC) Program List of RDCs List of datasets currently available in the RDCs Application process and guidelines statcan.hsdrecordlinkage-dsscouplageenregistrements.statcan@canada.ca HSD Record Linkage Mailbox 43