APPENDIX A UNDERSTANDING SOCIETY: THE UK HOUSEHOLD LONGITUDINAL STUDY (UKHLS)

Similar documents
APPENDIX A BRITISH HOUSEHOLD PANEL STUDY

Some Indicators of Sample Representativeness and Attrition Bias for BHPS and Understanding Society

VICTORIAN PANEL STUDY

Zambia - Demographic and Health Survey 2007

Guyana - Multiple Indicator Cluster Survey 2014

Sierra Leone - Multiple Indicator Cluster Survey 2017

The main focus of the survey is to measure income, unemployment, and poverty.

Botswana - Botswana AIDS Impact Survey III 2008

Session V: Sampling. Juan Muñoz Module 1: Multi-Topic Household Surveys March 7, 2012

Lao PDR - Multiple Indicator Cluster Survey 2006

Egypt, Arab Rep. - Multiple Indicator Cluster Survey

Nigeria - Multiple Indicator Cluster Survey

0-4 years: 8% 7% 5-14 years: 13% 12% years: 6% 6% years: 65% 66% 65+ years: 8% 10%

The Census questions. factsheet 9. A look at the questions asked in Northern Ireland and why we ask them

Sampling Subpopulations

Neighbourhood Profiles Census and National Household Survey

; ECONOMIC AND SOCIAL COUNCIL

Neighbourhood Profiles Census and National Household Survey

Using registers E-enumeration and CAPI Electronic map. Census process. E-enumeration. Census moment and census period E-enumeration process

Neighbourhood Profiles Census

The SCOTTISH LONGITUDINAL STUDY (SLS)

Turkmenistan - Multiple Indicator Cluster Survey

Tonga - National Population and Housing Census 2011

Saint Lucia Country Presentation

Supplementary questionnaire on the 2011 Population and Housing Census SWITZERLAND

SURVEY ON POLICE INTEGRITY IN THE WESTERN BALKANS (ALBANIA, BOSNIA AND HERZEGOVINA, MACEDONIA, MONTENEGRO, SERBIA AND KOSOVO) Research methodology

Liberia - Household Income and Expenditure Survey 2016

Barbados - Multiple Indicator Cluster Survey 2012

Sampling Techniques. 70% of all women married 5 or more years have sex outside of their marriages.

Albania - Demographic and Health Survey

Experiences with the Use of Addressed Based Sampling in In-Person National Household Surveys

Polls, such as this last example are known as sample surveys.

PUBLIC EXPENDITURE TRACKING SURVEYS. Sampling. Dr Khangelani Zuma, PhD

Section 2: Preparing the Sample Overview

Jamaica - Multiple Indicator Cluster Survey 2011

Ghana - Ghana Living Standards Survey

1 NOTE: This paper reports the results of research and analysis

Ghana - Financial Inclusion Insights Survey 2014

NATIONAL SOCIO- ECONOMIC SURVEY (SUSENAS) 2001 MANUAL HEAD OF PROVINCIAL, REGENCY/ MUNICIPALITY AND CORE SUPERVISOR/ EDITOR

PREPARATIONS FOR THE PILOT CENSUS. Supporting paper submitted by the Central Statistical Office of Poland

AmericasBarometer, 2016/17

SURVEY ON USE OF INFORMATION AND COMMUNICATION TECHNOLOGY (ICT)

Montenegro - Multiple Indicator Cluster Survey Roma Settlements

ESSnet on DATA INTEGRATION

1999 AARP Funeral and Burial Planners Survey. Summary Report

American Community Survey 5-Year Estimates

American Community Survey 5-Year Estimates

6 Sampling. 6.2 Target Population and Sample Frame. See ECB (2011, p. 7). Monetary Policy & the Economy Q3/12 addendum 61

Sampling Subpopulations in Multi-Stage Surveys

Namibia - Demographic and Health Survey

Strategies for the 2010 Population Census of Japan

Year Census, Supas, Susenas CPS and DHS pre-2000 DHS Retro DHS 2007 Retro

FINANCIAL PROTECTION Not-for-Profit and For-Profit Cemeteries Survey 2000

NATIONAL SOCIAL ECONOMIC SURVEY (SUSENAS) 2002

student finance wales EMA Education Maintenance Allowance Application Form for 2013/14 SFW/EMA/F/V1314/A

PSID in the Beginning

How Statistics Canada Identifies Aboriginal Peoples

It s good to share... Understanding the quality of the 2011 Census in England and Wales

Austria Documentation

An Overview of the American Community Survey

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu Albuquerque, UNM

Supplementary questionnaire on the 2011 Population and Housing Census SLOVAKIA

2011 Census. Report on changes to Government Statement published in December 2008

Quick Reference Guide

6 Sampling. 6.2 Target population and sampling frame. See ECB (2013a), p. 80f. MONETARY POLICY & THE ECONOMY Q2/16 ADDENDUM 65

Census Data for Transportation Planning

Grid. Grid. Grid. Some grids. Grid. Grid. A Grid in Lithuania. BNU 2012, Valmiera Seppo 1

Sampling. I Oct 2008

FINANCIAL LITERACY SURVEY IN BOSNIA AND HERZEGOVINA 2011

Liberia - Demographic and Health Survey 2007

UNIT 8 SAMPLE SURVEYS

2016 Census Bulletin: Families, Households and Marital Status

Understanding Society

LOGO GENERAL STATISTICS OFFICE OF VIETNAM

The challenges of sampling in Africa

SAMOA - Samoa National Population and Housing Census 2006

Italian Americans by the Numbers: Definitions, Methods & Raw Data

SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES American Community Survey 5-Year Estimates

2011 Census quality assurance: The estimation process

AF Measure Analysis Issues I

Culturally and Linguistically Diverse Young People and Digital Citizenship:

Vietnam - Household Living Standards Survey 2004

Key Words: age-order, last birthday, full roster, full enumeration, rostering, online survey, within-household selection. 1.

São Tomé and Príncipe - Multiple Indicator Cluster Survey 2014

THE 2009 VIETNAM POPULATION AND HOUSING CENSUS

Sample size, sample weights in household surveys

2011 UK Census Coverage Assessment and Adjustment Methodology

Sample Registration System in India. State Institute of Health & Family Welfare, Jaipur

Chapter 1 Introduction

SAMPLE DESIGN A.1 OBJECTIVES OF THE SAMPLE DESIGN A.2 SAMPLE FRAME A.3 STRATIFICATION

Maintaining knowledge of the New Zealand Census *

Chapter 4: Sampling Design 1

Tanzania - Demographic and Health Survey 2010

The Savvy Survey #3: Successful Sampling 1

Labour force survey in the EU, candidate and EFTA countries

Economic and Social Council

Benefits of Sample long Form to Enlarge the scope of Census Data Analysis: The Experience Of Bangladesh

Other Effective Sampling Methods

NILS-RSU Introductory Information

TURKISH STATISTICAL INSTITUTE

Transcription:

APPENDIX A UNDERSTANDING SOCIETY: THE UK HOUSEHOLD LONGITUDINAL STUDY (UKHLS) This is a short introduction to Understanding Society: The UK Household Longitudinal Study (UKHLS) that summarises the main characteristics of the study discussed in Longhi and Nandi (2014) A Practical Guide to Using Panel Data. For further details see the user guide: McFall, Stephanie L. (ed.) (2013) Understanding Society UK Household Longitudinal Study: Wave 1-3, 2009-2012, User Manual. Colchester: University of Essex You can find the user guide, interactive online documentation, questionnaires, fieldwork and technical documents at https://www.understandingsociety. ac.uk/documentation. The Survey The UKHLS is a multi-purpose household panel survey of a sample drawn from the non-institutionalised residential population of the UK in 2009. It is similar to the British Household Panel Survey (BHPS, discussed in a separate section of this online Appendix) in a number of ways. Some of the key differences of the UKHLS from the BHPS are the much larger sample size, an over-sample of ethnic minorities, health and bio-markers component and a larger geographical spread. The sample of approximately 30,000 households and 77,000 individuals was drawn from the Postcode Address Small Users File list of domestic addresses (for the Great Britain samples) and the Land and Property Services Agency list of domestic addresses (for the Northern Ireland sample). The sample has several components. One of these is the General Population Sample (GPS): the subsample for Great Britain has a clustered and stratified sample design while the sub-sample for Northern Ireland (NI) has a simple random sample design and a

selection probability that is twice that of the Great Britain sub-sample. An additional over-sample of around 4,000 ethnic minority households consisting of 13,000 individuals, the Ethnic Minority Boost (EMB) sample, has been selected from high ethnic minority concentration areas of Great Britain. This too has a clustered and stratified design. From the second wave onwards, the sample of households that had not dropped out after the 18th wave of the BHPS becomes eligible for inclusion into the UKHLS, resulting in an additional 6,600 interviewed households consisting of about 16,500 individuals. The different subsamples are summarised in Table 1 at the end of this document. Compared to the BHPS, the UKHLS is much more geographically dispersed and the cluster sizes are smaller: 18 addresses selected from each of the 2640 primary sampling units (PSUs). For comparison, in the BHPS sample, on an average, 33 addresses were selected from 400 primary sampling units (250 in the original sample and 75 each in the Scottish and Welsh boost samples). The UKHLS is an indefinite life panel survey, with no refreshment sample planned as yet. Sample members are interviewed every year across a 24 month fieldwork period resulting in an overlapping fieldwork design. However, given their small sample sizes the GPS-NI and the BHPS sub-samples are interviewed over the first 12 months of the fieldwork period. The Survey All household members of responding households in wave 1 (except for nonethnic minority members of EMB households) are considered to be Original Sample Members (OSM). OSMs are followed wherever they go as long as they reside in the UK. Children of OSM mothers are also considered to be OSMs. Others who join OSM households and non-ethnic minority members of EMB households are considered to be Temporary Sample Members (TSM). TSMs are only interviewed as long as they are co-resident with at least one OSM. Any TSM who becomes the father of an OSM child becomes a Permanent Sample Member (PSM). PSMs have the same following rules as the OSMs. In the UKHLS, data are collected using a similar set of survey instruments as in the BHPS: Household and enumeration grid: interviewers collect information about who lives in the household, their relationships to each other and some basic information about them such as marital status, age, sex. Household questionnaire: information about the residential property, household expenditures, assets and so on is collected from a knowledgeable adult in the household. Adult face-to-face interview questionnaire: all adults (aged 16 years old or over) are asked detailed factual and subjective information about themselves. Proxy interview questionnaire: basic factual information is collected about nonresponding adults from their spouse, partner or adult child who knows them well.

Adult self-completion questionnaire: adult respondents are also asked to fill in a selfcompletion questionnaire which may include questions on sensitive topics. Youth self-completion questionnaire: young persons in the household between the ages of 10 and 15 are asked to complete a self-completion questionnaire which includes questions that are particularly important to understand the lives and experiences of young persons such as eating habits, experiences of bullying, computer usage. In the UKHLS currently interviews are not conducted by telephone but the BHPS telephone sample members continue to be interviewed by telephone when they join the UKHLS sample. Unlike in the BHPS, the telephone questionnaire in UKHLS is the same as the face-to-face questionnaire. To allow comparison of ethnic minority groups with each other and the white majority group, an extra five minutes is set aside for questions of particular importance to ethnicity related research. These extra five minutes questions are asked of adults in the following sub-samples: EMB sample, a random subsample of 500 households in the GPS also referred to as the General Population Comparison (GPC) sample, all ethnic minority adult respondents in the GPS who were living in low ethnic minority concentration areas at wave 1. Data files Understanding Society data are provided in Stata, SPSS and TAB formats. Information collected in each wave is made available in a separate set of data files: each file corresponding to one data source. Across waves, file names have the same root name with a letter prefix identifying the wave, followed by an underscore: a_ for the first wave, b_ for the second wave, and so on. For example, all information collected in the household questionnaire in the first wave is provided in the file a_hhresp while all information collected from household members aged 16 or above in the third wave is provided in c_indresp. Information about all individuals in responding households, including children and nonrespondents are available in w_indall (here we use w as a placeholder for any wave prefix). To make it easier for users to access fixed information collected at different waves (such as year and country of birth, and highest qualification obtained), this information is stored in an individual level cross-wave file called xwavedat (note the absence of a wave prefix in the file name). A list of file names and their content is provided in Table 2 at the end of this document. Variable names follow a similar naming convention as file names. For example, the variable identifying the main activity status is called a_jbstat in the first wave, b_jbstat in the second wave and so on. All derived or generated variables in the UKHLS have the suffix _dv. For example, the generated variable for usual monthly pay is called w_paygu_dv. The only variables not to have wave prefixes are the cross-wave unique person identifier pidp, the BHPS cross-wave identifier pid, and the fixed information variables in xwavedat. The unique within-wave household identifier is called

w_hidp. There is no concept of a longitudinal household in UKHLS and so there is no unique cross-wave household identifier. Within each wave a person can also be uniquely identified by the household identifier and person number (w_hidp and w_pno). These variables can be used to merge individual and household level files as well as individual or household level files across waves. For further details see Chapter 5 of Longhi and Nandi (2014) A Practical Guide to Using Panel Data. Missing values in the UKHLS are: -1, -2, -7, -8, -9, -10 and -11. These values represent don t know (-1), refusal (-2), not asked in proxy interview (-7), valid skips or not applicable (-8), inconsistent or implausible values (-9), no data from BHPS 1-18 (-10) and no data from UKHLS (-11). Data for all sub-samples (EMB, GPC, BHPS and so on) are provided in the same files and cases from the different sub-samples can be identified by the variable hhorig in cross-wave files and w_hhorig in any of the wave specific files. See Table 18 of McFall (2013) for a list of the key variables in the dataset. BHPS respondents in UKHLS From wave 2 onwards all individual level datasets contain the variable pid, the individual cross-wave identifier for BHPS sample members. This can be used to link the UKHLS data of the BHPS sample members to their BHPS data for the previous 18 waves. Hence, BHPS respondents have valid data for both pid and pidp (the same person has two different identification numbers, one for BHPS and one for UKHLS), while UKHLS respondents who were not part of BHPS have missing values for the variable pid. Individuals joining BHPS households for the first time after it became part of the UKHLS also have missing values for pid. The 18th wave of the BHPS was fielded in September to December 2008 and the subsequent time the sample members were interviewed was as part of the UKHLS from January to December 2010. While some were interviewed after approximately one year, others were interviewed after two years. Panel data methods assume equal time interval between observations; if you are building a panel data with the BHPS that includes the UKHLS components, it may be preferable to treat the 2nd wave of the UKHLS as 20th wave of the BHPS instead of the 19th. Identifying other household members The UKHLS also provides indicator variables for identifying the spouse, partner, parents and grandparents. Other family members can be identified using the file w_egoalt. For details see Chapter 6 of Longhi and Nandi (2014) A Practical Guide to Using Panel Data.

Sample Weights As different sub-samples of the UKHLS were selected with different selection probabilities and there is non-response and attrition, weights are needed to produce unbiased population estimates based on sample statistics. Different weights are provided for different types of analyses. The data provide cross-sectional and longitudinal weights for each wave which correct for unequal selection probability and non-response or attrition, but also weights that only correct for the sample design. The sampling design variables which represent the strata and primary sampling units are also available. Some are to be used for individual level analysis while others for household level analysis. See McFall (2013) for details on computation of these weights and guidance on which weights to use for specific types of analyses. History Files In the UKHLS most of the data are collected by prospective methods. Exceptions are the marital, fertility and employment histories asked when respondents are interviewed for the first time, which are collected by retrospective method. The UKHLS collects information on the marital, fertility and employment histories of respondents before they joined the panel, as well as changes in these domains between interviews. The initial employment history was collected for only a quarter of the sample in wave 1 and is collected for the rest of the sample in wave 5. These are multi-level files where each row is identified by the individual and the specific spell and include information on the type of spell, start and end dates and whether the spell is still ongoing (see Table 2 at the end of this document, and the UKHLS website for details). Health and Biomarkers Health and bio-markers were collected by nurses five months after the wave 2 interview from a sub-sample of the wave 2 GP sample, and after the wave 3 interview from a sub-sample of the BHPS sample. The sub-samples that were eligible were people residents in Great Britain who gave a full-interview in English at these waves. During the nurse visits, subject to consent, a number of health information was collected from adult respondents such as height, weight, grip strength, lung function test, blood spot, and so on. This resulted in nurse health assessments for around 20,000 individuals and health assessments as well as blood samples for around 13,000 individuals. These data are available as a separate set of data files which can be combined with the UKHLS main data using the variable pidp. The data structure and naming conventions of the data files and variables

are similar to the main survey files. These data files have a suffix _ns. For more information see the Health Assessment user guide and online documentation at https://www.understandingsociety.ac.uk/documentation. Innovation Panel A household panel survey called the Innovation Panel (IP) with a sample of approximately 1,500 households drawn from Great Britain is fielded one year prior to the UKHLS main survey. For the first few waves the main purpose of this survey was to inform UKHLS on survey methodological issues. Nowadays it is increasingly used for research on more general survey methodology issues related to longitudinal surveys. The innovation panel data are available as a separate set of data files. The data structure and naming conventions of the data files and variables are similar to the main survey files. These data files have a suffix _ip. The innovation panel should not be used together with the main sample for analysis. For further information on the IP see the user guide and online documentation available at https://www.understandingsociety.ac.uk/documentation. Table 1 Description of Understanding Society samples General Population Sample Great Britain Component Northern Ireland Component Ethnic Minority Boost Sample British Household Panel Survey (wave 2) Total Issued Households Responding Households Enumerated Individuals in Responding Households Adult Respondents (excluding proxy respondents) Proxy Respondents Telephone Respondents Youth Respondents 48,144 2,395 44,769 8,992 104,300 24,797 1,292 4,080 6,692 36,861 60,597 3,351 13,361 16,562 93,871 39,050 1,997 6,685 11,260 58,992 2,536 91 635 450 3,712 - - - 326 326 3,783 212 904 1,117 6,016

Table 2 Description of Understanding Society data files Contains File Name Responses provided by Each row of observation is uniquely identified by Substantive information Information on the household and enumeration grid w_indall Any adult in the household pidp or w_hidp w_pno Responses from the household questionnaire w_hhresp Knowledgeable adult in the household a w_hidp Responses from adult individual interviews (face-to-face, telephone, self-completion and proxy). Also includes interviewer remarks about the interview process w_indresp Adults (aged 16 and over) AND interviewers Responses from adult (face-to-face) individual interviews. Adults (aged 16 and over) Information of all income sources since last interview, one row for each income source of each individual History of marriages, cohabitation and employment statuses before the start of the survey. One row for each marriage, cohabitation and employment spell of each individual pidp or w_hidp w_pno w_income pidp w_fiseq or w_hidp w_pno w_fiseq w_marriage(a) w_cohab(a) w_hidp w_pno w_marno w_hidp w_pno w_cohabno Information about natural, adopted/step children (including non-resident children) of the adult respondent w_empstat(a) w_child(a) w_natchild(a) w_newborn(b,c) w_parstyle(c) w_hidp w_pno w_spellno pidp or w_hidp w_pno w_hidp w_pno w_childno pidp w_newchno or w_hidp w_pno w_newchno w_chmain(c) pidp w_childpno or w_hidp w_pno w_childpno pidp w_ absparno or w_hidp w_pno w_absparno (Continued)

Table 2 (Continued) Contains File Name Responses provided by Each row of observation is uniquely identified by Responses from the youth questionnaire w_youth 10-15 year olds in the household pidp or w_hidp w_pno Sampling information and Paradata Sampling information and information from the ARF (includes information on non-responding households) Household location and interview outcome information about every person in fielded households w_hhsamp Survey organisation and interviewer w_hidp w_indsamp Interviewer pidp w_finloc Interview outcome at each issue w_issue Interviewer w_hidp w_issueno Interview outcome at each call w_callrec Interviewer w_hidp w_issueno w_callno Interviewer information xivdata Survey organisation intnum Derived files Relationship between every pair of household members w_egoalt pidp w_apidp or w_hidp w_pno w_apno Fixed information about everyone in every enumerated households xwavedat pidp Interview outcome in every wave xwaveid pidp