TED NAT! ONS. LIMITED ST/ECLA/Conf.43/ July 1972 ORIGINAL: ENGLISH. e n

Similar documents
Supplementary questionnaire on the 2011 Population and Housing Census FRANCE

DATA VALIDATION-I Evaluation of editing and imputation

Workshop on Census Data Evaluation for English Speaking African countries

Supplementary questionnaire on the 2011 Population and Housing Census SWITZERLAND

IM M IG RAN TS AN D TH E IR CHILDREN, ^

Monday, 1 December 2014

Internet Survey Method in the Population Census of Japan. -- Big Challenges for the 2015 Census in Japan -- August 1, 2014

Census Response Rate, 1970 to 1990, and Projected Response Rate in 2000

K.R.N.SHONIWA Director of the Production Division Zimbabwe National Statistics Agency

1 NOTE: This paper reports the results of research and analysis

; ECONOMIC AND SOCIAL COUNCIL

American Community Survey 5-Year Estimates

American Community Survey 5-Year Estimates

Evaluation and analysis of socioeconomic data collected from censuses. United Nations Statistics Division

Using Administrative Records and the American Community Survey to Study the Characteristics of Undercounted Young Children in the 2010 Census

1940 QUESTIONNAIRE CENSUS OF VACANT DWELLINGS

The Census questions. factsheet 9. A look at the questions asked in Northern Ireland and why we ask them

Tonga - National Population and Housing Census 2011

CONOMIC ND OCIAL COUNCI

1980 Census 1. 1, 2, 3, 4 indicate different levels of racial/ethnic detail in the tables, and provide different tables.

SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES American Community Survey 5-Year Estimates

Understanding and Using the U.S. Census Bureau s American Community Survey

February 24, [Click for Most Updated Paper] [Click for Most Updated Online Appendices]

Methodology Statement: 2011 Australian Census Demographic Variables

Measuring Multiple-Race Births in the United States

1950 Questionnaire Population

United Nations Demographic Yearbook Data Collection System

Supplement No. 7 published with Gazette No. 18 dated 30 August, THE STATISTICS LAW (1996 REVISION) THE CENSUS (CAYMAN ISLANDS) ORDER, 2010

Indonesia - Demographic and Health Survey 2007

DAR POLICY STATEMENT AND BACKGROUND Using DNA Evidence for DAR Applications

Using Administrative Records for Imputation in the Decennial Census 1

ESP 171 Urban and Regional Planning. Demographic Report. Due Tuesday, 5/10 at noon

Census 2000 and its implementation in Thailand: Lessons learnt for 2010 Census *

Supplementary questionnaire on the 2011 Population and Housing Census SLOVAKIA

Austria Documentation

THE 2009 VIETNAM POPULATION AND HOUSING CENSUS

Using 2010 Census Coverage Measurement Results to Better Understand Possible Administrative Records Incorporation in the Decennial Census

Workshop on Census Data Processing Doha, Qatar 18-22/05/2008

The progress in the use of registers and administrative records. Submitted by the Department of Statistics of the Republic of Lithuania

The main focus of the survey is to measure income, unemployment, and poverty.

Working with NHS and Taxfiler data to measure income and poverty in Toronto neighbourhoods

census 2016: count yourself in

The American Community Survey and the 2010 Census

A gender perspective on the 2005 Census of Korea (R.O.K) Focusing on Economic Activity, and Living Expense of the Aged.

Collection and dissemination of national census data through the United Nations Demographic Yearbook *

MATRIX SAMPLING DESIGNS FOR THE YEAR2000 CENSUS. Alfredo Navarro and Richard A. Griffin l Alfredo Navarro, Bureau of the Census, Washington DC 20233

The Demographic situation of the Traveller Community 1 in April 1996

ST/ESA/STAT/SER.F/82. UNITED NATIONS PUBLICATION. Sales No.:E.00.XVII.9 ISBN:

Zambia - Demographic and Health Survey 2007

Sunday, 19 October Day 1: Revision 3 of Principles and Recommendations for Population and Housing Censuses

Year Census, Supas, Susenas CPS and DHS pre-2000 DHS Retro DHS 2007 Retro

Population and dwellings Number of people counted Total population

1) Analysis of spatial differences in patterns of cohabitation from IECM census samples - French and Spanish regions

Overview of the Course Population Size

Population and dwellings Number of people counted Total population

Demographic and Social Statistics in the United Nations Demographic Yearbook*

population and housing censuses in Viet Nam: experiences of 1999 census and main ideas for the next census Paper prepared for the 22 nd

Kenken For Teachers. Tom Davis January 8, Abstract

Sierra Leone 2015 Population and Housing Census POST ENUMERATION SURVEY RESULTS AND METHODOLOGY

2001 Population and Housing Census

Jamaica - Multiple Indicator Cluster Survey 2011

Accuracy of Data for Employment Status as Measured by the CPS- Census 2000 Match

Chapter 1: Economic and Social Indicators Comparison of BRICS Countries Chapter 2: General Chapter 3: Population

United Nations Demographic Yearbook review

NATIONAL SOCIO- ECONOMIC SURVEY (SUSENAS) 2001 MANUAL HEAD OF PROVINCIAL, REGENCY/ MUNICIPALITY AND CORE SUPERVISOR/ EDITOR

Record Linkage between the 2006 Census of the Population and the Canadian Mortality Database

PROBABILITY-BASED SAMPLING USING Split-Frames with Listed Households

Evaluation of the Canadian Census Editing and Imputation System

1. Why randomize? 2. Randomization in experiental design

2.0 INTERFACE OF CR SYSTEM WITH THE VITAL STATISTICS SYSTEM AND NPD

American Community Survey Review and Tips for American Fact Finder. Sarah Ehresman Kentucky State Data Center August 7, 2014

Data Processing of the 1999 Vietnam Population and Housing Census

DATA PROCESSING OF THE 1999 POPULATION CENSUS IN VIET NAM

National Population Estimates: March 2009 quarter

SAMOA - Samoa National Population and Housing Census 2006

Introduction Strategic Objectives of IT Operation for 2008 Census Constraints Conclusion

Estimating the number of rooms and bedrooms in the 2021 Census for England and Wales. An alternative approach using Valuation Office Agency (VOA) data

THE EVALUATION OF THE BE COUNTED PROGRAM IN THE CENSUS 2000 DRESS REHEARSAL

The Unexpectedly Large Census Count in 2000 and Its Implications

Some Indicators of Sample Representativeness and Attrition Bias for BHPS and Understanding Society

Fiscal 2007 Environmental Technology Verification Pilot Program Implementation Guidelines

Economic and Social Council

Overview of the 2014 Myanmar Population and Housing Census. Prepared by the Census Office (Department of Population and UNFPA)

Trends, Data and Definitions The Household Reference Person. Greg Ball BSPS Council & independent consultant

East -West Population Institute. Accuracy of Age Data

INTEGRATED COVERAGE MEASUREMENT SAMPLE DESIGN FOR CENSUS 2000 DRESS REHEARSAL

Census Liaison Managers (CLM) & Assistant Census Liaison Managers (ACLM) monthly update for onward communication by CRCs April 2010

ONLINE APPENDIX: SUPPLEMENTARY ANALYSES AND ADDITIONAL ESTIMATES FOR. by Martha J. Bailey, Olga Malkova, and Zoë M. McLaren.

Sierra Leone - Multiple Indicator Cluster Survey 2017

Estimation Methodology and General Results for the Census 2000 A.C.E. Revision II Richard Griffin U.S. Census Bureau, Washington, DC 20233

Guyana - Multiple Indicator Cluster Survey 2014

Namibia - Demographic and Health Survey

Adjusting for linkage errors to analyse coverage of the Integrated Data Infrastructure (IDI) and the administrative population (IDI-ERP)

Probability and Counting Techniques

National Population Estimates: June 2011 quarter

THE 2012 POPULATION AND HOUSING CENSUS AN OVERVIEW. NATIONAL BUREAU OF STATISTICS 4 th August, 2011 Dar es Salaam

2012 UN International Seminar for Global Agenda - The Population and Housing Census. Hyong-Joon Noh Statistics Korea

Evaluation of the Completeness of Birth Registration in China Using Analytical Methods and Multiple Sources of Data (Preliminary draft)

EA R LY in 1947 the United Nations Statistical Commission

Additional file 1: Cleaning, Geocoding and Weighting

Transcription:

BIBLIOTECA NACIONES UNIDAS MEXIGO TED NAT! ONS LIMITED ST/ECLA/Conf.43/1.4 11 July 1972 e n ORIGINAL: ENGLISH (»»«tiiitmiimmiimitmtiitmtmihhimtfimiiitiinihmihmiimhfiiim i infittititi m m ECONOMIC COMMISSION FOR LATIN AMERICA SEMINAR ON THE PREPARATION AND USE OF POPULATION AND HOUSING CENSUS TABULATIONS Organized by the United Nations, through the Economic Commission for Latin America, the Statistical Office of the United Nations, the United Nations Trust Fund for Population Activities with the collaboration of the Latin American Demographic Centre Santiago, Chile, 14-19 August 1972 COMPUTER REVIEW AND EDIT OF A POPULATION AND H0USIN3 CENSUS By Howard G. Brunsman United States Bureau of the Census 72-7-1652

The purpose of the error detection and correction programme is to improve the accuracy of the census data. As indicated in the United Nations paper* errors may arise because the information recorded on the census schedule is not a correct representation of the facts or because of errors in the various stages of processing the data from the enumeration document to the resulting statistical tables. The United Nations paper refers to the three types of errors that might be detected. These are: (aj Quission of entries (b) Inadmissable entries (c) Inconsistent entries All three types might occur either in the data recorded on the census schedule or in the processing of the data 0 An entry on the schedule might be omitted because the respondent did not knew the answer to the question. The recorded data might be inadmissable. For example, the reported place of birth might be non-existent. Or the recorded data may be inconsistent with other characteristics For example, highest grade completed may be too great for the age of the person. Reported bathing equipment may be inconsistent with water supply. Some of the errors may be corrected by a detailed examination of the schedule, A missing sex report may be supplied by an examination of given name. Inadmissable place of birth might result from relatively minor misspelling. The only way in which many errors may be corrected is by reinterviswing the respondent. Obviously this approach is not feasible on a wholesale basis for a nation-wide census. Many of the errors result from the various processing operations. "The most common errors arise from improper manual coding and in the process of converting data to machine readable form. Errors of these types can be corrected by reexamining the enumeration schedule, but it is very expensive to search out the enumeration schedule and to determine the required correction. And if the error proves to be in the recording of the. data on the schedule, nothing is gained by the search. Also if\the errors are relatively rare, the accuracy of the data is not seriously impaired if these are corrected by imputation. The Use of Computers for Error Detection and Correction in Population and Housing Census, preparaed by the United Nations Statistical Office, April l??l, /Inconsistencies in

Inconsistencies in the data might be revealed by a case-by-case examination of each record or by the preparation of a detailed cross tabulation of the variables» For examples Inconsistencies in the relationship of grade attending to age might be detected by tabulating detailed grade by detailed age. Adjustments might then be made in the table before publication. But such adjustments might result in producing inconsistencies with the tables showing grade attending but not by age for smaller areas. Therefore it is better to locate the inconsistent cases and change them before any tabulations are prepared. As indicated in the United Nations paper, the role of the computer can be restricted to the detection of records with errors without the automatic correction of the errors. In the days of the unit count tabulator, the tabulator could process a batch of punch cards and sort into a separate pocket all cards with a specific type of omission or inconsistency. These cards were corrected manually and the corrected records inserted in the file. In corresponding manner the computer programme can print a report showing the content of all defective records and the nature of the defect. The programme can produce duplicates of defective cards so that only the defective items need be corrected. The fact that the computer is able to detect certain types of errors must not be used as an excuse to eliminate all verification. The recorded information may be incorrect even though it is fully consistent. The computer error detection programme might be used to identify the work units with an excessive number of errors. The presence of errors that the computer is able to identify may suggest the presence of additional errors. Work units with excessive errors might then by subjected to a more thorough verification. In many cases the true nature of an error is not immediately apparent. An inadmissable sex code may result from the puncher skipping an item. In such a case the relationship code of "4" might appear in the sex column where only codes of "1" or "2" are acceptable. An error of this type will place incorrect data in many of the characteristics. It is better to reprocess such records rather than supply the data by allocation. Also an improper housing unit identification code on the record for one person may create an apparent error in the count of persons in the housing unit, and the appearance of several groups of persons with the same identification but no housing unit. Cases of this type also should be taken care of by reprocessing rather.than allocation. /The automatic

The automatic correction of errors may consist of treating inadmissable and inconsistent entries the same as omitted entries. For many characteristics this is the best procedure. This is especially true for characteristics with many categories, such as occupation or industry or country of birth. The inclusion of a "Not Reported" category, forces the user to exercise judgement in the interpretation of the data 0 For example, in some age groups, the number of persons not reporting school attendance might be greater than the number not attending school. The interpretation of such data would be quite different if it is assumed that the "Not Reported" groups is attending school, is not attending, or is distributed in the same proportion as those reporting on attendance. Thus the inclusion of a "Not Reported" category warns the user to exercise caution in the use of the data. This is a awarning that might be lacking if the computer programme (or, more realistically, the programmer) had decided how these cases should be classified. For certain characteristics it is preferable to have the programme assign acode in all cases where errors are detected. The programme is able to assign values to missing variables by taking account of other characteristics of the record. It is preferable to have the programme assign a code for sex in all such cases in order to avoid increasing the cells in each table by one third tc provide a "Not Reported" category as well as Total, Male and Female, The programme can take account of whether the person is reported as wife, or is a head with wife present. Also whether fertility is reported or the activity is shown as housewife. Even when these clues are not available, it is preferable to assign sex taking account ofrelationshipto head and marital status. The allocation of age avoids the difficulty decision of whether to include or exclude persons with age not reported from tables that are restricted to persons 15 years old and over, 5 years old and over, 5 to 29 years old, etc. The programme is also able to impute a missing age on the basis of characteristics of the person or the presence and characteristics of related persons. Economic activity is not reported for children. School attendance and education are not reported for persons under 5, There is usually a relatively close relationship between the age of the wife of head and the head and between the wife of head and age of oldest child. There is also a closerelationshipbetween age and grade attending for a person attending school. In corresponding manner", it is better to assign a value to marital status and relationship when they are not reported. /School attendance

School attendance might be shown for persons 6 to 29 years of age. The programme might be able to take account of educational attainment and age for persons not reporting attendance. The group not reporting might be concentrated in persons 25 to 29 years of age who are seldom attending school. Thus the values assigned by the computer may be more reasonable than those assumed by the user. The analysts are far more likely to accept the computer assignment if they are given a report showing the characteristics that have been changed and the nature of the changes. This report can be a diary summary. It can also include a case-by-case report of all changes that have vbeen made. The analyst can review this report and submit revisions for those records for which the action of the computer is not acceptable. Within the past year, I have worked with the Executive Office of the Census of Nicaragua in the preparation of a computer edit programme for their 1971 Census of Population and Housing, This programme is being run on an IBM 360/25 with 32K, The input of the programme is in the form of punch cards with a separate card for each housing unit and a separate card for each p3rscn 0 The card for the housing unit is followed by the cards for the persons in the unit. Persons are grouped by household within the housing unit. The output is a binary tape record with each variable expressed as a one or two byte binary number. The output also contains an indication of which variables have been changed by the edit process. Thus it is possible to prepare tabulations showing the number of cases where the variable has been changed and the distribution of the variable for cases where no change has been and the distribution cf cases supplied by the edit process. The programme prints a report for each work unit showing: 1, Total number of records, 2, Number of records with one or more errors, and as a percent of total records, 3, Total number of errors and as a percent of total records, 4, Total number of housing records, 5, Number of housing records with one or more errors. /6, The

6» The total number of housing errors and the distribution of housing errors by type, 7, Similar numbers and distribution of population records. In addition to the diary by work units the programme is able to produce a record of the content of the input record before edit and an indication of which variables have been changed and their value after the change. This record-by-record report may be shown for all records, for the recordsj, for a housing iunit end "its occupants when any allocation has - been made for the group«or it can suppress the report for the separate records. The record-by-record report may be reviewed manually«in most cases it is found that the computer revisions are satisfactory. When thay are not satisfactory a supplementary set of cards may be prepared for these housing units and their occupants. These records may be inserted in the basic record in place of the defective records; Unfortunately, we have not been able to construct an edit programme that is actuated by peramstor cards in the same manner as the CENTS programme. We have taken account of ths desirability of.facilitating the process of adapting the programme*.to'father countries. We have made very liberal use of subroutines and macros. The variables are referenced with mnemonic lables. Age is referenced as AGE S (or should itlbe EDAD?), Only minor changes are required to adjust for the fact that the variable is in a different location in the input or output record. The Computer Methods Laboratory of the U.S. Census Bureau would be able to adapt the programme to the census of some other country in a small fraction of the time required to prepare the Nicaragua programme. The edit process is performed by reading the record for the housing unit and for the members of the first household in the unit, and process this first household. Then it reads records for the next household and processes it. The record for the housing unit is edited after a new housing unit is encountered. Then the record for the housing unit and all of its occupants are written on tape. By this method we are able to check characteristics of related persons for consistency and to use relationships between persons in the Hot Deck process. For example, when a wife of head is present, the missing age of head is supplied by assuming that his age bears the same relationship to that of his wife as that of the preceding head with wife present. Occasionally,this approach leads to inppr^i^i^o^ results and must be rejected. For example let us assume /that a

that a 20 year old head with a 50 year old wife is followed by a head with unknown age and a 25 year old wife. The 20 year old is 30 years younger than his 50 year old wife, but the second head cannot be 30 years younger than his 25 year old wife. I should like to outline some of the edits that are performed by the programme. - Sex Sex is obtained by allocation when it is not reported using the following rules, Assign as female if reporting one or more children ever bom or if reported as wife of head. Assign Head as male if wife is present in household. Otherwise assign sex to head from Hot Deck taking account of presence of own children in the household, and of marital status if no children are present. Assign sex to child if head from Hot. Deck. When sex of children is not shown it is likely to be missing for more than one child in a household. Therefore the Hot Deck for this item contains th3 sex of the last five children. These values are rotated to avoid strings with the same sex. Otherwise, assign sex from Hot Deck taking account of relationship to head, age and marital status. Age Tine review and edit of age is among the most complex in the programme, A reported age of less than 10 years for head or wife of head is canceled. In corresponding manner a child of the head may be only 15 to 49 years younger than the head and the parent of the head must be at least 15 years older than the head.. Assign age of wife from age of head and difference between age of previous head and wife. If this fails to yield valid age, assign age of wive from age of oldest child and difference between age of previous oldest child and wife. If this fails, assign from previous wife. Assign age of head from age of wife and difference.between age of previous wife and head. If no wife, assign from age of oldest child and differences between age of previous child and head.' If no child, assign from previous he^d by sex and marital status. /For others

For others attending school, assign age from grade attending and difference between age and grade of previous person attending school. For children of head not attending school, assign age from age of wife (or head by sex if no wife) and difference between age of child and previous wife or head. As in the case of sex of child, these differences are retained and rotated for five cases to avoid strings of repetitions. For parent or grandchild, assign age from age of head and difference between previous parent or grandchild and head by relationship. For all ctiters and when the previous procedures fail to yield a consistent age, assign from previous person by relationship and assumed age. If literacy is not reported, assign as literate if education is 3 years or more; as illiterate if not. The subjects covered by the housing edit include the following: Occupancy status Change occupancy status to be consistent with presence (or absence) of person records for the unit if inconsistent. Persons in unit Change number of persons in unit to agree with number of persons records when inconsistent. Rooms and bedrooms Hot deck number of rooms from previous unit by type when neither number of rooms nor number of bedrooms is reported. Hot deck number of bedrooms from number of rooms and difference between rooms in a previous unit. Plumbing Hot deck water supply by type of structure when no plumbing items are reported. /Hot deck

tot deck water supply from previous unit with toilet or bath for exclusive use, shared toilet or bath, or none when not reported or inconsistent. Hot deck toilet facilities and bathing equipment from previous unit by water supply.