Citizen Information Project

Size: px
Start display at page:

Download "Citizen Information Project"

Transcription

1 Annex 2: Stakeholder processes, systems and data 2D:

2 Final Report: Annex 2D: Version Control Date of Issue 14 th June 2005 Version Number 1.0 Version Date Issued by Status /06/05 PJ Maycock Final report 2

3 Final Report: Annex 2D: Metadata Coverage UK Creator Office for National Statistics, General Register Office, Team Date Issued 13/6/05 Language English Publisher Office for National Statistics, 1 Drummond Gate, London, SW1V 2QQ Status Approved by Project Manager Subject Data quality, sharing and processing Subject.category Title : Annex 2D: Final report: 3

4 Final Report: Annex 2D: Contents 1. Preface Related documents Data trial objectives Scope and methodology Coverage profiles Critical quality characteristics Overview Scope and size of datasets Duplicate records Name and address verification Address quality / validity Summary PAF compliance Address matches across postcode demographic Language impacts Foreign addresses Address cleansing Linking to NLPG Overview of NLPG Dataset matching - methodology Matching by date of birth, name and address details Match results Interpretation of results Matches against each stakeholder dataset Identification of duplicate records Family composition Matching results by demographic Influence of address cleansing on overall matching statistics Influence of other datasets on matching Matching by date of birth and name elements Analysis of address changes False matches Family composition - matching by date of birth and names Preface

5 1. Preface The Final Report recommends the creation of an adult population register that will deliver benefits by sharing basic contact information (name, address, date of birth etc) across the public sector. The report recommends that the development of a population register is implemented as part of the ID Cards Scheme by utilising the National Identity Register (NIR) and that in the interim a range of short term data sharing initiatives are explored further. 2. Related documents Annex 2: Stakeholder processes, systems and data comprises of the following documents: Annex 2A: Overview Annex 2B: Data quality framework Annex 2C: Stakeholder profiles Annex 2D: : This document Annex 2E: : Appendices Annex 2F: Current data sharing across government Annex 2G: Other data quality initiatives This document provides A summary of the objectives, scope and methodology of the data trial. A summary of the comparative coverage, demographics, quality indexes and matching of the sample datasets Detailed results of the comparative analysis are detailed in Annex 2E: Data trial: Appendices The analysis of each specific dataset is detailed in Annex 2C Stakeholder profiles and accompanying appendicies. 3. Data trial objectives The overall objective was to assess the relative and combined quality of basic contact data held within stakeholders operational systems. This incorporated looking at the cost effectiveness of cleaning, matching and quality scoring techniques by using samples of stakeholder data; and assessing the implications of applying these techniques to the complete datasets To achieve this overall objective, the trial aimed to: 5 Preface

6 Provide further understanding of the characteristics and anomalies of identity (e.g. names, date of birth) and contact details held in stakeholders operational systems. Identify fitness for purpose of records and fields in the individual and merged datasets by determining appropriate quality level indicator(s). Obtain a statistical assessment of the matching records between stakeholders datasets. This includes the percentage of records that can be automatically matched and those that have a reasonable probability of being matched and may justify manual inspection. Develop best practice guidance on data quality and matching. 4. Scope and methodology Nine demographics were identified as sample areas, selected by one of three criteria; name, address and date of birth. The total estimated population across the selected demographics was between 20,000 90,000 depending on the dataset and the composition of these demographics were reviewed with the ONS Methodology group The sample sets were chosen to ensure that the following demographics would be covered by the trial: Demographic s1 Typical dataset by name s2 Typical dataset by name s3 Typical suburban dataset by geographic area (postcode and area name) s4 Covers name issues and address issues on houses that have been converted into flats. (postcode) s5 Covers a rural area in Scotland (postcode) s6 Covers issues around Welsh names and addresses (postcode and area name) s7 Covers issues related to high density urban areas and high rise flat blocks s8 Dataset by specific date of birth s9 Covers issues around nominated date of birth being 1st January The Electoral Roll (2003) contains 83% of the 18+ UK population, and is the closest and most representative available dataset, (apart from datasets which are maintained in the private sector), to a comprehensive population register against which other datasets can be compared. Demographics based on the same criteria as the other datasets have been applied to the Electoral Roll and the current population for each demographic determined. The relative size of the Electoral Roll vs Census 2001 was used to correlate the date of birth profiles of the sample data sets with the Census 2001 date of birth profile A data sharing protocol was produced and reviewed with the Information Commissioner to provide a robust framework for the legal, secure and confidential sharing of personal information for the trial. A fundamental principle 6 Scope and methodology

7 was that the trial outputs will be anonymous and mainly statistical and that the data will be destroyed at the end of the trial. The contractor s data security protocols were audited, inspected and approved by ONS The following key stakeholders were identified to participate in the trial by providing sample contact data based on the demographics described above: Department for Work and Pensions Driver and Vehicle Licensing Agency General Register Office General Register Office for Scotland. HM Revenue and Customs National Health Service Information Authority United Kingdom Passport Service Legal vires for data sharing with the stakeholders were agreed; with the exception of the DWP and the NHSIA, both of whom subsequently were unable to provide sample data to participate in the trial A procurement exercise identified Siemens Business Systems as a specialist contractor with extensive skills in the areas required to perform the technical aspects of the trial in the most economically effective way The participating stakeholders were given the same data-extract specification and provided sample data-extracts to the specialist contractor. This covered basic contact details, such as current name, address and date of birth for the population within the selected demographics and, in the case of HM Revenue and Customs, historical names and addresses The contractor reported on: Detailed analysis of all input data; for addresses this included comparison with electoral register dataset and external address datasets (PAF and NLPG) Assessment of address data cleansing possible Analysis of data matching between the sample datasets Development and application of a data quality index methodology Subsequently the Atkins Technical Team carried out further analysis of the results and correlation with other information, e.g. database sizes, demographics and census profiles. An Excel model of all de-personalised data and results was created and used to determine and generate: Appropriate weightings for each dataset and demographic (s1-s7) Comparative profiles for all datasets Comparative profiles of demographics within each dataset Matched profiles for selected demographics and different match criteria 7 Scope and methodology

8 5. Coverage profiles Analysis of the sample datasets by demographic and correlation of these results against the same demographics from the Electoral Roll enabled coverage profiles against date of birth to be generated. These highlighted the following issues: DVLA dataset demographics s5 Scotland and s7 Birmingham were unrepresentative (4% and 6% of expected population compared with other demographics 79-87%). The most likely explanation is that the extract of the data for demographics s3-7 was substantially based on postcodes and that s5 and s7 comprised postcodes containing a padding character, which invalidated the extract. As a result of this s5 and s7 were excluded from the coverage profiles HMRC data extract included historical names and addresses and the nature of the data structure resulted in additional citizen records being returned for those no longer living within the geographical criteria or currently meeting the name criteria for the demographic. The data provided by HMRC contained sufficient information to identify these additional identities and exclude them from the analysis. Where possible the datasets were modified to exclude all citizens known to be deceased to provide comparable results to the Electoral Roll / Census However, this information was not available for DVLA, UKPS and may not be fully current for HMRC datasets. There are significant variations of profile between the demographics of the sample datasets, e.g. the age profile varies significantly between s4 London and s6 Wales. This was expected as demographics s4-s7 were deliberately chosen to reflect atypical situations. Weightings were applied at dataset and demographic level and a sensitivity analysis carried out to ensure the most acceptable correlation between the sample datasets and other information, e.g. census 2001 profile, the same demographics extracted from the Electoral Roll (representing approx 83% of 18+ population), database sizes relative to the current population The coverage profiles are based on the following parameters: 50% records based on typical name demographics s1 and s2 20% records based on typical geographical demographic, s3 Bournemouth 20% records based on demographic s4 London 10% records based on demographic s6 Wales dataset weightings to correlate results with actual database sizes obtained from data suppliers (data quality questionnaire). DVLA: 94% GRO/S: 84% HMRC: 108% UKPS: 96% Electoral Roll / Census 2001: 110% 8 Coverage profiles

9 Census 2001 DVLA (Drivers) GRO + GROS (Births) HMRC (NIRS2) UKPS (PASS) Census is the most accurate estimate of the whole population HMRC and DVLA do not include Include children emigrants and are and greater excludes than children Census due to emigrants. GRO + GRO(S) includes everyone born in Scotland from 1974 England and Wales from 1993 DVLA only includes those with a driving licence UKPS (PASS) includes new and renewed UK passports since 1998 (60% of total UK passports) Year of birth Coverage profile by date of birth 9 Coverage profiles

10 6. Critical quality characteristics 6.1 Overview CIP completed a review of the existing data held in key public service systems through a data trial supplemented by a detailed questionnaire. The results are summarised below (detailed results are given within each stakeholder profile). DfES Loans Student Citizen records Estimated duplicates Name verification Address verification 5m < 2% High Initially high > low Up to date address Address validity DVLA (Drivers) 40m 0.17% High Nil ~ 62% High DVLA (Vehicles) 18m #1 - Medium Nil 90-95% High DWP (DCI) 84m ~ 0.07 as per NIRS2 GRO / GROS (Births) 10m 0.66% (GRO) Low High Medium Low Medium High Not applicable Nil Not updated HMRC (CID) 60m - Low Nil Medium High HMRC (NIRS2) 72m 0.07% Low Low Medium High Low UKPS (Main) 70m #2 Passport renewals UKPS (PASS) 24m #2 Passport renewals Identity (Requirements) Cards 40 / 48m (adults) High Low ~ 56% High High Low 70% > 56% High 0% High Low 90 95% High Data trial results Quality questionnaire response Target Notes: #1. Of the 30 million records only 18 million vehicles have individual citizens as owner. #2 Database is passport centric rather than person centric. 10 Critical quality characteristics

11 6.2 Scope and size of datasets DfES Student Loans Citizen records 5m Comments DVLA (Drivers) 40m Active drivers, but includes emigrants and some deceased DVLA (Vehicles) 18m #1 30 million records, approx. 18m individuals names (remainder registered with organisations) DWP (DCI) 84.5m 47 million live adult records in UK; 1 million live social security benefit recipients living abroad; 15 million deceased records (date of death verified); 1.5 million deceased records (date of death not verified); 5.5 million, abroad not in receipt of benefit; 2 million, inactive but not categorised; and 12.5 million child records. GRO / GROS (Births) HMRC (CID) 10m 60m GRO only available electronically since 1993, GRO(S) since 1974 HMRC (NIRS2) 72m Similar to DCI, 6.5m emigrants, 2m inactive and 15m deceased. No children. UKPS (Main) 70m #2 Records relate to passports (duplicate records on renewal) UKPS (PASS) 24m #2 As above, only populated since 1998 (60% of all passport holders) Identity Cards (Requirements) 40 / 48m (adults) Target of 40m without compulsion, 48m with compulsion 6.3 Duplicate records DfES Student Loans Estimated duplicates < 2% Comments DVLA (Drivers) 0.17% Likely to be mainly associated with paper licences DVLA (Vehicles) - DWP (DCI) ~ 0.07 as per NIRS2 Based on close similarities with NIRS2 GRO / GROS (Births) HMRC (CID) 0.66% (GRO) No details available to CIP There are 6.7% of citizens records which for a limited period are duplicated with a temporary and permanent NINO. This is part of the business process and the use of these temporary NINOs is being phased out. 11 Critical quality characteristics

12 Estimated duplicates Comments HMRC (NIRS2) 0.07% UKPS (Main) UKPS (PASS) Identity Cards (Requirements) Passport renewals Passport renewals 0% Legitimate duplicates due to renewals 6.4 Name and address verification Verification supporting documents or processes that confirm the information e.g. name verified by presentation of passport. Validation checking that value is within range or exists, checking address against Postcode Address File (PAF) Name verification Critical to many processes Striving for Gold standard Address verification Low quality Less onerous, fewer critical processes niche requirement Difficult to e-enable DfES Student Loans Name verification High Address verification Initially high > low DVLA (Drivers) High Nil DVLA (Vehicles) Medium Nil DWP (DCI) Medium Low GRO / GROS (Births) Not applicable Nil HMRC (CID) Low Nil HMRC (NIRS2) Medium Low UKPS (Main) High Low UKPS (PASS) High Low Identity Cards (Requirements) High Low 12 Critical quality characteristics

13 7. Address quality / validity 7.1 Summary DfES Student Loans DVLA (Drivers) DVLA (Vehicles) DWP (DCI) GRO / GROS (Births) HMRC (CID) HMRC (NIRS2) UKPS (Main) UKPS (PASS) Identity Cards (Requirements) Address validity High High High High Low High High High High High Generally high quality addresses - effectively 90% (assessed using QAS) Automatic address cleansing is limited to marginally improving existing good quality addresses Tentative matches significant numbers can be resolved rapidly by visual inspection Application of Unique Property Reference Number Verification (UPRN-NLPG, now NSAI National Spatial Address Infrastructure) As NLPG validated by more LAs and becomes integral with other systems, so data quality will improve 7.2 PAF compliance QAS was used to assess the percentages of addresses which are compliant with PAF, the results of which are shown below: DVLA GRO GROS HMRC UKPS Percentage of PAF Compliant Addresses by Stakeholder DVLA and UKPS both achieved a 95% compliance with PAF, which is above the 90% matching level at which the Post Office will start offering mailing discounts. However, for the DVLA results only 6% were actually matched as Verified Correct as the DVLA generally omits the town name from its address format, which resulted in QAS making an automatic adjustment to the address format and classifying those records as only a Good Match In the HMRC dataset, 89% of addresses complied with PAF but this was the only data set to include all historical addresses and the overall score for this dataset suffered from the obsolescent nature of some of its addresses GROS data, taken as a whole for births and deaths, reached a compliance percentage of 68%. This lower figure is caused largely by the relatively high number of both tenement addresses in Scottish towns and cities and the number of rural addresses outside of cities. 13 Address quality / validity

14 7.2.5 GRO produced the poorest results having fewer than 58% of raw data addresses complying automatically with PAF. The GRO result can be attributed to the concatenated address data in its sample, which QAS had difficulty automatically matching to PAF. 7.3 Address matches across postcode demographic The postcode demographics achieving the highest match rates were s3 (Bournemouth) and s4 (London). The results for s3 were not unexpected, given that this was a typical suburban area with limited scope for problem addresses. The scores for s4 were expected to be lower than those actually recorded due to the number of flat conversions in this area. However, it would appear that the format of flat addresses in s4 did not have a significant impact on QAS ability to match addresses The s7 (Birmingham) demographic achieved results slightly lower than s3 and s4 and this lower result was primarily caused by the concentration of high rise tower block accommodation in this area, the formatting of which did result in QAS recording lower levels of Verified Correct and Good Full matches The lowest match rates were in s5 (Scotland) and s6 (Wales). The s5 demographic was particularly adversely impacted by the combination of poor rural address formats and the high number of obsolete addresses resulting from a housing estate redevelopment whilst the poor results for s6 were primarily due to problems with rural address formats only. In fact, the good match percentages for s6 were higher than those for s5 due to the lower predominance of rural addresses. 7.4 Language impacts Demographic s6 was of a Welsh postcode area which was partly rural. Possible issues with the use of Welsh language names had been predicted but, apart from a few records where Welsh names had been spelt incorrectly, the use of Welsh in address raw data was not a major factor hindering the overall matching process. 7.5 Foreign addresses The level of Foreign Address matches was low with figures at around or below 0.1% for all but the HMRC dataset. Foreign Address matches for HMRC were actually reduced because many were given a match type of Unmatched with particular issues around addresses having the country name of Ireland, which was not recognised, instead of Eire or the Irish Republic Overall, foreign addresses only accounted for 0.3 % of total addresses and did not have a material impact on address match rates. 14 Address quality / validity

15 8. Address cleansing The automatic improvement to addresses can only be confidently applied to matches that already qualify as good or better (i.e. Verified Correct and Good Full matches). To maximise the quality of address data and increase the overall figures for PAF compliance, manual matching will be necessary. The match type groupings produced by QAS Batch confer confidence levels on the matches it provides and separate analysis has shown that addresses with match types of Tentative and Partial offer considerable potential for increasing the overall number of address matches through a separate exercise of manual matching. Whilst some of this manual matching can be accomplished quite easily (less than one minute per record), it has not been possible to accurately assess the total effort required to undertake a complete manual review of all records which QAS has not classified as a Verified Correct and Good Full match. 9. Linking to NLPG 9.1 Overview of NLPG The National Land and Property Gazetteer (NLPG) is a single, comprehensive list of addresses that was initially generated from Valuation Office records. The validation and maintenance of these addresses has been devolved to each Local Authority, who maintain a Local Land and Property Gazetteer, which is synchronised with the NLPG. All the data is held in a common format and each property is assigned a unique property reference numbers (UPRN) and geographical grid references. These co-ordinates allow individual properties to be accurately identified within ad hoc boundaries (e.g. Primary Care Trust catchment areas, and areas defined for Neighbourhood Statistics) using geographical information systems and enable dwellings in remote areas to be accurately located where one postcode might cover a very wide area. Difficulties in obtaining NLPG data Obtaining access to the NLPG dataset for use on the CIP trial proved extremely problematical. This was primarily due to the difficulties Siemens encountered in obtaining the necessary approvals for the release of this data as licencing issues meant that it was not possible to obtain the complete national NLPG dataset and the local authorities, whose demographic area was covered in the trial, were reluctant to release such data to the CIP trial. As a result, further delays were encountered and the local authority datasets that were eventually delivered to Siemens and could be used on the trial were restricted to the following: Wandsworth Bournemouth Poole Pembrokeshire 15 Address cleansing

16 9.1.3 A major learning point to be carried forward for any similar exercises requiring access to NLPG data in the future is that careful consideration may have to be given to how best to gain access to such data. A separate lobbying process may be required to win the support and cooperation of local authorities and other relevant Government agencies to facilitate the willing release of data by these bodies in a timely manner. It is hoped that the launch of the NSAI (National Spatial Address Infrastructure), which seeks to integrate NLPG, Royal Mail and OS address data, will provide impetus to LA s validating and using a single address register and the adoption of the UPRN. Objectives The sample datasets were matched against the NLPG data using i/lytics to identify the level of address matching possible to enable the allocation of Unique Property Reference Numbers (UPRN) and compared with similar matching using QAS to establish if NLPG data might be used to improve the quality of addresses (completeness, consistency, format and validity). Results Due to the limited number of available datasets the NLPG data used in the trial only covered the s3, s4 and s6 samples, and results were limited to these demographics. Consequently, the results did not include any matches with the GROS dataset The actual matching levels obtained, as a percentage of s3, s4 and s6 data, are shown below. DVLA GRO GROS HMRC UKPS 68.76% 33.38% 69.69% 67.62% Percentage of address records in s3, s Compared with QAS matching levels NLPG matched between 70% -80% of addresses in demographics s3, s4 and s6. This could be partly due to NLPG data not having identical boundaries to postcode areas and some of the demographics falling outside the NLPG area. DVLA GRO GROS HMRC UKPS 72.56% 78.53% 74.79% 70.95% NLPG matches as % QAS matches for s3, s Stakeholder addresses matched NLPG data in broadly the same proportions as they were matched by QAS with the single address field format of the GRO data achieving considerably fewer matches The conclusions from the partial NLPG matching is that QAS gives levels of address matching approximately 25% higher. However, these figures should be treated with some caution due to the limited scope of the NLPG analysis resulting from the limited amount of NLPG data made available to the trial. 16 Linking to NLPG

17 We recommend that the use of NLPG data (or the subsequent National Address Infrastructure) and the allocation of a UPRN to all citizen addresses should be pursued, as this will yield significant benefit when sharing data and will limit the manual matching effort to the initial allocation Currently 81% Local Authorities, in the England, Scotland and Wales, have validated their LLPG data and 55% of LAs are actively maintaining this data. Assuming that this initiative continues across all LAs and that LAs, as they adopt CRM solutions, will use their LLPG data across all their applications, then the quality of this data will significantly improve and achieve a level similar to PAF. 10. Dataset matching - methodology The raw datasets (175,000 records) were rationalised into a common format and where alternative or historical names and addresses existed these were converted into 145,000 additional records (i.e. a record was created for each combination of name and address in the original record) All datasets were then matched using the i/lytics tool using the ranking criteria described in Appendix 3.10 which utilised all the primary data items (including date of birth, names, and addresses) The i/lytics system sorts and compares all the records using exact and fuzzy matching and utilises heuristic rules related to abbreviations and permutations of name and address elements. Groups with similar records, called families, are created and the record with the most complete information is identified as parent and all the other records in the family termed members. Each member is compared against the parent and the type of similarity between the parent and each record is termed the rank of the match and is a complex combination of matching rules associated with each data item. For more details refer to Appendix Automatic ranks are those where the similarity between two records is high enough that the records can be considered duplicates without any further analysis or inspection. An initial automatic match rate of 25% was achieved with exact matching and subsequently enhanced to 49% with the inclusion of fuzzy matching and optimisation of the ranks yielding satisfactory results and very low probabilities of false matching Each family group is then de-duplicated using the unique id allocated to the raw records, i.e. this re-combines permutations of name and address, but ensures that matching has been achieved utilising all these permutations. Family groups may then be classified as: Parent records with no children: Original records do not match any others Parent records with children from different datasets: legitimate matches Parent records with more than one child from the same dataset: Potential duplicate records, i.e. the same date of birth, name and address but with different stakeholder id (NINO, licence, etc). 17 Dataset matching - methodology

18 The members within each family are then analysed and a family composition report generated identifying the combinations of matching. These results are aggregated to give match rates for all combinations of datasets In addition to matching on all primary fields, the process has been repeated using more relaxed matching criteria: date of birth + names date of birth + surname From these additional matches the following can be derived: extent of identities (i.e. matching on date of birth and names) with different addresses extent of missed identity matches by broadening the criteria to date of birth and surname some indication of false matches by inspecting the occurrences within each match group (family comosition) no match records those that will never match, e.g. citizen with only a driving licence and no passport. There are a limited number of scenarios not considered, e.g. change of name due to marriage / divorce, but these are not likely to be significant (i.e. number of marriages / divorces in a year is relatively small to total population). 11. Matching by date of birth, name and address details 11.1 Match results Number of stakeholder records matched as a percentage of all stakeholder records (considering all datasets and demographics, without any weightings) Stakeholder All datasets DVLA GRO (B+D) GROS (B+D) HMRC UKPS Births (GRO+ GROS) Deaths (GRO+ GROS) All records 175,268 39,004 12,969 5,187 93,580 24,528 11,428 6,728 Matched records 84,646 (48%) 27,123 (69%) 4,371 (33%) 902 (17%) 34,152 (36%) 18,098 (73%) 2,226 (19%) 3,047 (45%) DVLA GRO (B+D) GROS (B+D) HMRC UKPS Births (GRO+GROS) Matching by date of birth, name and address details

19 In the above table a record refers to a person with a unique id (e.g. NINO, licence no, etc), except in the case of UKPS where it refers to a passport no. which changes on renewal. UKPS 63% 37% UKPS 23% DVLA DVLA 77% HMRC 85% 15% HMRC 59% UKPS UKPS 41% HMRC 73% 27% HMRC 66% DVLA DVLA 34% 11.2 Interpretation of results It is important to recognise that the match percentages reported are more heavily influenced by the nature of the datasets than by the efficacy of the matching process, e.g. children in UKPS dataset can never be matched to DVLA data which applies only to over 16s. e.g. UKPS matched against DVLA The following match profiles were based on the same dataset and demographic weightings used to analyse comparative coverage. The match percentages between all unweighted datasets and demographics is not significantly different to the following. 350 Census 2001 No. of records in sample dataset Grey match (dob, surname) Full match (dob, name, address) DVLA (Drivers) UKPS (PASS) Year of birth Automatic matches 19 Matching by date of birth, name and address details UKPS (PASS)

20 DVLA drivers No matches i.e. drivers without records in PASS database 23% DVLA records automatically match UKPS (PASS) records No matches i.e. passport holders (in PASS) without drivers licence 37% UKPS (PASS) records automatically match DVLA records Matching between DVLA (Drivers) and UKPS (PASS) datasets 20 Matching by date of birth, name and address details

21 11.3 Matches against each stakeholder dataset 100,000 90,000 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10,000 - DVLA GRO (BIRTHS+DEATHS) GROS (BIRTHS+DEATHS) IR UKPS Unique IDs in Input Unique IDs in Match Families The above graph shows the numbers of input records and match records per stakeholder and gives an indication for the percentage match rate of each stakeholder against all records. These figures are discussed below As can be seen, the matching levels within the merged dataset revealed a sizeable disparity between stakeholders with far higher percentage match rates from DVLA and UKPS of 69.54% and 73.39% respectively. Matching levels for birth and death records were substantially lower whilst HMRC records, although having more records matched than any other stakeholder, only matched 36.49% of distinct records. 21 Matching by date of birth, name and address details

22 The disparity of matching levels between stakeholders can be attributed to a number of identifiable factors specific to one or more demographic as listed below. Dataset DVLA UKPS Factors with a positive influence on matching rates High level of PAF compliant addresses. Current and updated data High level of PAF compliant addresses. No data over six years old, i.e. prior to 1998 Factors with a negative influence on matching rates Not all citizens have a driving licence Not applicable to under 16s Not all citizens have a passport Only 60% of passport holders on this database HMRC Large coverage Not applicable to under 16s GRO GROS Temporary residents working in the UK Older data now obsolete e.g. deaths predated other stakeholder data Older data now outside of sampled demographics e.g. Person living in s3 and moving before creation of other stakeholders datasets Persons born before 1993 not in dataset Poor PAF compliance due to concatenation of addresses elements Birth data on children too young to appear in other data Persons born before 1973 not in dataset Low PAF compliance due to more complex nature of Scottish addresses Low numbers of people in the Scottish postcode s5 demographic in other stakeholders Birth data on children too young to appear in other data 11.4 Identification of duplicate records The number of matched family records per dataset is shown in the chart below with a count showing number of matches within a dataset. For example, there are 22 Matching by date of birth, name and address details

23 65 matches of identity within the GRO dataset and 1 example of four UKPS records in the same match family From inspection all these records (except for UKPS where a record is related to a passport rather than a citizen and indicate passport renewals) are duplicate records, i.e. a person having more than one unique id within a dataset. 34,012 35,000 26,929 30,000 25,000 17,072 20,000 15,000 10,000 5, , , , Prevalence Count DEATHS (GRO+GROS) BIRTHS (GRO+GROS) GROS (BIRTHS+DEATHS) GRO (BIRTHS+DEATHS) UKPS IR DVLA 11.5 Family composition The following graph identifies the matches between different datasets 23 Matching by date of birth, name and address details

24 The composition of families by stakeholders Family Size DVLA Only GRO Only GROS Only IR Only UKPS Only DVLA, GRO DVLA, GROS DVLA, IR DVLA, UKPS GRO, IR GRO, UKPS GROS, IR GROS, UKPS IR, UKPS DVLA, GRO, IR DVLA, GRO, UKPS DVLA, GROS, IR DVLA, IR, UKPS GRO, IR, UKPS GROS, IR, UKPS DVLA, GROS, IR, UKPS DVLA, GRO, IR, UKPS DVLA, BIRTHS DVLA, DEATHS BIRTHS, DEATHS BIRTHS, IR BIRTHS, UKPS DEATHS, IR DVLA, BIRTHS, IR DVLA, DEATHS, IR DVLA, DEATHS, UKPS BIRTHS, DEATHS, IR BIRTHS, IR, UKPS DEATHS, IR, UKPS DVLA, DEATHS, IR, UKPS DVLA, BIRTHS, IR, UKPS The above confirms that when matching on all primary fields, the occurrence of false matches is negligible and due solely to duplicate identities. 24 Matching by date of birth, name and address details

25 11.6 Matching results by demographic The matching results obtained by demographic split are shown below: 60,000 50,000 40,000 30,000 20,000 10,000 - s1 s2 s3 s4 s5 s6 s7 s8 s9 Unique IDs in Input Unique IDs in Match Families 25 Error! No text of specified style in document.

26 Across demographics the match level percentages for unique id records are shown in the table below: Demographic Total Matched Match % All datasets and demographics 175,268 84,646 48% s1 Surname beginning XXX 12,162 6,959 57% s2 Surname beginning YYY 9,579 5,989 63% s3 Bournemouth 32,087 18,220 57% s4 London 50,482 23,076 46% s5 Scotland 14,589 3,816 26% s6 Wales 12,692 8,699 69% s7 Birmingham 26,908 7,450 28% s8 DOB - Random 6,471 4,200 65% s9 DOB 1/1 from mid 70s 10,298 6,237 61% The results for s5 and s7 reflect the very low numbers of records retrieved from the DVLA drivers database for those demographics and should be disregarded The consistency of data over time is also comparable with matching levels. The date of birth demographics s8 and s9, based on data that should never change, show a greater matching percentage with s8 levels higher than s9. This is possibly due to dates of birth given as first of January not being consistently used elsewhere. The s1 and s2 demographics are based on fairly consistent name data but changes in surname will reduce the number of matches. Address data for an individual can change often which reduces matching levels. Areas such as s3 and s6, which could be expected to have a more static population, show much better matching levels For example, date of birth demographics s8 and s9 show a higher matching percentage than any other demographic type. This may be due the date of birth being static through a person s lifetime when address date and, even name data, can be prone to change. There is higher percentage of s8 records matched than s9 which may indicate birth dates of 1 st of January are often guessed or approximated and are not used consistently by people Influence of address cleansing on overall matching statistics QAS address cleansing had minimal effect on increasing matches. Removing address data from matching criteria increased matches by just over 5%, from 48.29% to 53.41%, indicating that address quality was not hugely significant in securing matches due to the overall good quality of addresses in the DVLA and UKPS datasets Influence of other datasets on matching The matching of all the datasets by date of birth, names and addresses was repeated with CACI Enhanced Electoral Roll data included. This resulted in an 26 Matching by date of birth, name and address details

27 increased match rate of 7% for the HMRC dataset, 3% for DVLA, 1.5% for UKPS and nominal effect on GRO / GROS. 12. Matching by date of birth and name elements Relaxation of the matching criteria to exclude address details results in almost 10% more matches than previously. However, some measure of the false matches occurring may be derived from the family composition diagram where there is a small increase in the occurrences of families with more members than should be expected, e.g. where matching occurs between DVLA, UKPS and IR there are 6 members in a family of size 4 indicates that there are 6 x (4-3) = 6 false records By further relaxing the criteria to just date of birth and surname, the increase of matches will include any missed matches in the previous analyses, but there will be more false matches. This gives an indication of the grey area of matching for this sample size, i.e. the difference between the records that conclusively match (based on extensive criteria), and people that are unlikely to ever match (dob and surname are unique) e.g. they only have a driving licence and no passport The difference between matching by date of birth and names vs date of birth and just surname showed only a small difference. This is likely to be due to the small size of the data samples This result cannot be directly extrapolated to a large dataset as if there may well be only one Smith born on a specific day in a dataset of 100 members, but there will be a number of Smiths born on that day in a dataset of 10 million. However, from analysis of surname and date of birth statistics it is known that within the UK population 90% of people have a unique combination of date of birth and surname. Thus the extrapolated no match result cannot fall below 90% of the extrapolated value This enables a matching percentage to be derived, which is only related to the efficacy of the match and not skewed by members who will never match. 13. Analysis of address changes The following results were obtained for the limited and weighted demographics / datasets used in the coverage profiling: 27 Matching by date of birth and name elements

28 Dob + Name + Address DVLA IR UKPS No % No % No % Adjusted All records 11,941 16,036 6,560 DVLA 8, % 3, % 80.4% IR 8, % 0.0% 3, % 98.2% UKPS 3, % 3, % GRO Dob + Surname DVLA 9, % 3, % 88.0% IR 9, % 4, % 107.8% UKPS 3, % 4, % GRO People with different addresses DVLA 1, % % 7.6% IR 1, % % 9.7% UKPS % % As a % of matched addresses DVLA 12.5% 8.7% 14.4% IR 12.5% 9.0% 14.9% UKPS 8.7% 9.0% UKPS adjusted 14.4% 14.9% Results give the number of different addresses as between 9-15% of matched records. These represent the records shown in the diagram below: The unknown remains the number of records where both databases hold out of date addresses. Passport holders with a current address UKPS Passport holders with old address Drivers and passport holders with old and current address DVLA (Drivers) Drivers and passport holders with old address (UKPS) and current address (DVLA) Drivers and passport holders with old address in both databases Drivers and passport holders with old address (DVLA) and current address (UKPS) UKPS DVLA Drivers and passport holders with current address in both databases Drivers with old address Drivers with a current address Analysis of address changes

29 14. False matches Inspection of the family composition results show an increased number of false matches as expected. Match by DoB + surname Matches by DoB + name (fuzzy) Matches by DoB + surname DVLA Matches + false matches 46% 43% 38% No matches 54% 57% 62% UKPS Matches + false matches 29% 27% 24% No matches 71% 73% 76% The combination of false and missed matches as the match criteria is relaxed is illustrated in the following diagram: 29 False matches

30 Name Address Date of birth ID no Stakeholder James Doe 20 High St 01/01/1950 ID 1000 UKPS James Doe 20 High St 01/01/1950 ID 1000 DVLA John Doe 20 High St 01/01/1950 ID 1001 UKPS John Doe 12 Bridge St 01/01/1950 ID 1001 DVLA Susan Doe 12 Bridge St 01/01/1950 ID 1002 UKPS Sue Doe 12 Bridge St 01/01/1950 ID 1002 DVLA Ann Doe 10 Kings Rd 01/01/1950 ID 1003 UKPS John Jones 80 Main St 01/01/1950 ID 1004 DVLA Matching by dob + name + address Matching by dob + name Matching by dob + surname Correct Correct Missed Missed Missed Missed Correct Correct Automatic match 1 Automatic match 1 No matches 1-6 Correct Correct Correct Correct Missed Missed Correct Correct Automatic match 2 No matches 1-4 Correct Correct False match False match False match False match False match Correct Automatic match 1 No match 1 Proportion remains unchanged as sample is scaled up % of population with full match and who hold a passport and drivers licence, i.e. match criteria is so strict that no false matches exist Grey matches - ratio of missed / false / correct matches varies as sample is scaled up Proportion reduces as sample is scaled up but can never go below minimum of % populatio with unique combination of do and surname and which hold either a passport or a drivers licence (ie ratio of passports:drivers) 30 False matches

31 14.1 Family composition - matching by date of birth and names This analysis identifies the increased level of matching and the occurrence of a small number of false matches as a result of relaxing the matching criteria to exclude address. 20,000 18,000 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2, Family Size - 8 DVLA Only GRO Only GROS Only IR Only UKPS Only DVLA, GRO DVLA, GROS DVLA, IR DVLA, UKPS GRO, IR GRO, UKPS GROS, IR GROS, UKPS IR, UKPS DVLA, GRO, IR DVLA, GRO, UKPS DVLA, GROS, IR DVLA, IR, UKPS GRO, IR, UKPS GROS, IR, UKPS DVLA, GROS, IR, UKPS DVLA, GRO, IR, UKPS DVLA, BIRTHS DVLA, DEATHS BIRTHS, DEATHS BIRTHS, IR BIRTHS, UKPS DEATHS, IR DVLA, BIRTHS, IR DVLA, DEATHS, IR DVLA, DEATHS, UKPS BIRTHS, DEATHS, IR BIRTHS, IR, UKPS DEATHS, IR, UKPS DVLA, DEATHS, IR, UKPS DVLA, BIRTHS, IR, UKPS 31 False matches

The Census questions. factsheet 9. A look at the questions asked in Northern Ireland and why we ask them

The Census questions. factsheet 9. A look at the questions asked in Northern Ireland and why we ask them factsheet 9 The Census questions A look at the questions asked in Northern Ireland and why we ask them The 2001 Census form contains a total of 42 questions in Northern Ireland, the majority of which only

More information

How a People Classification Can Add Value to Census Data. Simon Perry

How a People Classification Can Add Value to Census Data. Simon Perry How a People Classification Can Add Value to Census Data Simon Perry Presentation outline Why the census is useful and what s better this time Disclosure protection and spatial analysis What the census

More information

Country Paper : Macao SAR, China

Country Paper : Macao SAR, China Macao China Fifth Management Seminar for the Heads of National Statistical Offices in Asia and the Pacific 18 20 September 2006 Daejeon, Republic of Korea Country Paper : Macao SAR, China Government of

More information

The progress in the use of registers and administrative records. Submitted by the Department of Statistics of the Republic of Lithuania

The progress in the use of registers and administrative records. Submitted by the Department of Statistics of the Republic of Lithuania Working Paper No. 24 ENGLISH ONLY STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT) CONFERENCE OF EUROPEAN STATISTICIANS Joint ECE/Eurostat

More information

Prepared for: CACI Acorn microsite Prepared by: CACI Product Development Team Date issued: 15th March Acorn technical document

Prepared for: CACI Acorn microsite Prepared by: CACI Product Development Team Date issued: 15th March Acorn technical document Prepared for: CACI Acorn microsite Prepared by: CACI Product Development Team Date issued: 15th March 2013 Acorn technical document Table of Contents 1. Introduction... 3 1.1. What is Acorn?... 3 1.2.

More information

Methodology Statement: 2011 Australian Census Demographic Variables

Methodology Statement: 2011 Australian Census Demographic Variables Methodology Statement: 2011 Australian Census Demographic Variables Author: MapData Services Pty Ltd Version: 1.0 Last modified: 2/12/2014 Contents Introduction 3 Statistical Geography 3 Included Data

More information

Report to Frack Free Frodsham & Helsby. Survey Analysis and Report of Residents Attitudes Towards Shale Gas Fracking in Helsby Parish Council Area

Report to Frack Free Frodsham & Helsby. Survey Analysis and Report of Residents Attitudes Towards Shale Gas Fracking in Helsby Parish Council Area Report to Frack Free Frodsham & Helsby Survey Analysis and Report of Residents Attitudes Towards Shale Gas Fracking in Helsby Parish Council Area Author: John Murray BSc (hons) FBCS FSS CITP CEng Date:

More information

2011 Census quality assurance: The estimation process

2011 Census quality assurance: The estimation process CIS2012-03 2011 Census quality assurance: The estimation process July 2012 Introduction This briefing outlines the census estimation process for the 2011 Census estimates. The data it draws upon was released

More information

Some Indicators of Sample Representativeness and Attrition Bias for BHPS and Understanding Society

Some Indicators of Sample Representativeness and Attrition Bias for BHPS and Understanding Society Working Paper Series No. 2018-01 Some Indicators of Sample Representativeness and Attrition Bias for and Peter Lynn & Magda Borkowska Institute for Social and Economic Research, University of Essex Some

More information

Overview of Civil Registration and Vital Statistics systems

Overview of Civil Registration and Vital Statistics systems Overview of Civil Registration and Vital Statistics systems Training Workshop on CRVS ESCAP, Bangkok 9-13 January 2016 Helge Brunborg Statistics Norway Helge.Brunborg@gmail.com Outline Civil Registration

More information

2.0 INTERFACE OF CR SYSTEM WITH THE VITAL STATISTICS SYSTEM AND NPD

2.0 INTERFACE OF CR SYSTEM WITH THE VITAL STATISTICS SYSTEM AND NPD Computerization of the Civil Status and Vital Statistics Systems of the Republic of Seychelles and its Integration with the National Population Database 1 1. INTRODUCTION 1.1 The Civil Status records were

More information

THE SCOTTISH LONGITUDINAL STUDY Tracing rates and sample quality for the 1991 Census SLS sample

THE SCOTTISH LONGITUDINAL STUDY Tracing rates and sample quality for the 1991 Census SLS sample THE SCOTTISH LONGITUDINAL STUDY Tracing s and quality for the 1991 Census SLS LSCS Working Paper 2.0 October 2007 Lin Hattersley LSCS & General Register Office for Scotland Gillian Raab LSCS & University

More information

IXIA S PUBLIC ART SURVEY 2013 SUMMARY AND KEY FINDINGS. Published February 2014

IXIA S PUBLIC ART SURVEY 2013 SUMMARY AND KEY FINDINGS. Published February 2014 IXIA S PUBLIC ART SURVEY 2013 SUMMARY AND KEY FINDINGS Published February 2014 ABOUT IXIA ixia is England s public art think tank. We promote and influence the development and implementation of public

More information

SHTG primary submission process

SHTG primary submission process Meeting date: 24 April 2014 Agenda item: 8 Paper number: SHTG 14-16 Title: Purpose: SHTG primary submission process FOR INFORMATION Background The purpose of this paper is to update SHTG members on developments

More information

Census Liaison Managers (CLM) & Assistant Census Liaison Managers (ACLM) monthly update for onward communication by CRCs April 2010

Census Liaison Managers (CLM) & Assistant Census Liaison Managers (ACLM) monthly update for onward communication by CRCs April 2010 Census Liaison Managers (CLM) & Assistant Census Liaison Managers (ACLM) monthly update for onward communication by CRCs April 2010 HEADLINES : i) Address check: May - August 2010 - ONS address checking

More information

JRC Response to the Consultation on. More Radio Spectrum for the Internet of Things

JRC Response to the Consultation on. More Radio Spectrum for the Internet of Things JRC Response to the Consultation on More Radio Spectrum for the Internet of Things JRC Ltd Dean Bradley House 52 Horseferry Road London SW1P 2AF United Kingdom +44 (0)20 7706 5199 +44 (0)20 7222 0100 info@jrc.co.uk

More information

Economic and Social Council

Economic and Social Council United Nations Economic and Social Council Distr.: General 18 December 2017 Original: English Statistical Commission Forty-ninth session 6 9 March 2018 Item 4 (a) of the provisional agenda* Items for information:

More information

A Guide to Linked Mortality Data from Hospital Episode Statistics and the Office for National Statistics

A Guide to Linked Mortality Data from Hospital Episode Statistics and the Office for National Statistics A Guide to Linked Mortality Data from Hospital Episode Statistics and the Office for National Statistics June 2015 Version History Version Changes Date Issued Number 1 14/Dec/2010 1.1 Modified Appendix

More information

Family Tree Analyzer Part II Introduction to the Menus & Tabs

Family Tree Analyzer Part II Introduction to the Menus & Tabs Family Tree Analyzer Part II Introduction to the Menus & Tabs Getting Started If you haven t already got FTAnalyzer installed and running you should see the guide Family Tree Analyzer Part I Installation

More information

5 TH MANAGEMENT SEMINARS FOR HEADS OF NATIONAL STATISTICAL OFFICES (NSO) IN ASIA AND THE PACIFIC SEPTEMBER 2006, DAEJEON, REPUBLIC OF KOREA

5 TH MANAGEMENT SEMINARS FOR HEADS OF NATIONAL STATISTICAL OFFICES (NSO) IN ASIA AND THE PACIFIC SEPTEMBER 2006, DAEJEON, REPUBLIC OF KOREA Malaysia 5 TH MANAGEMENT SEMINARS FOR HEADS OF NATIONAL STATISTICAL OFFICES (NSO) IN ASIA AND THE PACIFIC. 18 20 SEPTEMBER 2006, DAEJEON, REPUBLIC OF KOREA 1. Overview of the Population and Housing Census

More information

Analysis of the December 2014 electoral registers in England and Wales

Analysis of the December 2014 electoral registers in England and Wales Analysis of the December 2014 electoral registers in England and Wales The implementation of Individual Electoral Registration: progress report February 2015 Executive summary... 4 Data issues affecting

More information

Economic and Social Council

Economic and Social Council United Nations Economic and Social Council Distr.: General 21 March 2012 ECE/CES/2012/22 Original: English Economic Commission for Europe Conference of European Statisticians Sixtieth plenary session Paris,

More information

ABI Framework for the Management of Gone-Away Customers in the Life and Pensions Market

ABI Framework for the Management of Gone-Away Customers in the Life and Pensions Market 1 Association of British Insurers ABI Framework for the Management of Gone-Away Customers in the Life and Pensions Market ABI Framework for the Management of Gone-Away Customers in the Life and Pensions

More information

Estimation of the number of Welsh speakers in England

Estimation of the number of Welsh speakers in England Estimation of the number of ers in England Introduction The number of ers in England is a topic of interest as they must represent the major part of the -ing diaspora. Their numbers have been the matter

More information

Panel Study of Income Dynamics: Mortality File Documentation. Release 1. Survey Research Center

Panel Study of Income Dynamics: Mortality File Documentation. Release 1. Survey Research Center Panel Study of Income Dynamics: 1968-2015 Mortality File Documentation Release 1 Survey Research Center Institute for Social Research The University of Michigan Ann Arbor, Michigan December, 2016 The 1968-2015

More information

Supplementary questionnaire on the 2011 Population and Housing Census SWITZERLAND

Supplementary questionnaire on the 2011 Population and Housing Census SWITZERLAND Supplementary questionnaire on the 2011 Population and Housing Census SWITZERLAND Supplementary questionnaire on the 2011 Population and Housing Census Fields marked with are mandatory. INTRODUCTION As

More information

Airwave Service Government Security Classifications Guidance

Airwave Service Government Security Classifications Guidance Airwave Service Government Security Classifications Guidance Author: John Stark, Head of Communications Security Home Office Airwave Management Team john.stark@homeoffice.gsi.gov.uk 07718 805804 Enquiries:

More information

Tonga - National Population and Housing Census 2011

Tonga - National Population and Housing Census 2011 Tonga - National Population and Housing Census 2011 Tonga Department of Statistics - Tonga Government Report generated on: July 14, 2016 Visit our data catalog at: http://pdl.spc.int/index.php 1 Overview

More information

BROADCASTING (RADIO MULTIPLEX SERVICES) BILL EXPLANATORY NOTES

BROADCASTING (RADIO MULTIPLEX SERVICES) BILL EXPLANATORY NOTES BROADCASTING (RADIO MULTIPLEX SERVICES) BILL EXPLANATORY NOTES What these notes do These Explanatory tes relate to the Broadcasting (Radio Multiplex Services) Bill as introduced in the House of. These

More information

Record Linkage between the 2006 Census of the Population and the Canadian Mortality Database

Record Linkage between the 2006 Census of the Population and the Canadian Mortality Database Proceedings of Statistics Canada Symposium 2016 Growth in Statistical Information: Challenges and Benefits Record Linkage between the 2006 Census of the Population and the Canadian Mortality Database Mohan

More information

RE: Land at Boundary Hall, Aldermaston Road, Tadley. INSPECTORATE REF: APP/H1705/V/10/

RE: Land at Boundary Hall, Aldermaston Road, Tadley. INSPECTORATE REF: APP/H1705/V/10/ APPLICATION BY: Cala Homes RE: Land at Boundary Hall, Aldermaston Road, Tadley. INSPECTORATE REF: APP/H1705/V/10/2124548 LOCAL AUTHORITY REF: BDB/67609 Prepared by: Mr Geoff Gosling Intelligence Officer,

More information

METHODOLOGY NOTE Population and Dwelling Stock Estimates, , and 2015-Based Population and Dwelling Stock Forecasts,

METHODOLOGY NOTE Population and Dwelling Stock Estimates, , and 2015-Based Population and Dwelling Stock Forecasts, METHODOLOGY NOTE Population and Dwelling Stock Estimates, 2011-2015, and 2015-Based Population and Dwelling Stock Forecasts, 2015-2036 JULY 2017 1 Cambridgeshire Research Group is the brand name for Cambridgeshire

More information

Collection and dissemination of national census data through the United Nations Demographic Yearbook *

Collection and dissemination of national census data through the United Nations Demographic Yearbook * UNITED NATIONS SECRETARIAT ESA/STAT/AC.98/4 Department of Economic and Social Affairs 08 September 2004 Statistics Division English only United Nations Expert Group Meeting to Review Critical Issues Relevant

More information

; ECONOMIC AND SOCIAL COUNCIL

; ECONOMIC AND SOCIAL COUNCIL Distr.: GENERAL ECA/DISD/STAT/RPHC.WS/ 2/99/Doc 1.4 2 November 1999 UNITED NATIONS ; ECONOMIC AND SOCIAL COUNCIL Original: ENGLISH ECONOMIC AND SOCIAL COUNCIL Training workshop for national census personnel

More information

MODERN CENSUS IN POLAND

MODERN CENSUS IN POLAND United Nations International Seminar on Population and Housing Censuses: Beyond the 2010 Round 27-29 November 2012 Seoul, Republic of Korea SESSION 7: Use of modern technologies for censuses MODERN CENSUS

More information

Maintaining knowledge of the New Zealand Census *

Maintaining knowledge of the New Zealand Census * 1 of 8 21/08/2007 2:21 PM Symposium 2001/25 20 July 2001 Symposium on Global Review of 2000 Round of Population and Housing Censuses: Mid-Decade Assessment and Future Prospects Statistics Division Department

More information

It s good to share... Understanding the quality of the 2011 Census in England and Wales

It s good to share... Understanding the quality of the 2011 Census in England and Wales It s good to share... Understanding the quality of the 2011 Census in England and Wales SRA Conference, London, December 2012 Adriana Castaldo Andrew Charlesworth AGENDA Context: 2011 Census quality assurance

More information

HOW TO BUILD GEODEMOGRAPHICS FROM BIG DATA. March 2016 Graham Smith, Associate Director

HOW TO BUILD GEODEMOGRAPHICS FROM BIG DATA. March 2016 Graham Smith, Associate Director HOW TO BUILD GEODEMOGRAPHICS FROM BIG DATA March 2016 Graham Smith, Associate Director WELCOME BIG DATA & GEODEMS THE STORY SO FAR NEW OPPORTUNITIES FOR GEODEMOGRAPHICS DATA PRIVACY & KEY CONSIDERATIONS

More information

Your response. Our case is set out in the attachment below:

Your response. Our case is set out in the attachment below: Your response Question 1: Do you agree with our proposed approach towards registered fixed link and satellite earth stations users of the 3.6GHz to 3.8GHz band? Yes, in principle, but we believe that if

More information

Estimating the number of rooms and bedrooms in the 2021 Census for England and Wales. An alternative approach using Valuation Office Agency (VOA) data

Estimating the number of rooms and bedrooms in the 2021 Census for England and Wales. An alternative approach using Valuation Office Agency (VOA) data Estimating the number of rooms and bedrooms in the 2021 Census for England and Wales An alternative approach using Valuation Office Agency (VOA) data Marie Haythornthwaite Administrative Data Census Team

More information

GOVERNING BODY MEETING in Public 25 April 2018 Agenda Item 3.2

GOVERNING BODY MEETING in Public 25 April 2018 Agenda Item 3.2 GOVERNING BODY MEETING in Public 25 April 2018 Paper Title Paper Author(s) Jerry Hawker Accountable Officer NHS Eastern Cheshire CCG The Future of CCG Commissioning in Cheshire Alison Lee Accountable Officer

More information

The Demographic situation of the Traveller Community 1 in April 1996

The Demographic situation of the Traveller Community 1 in April 1996 Statistical Bulletin, December 1998 237 Demography The Demographic situation of the Traveller Community 1 in April 1996 Age Structure of the Traveller Community, 1996 Age group Travellers Total Population

More information

RadioCentre s response to the BBC Trust review of the BBC s national radio stations in Northern Ireland, Scotland and Wales

RadioCentre s response to the BBC Trust review of the BBC s national radio stations in Northern Ireland, Scotland and Wales RadioCentre s response to the BBC Trust review of the BBC s national radio stations in Northern Ireland, Scotland and Wales 1. Executive summary 1.1. We welcome the fact that a significant degree of scrutiny

More information

Strategies for the 2010 Population Census of Japan

Strategies for the 2010 Population Census of Japan The 12th East Asian Statistical Conference (13-15 November) Topic: Population Census and Household Surveys Strategies for the 2010 Population Census of Japan Masato CHINO Director Population Census Division

More information

Census 2000 and its implementation in Thailand: Lessons learnt for 2010 Census *

Census 2000 and its implementation in Thailand: Lessons learnt for 2010 Census * UNITED NATIONS SECRETARIAT ESA/STAT/AC.97/9 Department of Economic and Social Affairs 08 September 2004 Statistics Division English only United Nations Symposium on Population and Housing Censuses 13-14

More information

Across the Divide Tackling Digital Exclusion in Glasgow. Douglas White

Across the Divide Tackling Digital Exclusion in Glasgow. Douglas White Across the Divide Tackling Digital Exclusion in Glasgow Douglas White 2 Across the Divide Tackling Digital Exclusion in Glasgow Executive Summary Why does having an internet connection matter? Evidence

More information

Research Specification: understanding consumer experience of first tier complaints

Research Specification: understanding consumer experience of first tier complaints Research Specification: understanding consumer experience of first tier complaints Purpose To gain an understanding of consumers experience of first-tier complaints handling by approved persons. This includes:

More information

Essential requirements for a spectrum monitoring system for developing countries

Essential requirements for a spectrum monitoring system for developing countries Recommendation ITU-R SM.1392-2 (02/2011) Essential requirements for a spectrum monitoring system for developing countries SM Series Spectrum management ii Rec. ITU-R SM.1392-2 Foreword The role of the

More information

Are Northern Ireland s Two Communities Dividing?: Evidence from the Census of Population

Are Northern Ireland s Two Communities Dividing?: Evidence from the Census of Population 5 Are Northern Ireland s Two Communities Dividing?: Evidence from the Census of Population 1971-2001 Ian Shuttleworth and Chris Lloyd Introduction Media coverage after the 1991 Northern Ireland Census

More information

population and housing censuses in Viet Nam: experiences of 1999 census and main ideas for the next census Paper prepared for the 22 nd

population and housing censuses in Viet Nam: experiences of 1999 census and main ideas for the next census Paper prepared for the 22 nd population and housing censuses in Viet Nam: experiences of 1999 census and main ideas for the next census Paper prepared for the 22 nd Population Census Conference Seattle, Washington, USA, 7 9 March

More information

OWA Floating LiDAR Roadmap Supplementary Guidance Note

OWA Floating LiDAR Roadmap Supplementary Guidance Note OWA Floating LiDAR Roadmap Supplementary Guidance Note List of abbreviations Abbreviation FLS IEA FL Recommended Practices KPI OEM OPDACA OSACA OWA OWA FL Roadmap Meaning Floating LiDAR System IEA Wind

More information

End of the Census. Why does the Census need reforming? Seminar Series POPULATION PATTERNS. seeing retirement differently

End of the Census. Why does the Census need reforming? Seminar Series POPULATION PATTERNS. seeing retirement differently Seminar Series End of the Census The UK population is undergoing drastic movement, with seachanges in mortality rates, life expectancy and how long individuals can hope to live in good health. In order

More information

Pan-Canadian Trust Framework Overview

Pan-Canadian Trust Framework Overview Pan-Canadian Trust Framework Overview A collaborative approach to developing a Pan- Canadian Trust Framework Authors: DIACC Trust Framework Expert Committee August 2016 Abstract: The purpose of this document

More information

Workshop on the Improvement of Civil Registration and Vital Statistics in SADC Region Blantyre, Malawi 1 5 December 2008

Workshop on the Improvement of Civil Registration and Vital Statistics in SADC Region Blantyre, Malawi 1 5 December 2008 United Nations Statistics Division Southern African Development Community Pre-workshop assignment 1 Workshop on the Improvement of Civil Registration and Vital Statistics in SADC Region Blantyre, Malawi

More information

THE TOP 100 CITIES PRIMED FOR SMART CITY INNOVATION

THE TOP 100 CITIES PRIMED FOR SMART CITY INNOVATION THE TOP 100 CITIES PRIMED FOR SMART CITY INNOVATION Identifying U.S. Urban Mobility Leaders for Innovation Opportunities 6 March 2017 Prepared by The Top 100 Cities Primed for Smart City Innovation 1.

More information

Publishing date: 23/07/2015 Document title: We appreciate your feedback. Share this document

Publishing date: 23/07/2015 Document title: We appreciate your feedback. Share this document Publishing date: 23/07/2015 Document title: We appreciate your feedback Please click on the icon to take a 5 online survey and provide your feedback about this document Share this document REPORT ON UNIT

More information

Economic and Social Council

Economic and Social Council UNITED NATIONS E Economic and Social Council Distr. GENERAL 5 May 2008 Original: ENGLISH ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Joint UNECE/Eurostat Meeting on Population and

More information

SAMOA - Samoa National Population and Housing Census 2006

SAMOA - Samoa National Population and Housing Census 2006 National Data Archive SAMOA - Samoa National Population and Housing Census 2006 Samoa Bureau of Statistics - Government of Samoa Report generated on: August 19, 2013 Visit our data catalog at: http://nousdpeweb02.spc.external/prism/nada/index.php

More information

Response ID ANON-TX5D-M5FX-5

Response ID ANON-TX5D-M5FX-5 Response ID ANON-TX5D-M5FX-5 Submitted on 2015-08-27 15:25:10.395503 About you Are you answering this questionnaire on behalf of an organisation or as an individual? Organisation Please tell us a bit about

More information

Experiences with the Use of Addressed Based Sampling in In-Person National Household Surveys

Experiences with the Use of Addressed Based Sampling in In-Person National Household Surveys Experiences with the Use of Addressed Based Sampling in In-Person National Household Surveys Jennifer Kali, Richard Sigman, Weijia Ren, Michael Jones Westat, 1600 Research Blvd, Rockville, MD 20850 Abstract

More information

2012 UN International Seminar for Global Agenda - The Population and Housing Census. Hyong-Joon Noh Statistics Korea

2012 UN International Seminar for Global Agenda - The Population and Housing Census. Hyong-Joon Noh Statistics Korea 2012 UN International Seminar for Global Agenda - The Population and Housing Census Hyong-Joon Noh Statistics Korea I II III IV V VI Concepts Background Action Plans Use of Administrative Data Future Plans

More information

Report on the First Trial Census of the Register-Based Population and Housing Census (REGREL)

Report on the First Trial Census of the Register-Based Population and Housing Census (REGREL) Report on the First Trial Census of the Register-Based Population and Housing Census (REGREL) Moment of Census 31.12.2015 objekte n24 maksimaalne raadius 75 mm minimaalne raadius 2 mm 2017 Estonia s first

More information

CCG 360 o Stakeholder Survey

CCG 360 o Stakeholder Survey July 2017 CCG 360 o Stakeholder Survey National report NHS England Publications Gateway Reference: 06878 Ipsos 16-072895-01 Version 1 Internal Use Only MORI This Terms work was and carried Conditions out

More information

Ensuring the accuracy of Myanmar census data step by step

Ensuring the accuracy of Myanmar census data step by step : Ensuring the accuracy of Myanmar census data step by step 1. Making sure all households were counted 2. Verifying the data collected 3. Securely delivering questionnaires to the Census Office 4. Safely

More information

Drafted by Anne Laurence 9 Dec 2013

Drafted by Anne Laurence 9 Dec 2013 Drafted by Anne Laurence (e.a.laurence@open.ac.uk) 9 Dec 2013 Census Consultation 2013, return of the Economic History Society; Royal Historical Society and the Social History Society The Royal Historical

More information

2 3, MAY 2018 ANKARA, TURKEY

2 3, MAY 2018 ANKARA, TURKEY SEVENTH SESSION OF OIC STATISTICAL COMMISSION 2 3, MAY 2018 ANKARA, TURKEY CRVS for the 2020 Round of Population and Housing Census Mr. Nyakassi M.B. Sanyang, The Gambia Presentation Outline Introduction

More information

Report to Guilden Sutton Parish Council. Survey Analysis and Report of Residents Attitudes Towards Fracking in Guilden Sutton

Report to Guilden Sutton Parish Council. Survey Analysis and Report of Residents Attitudes Towards Fracking in Guilden Sutton Report to Guilden Sutton Parish Council Survey Analysis and Report of Residents Attitudes Towards Fracking in Guilden Sutton Author: John Murray BSc (hons) FBCS FSS CITP CEng Date: 2 nd June 2016 1 Contents

More information

General report format, ref. Article 12 of the Birds Directive, for the report

General report format, ref. Article 12 of the Birds Directive, for the report Annex 1: General report format, ref. Article 12 of the Birds Directive, for the 2008-2012 report 0. Member State Select the 2 digit code for your country, according to list to be found in the reference

More information

The ONS Longitudinal Study

The ONS Longitudinal Study Geography and Geographical Analysis using the ONS Longitudinal Study Christopher Marshall & Julian Buxton CeLSIUS Aims of the Presentation What is the ONS LS and what data does it contain? What geographical

More information

Engaging UK Climate Service Providers a series of workshops in November 2014

Engaging UK Climate Service Providers a series of workshops in November 2014 Engaging UK Climate Service Providers a series of workshops in November 2014 Belfast, London, Edinburgh and Cardiff Four workshops were held during November 2014 to engage organisations (providers, purveyors

More information

LAW ON RECORDS OF BIRTHS, DEATHS AND MARRIAGES

LAW ON RECORDS OF BIRTHS, DEATHS AND MARRIAGES LAW ON RECORDS OF BIRTHS, DEATHS AND MARRIAGES CONSOLIDATED TEXT 1 I. GENERAL PROVISIONS Article 1 The basic personal data of the citizens shall be kept in a: register of births, register of marriages,

More information

Methods and Techniques Used for Statistical Investigation

Methods and Techniques Used for Statistical Investigation Methods and Techniques Used for Statistical Investigation Podaşcă Raluca Petroleum-Gas University of Ploieşti raluca.podasca@yahoo.com Abstract Statistical investigation methods are used to study the concrete

More information

Demographic and Social Statistics in the United Nations Demographic Yearbook*

Demographic and Social Statistics in the United Nations Demographic Yearbook* UNITED NATIONS SECRETARIAT Background document Department of Economic and Social Affairs September 2008 Statistics Division English only United Nations Expert Group Meeting on the Scope and Content of

More information

GROUND ROUTING PROTOCOL FOR USE WITH AUTOMATIC LINK ESTABLISHMENT (ALE) CAPABLE HF RADIOS

GROUND ROUTING PROTOCOL FOR USE WITH AUTOMATIC LINK ESTABLISHMENT (ALE) CAPABLE HF RADIOS GROUND ROUTING PROTOCOL FOR USE WITH AUTOMATIC LINK ESTABLISHMENT (ALE) CAPABLE HF RADIOS October 2002 I FOREWORD 1. The Combined Communications-Electronics Board (CCEB) is comprised of the five member

More information

Poverty in the United Way Service Area

Poverty in the United Way Service Area Poverty in the United Way Service Area Year 2 Update 2012 The Institute for Urban Policy Research At The University of Texas at Dallas Poverty in the United Way Service Area Year 2 Update 2012 Introduction

More information

National Grid s commitments when undertaking works in the UK. Our stakeholder, community and amenity policy

National Grid s commitments when undertaking works in the UK. Our stakeholder, community and amenity policy National Grid s commitments when undertaking works in the UK Our stakeholder, community and amenity policy Introduction This document describes the ten commitments we have made to the way we carry out

More information

Section 2: Preparing the Sample Overview

Section 2: Preparing the Sample Overview Overview Introduction This section covers the principles, methods, and tasks needed to prepare, design, and select the sample for your STEPS survey. Intended audience This section is primarily designed

More information

Supplementary questionnaire on the 2011 Population and Housing Census FRANCE

Supplementary questionnaire on the 2011 Population and Housing Census FRANCE Supplementary questionnaire on the 2011 Population and Housing Census FRANCE Supplementary questionnaire on the 2011 Population and Housing Census Fields marked with are mandatory. INTRODUCTION As agreed

More information

American Community Survey 5-Year Estimates

American Community Survey 5-Year Estimates DP02 SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES 2012-2016 American Community Survey 5-Year Estimates Supporting documentation on code lists, subject definitions, data accuracy, and statistical

More information

American Community Survey 5-Year Estimates

American Community Survey 5-Year Estimates DP02 SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES 2011-2015 American Community Survey 5-Year Estimates Supporting documentation on code lists, subject definitions, data accuracy, and statistical

More information

Standard and guidance for the creation, compilation, transfer and deposition of archaeological archives

Standard and guidance for the creation, compilation, transfer and deposition of archaeological archives Standard and guidance for the creation, compilation, transfer and deposition of archaeological archives Published December 2014 The Chartered Institute for Archaeologists is a company incorporated by Royal

More information

2016 Census Profile on the Town of Richmond Hill

2016 Census Profile on the Town of Richmond Hill 2016 Census Profile on the Town of Richmond Hill Release #3: Families, households and marital status, and language Every 5 years, Statistics Canada (on behalf of the Government of Canada) undertakes a

More information

Spectrum Efficiency in Scotland. Study Proposal. National Grid Wireless. Prepared for: Ofcom Prepared by: National Grid Wireless.

Spectrum Efficiency in Scotland. Study Proposal. National Grid Wireless. Prepared for: Ofcom Prepared by: National Grid Wireless. Spectrum Efficiency in Scotland Study Proposal National Grid Wireless Prepared for: Ofcom Prepared by: National Grid Wireless 1 EXECUTIVE SUMMARY... 3 2 INTRODUCTION... 4 3 METHODOLOGY... 4 4 CHANGES TO

More information

WORKSHOP ON BASIC RESEARCH: POLICY RELEVANT DEFINITIONS AND MEASUREMENT ISSUES PAPER. Holmenkollen Park Hotel, Oslo, Norway October 2001

WORKSHOP ON BASIC RESEARCH: POLICY RELEVANT DEFINITIONS AND MEASUREMENT ISSUES PAPER. Holmenkollen Park Hotel, Oslo, Norway October 2001 WORKSHOP ON BASIC RESEARCH: POLICY RELEVANT DEFINITIONS AND MEASUREMENT ISSUES PAPER Holmenkollen Park Hotel, Oslo, Norway 29-30 October 2001 Background 1. In their conclusions to the CSTP (Committee for

More information

Planning for an increased use of administrative data in censuses 2021 and beyond, with particular focus on the production of migration statistics

Planning for an increased use of administrative data in censuses 2021 and beyond, with particular focus on the production of migration statistics Planning for an increased use of administrative data in censuses 2021 and beyond, with particular focus on the production of migration statistics Dominik Rozkrut President, Central Statistical Office of

More information

Response of Boeing UK Limited. UK Ofcom Call for Input 3.8 GHz to 4.2 GHz Band: Opportunities for Innovation 9 June 2016

Response of Boeing UK Limited. UK Ofcom Call for Input 3.8 GHz to 4.2 GHz Band: Opportunities for Innovation 9 June 2016 Response of Boeing UK Limited UK Ofcom Call for Input 3.8 GHz to 4.2 GHz Band: Opportunities for Innovation 9 June 2016 Introduction Boeing UK Limited (Boeing) is pleased to respond to Ofcom s Call for

More information

Introduction to the course, lecturers, participants and the European Census 2021

Introduction to the course, lecturers, participants and the European Census 2021 Introduction to the course, lecturers, participants and the European Census 2021 Eric Schulte Nordholt Statistics Netherlands Division Socio-economic and spatial statistics THE CONTRACTOR IS ACTING UNDER

More information

SESSION 11. QUALITY ASSESSMENT AND ASSURANCE IN THE CIVIL REGISTRATION

SESSION 11. QUALITY ASSESSMENT AND ASSURANCE IN THE CIVIL REGISTRATION Brisbane Accord Group SESSION 11. QUALITY ASSESSMENT AND ASSURANCE IN THE CIVIL REGISTRATION Civil Registration Process: Place, Time, Cost, Late AND VITAL STATISTICS SYSTEM Registration UNITED NATIONS

More information

ble of Contents This is a licensed product of Ken Research and should not be copied

ble of Contents This is a licensed product of Ken Research and should not be copied ble of Contents 1 TABLE OF CONTENTS 1. India Paints Industry Introduction 1.1. Indian Paint Industry Value Chain 2. India Paints Industry Market Size, FY 2006-FY 2012 2.1. By Revenue, FY 2006-FY 2012 2.2.

More information

SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES American Community Survey 5-Year Estimates

SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES American Community Survey 5-Year Estimates DP02 SELECTED SOCIAL CHARACTERISTICS IN THE UNITED STATES 2010-2014 American Community Survey 5-Year Estimates Supporting documentation on code lists, subject definitions, data accuracy, and statistical

More information

9 October Opportunities to Promote Data Sharing UCL and the YODA Project. Emma White. Associate Director

9 October Opportunities to Promote Data Sharing UCL and the YODA Project. Emma White. Associate Director 9 October 2015 Opportunities to Promote Data Sharing UCL and the YODA Project Emma White Associate Director Overview - Administrative Data Research Network (ADRN) - Administrative Data Research Centre

More information

UNICEF Mexico/Mauricio Ramos BIRTH REGISTRATION IN LATIN AMERICA AND THE CARIBBEAN: CLOSING THE GAPS 2016 UPDATE

UNICEF Mexico/Mauricio Ramos BIRTH REGISTRATION IN LATIN AMERICA AND THE CARIBBEAN: CLOSING THE GAPS 2016 UPDATE UNICEF Mexico/Mauricio Ramos BIRTH REGISTRATION IN LATIN AMERICA AND THE CARIBBEAN: CLOSING THE GAPS 2016 UPDATE UNICEF/UNI159402/Pirozzi Every child's birth right A name and nationality is every child

More information

Loughborough University Institutional Repository. This item was submitted to Loughborough University's Institutional Repository by the/an author.

Loughborough University Institutional Repository. This item was submitted to Loughborough University's Institutional Repository by the/an author. Loughborough University Institutional Repository Digital and video analysis of eye-glance movements during naturalistic driving from the ADSEAT and TeleFOT field operational trials - results and challenges

More information

Getting the evidence: Using research in policy making

Getting the evidence: Using research in policy making Getting the evidence: Using research in policy making REPORT BY THE COMPTROLLER AND AUDITOR GENERAL HC 586-I Session 2002-2003: 16 April 2003 LONDON: The Stationery Office 14.00 Two volumes not to be sold

More information

DATA VALIDATION-I Evaluation of editing and imputation

DATA VALIDATION-I Evaluation of editing and imputation DATA VALIDATION-I Evaluation of editing and imputation Census processing overview Steps of data processing depend on the technology used in general, the process covers the following steps: Preparati on

More information

2010 World Programme on Population and Housing Censuses Final Report March 2009 to February 2010

2010 World Programme on Population and Housing Censuses Final Report March 2009 to February 2010 2010 World Programme on Population and Housing Censuses Final Report March 2009 to February 2010 A. SUMMARY Over the period March 2009 to February 2010, UNSD continued monitoring national census planning

More information

Postal Codes OM by Federal Ridings File (PCFRF) 2013 Representation Order, Reference Guide

Postal Codes OM by Federal Ridings File (PCFRF) 2013 Representation Order, Reference Guide Catalogue no. 92-178-G ISSN 2369-9809 Postal Codes OM by Federal Ridings File (PCFRF) 2013 Representation Order, Reference Guide June 2017 Release date: December 13, 2017 How to obtain more information

More information

Economic and Social Council

Economic and Social Council United Nations Economic and Social Council Distr.: General 21 May 2012 Original: English E/CONF.101/57 Tenth United Nations Conference on the Standardization of Geographical Names New York, 31 July 9 August

More information

1) Analysis of spatial differences in patterns of cohabitation from IECM census samples - French and Spanish regions

1) Analysis of spatial differences in patterns of cohabitation from IECM census samples - French and Spanish regions 1 The heterogeneity of family forms in France and Spain using censuses Béatrice Valdes IEDUB (University of Bordeaux) The deep demographic changes experienced by Europe in recent decades have resulted

More information

PREPARATIONS FOR THE PILOT CENSUS. Supporting paper submitted by the Central Statistical Office of Poland

PREPARATIONS FOR THE PILOT CENSUS. Supporting paper submitted by the Central Statistical Office of Poland Distr. GENERAL CES/SEM.40/22 15 September 1998 ENGLISH ONLY STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT) CONFERENCE OF EUROPEAN STATISTICIANS

More information