Sampling Theory MODULE XIII LECTURE - 41 NON SAMPLING ERRORS DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOG KANPUR 1
It i a general aumption in ampling theory that the true value of each unit in the population can be obtained and tabulated without any error. In practice, thi aumption may be violated due to everal reaon and practical contraint. Thi reult in error in obervation a well a in tabulation. Such error which are due to factor other than ampling are called non-ampling error. The non-ampling error are unavoidable in cenu and urvey. The data collected by complete enumeration in cenu i free from ampling error but would not remain free from non-ampling error. The data collected through ample urvey can have both ampling error a well a non-ampling error. Non-ampling error arie becaue of the factor other than the inductive proce of inferring about the population from a ample. In general, the ampling error decreae a the ample ize increae wherea non-ampling error increae a the ample ize increae. In ome ituation, the non-ampling error may be large and deerve greater attention than the ampling error. In any urvey, it i aumed that the value of the characteritic to be meaured ha been defined preciely for every population unit. Such a value exit and i unique. Thi i called the true value of the characteritic for the population value. In practical application, data collected on the elected unit are called urvey value. and differ from the true value. Such difference between the true and oberved value i termed a obervational error or repone error. Such an error arie mainly from the lack of preciion in meaurement technique and variability in the performance of the invetigator. 2
Source of non-ampling error: Non ampling error can occur at every tage of planning and execution of urvey or cenu. It occur at planning tage, field work tage a well a at tabulation and computation tage. The main ource of nonampling error are lack of proper pecification of the domain of tudy and cope of invetigation, incomplete coverage of the population or ample, faulty definition, defective method of data collection and tabulation error. More pecifically, one or more of the following reaon may give rie to nonampling error or indicate it preence: The data pecification may be inadequate and inconitent with the objective of the urvey or cenu. Due to imprecie definition of the boundarie of area unit, incomplete or wrong identification of unit, faulty method of enumeration etc, the data may be duplicated or may be omitted. The method of interview and obervation collection may be inaccurate or inappropriate. The quetionnaire, definition and intruction may be ambiguou. The invetigator may be inexperienced or not trained properly. The recall error may poe difficulty in reporting the true data. The crutiny of data i not adequate. The coding, tabulation etc. of the data may be erroneou. There can be error in preenting and printing the tabulated reult, graph etc. In a ample urvey, the non-ampling error arie due to defective frame and faulty election of ampling unit. 3
Thee ource are not exhautive but urely indicate the poible ource of error. Non-ampling error may be broadly claified into three categorie. (a) Specification Error: Thee error occur at planning tage due to variou reaon, e.g., inadequate and inconitent pecification of data with repect to the objective of urvey/cenu, omiion or duplication of unit due to imprecie definition, faulty method of enumeration/interview/ambiguou chedule etc. (b) Acertainment Error: Thee error occur at field tage due to variou reaon e.g., lack of trained and experienced invetigation, recall error and other type of error in data collection, lack of adequate inpection and lack of uperviion of primary taff etc. (c) Tabulation Error: Thee error occur at tabulation tage due to variou reaon, e.g., inadequate crutiny of data, error in proceing the data, error in publihing the tabulated reult, graph etc. Acertainment error may be further ub-divided into i. Coverage error owing to over-enumeration or under-enumeration of the population or ample, reulting from duplication or omiion of unit and from non-repone. ii. Content error relating to wrong entrie due to error on the part of invetigator and repondent. Same diviion can be made in the cae of tabulation error alo. There i a poibility of miing data or repetition of data at tabulation tage which give rie to coverage error and alo of error in coding, calculation etc. which give rie to content error. 4
Treatment of non-ampling error: Some conceptual background i needed for the mathematical treatment of non-ampling error. Total error: Difference between the ample urvey etimate and the parametric true value being etimated i termed a total error. Sampling error: If complete accuracy can be enured in the procedure uch a determination, identification and obervation of ample unit and the tabulation of collected data, then the total error would conit only of the error due to ampling, termed a ampling error. Meaure of ampling error i mean quared error (MSE). The MSE i the difference between the etimator and the true value and ha two component: quare of ampling bia. ampling variance. If the reult are alo ubject to non-ampling error, then the total error would have both ampling and non-ampling error. Total bia: The difference between the expected value and the true value of the etimator i termed a total bia. Thi conit of ampling bia and nonampling bia. 5
Non-ampling bia: For the ake of implicity, aume that the two following tep are involved in randomization: i. for electing the ample of unit and ii. for electing the urvey peronnel. Let be the etimate of population mean baed on ample of unit upplied by the ample of the urvey peronnel. The conditional expected value of unit i r E ( ) =, r r o r th taken over the econd tep of randomization for a fixed ample of r th which may be different from baed on true value of the unit in the ample. The expected value of o over the firt tep of randomization give E * ( o) =, which i the value for which an unbiaed etimator can be had by the pecified urvey proce. The value different from true population mean Bia * t( r ) =. and the total bia i given a * may be The ampling bia i given by Bia( ) = E ( ). 6
The non-ampling bia i Bia ( ) = Bia ( ) Bia ( ) r r t r * = E ( ) = E ( ) o which i the expected value of the non-ampling deviation. In cae of complete enumeration, there i no ampling bia and total bia conit only of non-ampling bia. In cae of ample urvey, the total bia conit only of non-ampling bia. The non-ampling bia in a cenu can be etimated by urveying a ample of unit in the population uing better technique of data collection and compilation than thoe adopted under general cenu condition. Survey called pot-enumeration urvey, which are uually conducted jut after the cenu for tudying the quality of cenu data, may be ued for thi purpoe. In a large cale ample urvey, the acertainment bia can be etimated by reurveying a ub-ample of the original ample uing better urvey technique. Another method of checking urvey data i to compare the value of the unit obtained in two urvey and to reconcile dicrepant figure by further invetigation. Thi method of checking i termed reconciliation (check ) urvey. 7
Non-ampling variance: The MSE of th th baed on ample of unit and upplied by ample of the urvey peronnel i r MSE( ) = E ( ) r r r 2 r where i the true value being etimated. Thi take into account both ampling and non-ampling error, i.e., MSE( ) ( ) ( r = Var r + Bia r ) = E ( ) + ( ) r 2 * 2 * 2 where * i the expected value of the etimator taken over both tep of randomization. Taking the variance over the two tep of randomization, we get Var ( ) ( ) ( r r = Var Er r + E Varr r ) 2 = Var o + E Er( r o) ampling non-amping variance variance 8
Note that * * r o = ( r o or + ) + ( or ) where = E ( ). or r E( ) = E ( + ) + E ( ) 2 * 2 * 2 r o r r o or r or Interaction between Variance ampling and between non-ampling error urvey peronnel The MSE of an etimator conit of ampling variance, interaction between ampling and non-ampling error, variance between urvey peronnel and quare of the um of ampling and non-ampling biae. In complete cenu, the MSE i compoed of only the non-ampling variance and quare of the non-ampling bia. 9
Non-repone error: The non-repone error may occur due to refual by repondent to give information or the ampling unit may be inacceible. Thi error arie becaue the et of unit getting excluded may have characteritic o different from the et of unit actually urveyed a to make the reult biaed. Thi error i termed a non-repone error ince it arie from the excluion of ome of the anticipated unit in the ample or population. One way of dealing with the problem of non-repone i to make all effort to collect information from a ub-ample of the unit not reponding in the firt attempt. Meaurement and control of error: Some uitable method and adequate procedure for control can be adopted before initiating the main cenu or ample urvey. Some eparate programme for etimating the different type of non-ampling error are alo required. Some uch procedure are a follow: 1. Conitency check: Certain item in the quetionnaire can be added which may erve a a check on the quality of collected data. To locate the doubtful obervation, the data can be arranged in increaing order of ome baic variable. Then they can be plotted againt each ample unit. Such graph i expected to follow a certain pattern and any deviation from thi pattern would help in potting the dicrepant value. 10
2. Sample check An independent duplicate cenu or ample urvey can be conducted on a comparatively maller group by trained and experienced taff. If the ample i properly deigned and if the checking operation i efficiently carried out, it i poible to detect the preence of non-ampling error and to get an idea of their magnitude. Such procedure i termed a method of ample check. 3. Pot-cenu and pot-urvey check: It i a type of ample check in which a ample (or ubample) i elected of the unit covered in the cenu (or urvey) and re-enumerate or re-urvey it by uing better trained and more experienced urvey taff than thoe involved in the main invetigation. Thi procedure i called a pot-urvey check or pot-cenu. The effectivene of uch check urvey can be increaed by re-enumerating or re-urveying immediately after the main cenu to avoid recall error taking tep to minimize the conditioning effect that the main urvey may have on the work of the checkurvey. 4. External record check: Take a ample of relevant unit from a different ource, if available, and to check whether all the unit have been enumerated in the main invetigation and whether there are dicrepancie between the value when matched. The lit from which the check-ample i drawn for thi purpoe, need not be a complete one. 11
5. Quality control technique: The ue of tool of tatitical quality control like control chart and acceptance ampling technique can be ued in aeing the quality of data and in improving the reliability of final reult in large cale urvey and cenu. 6. Study or recall error: Repone error arie due to variou factor like the attitude of repondent toward the urvey, method of interview, kill of the invetigator and recall error. Recall error depend on the length of the reporting period and on the interval between the reporting period and data of urvey. One way of tudying recall error i to collect and analyze data related to more than one reporting period in a ample (or ub-ample) of unit covered in the cenu or urvey. 7. Interpenetrating ub-ample: The ue of interpenetrating ub-ample technique help in providing an appraial of the quality of information a the interpenetrating ub-ample can be ued to ecure information on non-ampling error uch a difference ariing from differential interviewer bia, different method of eliciting information etc. After the ub-ample have been urveyed by different group of invetigator and proceed by different team of worker at the tabulation tage, a comparion of the final etimate baed on the ub-ample provide a broad check on the quality of the urvey reult. 12