Cofidece Iterval Iferece We are i the fourth ad fial part of the coure - tatitical iferece, where we draw cocluio about the populatio baed o the data obtaied from a ample choe from it. Chapter 7 1 Our Goal i Iferece If ALL the populatio, whatever we are itereted i, would be maageable i ize, we would jut figure out the populatio parameter. The there would be o eed for iferece. Cofidece Iterval (CI) The goal: to give a rage of plauible value for the etimate of the ukow populatio parameter the populatio mea, μ, the populatio proportio, p the populatio tadard deviatio, We tart with our bet gue: the ample tatitic the ample mea x, the ample proportio $p the ample tadard deviatio, Sample tatitic = poit etimate 4 Cofidece Iterval (CI) to etimate Cofidece Iterval (CI) CI = poit etimate ± margi of error Populatio MEAN Populatio PROPORTION Populatio STANDARD DEVIATION Margi of error Margi of error Poit etimate: Poit etimate: Poit etimate: x $p 5 6 1
Margi of error Show how accurate we believe our etimate i The maller the margi of error, the more precie our etimate of the true parameter Formula: critical E = value tadard deviatio of the tatitic Cofidece Iterval (CI) for a Mea Suppoe a radom ample of ize i take from a ormal populatio of value for a quatitative variable whoe mea µ i ukow,, whe the populatio tadard deviatio i kow. A cofidece iterval (CI) for µ i: CI = poit etimate ± margi of error x ± z * Poit etimate Margi of error (m or E) 8 So what z*??? A cofidece iterval i aociated with a cofidece level. We will ay: the 95% cofidece iterval for the populatio mea i The mot commo choice for a cofidece level are 90% :z* = 1.645 95% : z* = 1.96, 99% : z* =.576. Statemet: (memorize!!) We are % cofidet that the true mea cotext lie withi the iterval ad. 9 Uig the calculator The Trade-off Calculator: STAT TESTS 7:ZIterval Ipt: Data Stat Ue thi whe you have data i oe of your lit Ue thi whe you kow ad x There i a trade-off betwee the level of cofidece ad preciio i which the parameter i etimated. higher level of cofidece -- wider cofidece iterval lower level of cofidece arrower cofidece iterval 11 1
95% cofidet mea: I 95% of all poible ample of thi ize, µ will ideed fall i our cofidece iterval. I oly 5% of ample would mi µ. The Margi of Error The width (or legth) of the CI i exactly twice the margi of error (E): E The margi of error i therefore "i charge" of the width of the cofidece iterval. E E 14 Commet The margi of error (E ) i E = z * ad ice, the ample ize, appear i the deomiator, icreaig will reduce the margi of error for a fixed z*. How ca you make the margi of error maller? z* maller (lower cofidece level) maller (le variatio i the populatio) larger Really caot (to cut the margi of error i half, mut be 4 time a big) chage! 15 Margi of Error ad the Sample Size I ituatio where a reearcher ha ome flexibility a to the ample ize, the reearcher ca calculate i advace what the ample ize i that he/he eed i order to be able to report a cofidece iterval with a certai level of cofidece ad a certai margi of error. Calculatig the Sample Size E = z * = z * E Clearly, the ample ize mut be a iteger. Calculatio may give u a o-iteger reult. I thee cae, we hould alway roud up to the ext highet iteger. 17 18 3
Example IQ core are kow to vary ormally with tadard deviatio 15. How may tudet hould be ampled if we wat to etimate populatio mea IQ at 99% cofidece with a margi of error equal to? = z = E *. 576 15 = 3736. = 374 They hould take a ample of 374 tudet. Aumptio for the validity of x ± z * The ample mut be radom The tadard deviatio,, i kow ad either the ample ize mut be large ( 30) or for maller ample the variable of iteret mut be ormally ditributed i the populatio. 19 0 1 Step to follow 1. Check coditio: SRS, i kow, ad either 30 or the populatio ditributio i ormal. Calculate the CI for the give cofidece level 3. Iterpret the CI Example 1 A college admiio director wihe to etimate the mea age of all tudet curretly erolled. I a radom ample of 0 tudet, the mea age i foud to be.9 year. Form pat tudie, the tadard deviatio i kow to be 1.5 year ad the populatio i ormally ditributed. Cotruct a 90% cofidece iterval of the populatio mea age. Step 1: Check coditio A college admiio director wihe to etimate the mea age of all tudet curretly erolled. I a radom ample of 0 tudet, the mea age i foud to be.9 year. Form pat tudie, the tadard deviatio i kow to be 1.5 year ad the populatio i ormally ditributed. SRS i kow The populatio i ormally ditributed Step : Calculate the 90% CI uig the formula x =. 9 = 15. = 0 z * = 1645.. x ± z * =. 9 ± 1645. 15 =. 9 ± 0. 6 = (. 3, 35. ) 0 3 4 4
Step : Calculate the 90% CI uig the calculator Calculator: STAT TESTS 7:ZIterval Ipt: Data Stat = 1.5 x =.9 = 0 C-Level:.90 Calculate ZIterval : (.3, 3.5) Step 3: Iterpretatio We are 90% cofidet that the mea age of all tudet at that college i betwee.3 ad 3.5 year. 5 6 Example 1 How may tudet hould he ak if he wat the margi of error to be o more tha 0.5 year with 99% cofidece? = z = E. *. 576 15 = 59. 7 05. Thu, he eed to have at leat 60 tudet i hi ample. A cietit wat to kow the deity of bacteria i a certai olutio. He make meauremet of 10 radomly elected ample: 4, 31, 9, 5, 7, 7, 3, 5, 6, 9 *10 6 bacteria/ml. From pat tudie the cietit kow that the ditributio of bacteria level i ormally ditributed ad the populatio tadard deviatio i *10 6 bacteria/ml. a. What i the poit etimate of μ? =7.5 *10 6 bacteria/ml. x 7 b. Fid the 95% cofidece iterval for the mea level of bacteria i the olutio. Step 1: check coditio: SRS, ormal ditributio, i kow. All atified. Step : CI: x ± z* = 7.5 ± 1.96 = 7.5 ± 1.4 = (6.6,8.74) 10 Step 3: Iterpret: we are 95% cofidet that the mea bacteria level i the whole olutio i betwee 6.6 ad 8.74 *10 6 bacteria/ml. Uig the calculator: Eter the umber ito o of the lit, ay L1 STAT TESTS 7: ZIterval Ipt: Data : Lit: L1 Freq: 1 (it alway 1) C-Level:.95 Calculate (6.6, 8.74) 9 30 5
c. What i the margi of error? From part b: x ± z* = 7.5 ± 1.96 = 7.5 ± 1.4 = (6.6,8.74) 10 Thu, the margi of error i E=1.4 *10 6 bacteria/ml. d. How may meauremet hould he make to obtai a margi of error of at mot 0.5*10 6 bacteria/ml with a cofidece level of 95%? = z 6 * = 196. 10 = 614656. 6 E 05. 10 Thu, he eed to take 6 meauremet. 31 3 Aumptio for the validity of x ± z * The ample mut be radom The tadard deviatio,, i kow ad either The ample ize mut be large ( 30) or For maller ample the variable of iteret mut be ormally ditributed i the populatio. The oly ituatio whe we caot ue thi cofidece iterval, the, i whe the ample ize i mall ad the variable of iteret i ot kow to have a ormal ditributio. I that cae, other method called oparameteric method eed to be ued. Example 3 I a radomized comparative experimet o the effect of calcium o blood preure, reearcher divided 54 healthy, white male at radom ito two group, take calcium or placebo. The paper report a mea eated ytolic blood preure of 114.9 with tadard deviatio of 9.3 for the placebo group. Aume ytolic blood preure i ormally ditributed. Ca you fid a z-iterval for thi problem? Why or why ot? 33 So what if i ukow? Well, there i ome good ew ad ome bad ew! The good ew i that we ca eaily replace the populatio tadard deviatio,, with the ample tadard deviatio. Ad the bad ew i that oce ha bee replaced by, we loe the Cetral Limit Theorem together with the ormality of X ad therefore the cofidece multiplier z* for the differet level of cofidece are (geerally) ot accurate ay more. The ew multiplier come from a differet ditributio called the "t ditributio" ad are therefore deoted by t* (itead of z*). 35 36 6
CI for the populatio mea whe i ukow The cofidece iterval for the populatio mea µ whe i ukow i therefore: x ± t * z* v. t* There i a importat differece betwee the cofidece multiplier we have ued o far (z*) ad thoe eeded for the cae whe i ukow (t*). z*, deped oly o the level of cofidece, t* deped o both the level of cofidece ad o the ample ize (for example: the t* ued i a 95% cofidece whe =10 i differet from the t* ued whe =40). 37 38 t-ditributio There i a differet t ditributio for each ample ize. We pecify a particular t ditributio by givig it degree of freedom. The degree of freedom for the oe-ample t tatitic come from the ample tadard error i the deomiator of t. Sice ha -1 degree of freedom, the t- ditributio ha -1 degree of freedom. t-ditributio The t-ditributio i bell haped ad ymmetric about the mea. The total area uder the t-curve i 1 The mea, media, ad mode of the t-ditributio are equal to zero. The tail i the t-ditributio are thicker tha thoe i the tadard ormal ditributio. A the df (ample ize) icreae, the t-ditributio approache the ormal ditributio. After 9 df the t- ditributio i very cloe to the tadard ormal z- ditributio. 39 40 Hitorical Referece William Goet (1876-1937) developed the t-ditributio while employed by the Guie Brewig Compay i Dubli, Irelad. Goet publihed hi fidig uig the ame Studet. The t- ditributio i, therefore, ometime referred to a Studet t-ditributio. Deity of the t-ditributio (red ad gree) for 1,, 3, 5, 10, ad 30 df compared to ormal ditributio (blue) 41 4 7
Calculator Calculator: STAT TESTS 8:TIterval Ipt: Data Stat Ue thi whe you have data i oe of your lit Ue thi whe you kow ad x Example To tudy the metabolim of iect, reearcher fed cockroache meaured amout of a ugar olutio. After, 5, ad 10 hour, they diected ome of the cockroache ad meaured the amout of ugar i variou tiue. Five roache fed the ugar olutio ad diected after 10 hour had the followig amout of ugar i their hidgut: 43 44 Example 55.95, 68.4, 5.73, 1.50, 3.78 Fid the 95% CI for the mea amout of ugar i cockroach hidgut: x = 44. 44 = 0. 741 The degree of freedom, df=-1=4, ad from the table we fid that for the 95% cofidece, t*=.776. The 0. 741 x ± t * = 44. 44 ±. 776 = ( 18. 69, 7019. ) 5 Example The large margi of error i due to the mall ample ize ad the rather large variatio amog the cockroache. Calculator: Put the data i L 1. STAT TESTS 8:TIterval Ipt: Data Stat Lit: L 1 Freq:1 C-level:.95 45 46 Example: You take: 4 ample, the data are ormally ditributed, i kow ormal ditributio with x ± z * 14 ample, the data are ormally ditributed, i ukow x ± t * t-ditributio with 34 ample, the data are ot ormally ditributed, i ukow ormal ditributio with x ± t * 1 ample; the data are ot ormally ditributed, i ukow caot ue the ormal ditributio or the t-ditributio 47 48 8
Some Cautio: The data MUST be a SRS from the populatio The formula i ot correct for more complex amplig deig, i.e., tratified, etc. No way to correct for bia i data Outlier ca have a large effect o cofidece iterval Mut kow to do a z-iterval which i urealitic i practice Etimatig a Populatio Proportio Whe the variable of iteret i categorical, the populatio parameter that we will ifer about i a populatio proportio (p) aociated with that variable. For example, if we are itereted i tudyig opiio about the death pealty amog U.S. adult, ad thu our variable of iteret i "death pealty (i favor/agait)," we'll chooe a ample of U.S. adult ad ue the collected data to make iferece about p - the proportio of US adult who upport the death pealty. 50 Suppoe that we are itereted i the opiio of U.S. adult regardig legalizig the ue of marijuaa. I particular, we are itereted i the parameter p, the proportio of U.S. adult who believe marijuaa hould be legalized. Suppoe a poll of 1000 U.S. adult fid that 560 of them believe marijuaa hould be legalized. If we wated to etimate p, the populatio proportio by a igle umber baed o the ample, it would make ituitive ee to ue the correpodig quatity i the ample, the ample proportio $p = 560/1000 = 0.56. We ay i thi cae that 0.56 i the poit etimate for p, ad that i geeral, we'll alway ue $p a the poit etimator for p. Note, agai, that whe we talk about the pecific value (.56), we ue the term etimate, ad whe we talk i geeral about the tatitic we ue the term etimator. Here i a viual ummary of thi example: 51 5 Back to Suppoe a poll of 1000 U.S. adult fid that 560 of them believe marijuaa hould be legalized. 53 54 9
The CI for p Thu, the cofidece iterval for p i Calculator: STAT TESTS A:1-PropZIt p$ ± E = p$ ± z * p$( 1 p$) x i the umber of uccee: x = p$ For a 95% CI ue z*=1.96 For a 90% CI ue z*=1.645 For a 99% CI ue z*=.576 55 56 Coditio The CI i reaoably accurate whe three coditio are met: The ample wa a imple radom ample (SRS) from a biomial populatio Both p$ 10 ad ( 1 p$) 10 The ize of the populatio i at leat 10 time the ize of the ample Example Suppoe you have a radom ample of 40 bue from a large city ad fid that 4 bue have a afety violatio. Fid the 90% CI for the proportio of all bue that have a afety violatio. Coditio: SRS 4 both p$ = 40( 40) = 4 10 ad ( 1 p$) = 40( 1 4 ) = 16 10 40 The ize of the populatio (all the bue) i at leat 10 time the ize of the ample (40) 57 58 90% CI 4 p $ = = 0. 6 40 For 90% CI z*=1.645 p$( 1 p$) p$ ± E = p$ ± z * = 0. 6 ± 1645. 0. 6( 1 0. 6) = 06. ± 013. = ( 0. 47, 073. ) 40 Iterpretatio 1. What i it that you are 90% ure i i the cofidece iterval? The proportio of all of the bue i thi populatio that have afety violatio if we could check them all.. What i the meaig (or iterpretatio) of the cofidece iterval of 0.47 to 0.73? We are 90% cofidet that if we could check all of the bue i thi populatio, betwee 47% ad 73% of them would have afety violatio. 3. What i the meaig of 90% cofidece? If we took 100 radom ample of bue from thi populatio ad computed the 90% cofidece iterval from each ample, the we would expect that 90 of thee iterval would cotai the proportio of all bue i thi populatio that have afety violatio. I other word, we are uig a method that capture the true populatio proportio 90% of the time. 59 60 10
Margi of Error ad Sample Size Whe we have ome level of flexibility i determiig the ample ize, we ca et a deired margi of error for etimatig the populatio proportio ad fid the ample ize that will achieve that. For example, a fial poll o the day before a electio would wat the margi of error to be quite mall (with a high level of cofidece) i order to be able to predict the electio reult with the mot preciio. Thi i particularly relevat whe it i a cloe race betwee the cadidate. The pollig compay eed to figure out how may eligible voter it eed to iclude i their ample i order to achieve that. Let' ee how we do that. Margi of Error ad Sample Size The cofidece iterval for p i p$( 1 p$) p$ ± E = p$ ± z * p$( 1 p$) E = z * Thu, the margi of error i Uig ome algebra we have z = * E p$( 1 p$) 61 6 z = * E p$( 1 p$) If you have a good etimate $p of p, ue it i thi formula, otherwie take the coervative approach by ettig $p = 1. You have to decide o a level of cofidece o you kow what value of z* to ue (mot commo oe i the 95% level). Alo, obviouly, you have to et the margi of error (the mot commo oe i 3%). What ample ize hould we ue for a urvey if we wat a margi of error to be at mot 3%? Let ue the 95% cofidece here, o z*=1.96. Alo, ice we do t have a etimate of p, we will ue p $ = 05..The z = p p * = E 196. $( 1 $) ( 05. )( 1 05. ) = 1067111. 0. 03 Becaue you mut have a ample ize of at leat 1067.111, roud up to 1068. So hould be at leat 1068. 63 64 Summary: CI for a populatio proportio a populatio mea, i kow ad ormally ditributed populatio or 30 a populatio mea, i ukow ad ormally ditributed populatio or 30 p$( 1 p$) p$ ± z * x ± z * x ± z * x ± t * 65 11