The Istitute of Chartered Accoutats of Sri Laka Postgraduate Diploma i Busiess ad Fiace Quatitative Techiques for Busiess Hadout 02:Presetatio ad Aalysis of data Presetatio of Data The Stem ad Leaf Display The stem ad leaf plot is a device to group data while displayig most of the origial data. Each score is cosidered to have two parts ( ie. Stem ad leaf) The leadig digit of a score is called stem. Eg. Costruct the stem ad leaf diagram for the followig data. 13, 46, 87, 16, 91, 25, 44, 27, 22, 10, 76, 23, 65, 3, 35, 43, 59, 75, 56, 64, 28, 36, 47, 53, 70, 28, 49. Oce the data have bee collected, these must be processed i some way so that importat patters becomes apparet. Tabulatio of Data The process of placig classified data ito tabular form is kow as tabulatio. A table is a symmetric arragemet of statistical data i rows ad colums. Rows are horizotalarragemets whereas colums are vertical arragemets. It may be simple, double or complex depedig upo the type of classificatio. Types of Tabulatio: (1) Simple Tabulatio or Oe-way Tabulatio: Whe the data are tabulated to oe characteristic, it is said to be simple tabulatio or oe-way tabulatio. For Example: Tabulatio of data o populatio of world classified by oe characteristic like Religio is example of simple tabulatio. 1
2) Double Tabulatio or Two-way Tabulatio: Whe the data are tabulated accordig to two characteristics at a time. It is said to be double tabulatio or two-way tabulatio. For Example: Tabulatio of data o populatio of world classified by two characteristics like Religio ad Sex is example of double tabulatio. (3) Complex Tabulatio: Whe the data are tabulated accordig to may characteristics, it is said to be complex tabulatio. For Example: Tabulatio of data o populatio of world classified by two characteristics like Religio, Sex ad Literacy etc is example of complex tabulatio Frequecy Distributios A distributio is a collectio, array or group of umerical values. A frequecy distributio is a list of data classes or categorize alog with the umber of values that fall i to each. Steps for costructig a frequecy distributio eg. Marks obtaied for mathematics by 50 studets i a school are listed as follows 61 42 48 64 54 65 56 51 52 70 37 51 58 42 58 48 59 56 56 59 53 62 54 49 56 43 57 33 40 76 62 53 68 40 38 63 56 65 54 57 55 62 52 56 68 51 45 73 55 65 Prepare a frequecy distributio usig equal class itervals 30-34, 35-39 ad so o. Relative Frequecy Relative frequecies are calculated by dividig the actual frequecy for each class by the total umber of observatios beig classified. Multiply the relative frequecy by 100 to arrive at percetage relative frequecy. 2
Cumulative frequecy distributio Cumulative frequecy distributio shows the total umber of occurreces that lie above or below certai key values. There are two types of distributios 1. Less tha Cumulative frequecy distributio. 2. More tha Cumulative frequecy distributio. A). Diagrammatic represetatio Bar diagram Pie chart Pictogram B). Graphic represetatio Histogram A histogram is "a represetatio of a frequecy distributio by meas of rectagles whose widths represet class itervals ad whose areas are proportioal to the correspodig frequecies." How Shall We Look at Histograms? Of course, part of the power of histograms is that they allow us to aalyze extremely large datasets by reducig them to a sigle graph that ca show primary, secodary ad tertiary peaks i data as well as give a visual represetatio of the statistical sigificace of those peaks. Frequecy polygo Frequecy polygos are a graphical device for uderstadig the shapes of distributios. They serve the same purpose as histograms, but are especially helpful for comparig sets of data. Frequecy polygos are also a good choice for displayig cumulative frequecy distributios. To create a frequecy polygo, start just as for histograms, by choosig a class iterval. The draw a X-axis represetig the values of the scores i your data. Mark the middle of each class iterval with a tick mark, ad label it with the middle value represeted by the class. Draw the Y-axis to idicate the frequecy of each class. Place a poit i the middle of each class iterval at the height correspodig to its frequecy. Fially, coect the poits. 3
You should iclude oe class iterval below the lowest value i your data ad oe above the highest value. The graph will the touch the X-axis o both sides. Cumulative Frequecy Curve (Ogive) This is the graphical represetatio of cumulative frequecy distributio. There are two types of ogives, amely Less tha ogive ad More tha ogive. I a ogive data may be expressed usig a sigle lie. A ogive (a cumulative lie graph) is best used whe you wat to display the total at ay give time. The relative slopes from poit to poit will idicate greater or lesser icreases; for example, a steeper slope meas a greater icrease tha a more gradual slope. A ogive, however, is ot the ideal graphic for showig comparisos betwee categories because it simply combies the values i each category, thus idicatig a accumulatio (a growig or lesseig total). Percetage ogive: 4
Aalysis of Data Measure of Cetral Locatio To ivestigate a set of quatitative data is useful to defie umerical measures that describe importat features of the data. Oe of the importat ways of describig the group of measuremets, whether it be a sample or populatio, is by the use of a average. A sigle umber that is used to represets a set of umbers is called a average. Whe the data is arraged i icreasig or decreasig order magitude, the average value lies at the ceter or close to the ceter of this set. Hece the measuremet of the average is kow as measuremet of cetral tedecy. The most commoly used masseurs of cetral locatio are Mea Media Mode Mea By Summig all of the observatios ad dividig by the umber of observatios ca obtai a arithmetic mea. For a set of observatios x1, x2, x3, x X i i1 Arithmetic mea = If the observatios x1, x2, x3, x occur with frequecies f1, f2, f3, f the Mea Example: f1x1 f2x2... f fx x f f f... f f 1 2 3 1. The umbers of employees at five differet stores are 3,5,6,4,6. Fid the mea umber of employees for the five stores. Mea 3 5 6 4 6 24 X 4.8 5 5 5
2. Fid the mea of the umbers 8,6,6,5,12,9,5,8,8,8. Tabulate the data as follows: Number (x) Frequecy (y) fx 5 2 10 6 2 12 8 4 32 9 1 9 12 1 12 f 10 fx 75 Mea fx 75 7.5 f 10 Weighted Mea I calculatios of simple mea, all items i a series are give equal importace. But i practical life, it may ot to so. I case some items i the distributio carry more importace tha others, the simple mea is ot the true represetative average. To have a represetative average i such a case weights are assiged to each item/value equal to it importace. Weights assig to various values are either estimate or arbitrarily fixed ad are ot the actual frequecies as give i a frequecy distributio. Weighted mea is defied as the average obtaied by multiplyig the various values i a series by certai values kow as weights ad the by dividig the total of products so obtaied by the total weight. Example: 1. A iterview was coducted to idetify suitable cadidate for the posts of Accoutats assistat. The marks obtaied for the test are as follows, Mathematics 60, Ecoomics 50, Accouts 34, Law 40. The weights assiged for are 2,3,4,1 respectively. Calculate the Average marks of the studet. (260) (350) (434) (1 40) Weight mea X 1 2 3 4 6
Media It is defied as the value that has a equal umber of observatios o either side of it whe the observatios are arraged i ascedig or descedig order. I this case there are odd umber of observatios. Whe there are eve umber of observatios, the arithmetic mea of the two middle values ca be take as media. Example: Ages of elderly people i the home are 90,87,84,78 ad 63. The media is 84. Cosider the frequecy distributio (grouped data) Media for the grouped data /2 Media : l f Where : m f 1 c l: lower limit of the media class : Number of observatios Mode f 1 : summatio of frequecies up to media class f m : frequecy of the media class The mode is the most commo observatio i the data. If there are two most commo values the the distributio is said to be bimodal ad it has two separate peaks. The mode does ot exist always. Example: Fid the mode of 9,10,5,9,7,9,6,8,10,11,9 Mode is 9 Mode for the grouped data 1 Mode = l 1 2 c Where l : Lower limit of the mode class 1: Frequecy differece betwee mode class ad class below to the mode class 2 : Frequecy differece betwee mode class ad class above to the mode class C : Class width 7
Measure of dispersio Rage The rage is the differece betwee the maximum ad miimum values i a data set. Rage = Maximum value Miimum value Example: The largest mothly retur of a orgaizatio from Jauary 1980 to March 2005 is 42.56 percet ad the smallest is 29.73 percet. The rage of returs is therefore 72.29 percet (42.56 percet - ( 29.73 percet)). Mea deviace Deviatio from the mea ca be calculated as follows Xi X Mea Deviatio = i 1 I geeral the Mea deviatio ca be calculated as follows by cosiderig frequecies i Mea Deviatio = 1 f X i i f X Stadard deviace The stadard deviace measures the amout by which the values i a data collectio differ from mea. Stadard deviatio N i1 Or i presece of frequecies ( x x) i 2 Stadard deviatio N i1 f ( x x) i f 2 8
Variace The variace is the average of the squared differeces betwee the data values ad the mea. Variace Or Variace N 2 i1 N 2 i1 ( x x) i 2 f ( x x) Coefficiet of Variatio i f 2 Skwess Skewess is a measure of the degree of asymmetry of a distributio. If the left tail (tail at small ed of the distributio) is more proouced tha the right tail (tail at the large ed of the distributio), the fuctio is said to have egative skewess. If the reverse is true, it has positive skewess. If the two are equal, it has zero skewess The Pearso mode skewess is defied by or 9
Exercises 1. A stadard test were admiistered to 30 studets to determie their IQ scores. These scores are recorded as follows. Class iterval (CI) Frequecy (f) 115 to less tha 120 6 120 125 3 125 130 8 130 135 7 135 140 3 140 145 3 Compute 1. A Histogram 2. A Frequecy Polygo 3. Less tha Cumulative frequecy diagram (Ogive) 4. More tha Cumulative frequecy diagram (Ogive) 2. The followig frequecy distributio represets the umber of abset days of a employees durig a year. Number of days Number of employees 0-2 5 3-5 10 6-8 20 9-11 10 12-14 5 1. Costruct a cumulative frequecy diagram for the above data. 2. How may employees were abset for less tha 3 days durig the year. 3. Hoe may employees were abset for more tha 8 days durig the year. 4. Draw a frequecy polygo for this data. 10
3. A large retailer is studyig the lead time (time betwee the receipt of the order ad shipmet of the merchadise) for a sample of 40 orders received i the previous moth. The lead time i days is reported as follows: Lead time (days) Frequecy 0-3 6 4-7 7 8-11 13 12-15 8 16-19 4 19 22 2 Total 40 a) Draw a frequecy polygo for this data. b) Draw a histogram for this data. c) How may orders were delivered i less tha 12 days? d) Covert the frequecy distributio ito less tha cumulative frequecy distributio. e) About 65% of the orders were delivered i less tha how may days? f) How may orders had a lead time of 12 days or more? 5. The followig are the scores for the mid-term exam give to 13 studets i statistics. 42,42,68,80,75,54,62,89,72,80,80,75,65 Calculate: a. Mea b. Media c. Mode 6. The followig data represets the mothly salary (i Rs. 000) of seve employees of the factory. 26,30,26,29,28,60,X. the mea salary of a employee was computed as Rs.33000. Compute X. 7. Because of a special sale o me s suits, a survey idicated that 24 suits were sold betwee 10:00 am ad 11:00 am o the sale day at Macy s. These suits were of the followig sizes: 42, 38, 42, 48, 42, 42, 45, 42 46, 43, 50, 38, 42, 43, 42, 36 39, 41, 42, 45, 39, 39, 49, 42 11
Compute a) The mea b) The mode c) The media Which of these three measures of cetral tedecy most accurately represets the average suit size. 8. The followig figures represets the weights (i pouds) of 10 ewbor babies at flushig hospital o a give day: 8, 6, 7, 7, 7, 5, 9, 7, 8, 6 a) Compete the mea, the mode ad the media. b) What is the rage of the data? c) Compute the stadard deviatio. d) Compute the coefficiet of variatio. e) Are these data skewed? If so how? 9. A elevator is desig to carry a maximum load of 3,200 pouds. If 18 passegers are ridig i the elevator with a average weight of 154 pouds, is there ay dager that the elevator might be over loaded? 10. I a car assembly plat, the cars were diagostically checked after assembly ad before shippig them to the dealers. All such cars with ay defect were retured for extra work. The umber of such detective cars retur i oe day for a 16 day period is give below: 30, 34, 10, 16, 28, 9, 22, 2, 6, 23, 25, 10, 15, 10, 8, 24 a) Fid the average umber of defective cars retured for extra work per day. b) Fid the media umber of defective cars per day. c) Fid the mode for defective cars per day. d) Compute the stadard deviatio. 12
11. Middle aged persos were ecouraged to have their blood pressure checked free of charge o a give day by the Nursig Departmet of our college. The systolic blood pressure of the first 20 persos is recorded as follows: 120, 118, 152, 160, 150, 134, 125, 145, 135, 139 160, 142, 139, 156, 135, 140, 126, 136, 148, 130 a) Compute the mea, the mode ad the media systolic blood pressure. b) What is the rage of the data? c) Compute the iterquartile rage. d) Compute the coefficiet of skewess. 12. The followig distributio represets the period i moths that Die Hard car batteries lasted before beig replaced. Class Iterval (CI) ( f ) 45 ad upto 50 10 50 ad upto 55 14 55 ad upto 60 13 60 ad upto 65 11 65 ad upto 70 2 50 Total a) Compute the average umber of moths that a battery lasted. b) Compute the media of the data. c) Compute the variace ad the stadard deviatio of the data. d) Compute the iterquartile rage. e) Compute the coefficiet of variatio. f) Compute the coefficiet of skewess. 13
13. Professor Alexader 50 studets i his Statistics class. I the fial exam, the marks obtaied by his studet rage from low of 20 to a high of 98. He arraged the scores ito 4 groups with a class iterval of 20 poits. The data is preseted i a frequecy distributio as follows: Class Iterval (CI) ( f ) 20 ad less tha 40 5 40 ad less tha 60 15 60 ad less tha 80 23 80 ad less tha 100 7 50 Total a) Fid the average score i the fial exam. b) If Professor Alexader fails every studets who gets a score of 40 or less, ca we fid, from the distributio as give, the percetage of studets that failed the exam? c) Compute the variace ad the stadard deviatio of this data. d) Fid the media score form this data. 14