Review Observational study vs experiment Experimental designs In an experiment, there is one variable that is of primary interest. There are several other factors, which may affect the measured result. Blocking: create homogeneous blocks so other factors are held constant and only the factor of interest varies Randomized block design ensures each treatment group has an equal proportion.
Test the effects of sleep deprivation on memory recall. What other factors may influence your results? Working hours? Majors? Five groups (1hour loss; 3hour; 5hour; 7hour; no loss; 1 st night: 9hr sleep; 2 nd night: sleep loss; 3 rd night: 9hr sleep).
1. Parameter 2. Qualitative 3. Interval 5. Simple random sample
Section 2.1: Frequency Distributions and Their Graphs
Frequency Distribution A table shows classes of data with a count of the number of entries in each class. The frequency, f, of a class is the number of data entries in the class.
Frequency Distribution Class width 6 1 = 5 A table shows classes of data with a count of the number of entries in each class. The frequency, f, of a class is the number of data entries in the class. Lower class limits Upper class limits
Constructing a Frequency Distribution 1. Decide on the number of classes. Usually between 5 and 20; otherwise, it may be difficult to detect any patterns. 2. Find the class width. Determine the range of the data. Divide the range by the number of classes. Round up to the next convenient number. Make sure every data point belongs to one class!
Constructing a Frequency Distribution 3. Find the class limits. You can use the minimum data entry as the lower limit of the first class. Find the remaining lower limits by adding the class width to the lower limit of the preceding class. Find the upper limit of the first class. Remember that classes cannot overlap. Find the remaining upper class limits. 4. Make a tally mark for each data entry in the row of the appropriate class. 5. Count the tally marks to find the frequency f for each class. A frequency can have many correction versions.
Example: Constructing a Frequency Distribution The following sample data set lists the prices (in dollars) of 30 portable global positioning system (GPS) navigators. Construct a frequency distribution that has seven classes. 90 130 400 200 350 70 325 250 150 250 275 270 150 130 59 200 160 450 300 130 220 100 200 400 200 250 95 180 170 150
Solution: Constructing a Frequency Distribution 1. Number of classes = 7 (given) 2. Find the class width = Round up to 56 max min 450 59 391 55.86 #classes 7 7
Solution: Constructing a Frequency Distribution Lower limit 59 Upper limit 3. Use the minimum value, 59, as the first lower limit. Add the class width of 56 to get the lower limit of the next class. 59 + 56 = 115 115 + 56 = 171 171 + 56 = 227 227 + 56 = 283 283 + 56 = 339 339 + 56 = 395
Solution: Constructing a Frequency Distribution Lower limit 59 Upper limit 3. Use the minimum value, 59, as the first lower limit. Add the class width of 56 to get the lower limit of the next class. Find the remaining lower limits. 59 + 56 = 115 115 + 56 = 171 171 + 56 = 227 227 + 56 = 283 283 + 56 = 339 339 + 56 = 395
Solution: Constructing a Frequency Distribution 3. Use the minimum value, 59, as the first lower limit. Add the class width of 56 to get the lower limit of the next class. Find the remaining lower limits. The upper limit of the first class is one less than the lower limit of the second class, 115 1 = 114. Add the class with of 56 to the upper limit of the next class. Lower limit 59 115 171 227 283 339 395 Upper limit 114 114 + 56 = 170 170 + 56 = 226 226 + 56 = 282 282 + 56 = 338 338 + 56 = 394 394 + 56 = 450
Solution: Constructing a Frequency Distribution 3. Use the minimum value, 59, as the first lower limit. Add the class width of 56 to get the lower limit of the next class. Find the remaining lower limits. The upper limit of the first class is one less than the lower limit of the second class, 115 1 = 114. Add the class with of 56 to the upper limit of the next class. Find the remining upper limits. Lower limit 59 115 171 227 283 339 395 Upper limit 114 114 + 56 = 170 170 + 56 = 226 226 + 56 = 282 282 + 56 = 338 338 + 56 = 394 394 + 56 = 450
Solution: Constructing a Frequency Distribution 90 130 400 200 350 70 325 250 150 250 275 270 150 130 59 200 160 450 300 130 220 100 200 400 200 250 95 180 170 150 4. Make a tally mark for each data entry in the row of the appropriate class. 5. Count the tally marks to find the frequency f for each class.
Solution: Constructing a Frequency Distribution 4. Make a tally mark for each data entry in the row of the appropriate class. 5. Count the tally marks to find the total frequency f for each class. Class Tally Frequency, f 59-114 115-170 171-226 227-282 283-338 339-394 395-450
Features of a Frequency Distribution: Midpoint Midpoint of a class: (Lower class limit) (Upper class limit) 2 59 114 2 86.5
Features of a Frequency Distribution: Midpoint Midpoint of a class: (Lower class limit) (Upper class limit) 2 59 114 86.5 2 115 170 142.5 2 171 226 198.5 2
Features of Frequency Distribution: Relative Frequency Relative Frequency of a class: Portion of the data that falls in a particular class. relative frequency class frequency Sample size f n 5 30 0.17
Features of Frequency Distribution: Relative Frequency Relative Frequency of a class: Portion or percentage of the data that falls in a particular class. relative frequency class frequency Sample size f n 5 30 0.17 8 30 0.27 6 30 0.2
Features of Frequency Distribution: Cumulative Frequency Cumulative frequency of a class: The sum of the frequency for that class and all previous classes. The cumulative frequency of the last class is equal to the sample size. 5 5 + 8 = 13 13 + 6 = 19
Expanded Frequency Distribution for Prices (in dollars) of GPS Navigators Σf = 30 f 1 n
Graphs of Frequency Distribution Frequency histogram Frequency polygon Relative Frequency Histogram Cumulative Frequency Graph (Ogive)
frequency Graphs of Frequency Distributions Frequency Histogram: A bar graph that represents the frequency distribution. The horizontal scale (x-axis) is quantitative and measures the data values. The vertical scale (y-axis) measures the frequencies of the classes. No spaces between bars. data values
Class Boundaries The numbers that separate classes without forming gaps between them. The distance from the upper limit of the first class to the lower limit of the second class is 115 114 = 1. Half this distance is 0.5. First class lower boundary = 59 0.5 = 58.5 First class upper boundary = 114 + 0.5 = 114.5
Class Boundaries The numbers that separate classes without forming gaps between them. 58.5 114.5 The distance from the upper limit of the first class to the lower limit of the second class is 115 114 = 1. Half this distance is 0.5. First class lower boundary = 59 0.5 = 58.5 First class upper boundary = 114 + 0.5 = 114.5
Example: Frequency Histogram Construct a frequency histogram for the global positioning system (GPS) navigators.
Solution: Frequency Histogram You can mark the horizontal scale either at the midpoint s or at the class boundaries. From either histogram, you can see that about two-third s of the GPS navigators are priced below $226.5.
frequency Graphs of Frequency Distribution Frequency Polygon: A line graph that emphasizes the continuous change in frequencies. Horizontal scale: class midpoints Vertical scale: frequency The graph should begin and end on the horizontal axis, so extend the left side to one class width before the first class midpoint and extend the right side to one class width after the last class midpoint. data values
Example: Frequency Polygon
Solution: Frequency Polygon You can see that the frequency of GPS navigators increases up to $142.50 and then decreases.
relative frequency Graphs of Frequency Distributions Relative Frequency Histogram Has the same shape and the same horizontal scale as the corresponding frequency histogram. The vertical scale measures the relative frequencies, not frequencies. data values
Expanded Frequency Distribution for Prices (in dollars) of GPS Navigators 90 130 400 200 350 70 325 250 150 250 275 270 150 130 59 200 160 450 300 130 220 100 200 400 200 250 95 180 170 150
Solution: Relative Frequency Histogram From this graph you can see that 20% of GPS navigators are priced between $170.50 and $226.50.
cumulative frequency Graphs of Frequency Distributions Cumulative Frequency Graph or Ogive: A line graph that displays the cumulative frequency of each class at its upper class boundary. The upper boundaries are marked on the horizontal axis. The cumulative frequencies are marked on the vertical axis. data values
Constructing an Ogive 1. Specify the horizontal and vertical scales. 2. Plot points that represent the upper class boundaries and their corresponding cumulative frequencies. 3. Connect the points in order from left to right. 4. The graph should start at the lower boundary of the first class (cumulative frequency is zero) and should end at the upper boundary of the last class (cumulative frequency is equal to the sample size).
Construct an ogive for the GPS navigators frequency distribution Cumulative frequency From the ogive, you can see that about 25 of GPS navigators cost $300 or less. The greatest increase occurs between $114.50 and $170.50.
Example The ages of the 50 most powerful women in the world in 2012. 26, 31, 35, 37, 43, 43, 43, 44, 45, 47, 48, 48, 49, 50, 51, 51, 51, 51, 52, 54, 54, 54, 54, 55, 55, 55, 56, 57, 57, 57, 58, 58, 58, 58, 59, 59, 59, 62, 62, 63, 64, 65, 65, 65, 66, 66, 67, 67, 72, 86 Group them into 7 classes. Expanded frequency distribution. Relative frequency histogram. Frequency polygon. Ojive.
2.2 More graphs and Displays Use stem-and-leaf plots and dot plots for quantitative data. Use pie charts and Pareto charts for qualitative data. Use scatter plots and time series charts for paired data sets.
Graphing Quantitative Data Sets Stem-and-leaf plot Each number is separated into a stem and a leaf. Still contains original data values. Similar to a histogram. Data: 21, 25, 25, 26, 27, 28, 30, 36, 36, 45 26 2 1 5 5 6 7 8 3 0 6 6 4 5
Example: Constructing a Stem-and-Leaf Plot The following are the numbers of text messages sent last month by the cellular phone users on one floor of a college dormitory. Display the data in a stem-and-leaf plot. 155 159 144 129 105 145 126 116 130 114 122 112 112 142 126 118 108 122 121 109 140 126 119 113 117 118 109 109 119 139 139 122 78 133 126 123 145 121 134 124 119 132 133 124 129 112 126 148 147
Solution: Constructing a Stem-and-Leaf Plot 155 159 144 129 105 145 126 116 130 114 122 112 112 142 126 118 118 108 122 121 109 140 126 119 113 117 118 109 109 119 139 139 122 78 133 126 123 145 121 134 124 119 132 133 124 129 112 126 148 147 The data entries go from a low of 78 to a high of 159. Use the rightmost digit as the leaf. For instance, 78 = 7 8 and 159 = 15 9 List the stems, 7 to 15, to the left of a vertical line. For each data entry, list a leaf to the right of its stem.
Solution: Constructing a Stem-and-Leaf Plot Don t forget to include a key to identify the data entries From the display, you can identify outliers, which are unusual data entries. You can conclude that more than 50% of the cellular phone users sent between 110 and 130 text messages.
Graphing Quantitative Data Sets Dot plot: Each data entry is plotted above a horizontal scale. The number of points above a number represents its frequency. Data: 21, 25, 25, 26, 27, 28, 30, 36, 36, 45 26 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Solution: Constructing a Dot Plot 155 159 144 129 105 145 126 116 130 114 122 112 112 142 126 118 118 108 122 121 109 140 126 119 113 117 118 109 109 119 139 139 122 78 133 126 123 145 121 134 124 119 132 133 124 129 112 126 148 147 From the dot plot, you can see that most values cluster between 105 and 148 and the value that occurs the most is 126. You can also see that 78 is an unusual data value.
Example: Constructing a dot plot 15 shoppers were asked how many bags they were carrying back to their car: 0, 1, 2, 1, 3, 0, 2, 1, 3, 0, 0, 4, 1, 0, 3
Graphing Qualitative Data Sets Pie Chart: A circle is divided into sectors that represent categories. The area of each sector is proportional to the frequency of each category. Central angle of each category: Relative frequency x 360 o
Example: Constructing a Pie Chart The numbers of earned degrees conferred (in thousands) in 2007 are shown in the table. Use a pie chart to organize the data. (Source: U.S. National Center for Educational Statistics)
Solution: Constructing a Pie Chart 1. Find the relative frequency (percent) of each category. 728 3007 0.24
Solution: Constructing a Pie Chart 2. Use the central angle that corresponds to each category. To find the central angle, multiply 360 o by the category's relative frequency. For example, the central angle for Associate s is 360(0.24) 86 o
Solution: Constructing a Pie Chart 360º(0.24) 86º 360º(0.51) 184º 360º(0.20) 72º 360º(0.03) 11º
Solution: Constructing a Pie Chart From the pie chart, you can see that almost onehalf of the degrees conferred in 2007 were bachelor s degrees.
Frequency Graphing Qualitative Data Sets Pareto Chart A vertical bar graph in which the height of each bar represents frequency or relative frequency. The bars are positioned in order of decreasing height, with the tallest bar positioned at the left. Categories
Constructing a Pareto Chart Main Causes of Inventory Shrinkag e From the graph, it is easy to see that the causes of inventory shrinkage that should be addressed first are employee theft and shoplifting.
Graphing Paired Data Sets Paired Data Sets: Each entry in one data set corresponds to one entry in a second data set. Graph using a scatter plot. The ordered pairs are graphed as points in a coordinate plane. Temperatur e ( o C) Used to show the relationship between two quantitative variables. Elevation (m)
Interpreting a Scatter Plot Fisher's Iris data set describes various physical characteristics, such as petal length and petal width (in millimeters), for three species of iris. The petal lengths form the first data set and the petal widths form the second data set. (Source: Fisher, R. A., 1936) As the petal length increases, the petal wi dth also tends to increase. Correlation (Ch9). Not causation.
Graphing Paired Data Sets Time Series Data set is composed of quantitative entries taken at regular intervals over a period of time.
Example: Constructing a Time Series Chart The table lists the number of cellular telephone subscribers (in millions) for the years 1998 through 2008. Construct a time series chart for the number of cellular subscribers. (Source: Cellular Telecommunication & Internet Association)
Solution: Constructing a Time Series Chart The graph shows that the number of subscribers has been increasing since 1998, with greater increases recently.
What kind of plots have you learned? Homework: 1.2 (22) Homework 1 due on 2/1.