Department of Quantitative Methods & Information Systems Business Statistics: Chapter 2 Graphs, Charts, and Tables Describing Your Data QMIS 120 Dr. Mohammad Zainal
Chapter Goals After completing this chapter, you should be able to: Construct a frequency distribution both manually and with a computer Construct and interpret a histogram Create and interpret bar charts, pie charts, and stem-and-leaf diagrams Present and interpret data in line charts and scatter diagrams QMIS 120, by Dr. M. Zainal Chap 2-2
Raw data Ages (in years) of 20 students selected from CBA are reported in the way they are collected. The data values are recorded in the following table. Ages of 20 Students 21 19 24 18 20 19 30 22 24 23 20 21 22 25 23 19 20 18 24 25 QMIS 120, by Dr. M. Zainal Chap 2-3
Raw data The same students were asked about their status. The responses of the sample are recorded in the following table Status of 20 Students J F S F J F S J S S F J J S J F J F S S QMIS 120, by Dr. M. Zainal Chap 2-4
Frequency Distributions What is a Frequency Distribution? A frequency distribution is a list or a table containing the values of a variable (or a set of ranges within which the data fall)... and the corresponding frequencies with which each value occurs (or frequencies with which data fall within each range) QMIS 120, by Dr. M. Zainal Chap 2-5
Frequency Distributions Weekly Earnings of 100 Employees of a company Weekly Earnings (dollars) 401 to 600 601 to 800 801 to 1000 1001 to 1200 1201 to 1400 1401 to 1600 Number of employees f 9 22 39 15 9 6 QMIS 120, by Dr. M. Zainal Chap 2-6
Why Use Frequency Distributions? A frequency distribution is a way to summarize data The distribution condenses the raw data into a more useful form... and allows for a quick visual interpretation of the data QMIS 120, by Dr. M. Zainal Chap 2-7
Frequency Distribution: Discrete Data Discrete data: possible values are countable Example: An advertiser asks 200 customers how many days per week they read the daily newspaper. Row Data 5,6,1,2,4,5,7,2,3,5,1,3,2,5,0,2,2,0,7,7,1,2,4,3,5,6,7,1,1,1,1,2,5,0,0,0,1,20,7,5,3,6,2,1,6,2,1,4,2,4,5,3,1,0,2,3,6,5,7,4,1,2,3,5,6,1,0, 0,0,0,0,1,1,1,2,3,5,1,4.. QMIS 120, by Dr. M. Zainal Chap 2-8
Frequency Distribution: Discrete Data It is called Single-Value approach Number of days read Frequency 0 44 1 24 2 18 3 16 4 20 5 22 6 26 7 30 Total 200 QMIS 120, by Dr. M. Zainal Chap 2-9
Relative Frequency Relative Frequency: What proportion is in each category? Number of days read Frequency Relative Frequency 0 44.22 1 24.12 2 18.09 3 16.08 4 20.10 5 22.11 6 26.13 44 200.22 22% of the people in the sample report that they read the newspaper 0 days per week 7 30.15 Total 200 1.00 QMIS 120, by Dr. M. Zainal Chap 2-10
Frequency Distribution: Discrete Data Example: Construct a frequency distribution table for the following data Team Home Runs Team Home Runs Anaheim 152 Milwaukee 139 Arizona 165 Minnesota 167 Atlanta 164 Montreal 162 Baltimore 165 New York Mets 160 Boston 177 New York Yankees 223 Chicago Cubs 200 Oakland 205 Chicago White Sox 217 Philadelphia 165 Cincinnati 169 Pittsburgh 142 Cleveland 192 St. Louis 175 Colorado 152 San Diego 136 Detroit 124 San Francisco 198 Florida 146 Seattle 152 Houston 167 Tampa Bay 133 Kansas City 140 Texas 230 Los Angeles 155 Toronto 187 Home Runs Hit by Major League Baseball Teams During the 2002 Season QMIS 120, by Dr. M. Zainal Chap 2-11
Frequency Distribution: Discrete Data QMIS 120, by Dr. M. Zainal Chap 2-12
Frequency Distribution: Continuous Data Continuous Data: may take on any value in some interval Example: A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27 Temperature is a continuous variable because it could be measured to any degree of precision desired QMIS 120, by Dr. M. Zainal Chap 2-13
Grouping Data by Classes QMIS 120, by Dr. M. Zainal Chap 2-14
Frequency Distribution Example Data from low to high: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 QMIS 120, by Dr. M. Zainal Chap 2-15
Frequency Histograms The classes or intervals are shown on the horizontal axis frequency is measured on the vertical axis Bars of the appropriate heights can be used to represent the number of observations within each class Such a graph is called a histogram QMIS 120, by Dr. M. Zainal Chap 2-16
Frequency Histogram Example Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Histogram 7 6 5 4 3 2 1 0 6 5 4 3 2 0 0 0 5 10 15 20 25 30 3640 45 50 55 60 More Class Class Midpoints Endpoints No gaps between bars, since continuous data Chap 2-17
Questions for Grouping Data into Classes 1. How wide should each interval be? (How many classes should be used?) 2. How should the endpoints of the intervals be determined? Often answered by trial and error, subject to user judgment The goal is to create a distribution that is neither too "jagged" nor too "blocky Goal is to appropriately show the pattern of variation in the data QMIS 120, by Dr. M. Zainal Chap 2-18
Frequency 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 More Frequency How Many Class Intervals? Many (Narrow class intervals) may yield a very jagged distribution with gaps from empty classes Can give a poor indication of how frequency varies across classes 3.5 3 2.5 2 1.5 1 0.5 0 Temperature (X axis labels are upper class endpoints) Few (Wide class intervals) may compress variation too much and yield a blocky distribution can obscure important patterns of variation. 12 10 8 6 4 2 0 0 30 60 More Temperature (X axis labels are upper class endpoints) QMIS 120, by Dr. M. Zainal Chap 2-19
General Guidelines Number of Data Points Number of Classes under 50 5-7 50 100 6-10 100 250 7-12 over 250 10-20 Class widths can typically be reduced as the number of observations increases Distributions with numerous observations are more likely to be smooth and have gaps filled since data are plentiful QMIS 120, by Dr. M. Zainal Chap 2-20
Class Width The class width is the distance between the lowest possible value and the highest possible value for a frequency class The class width is W = Largest Value - Smallest Value Number of Classes QMIS 120, by Dr. M. Zainal Chap 2-21
Histograms in Excel 1 Select Data Tab 2 Data Analysis 3 Choose Histogram QMIS 120, by Dr. M. Zainal Chap 2-22
Histograms in Excel (continued) 4 Input data and bin ranges Select Chart Output QMIS 120, by Dr. M. Zainal Chap 2-23
Ogives An Ogive is a graph of the cumulative relative frequencies from a relative frequency distribution Ogives are sometime shown in the same graph as a relative frequency histogram QMIS 120, by Dr. M. Zainal Chap 2-24
Ogives (continued) 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 QMIS 120, by Dr. M. Zainal Chap 2-25
Frequency Cumulative Frequency (%) Ogive Example Histogram 7 6 5 4 100 80 60 3 2 1 0 0 5 10 15 20 25 30 3640 45 50 55 60 More Class Class Midpoints Endpoints 40 20 0 Chap 2-26
Ogives in Excel Excel will show the Ogive graphically if the Cumulative Percentage option is selected in the Histogram dialog box QMIS 120, by Dr. M. Zainal Chap 2-27
Other Graphical Presentation Tools Categorical Data Quantitative Data Bar Chart Pie Charts Stem and Leaf Diagram QMIS 120, by Dr. M. Zainal Chap 2-28
Bar and Pie Charts Bar charts and Pie charts are often used for qualitative (category) data Height of bar or size of pie slice shows the frequency or percentage for each category QMIS 120, by Dr. M. Zainal Chap 2-29
Bar Chart Example 1 Investor's Portfolio Savings CD Bonds Stocks 0 10 20 30 40 50 Amount in $1000's (Note that bar charts can also be displayed with vertical bars) QMIS 120, by Dr. M. Zainal Chap 2-30
Freuency Bar Chart Example 2 Number of days read Frequency Newspaper readership per week 0 44 1 24 50 2 18 3 16 4 20 5 22 6 26 7 30 Total 200 40 30 20 10 0 0 1 2 3 4 5 6 7 Number of days newspaper is read per week QMIS 120, by Dr. M. Zainal Chap 2-31
Pie Chart Example Current Investment Portfolio Investment Amount Percentage Type (in thousands $) Stocks 46.5 42.27 Bonds 32.0 29.09 CD 15.5 14.09 Savings 16.0 14.55 Total 110 100 CD 14% Savings 15% Stocks 42% (Variables are Qualitative) Bonds 29% Percentages are rounded to the nearest percent QMIS 120, by Dr. M. Zainal Chap 2-32
Tabulating and Graphing Multivariate Categorical Data Investment in thousands of dollars Investment Investor A Investor B Investor C Total Category Stocks 46.5 55 27.5 129 Bonds 32.0 44 19.0 95 CD 15.5 20 13.5 49 Savings 16.0 28 7.0 51 Total 110.0 147 67.0 324 QMIS 120, by Dr. M. Zainal Chap 2-33
Tabulating and Graphing Multivariate Categorical Data Side by side charts Comparing Investors (continued) S avings CD B onds S toc k s 0 10 20 30 40 50 60 Inves tor A Inves tor B Inves tor C QMIS 120, by Dr. M. Zainal Chap 2-34
Side-by-Side Chart Example Sales by quarter for three sales territories: 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr East 20.4 27.4 59 20.4 West 30.6 38.6 34.6 31.6 North 45.9 46.9 45 43.9 60 50 40 30 20 East West North 10 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr QMIS 120, by Dr. M. Zainal Chap 2-35
Dot Plot A One of the simplest methods for graphing and understanding quantitative data is to create a dot plot. A horizontal axis shows the range of values for the observations. Each data point is represented by a dot placed above the axis. QMIS 120, by Dr. M. Zainal Chap 2-36
Dot Plot Dot plots can help us detect outliers (also called extreme values) in a data set. Outliers are the values that are extremely large or extremely small with respect to the rest of the data values. QMIS 120, by Dr. M. Zainal Chap 2-37
Dot Plot Example : The following table lists the number of runs batted in (RBIs) during the 2004 Major League Baseball playoffs by members of the Boston Red Sox team with at least one at-bat. Create a dot plot for these data. QMIS 120, by Dr. M. Zainal Chap 2-38
Dot Plot Step 1. First we draw a horizontal line that includes the minimum and the maximum values in this data set. Step 2. Place a dot above the value on the numbers line that represents each RBI listed in the table QMIS 120, by Dr. M. Zainal Chap 2-39
Stem and Leaf Diagram Another simple way to see distribution details from qualitative data METHOD 1. Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves) 2. List all stems in a column from low to high 3. For each stem, list all associated leaves QMIS 120, by Dr. M. Zainal Chap 2-40
Example: Data sorted from low to high: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Here, use the 10 s digit for the stem unit: 12 is shown as 35 is shown as Stem Leaf 1 2 3 5 QMIS 120, by Dr. M. Zainal Chap 2-41
Example: Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 28, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Completed Stem-and-leaf diagram: Stem Leaves 1 2 3 7 2 1 4 4 6 7 8 3 0 2 5 7 8 4 1 3 4 6 5 3 8 QMIS 120, by Dr. M. Zainal Chap 2-42
Using other stem units Using the 100 s digit as the stem: Round off the 10 s digit to form the leaves Stem Leaf 613 would become 6 1 776 would become 7 8... 1224 becomes 12 2 QMIS 120, by Dr. M. Zainal Chap 2-43
Line Charts and Scatter Diagrams Line charts show values of one variable vs. time Time is traditionally shown on the horizontal axis Scatter Diagrams show points for bivariate data one variable is measured on the vertical axis and the other variable is measured on the horizontal axis A trend line is a line that provides an approximation of that relationship. QMIS 120, by Dr. M. Zainal Chap 2-44
Inflation Rate (%) Line Chart Example Year Inflation Rate 1985 3.56 1986 1.86 1987 3.65 1988 4.14 1989 4.82 1990 5.40 1991 4.21 1992 3.01 1993 2.99 1994 2.56 1995 2.83 1996 2.95 1997 2.29 1998 1.56 1999 2.21 2000 3.36 2001 2.85 2002 1.59 2003 2.27 2004 2.68 2005 3.39 2006 3.24 6 5 4 3 2 1 U.S. Inflation Rate 0 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 QMIS 120, by Dr. M. Zainal Chap 2-45 Year
Scatter Diagram Example Volume per day Cost per day 23 125 26 140 29 146 33 160 38 167 42 170 50 188 55 195 60 200 Chap 2-46
Types of Relationships Linear Relationships Y Y X X QMIS 120, by Dr. M. Zainal Chap 2-47
Types of Relationships (continued) Curvilinear Relationships Y Y X X QMIS 120, by Dr. M. Zainal Chap 2-48
Types of Relationships (continued) No Relationship Y Y X X QMIS 120, by Dr. M. Zainal Chap 2-49
Cross-tabulation Example: Draw a scatter diagram for the following data which lists the total amount spent in KD by costumers in a restaurant. x (person) 1 1 2 2 3 3 4 4 5 5 y (KD) 8 7 14 18 20 22 21 26 29 33 QMIS 120, by Dr. M. Zainal Chap 2-50
Cross-tabulation A cross tab is a tabular summary of data of two variables. They are usually presented in a matrix format. Not like a frequency distribution (one variable). A contingency table describes the distribution of two or more variables simultaneously. Each cell shows the number of respondents that gave a specific combination of responses It can be used with any level of data (What are they?) QMIS 120, by Dr. M. Zainal Chap 2-51
Cross-tabulation example Example: In a survey of the quality rating and the meal price conducted by a consumer restaurant review agency, the following table was produced: Restaurant Quality rating Meal Price 1 Good 18 2 Very Good 22 3 Good 28 4 Excellent 38 5 Very Good 33 6 Good 28... QMIS 120, by Dr. M. Zainal Chap 2-52
Cross-tabulation example Quality rating is a qualitative variable with the rating categories of good, very good and excellent QMIS 120, by Dr. M. Zainal Chap 2-53
Cross-tabulation example Also, we can find the row percentage QMIS 120, by Dr. M. Zainal Chap 2-54
Cross-tabulation example Dividing the totals in the right margin of the cross tab by the grand total provides relative and percentage frequency distribution for the quality rating variable. QMIS 120, by Dr. M. Zainal Chap 2-55
Cross-tabulation example Try it for the meal price (column totals) QMIS 120, by Dr. M. Zainal Chap 2-56
Cross-tabulation example Example: The following data are for 30 observations involving two qualitative variables x (A, B and C) and y (1 and 2). Obs. x y Obs. x y Obs. x y 1 A 1 11 A 1 21 C 2 2 B 1 12 B 1 22 B 1 3 B 1 13 C 2 23 C 2 4 C 2 14 C 2 24 A 1 5 B 1 15 C 2 25 B 1 6 C 2 16 B 2 26 C 2 7 B 1 17 C 1 27 C 2 8 C 2 18 B 1 28 A 1 9 A 1 19 C 1 29 B 1 10 B 1 20 B 1 30 B 2 1- Construct a cross tabulation for the data 2- Calculate the row percentages QMIS 120, by Dr. M. Zainal Chap 2-57
Cross-tabulation example QMIS 120, by Dr. M. Zainal Chap 2-58
Cross-tabulation example QMIS 120, by Dr. M. Zainal Chap 2-59
Chapter Summary Data in raw form are usually not easy to use for decision making -- Some type of organization is needed: Table Graph Techniques reviewed in this chapter: Frequency Distributions, Histograms, and Ogives Bar Charts and Pie Charts Stem and Leaf Diagrams Line Charts and Scatter Diagrams QMIS 120, by Dr. M. Zainal Chap 2-60
Copyright The materials of this presentation were mostly taken from the PowerPoint files accompanied Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. QMIS 120, by Dr. M. Zainal Chap 2-61