Types of data Numerical: Data with quantity Discrete: whole number answers Example: How many siblings do you have? Continuous: Answers can fall anywhere in between two whole numbers. Usually any type of measurement (weight, distance, time etc) Example: Height of trees in a forest. Categorical: Data with descriptive attributes Nominal: Descriptive data that has no order associated. Example: What colour is your hair? table - Discrete data Ordinal: Descriptive data with an implied order. Responses are usually ranked. Example: How often do you use SnapChat? Never Sometimes - Quite frequently The shoe sizes of 20 people are listed below 5 11 8 8 10 12 6 8 11 8 7 10 11 9 11 8 7 10 6 5 table The column (f) represents the amount of times each score appears The cumulative column (cf) represents the running total of the. Add the previous frequencies to the current. The score column (x) represents what the data is: In this case shoe size Shoe size (x) (f) Cumulative (cf) 5 2 2 Percentage (%) : Sum of the 6 2 2 + 2 = 4 7 2 4 + 2 = 6 8 5 6 + 5 =11 9 1 11 + 1 = 12 10 3 12 + 3 = 15 11 4 15 + 4 = 19 12 1 19 + 1 = 20 TOTAL 100 Total of the Percentage column The percentage column represents what percent of the total each score accounts for. Page 1 of 7
table Continuous data The midpoint (x) represents the middle value of each group table The column (f) represents the amount of times each score appears (xf): Multiply the midpoint by the The cumulative column (cf) represents the running total of the. Add the previous frequencies to the current. The Class represents the individual groups of data Class Mid-point ( ) 0 <10 5 ( ) ( ) Cumulative ( ) 10 < 20 2 20 < 30 10 30 < 40 3 40 < 50 1 TOTAL : Sum of the. Eg. Amount of data Note: : Sum of the column. Eg: sum of all data Bar chart: Used for categorical data Bars represent for each category Gaps between each bar scale Remember to title axis Categories Remember to title axis Page 2 of 7
Histogram: Used for numerical data Bars represent for each category scale Remember to title axis NO gaps between each bar Score Remember to title axis Cumulative curve (ogive) A cumulative is a running total of the results. The curve (ogive) should always be sloping up to the right or be horizontal between two values (dots) Axis are titled Position of median Curve always slopes up to the right Median value Stem and leaf plot: Key: What the stem and leaf represents. In this example the stem is tens and the leaf is units The median can easily be found on a cumulative curve. 1) Add one to the greatest value and half the result. (position of median = ) 2) Draw a horizontal line from the value you obtained starting at the vertical axis (cumulative ) until you touch the ogive. 3) Draw a vertical line down to the horizontal axis. This is the value of the median. Example: From the diagram, the median value is 61. Stem: In this example it represents tens Title Leaf: All the numbers sorted in tens (according to the key). On the leaf side there should only be one number per entry Page 3 of 7
Dot plot: Dots represent frequencies Scale Title Mean: The average of the data values Example: 0 1 1 3 4 4 4 7 8 9 Median: The middle value of a data set Data needs to be ordered from lowest to highest to find the median Example 1: (odd data) Median = 5 th piece of data = 8 2 5 6 7 8 10 10 12 15 The median is the 5 th piece of data Median = 8 Example 2: (Even data) Median = between 5 th and 6 th piece of data Median = 7 0 1 1 3 5 9 11 15 17 20 The median falls between the 5 th and 6 th piece of data Median = Median = 7 Page 4 of 7
Mode: Most occurring value 0 1 1 3 4 4 4 7 8 9 Mode = 4 5-Figure Summary The 5-figure summary includes the minimum, first quartile, median, third quartile and the maximum value. Data needs to be sorted from lowest value to greatest value Minimum = 5 Lowest value Q 1 = 7 Median of lower half Median = 8.5 Middle number of data. 5 5 6 6 7 7 8 8 8 9 9 10 10 10 11 11 11 11 Q 3 = 10 Median of upper half Maximum = 11 Greatest value Minimum Q 1 Median Q 3 Maximum Median of Middle Median of Lowest Greatest the lower value of the upper value value half the data half 5 7 10 11 Range and Interquartile range (IQR): Range: Measures the spread of the data Interquartile range: Measures the spread of the middle 50% of the data Example: Minimum Q 1 Median Q 3 Maximum 1 15 16 16 18 19 19 20 21 35 Ie. The data has a spread of 34 values ranging from 1 to 35 Ie. The middle 50% of the data has a spread of 4 values ranging from 16 to 20 Page 5 of 7
Outliers: Extreme value: A data value that is not representative of the majority of the data. Example: Q 1 Median Q 3 1 15 16 16 18 19 19 20 21 35 Lower fence Upper fence Box plots: Minimum Q 1 Median Q 3 Maximum and outlier Scale Maximum value not including any outliers Centre and spread: The following differ in centre (look at median): Centre Centre The following differ in spread (look at range and IQR): Page 6 of 7
Symmetry and Skew: Both of the following are symmetric (look for an even bell curve): The following are skewed data: Page 7 of 7