Statistics is the study of the collection, organization, analysis, interpretation and presentation of data.

Similar documents
Statistics. Graphing Statistics & Data. What is Data?. Data is organized information. It can be numbers, words, measurements,

Numerical: Data with quantity Discrete: whole number answers Example: How many siblings do you have?

Univariate Descriptive Statistics

Biggar High School Mathematics Department. S1 Block 1. Revision Booklet GOLD

Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation. Chapter 2

(Notice that the mean doesn t have to be a whole number and isn t normally part of the original set of data.)

Chapter 4. Displaying and Summarizing Quantitative Data. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Describing Data. Presenting Categorical Data Graphically. Describing Data 143

Chapter 3. Graphical Methods for Describing Data. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

This Chapter s Topics

Chapter 1: Stats Starts Here Chapter 2: Data

TJP TOP TIPS FOR IGCSE STATS & PROBABILITY

Sidcot intranet - Firefly. Useful links: Instant classroom. MyMaths. Objectives

You must have: Pen, HB pencil, eraser, calculator, ruler, protractor.

Chapter Displaying Graphical Data. Frequency Distribution Example. Graphical Methods for Describing Data. Vision Correction Frequency Relative

Notes 5C: Statistical Tables and Graphs

The numbers are...,..., ...,...,...,...,...,

SAMPLE. This chapter deals with the construction and interpretation of box plots. At the end of this chapter you should be able to:

Core Connections, Course 2 Checkpoint Materials

10 Wyner Statistics Fall 2013

Chpt 2. Frequency Distributions and Graphs. 2-3 Histograms, Frequency Polygons, Ogives / 35

DESCRIBING DATA. Frequency Tables, Frequency Distributions, and Graphic Presentation

Mathematicsisliketravellingona rollercoaster.sometimesyouron. Mathematics. ahighothertimesyouronalow.ma keuseofmathsroomswhenyouro

We can see from columns 1 and 2 that: [Bottom number 12 = Top number] OR. [Top number 12 = Bottom number] [132] [6] 11 [10]

1 Summer Math Booklet

Review. In an experiment, there is one variable that is of primary interest. There are several other factors, which may affect the measured result.

UNCORRECTED PAGE PROOFS

Math Exam 2 Review. NOTE: For reviews of the other sections on Exam 2, refer to the first page of WIR #4 and #5.

Math Exam 2 Review. NOTE: For reviews of the other sections on Exam 2, refer to the first page of WIR #4 and #5.

Describe the variable as Categorical or Quantitative. If quantitative, is it discrete or continuous?

Data 1 Assessment Calculator allowed for all questions

AWM 11 UNIT 1 WORKING WITH GRAPHS

Chapter 2 Frequency Distributions and Graphs

Describing Data Visually. Describing Data Visually. Describing Data Visually 9/28/12. Applied Statistics in Business & Economics, 4 th edition

To find common multiples

Chuckra 11+ Maths Test 4

Chapter 4. September 08, appstats 4B.notebook. Displaying Quantitative Data. Aug 4 9:13 AM. Aug 4 9:13 AM. Aug 27 10:16 PM.

Elementary Statistics. Graphing Data

Reigate Grammar School. 11+ Entrance Examination January 2014 MATHEMATICS

Chapter 2. Organizing Data. Slide 2-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Going back to the definition of Biostatistics. Organizing and Presenting Data. Learning Objectives. Nominal Data 10/10/2016. Tabulation and Graphs

Reigate Grammar School. 11+ Entrance Examination January 2012 MATHEMATICS

Year 4 Time Block 2. For the next set of questions you will have 10 seconds to work out the answer and record it on your answer sheet.

MDM4U Some Review Questions

Statistics 101: Section L Laboratory 10

Year 4. Term by Term Objectives. Year 4 Overview. Autumn. Spring Number: Fractions. Summer. Number: Addition and Subtraction.

Frequency Distribution and Graphs

Day 1. Mental Arithmetic Questions KS3 MATHEMATICS. 60 X 2 = 120 seconds. 1 pm is 1300 hours So gives 3 hours. Half of 5 is 2.

Section 1.5 Graphs and Describing Distributions

Background knowledge in statistics

Paper 1. Calculator not allowed. Mathematics test. First name. Last name. School. Remember KEY STAGE 3 TIER 4 6

THOMAS WHITHAM SIXTH FORM

Name: Date: Period: Histogram Worksheet

Word Problems About Combining

LESSON 2: FREQUENCY DISTRIBUTION

TOPIC 4 GRAPHICAL PRESENTATION

Chapter 6: Descriptive Statistics

6th Grade Math. Statistical Variability

Introduction. Descriptive Statistics. Problem Solving. Inferential Statistics. Chapter1 Slides. Maurice Geraghty

Paper 1. Calculator not allowed. Mathematics test. First name. Last name. School. Remember KEY STAGE 3 TIER 6 8

PSY 307 Statistics for the Behavioral Sciences. Chapter 2 Describing Data with Tables and Graphs

Symmetric (Mean and Standard Deviation)

Second Practice Test 1 Level 5-7

Radical Expressions and Graph (7.1) EXAMPLE #1: EXAMPLE #2: EXAMPLE #3: Find roots of numbers (Objective #1) Figure #1:

AP Statistics Composition Book Review Chapters 1 2

2.2 More on Normal Distributions and Standard Normal Calculations

Chapter 11. Sampling Distributions. BPS - 5th Ed. Chapter 11 1

11 + Entrance Examination Sample Paper 2 Mathematics Total Marks: 100 Time allowed:1 hour

Mathematics Expectations Page 1 Grade 04

Collecting, Displaying, and Analyzing Data

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. B) Blood type Frequency

Chapter 1. Statistics. Individuals and Variables. Basic Practice of Statistics - 3rd Edition. Chapter 1 1. Picturing Distributions with Graphs

Year End Review. Central Tendency 1. Find the mean, median and mode for this set of numbers: 4, 5, 6, 3, 7, 4, 4, 6, 7 mean. median.

Understanding and Using the U.S. Census Bureau s American Community Survey

Core Learning Standards for Mathematics Grade 6

Unit 06 PC Form E. 1. (6.5, 6.6) Use pencil and paper to answer the question.

Chapter 4 Displaying and Describing Quantitative Data

Measurement over a Short Distance. Tom Mathew

EVALUATE- work out CALCULATE work out EXPRESS show PRODUCT- multiply SUM/TOTAL- add SIMPLIFY make easier

2 a. What is the total cost of a fidget. 1. Calculate the following: spinner costing 4.68 and a cricket ball. a costing 8.59?

Heights of netballers and footballers

Data and its representation

ESSENTIAL MATHEMATICS 1 WEEK 17 NOTES AND EXERCISES. Types of Graphs. Bar Graphs

Notes: Displaying Quantitative Data

Displaying Distributions with Graphs

Section 1: Data (Major Concept Review)

Measure of Central Tendency

8 LEVELS 4 6 PAPER. Paper 2. Year 8 mathematics test. Calculator allowed. First name. Last name. Class. Date YEAR

Solve this equation. 7y + 12 = 5y marks. Page 1 of 69

Business Statistics:

Objectives. Organizing Data. Example 1. Making a Frequency Distribution. Solution

Paper 2. Mathematics test. Calculator allowed. satspapers.org. First name. Last name. School KEY STAGE TIER

Statistics 101 Reviewer for Final Examination

Level 4 Core Skills Practice Sheet 1

Mean for population data: x = the sum of all values. N = the population size n = the sample size, µ = the population mean. x = the sample mean

< 1 / j, po&i*(sl. Statistics 300 X* > Summer 2011

Organizing Data 10/11/2011. Focus Points. Frequency Distributions, Histograms, and Related Topics. Section 2.1

Chapter 10. Definition: Categorical Variables. Graphs, Good and Bad. Distribution

Outline. Drawing the Graph. 1 Homework Review. 2 Introduction. 3 Histograms. 4 Histograms on the TI Assignment

Transcription:

Statistics is the study of the collection, organization, analysis, interpretation and presentation of data.

What is Data? Data is a collection of facts, such as values or measurements. It can be numbers, words, measurements, observations or even just descriptions of things.

Qualitative vs Quantitative Qualitative data is descriptive information (it describes something: colour, shape, etc) Quantitative data is numerical information (numbers).

Quantitative data can also be Discrete or Continuous: Discrete data can only take certain values (like whole numbers): the number of pupils in class. Continuous data can take any value (within a range): persons height.

What do we know about the dog? Qualitative: He is brown and white He has short hair He has lots of energy Quantitative: Discrete: He has 4 legs He has 2 brothers Continuous: He weighs 12,5 kg He is 80 cm tall

Put simply: Discrete data is counted. Continuous data is measured.

When we need to analyse data they must be collected and organized in a table. Data can be categorical or numerical. Examples of categorical data are: colour, kind of music, foods, our favourite subject Numerical data can be discrete or continuous. Examples of numerical discrete data are: size of shoes, number of brothers and sisters Examples of numerical continuous data are: height, measures of length,

The main terms used in statistic Population or census: the set that is made up of all the elements that we want to study. When you collect data for every member of the group (the whole "population").

Sample: is a part of the population that we study. Sample is when you collect data just for selected members of the group. A census is accurate, but hard to do. A sample is not as accurate, but may be good enough, and is a lot easier.

Example: there are 120 people in your local football club. You can ask everyone (all 120) what their age is. That is a census. 120 people Or you could just choose the people that are there this afternoon. That is a sample. population sample

Variable: the quality we want to study in the sample. Remember that variables can be cuantitative or cualitative. - cualitative variable: your friend s favourite colour. - cuantitative variable: Discrete: your friend s number of siblings. Continuous: the dog weight. Individual: Each element of the population or sample. Size: Number of elements in the sample or population.

Organise and analyse data When we need to analyse data they must be collected and organized in a table. It is advisable to follow these steps. 1. Collect data. 2. Organise data and display them in a frequency table. 3. Draw a graph.

Example 630 students have been asked about the number of brothers and sisters they are in their families. These are the answers: 1. Collect data: 1, 2, 1, 3, 6, 3, 2, 1, 1, 1, 2, 2, 3, 2, 3, 2, 2, 4, 2, 3, 3, 2, 2, 3, 4, 2, 2, 3, 1, and 2. 2. Organise data into a frequency table: Number of B or S Tally Frequency 1 II II II 6 2 II II II II II II I 13 3 II II II II 8 4 II 2 5 0 6 I 1

Frequency is how often something occurs. We count things or different situations (variables) and say how often they occur. By counting frequencies we can make a Frequency Distribution table. Example: We count the number of books pupils from 1º ESO read during the summer, and we get the tally and total: 1 0 1 3 4 2 tally: 2 3 2 2 0 0 1 0 2 2 3 1 1 0 2 1 2 1 0 IIII I 5 1 IIII III 7 2 IIII IIII 8 3 III 3 4 I 1

From the table some conclusions are drawn: Only one of the pupils from 1º ESO has read 4 books. The most common number of books pupils read are 1 and 2. The value from the distribution about books is called ABSOLUT FREQUENCY that means the number of times one thing occurs. This value is represented by the symbol fi The sum of the absolute frequency for each value correspond to the total number of data. f1 + f2 + f3 +..+ = N The RELATIVE FREQUENCY is the quotient of the absolute frequency by the total number of data and the symbol for this value is hi The sum of the relative frequency for each value always correspond to 1 h1 + h2 + h3 +.+ = 1

The frequency and the data from a distribution can be organized in a frequency table. Example: Build a frequency table for the tally from the last example. Data Absolute Frequency fi Relative Frequency hi 0 5 5/24 1 7 7/24 2 8 8/24 3 3 3/24 4 1 1/24 N = 24 Total = 1

Cumulative frequencies To have cumulative frequencies, just add up the values as you go. You only can obtain the cumulative frequencies for quantitative variables because you need to put the data in order. Cumulative absolute frequency is represented by Fi Cumulative relative frequency is represented by Hi

Obtain the cumulative frequencies for the distribution below: The shoe size for 20 students in a class is: 43, 42, 41, 39, 41, 37, 40, 43, 44, 40, 39, 36, 38, 41, 40, 39, 38, 39, 39, 40 xi fi hi Fi Hi 37 1 1/20 1 1/20 38 2 2/20 3 3/20 39 6 6/20 9 9/20 40 4 4/20 13 13/20 41 3 3/20 16 16/20 42 1 1/20 17 17/20 43 2 2/20 19 19/20 44 1 1/20 20 20/20 20 1

Now do exercises 8, 11 an 14 from pages 248, 249 and 250

Finding a central value for a data distribution We can find the central value for a data distribution in different ways: Calculating the MEAN (x) Calculating the MEDIAN (Me) Calculating the MODE (Mo)

1. Calculating the MEAN The mean is the average of the numbers: a calculated "central" value of a set of numbers. You only have to add up the numbers and divide by how many numbers. Example: Tom wants to know the mean for the number of hours that his children spend playing videogames per week. Tom: 4h/week Ana: 2h/week Bob: 5h/week Paul: 5 h/week Peter: 3h/week Betty: 4//week Add up all the ages, and divide by 6 (because there are 6 numbers): (4+2+5+5+3+4) / 6 = 3,8...

It is advisable to follow these steps: 1. Build a frequencies table. 2. Multiply each value for the variable by the value of its absolute frequency. 3. Add up all products. 4. Divide the result by N (total number of data). You can apply the formula: x = x i f i N Number of hours per week Absolute frequency x i f i 2 1 2 x = x i f i N = 23 14 = 1, 64 3 1 3 4 2 8 5 2 10 N = 14 Sum = 23

2. Calculating the MODE The Mode is the value that occurs most often: In the example before, the MODE is 5, because this occurs twice. Tom: 4h/week Paul: 5 h/week Ana: 2h/week Peter: 3h/week Bob: 5h/week Betty: 4//week But Mode can be tricky, there can sometimes be more than one Mode. Example: What is the Mode of 3, 4, 4, 5, 6, 6, 7 Well... 4 occurs twice but 6 also occurs twice. So both 4 and 6 are modes. When there are two modes it is called "bimodal", when there are three or more modes we call it "multimodal".

3. Calculating the MEDIAN But you could also use the Median: simply list all numbers in order and choose the middle one: In a birthday party there are 11 kids with different ages: 5 kids aged 10, 3 kids aged 8 and 2 kids aged 7. We obtain the median by listing the data in order and choosing the middle number: 10,10, 10, 10, 10, 8, 8, 8, 8, 7, 7 The Median age is 8... so let's go to the cinema!

Sometimes there are two middle numbers. Just average them: Example: What is the Median of 3, 4, 7, 9, 12, 15 There are two numbers in the middle: 3, 4, 7, 9, 12, 15 So we average them: (7+9) / 2 = 16/2 = 8 The Median is 8

Finding the position values for a data distribution: Quartiles A position value distribution. gives us the place for a variable in the ordered data Quartiles are the values that divide a list of numbers into quarters (Q1, Q2 and Q3) - First put the list of numbers in order - Then cut the list into four equal parts - The Quartiles are at the "cuts 25% 25% 25% 25% I I I I I Q1 Q2 Q3

Example: 4, 7, 6, 5, 5, 2, 4, 7, 5, 8, 8, 8 - Put the list of numbers in order 2, 4, 4, 5, 5, 5, 6, 7, 7, 8, 8, 8 - Then cut the list into four equal parts - The Quartiles are at the "cuts 2, 4, 4, 5, 5, 5, 6, 7, 7, 8, 8, 8 Q1 Q2 Q3 lower middle upper quartile quartile quartile (median)

Sometimes a "cut" is between two numbers... the Quartile is the average of the two numbers. 3, 3, 4, 5, 5, 5, 6, 6, 7, 8, 8, 8, 8 The number are already in order. Now, divide the list into quarters (cut the list) 3, 3, 4, 5, 5, 5, 6, 6, 7, 8, 8, 8, 8 Q1 Q2 Q3 lower middle upper quartile quartile quartile (median) Q2 is s half way between 6 and 7 Q2 = (6 +7) /2 = 6,5

In other words: Q1 correspond to 25% of data, so: The value of Cumulative frequecy higher than 25%* N correspond to Q1. Q2 correspond to 50% of data, so: The value of Cumulative frequecy higher than 50%* N correspond to Q2. This value is the same that the Median. Q3 correspond to 75% of data, so: The value of Cumulative frequecy higher than 75%* N correspond to Q3.

Finding Dispersion measurements for a data distribution Dispersion in statistics is a way of describing how spread out a set of data is. The spread of a data set can be described by: - Range (R) - Mean deviation (DM) - Variance (σ2 ) - Standard deviation (σ) - Variation coefficient (CV)

The Range (R) is the difference between the lowest and highest values. Mean deviation (DM) is the mean of the distances of each value from their mean. Step 1: Find the mean: x Step 2: Find the distance of each value from that mean: xi x Step 3. Find the mean of those distances DM= fi xi x N It tells us how far, on average, all values are from the middle.

Variance (S2 or σ2) is the average of the squared differences from the Mean. To calculate the variance follow these steps: - Work out the Mean (the simple average of the numbers). - Then for each number: subtract the Mean and square the result (the squared difference). - Then work out the average of those squared differences.

Standard Deviation (σ) is just the square root of Variance, so: Variation coefficient (CV) is the quotient between the standard deviation and the mean: CV= σ x

How to show data: Graphs Besides tables, graphs make very easy to organize data. Different graphs can be build with data. Bar Graph: A Bar Graph (also called Bar Chart) is a graphical display of data using bars of different heights. Example: Imagine you just did a survey of your friends to find which kind of sport they liked best: Sport Football Basket Tennis Athletism Handball fi 8 12 6 10 4

We can show that on a bar graph like this: 14 12 10 8 6 4 2 0 sports sports

Frequency polygon: If we join the middle top point of each column of a frequency graph, we obtain the frequences poligon (in red colour): sports 14 12 10 8 6 4 2 0 sports

Histograms: When we have a data distribution with intervals we use the histogram. It is a graphical display of data using bars of different heights. It is similar to a Bar chart, but a histogram groups numbers into ranges. See example number 8 from page 252

Pie Chart: A Pie chart is a special chart that uses "pie slices" to show relative sizes of data. Example: Imagine you survey your friends to find the kind of books they like best: Topic Adventure Sci Fi Drama Byography History fi 8 12 6 10 4 You can show the data by this Pie Chart: Topic Adventures Sci Fi Drama Byography History