Student's height (in)

Similar documents
Spring 2017 Math 54 Test #2 Name:

Exam 2 Review. Review. Cathy Poliak, Ph.D. (Department of Mathematics ReviewUniversity of Houston ) Exam 2 Review

2.2 More on Normal Distributions and Standard Normal Calculations

Math 247: Continuous Random Variables: The Uniform Distribution (Section 6.1) and The Normal Distribution (Section 6.2)

Reminders. Quiz today. Please bring a calculator to the quiz

CHAPTER 13A. Normal Distributions

Multiple Choice: Identify the choice that best completes the statement or answers the question.

Algebra 1 B Semester Exam Review

Lesson 1b Linear Equations

Table 1. List of NFL divisions that have won the Superbowl over the past 52 years.

Unit 4 Review. Multiple Choice: Identify the choice that best completes the statement or answers the question.

University of California, Berkeley, Statistics 20, Lecture 1. Michael Lugo, Fall Exam 2. November 3, 2010, 10:10 am - 11:00 am

Regression: Tree Rings and Measuring Things

1. Graph y = 2x 3. SOLUTION: The slope-intercept form of a line is y = mx + b, where m is the slope, and b is the y-intercept.

Sect Linear Equations in Two Variables

11 Wyner Statistics Fall 2018

STAB22 section 2.4. Figure 2: Data set 2. Figure 1: Data set 1

Spring 2016 Math 54 Test #2 Name: Write your work neatly. You may use TI calculator and formula sheet. Total points: 103

Chapter 7 Graphing Equations of Lines and Linear Models; Rates of Change Section 3 Using Slope to Graph Equations of Lines and Linear Models

Outcome 9 Review Foundations and Pre-Calculus 10

6.1 Slope of a Line Name: Date: Goal: Determine the slope of a line segment and a line.

Class 8 Cubes and Cube Root

10 Wyner Statistics Fall 2013

Graphs of linear equations will be perfectly straight lines. Why would we say that A and B are not both zero?

Math 1023 College Algebra Worksheet 1 Name: Prof. Paul Bailey September 22, 2004

Scatter Plots, Correlation, and Lines of Best Fit

2008 Excellence in Mathematics Contest Team Project A. School Name: Group Members:

Linear Regression Exercise

December 12, FGCU Invitational Mathematics Competition Statistics Team

Section 3 Correlation and Regression - Worksheet

Correlation and Regression

NOTES: Chapter 6 Linear Functions

Algebra 1 2 nd Six Weeks

Exploring bivariate data Student Activity Sheet 4; use with Exploring Interpreting linear models

Probability and Genetics #77

c. If you roll the die six times what are your chances of getting at least one d. roll.

(3 pts) 1. Which statements are usually true of a left-skewed distribution? (circle all that are correct)

Algebra & Trig. 1. , then the slope of the line is given by

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Lesson 17. Student Outcomes. Lesson Notes. Classwork. Example 1 (5 10 minutes): Predicting the Pattern in the Residual Plot

Lesson 2.1 Linear Regression

November 28, scatterplots and lines of fit ink.notebook. Page 153. Page 154. Page Scatter Plots and Line of Fit.

Multiple Choice: Identify the choice that best completes the statement or answers the question.

Algebra 1B. Chapter 6: Linear Equations & Their Graphs Sections 6-1 through 6-7 & 7-5. COLYER Fall Name: Period:

Name: Date: Period: Activity 4.6.2: Point-Slope Form of an Equation. 0, 4 and moving to another point on the line using the slope.

Name Date. Chapter 15 Final Review

(Notice that the mean doesn t have to be a whole number and isn t normally part of the original set of data.)

February 24, [Click for Most Updated Paper] [Click for Most Updated Online Appendices]

A slope of a line is the ratio between the change in a vertical distance (rise) to the change in a horizontal

The Picture Tells the Linear Story

Architects use isometric paper. An isometric drawing is a view seen from above that represents the three dimensions of the space.

Economics 101 Spring 2015 Answers to Homework #1 Due Thursday, February 5, 2015

MAT Midterm Review

Independence Is The Word

Selected Answers for Core Connections Algebra

IE 361 Module 7. Reading: Section 2.5 of Revised SQAME. Prof. Steve Vardeman and Prof. Max Morris. Iowa State University

x y

MA Lesson 16 Sections 2.3 and 2.4

Categorical and Numerical Data in Two Variables: Relationship Between Two Categorical Variables Part 1 Independent Practice

Restaurant Bill and Party Size

FINDING VALUES FROM KNOWN AREAS 1. Don t confuse and. Remember, are. along the scale, but are

Chapter 7, Part 1B Equations & Functions

Individual Guess Actual Error

Core Connections, Course 2 Checkpoint Materials

Female Height. Height (inches)

PASS Sample Size Software

2.3 Quick Graphs of Linear Equations

Chapter 4. September 08, appstats 4B.notebook. Displaying Quantitative Data. Aug 4 9:13 AM. Aug 4 9:13 AM. Aug 27 10:16 PM.

Rule. Describing variability using the Rule. Standardizing with Z scores

IEOR 130 Methods of Manufacturing Improvement Fall, 2018, Prof. Leachman Homework Assignment 8, Due Tuesday Nov. 13

4-7 Point-Slope Form. Warm Up Lesson Presentation Lesson Quiz

c. Find the probability that a randomly selected adult has an IQ between 90 and 110 (referred to as the normal range).

Math 2 Proportion & Probability Part 3 Sums of Series, Combinations & Compound Probability

Name Date. Chapter 15 Final Review

Wednesday, May 4, Proportions

Solving Equations and Graphing

Answers for the lesson Plot Points in a Coordinate Plane

Mathematics Success Grade 8

MATH 150 Pre-Calculus

5. Suppose the points of a scatterplot lie close to the line 3x + 2y = 6. The slope of this line is: A) 3. B) 2/3. C) 3/2. D) 3/2.

1.3 Density Curves and Normal Distributions

Released Item Booklet

Review for Mastery. Identifying Linear Functions

a) Getting 10 +/- 2 head in 20 tosses is the same probability as getting +/- heads in 320 tosses

1) What is the total area under the curve? 1) 2) What is the mean of the distribution? 2)

Unit 8: Coordinate Plane (including x/y tables), Proportional Reasoning, and Slope

Lesson Sampling Distribution of Differences of Two Proportions

Unit 2. Linear Functions

5. Aprimenumberisanumberthatisdivisibleonlyby1anditself. Theprimenumbers less than 100 are listed below.

Section 3.5. Equations of Lines

Algebra Success. LESSON 16: Graphing Lines in Standard Form. [OBJECTIVE] The student will graph lines described by equations in standard form.

Notes: Displaying Quantitative Data

Lesson 3A. Opening Exercise. Identify which dilation figures were created using r = 1, using r > 1, and using 0 < r < 1.

Math Labs. Activity 1: Rectangles and Rectangular Prisms Using Coordinates. Procedure

Block: Date: Name: REVIEW Linear Equations. 7.What is the equation of the line that passes through the point (5, -3) and has a slope of -3?

Today I am: using scatterplots and lines of best fit to make predictions. So that I can: learn to write equations of lines of best fit.

Chapter 3. The Normal Distributions. BPS - 5th Ed. Chapter 3 1

Summer Math Practice: 7 th Pre-Algebra Entering 8 th Algebra 1

BMB/Bi/Ch 173 Winter 2018

Unit Nine Precalculus Practice Test Probability & Statistics. Name: Period: Date: NON-CALCULATOR SECTION

Transcription:

Psych 315, Winter 2018, Homework 4 Answer Key Due Wednesday, January 31 either in section or in your TA s mailbox by 4pm. Name ID Section [AA Kit] [AB Kit] [AC Kelly] [AD Kelly] The scatterplot below plots Female students heights and their mother s heights for the 11 students who chose Green as their favorite color. Round all answers to 2 decimal places. Mother' height (in) 72 70 68 66 64 62 62 64 66 68 70 1

1) Use R to load in the survey data, select the 11 students who chose Green as their favorite color, and calculate: x: mean of student s heights ȳ: mean of mother s heights sx: standard devation of student s heights sy: standard devation of mother s heights r: corelation between students and mother s heights Giant Hint: here s how to do this for the female students that chose Red as their favorite color: # Load the survey data survey <-read.csv("http://www.courses.washington.edu/psy315/datasets/psych315w18survey.csv") # Find female students that chose "Red" student.red <- survey$gender == "Female" & survey$color=="red" # Find the heights of these students (call it x ): x <- survey$height[student.red] # Find their mother s heights (call it y ) y <- survey$mheight[student.red] # Find where there not NA s in both x and y: goodid <-!is.na(x) &!is.na(y) # Only include these pairs x <- x[goodid] y <- y[goodid] # Means of x and y: mx <- mean(x); my <- mean(y); # Standard deviations of x and y sx <- sd(x); sy <- sd(y); # Correlation of x and y: r <- cor(x,y) mx [1] 64.36364 my [1] 63.36364 sx [1] 2.500909 sy [1] 1.858641 r [1] 0.3774603 2

2) Use R or your calculator to find the equation of the regression line and draw it by hand on the scatterplot. m = r( s y sx ) and the y-intercept is: b = Ȳ (m)( X) Here s how to do it in R: # slope: m <- r*sx/sy # intercept: b <- my - m*mx mean of x: 65.82, mean of y: 65.27 sx = 3.27, sy = 3.65 Slope: m = (0.81) 3.65 3.27 = 0.9 Intercept: b = 65.27 - (0.9)(65.82) = 6.03 Y = 0.9X + 6.03 3) Use R or your calculator to find the standard error of the estimate by calculating the sum of the squared residuals: (Y Y ) 2 Syx = in R: n # Find y on the regression line for every value of x: yprime <- m*x+b # Find the residuals: residual <- y-yprime # Use the residuals to calculate syx: syx <- sqrt(sum( (y-yprime)^2)/length(x)) Y = 68.13, 64.53, 62.73, 63.63, 69.93, 68.13, 61.83, 68.13, 61.83, 67.23 and 61.83 Y-Y = -1.13, -2.53, -0.73, 4.37, -0.93, -2.13, -0.83, 3.87, 0.17, 0.77 and -0.83 (Y Y ) 2 = 1.28 + 6.4 +... + 0.69 = 49.69 Syx = 49.69 11 = 2.13 4) Use the correlation as another way of calculating the standard error of the estimate. Your answer should be close, but not exactly the same due to rounding error. Syx = Sy 1 r 2 (3.65) 1 0.81 2 = 2.14 inches 3

5) Use the regression line to predict the mother s height for a Female student that is 62.25 inches tall. Y = mx+b = (0.9)(62.25) + 6.03 = 62.06 inches 6) Assuming homoscedacity, find the range of mother s heights that covers the middle 50% of the heights of mothers of women that are 62.25 inches tall. Hint: The heights of the mothers of women that are 62.25 tall should be distributed normally with a mean determined by the regression line (problem 5) and a standard deviation equal to the standard error of the estimate (problem 4). The Mother s heights should be distributed normally with a mean of 62.06 and a standard deviation of 2.13 Using table A, the z-scores covering the middle 50 percent of the normal distribution is z = +/- 0.67 Converting to heights, the range is between 62.06-(0.67)(2.13) and 62.06+(0.67)(2.13) which is between 60.63 and 63.49 inches. 7) Repeat problems 5 and 6 but for students that are 69 inches tall. Note, because of homoscedasticity, the range above and below the predicted height should not change. Y = mx+b = (0.9)(69) + 6.03 = 68.13 inches The Mother s heights should be distributed normally with a mean of 68.13 and a standard deviation of 2.13 Using table A, the z-scores covering the middle 50 percent of the normal distribution is z = +/- 0.67 Converting to heights, the range is between 68.13-(0.67)(2.13) and 68.13+(0.67)(2.13) or between 66.7 and 69.56 inches 8) You should see that for any Female student s height, the middle 50% of the corresponding mothers heights should fall within the same range above and below the regression line. Draw two parallel lines on the scatterplot, one above and one below the regression line that should cover the middle 50% of the mother s heights. Use the values from problems 6 and 7 as points on the lines. 9) Look at the scatterplot and calculate the actual percent of data points that fall between these two parallel lines. How close does it match to 50%? 7 of the 11 points fall between the parallel lines This is 100 11 7 = 63.64 percent of the points. This is pretty close. 4

10) The correlation between SAT scores and IQ is around 0.5. Assume that SAT scores are normally distributed with a mean of 915 and a standard deviation of 88.24, and IQ scores are normally distributed with a mean of 100 and a standard of deviation of 15. a) Find the equation of the regression line that predicts IQs from SAT score. Hint: use the equations from problem 2. Give your answer in slope-intercept form. Let X be SAT scores, and Y be IQ The slope is r s y sx = (0.5) 15 88.24 = 0.08 The line goes through the means, so: Y = (0.08)(X-915) + 100 Y = (0.08)X + 26.8 b) What is the expected IQ of a student with a SAT score of 1000? IQ = (0.08)(1000)+ 26.8 = 106.8 c) What is the proportion of variance of Y explained by X (the coefficient of determination)? The coefficient of determination is r 2 = 0.5 2 = 0.25 d) What is the total variance in the IQ scores? The variance is the standard deviation squared: 15 2 = 225 e) From parts c and d, calculate the amount of variance in IQ scores that is explained by SAT scores. The amount of variance explianed by SAT scores is equal to the total amount of variance in SAT scores multiplied by the proportion of variance accounted for, which is r 2. (225)(0.25) = 56.25 5

11) Explain why the correlation between parent s heights and all student s heights might be lower than for the correlations you d find for just the female or male students. Draw a picture if it helps. While there may be a strong correlations within each gender combining students leads to added variance in the student s heights that is not explained by the parent s height. This leads to an overall lower correlation for the whole group than for the correlations within each gender. 76 Students by gender 76 All students 74 Male Female 74 72 72 70 68 66 64 62 r= 0.84 r= 0.85 70 68 66 64 62 r= 0.52 60 60 58 58 56 Parent's height (in) 56 Parent's height (in) 6

12) Explain why the correlation between student s heights and video game playing time might be stronger for the whole group than for the correlations within male and female students. Again, draw a picture if it helps. Suppose there is no correlation between height and video game playing within each gender. But since men play games more than women, and men are taller than women, the combined distribution is correlated. 8 7 Students by gender Male Female r= 0.01 8 7 All students Video game playing (hours/week) 6 5 4 3 2 r= 0.00 Video game playing (hours/week) 6 5 4 3 2 r= 0.72 1 1 0 0 7