Khan Academy Video Notes: (5 minutes)
Caffeine Example Using Computer Output Information:
Cheryl Dixon is interested to see if students who consume more caffeine tend to study more as well.
She randomly selects 20 students at her school and records their caffeine intake (mg) and the number of hours spent studying.
A scatterplot of the data showed a linear relationship.
Watch the Khan Academy video (5 min).
Use the following link: https://bit.ly/37fW2gE
You may need to copy and paste it into your browser.
Watch the video then work through the problems below.
Use information from the video to answer #1-7
1. Read the information carefully above, the relationship of what two variables is being analyzed?
Select both from the list below.
2. Use the 'show your work' section to circle the data that we are interested in when looking at computer output at this time.
Which variable is the explanatory variable, doing the predicting?
Which variable is the response variable, being predicted?
Create the Linear Model Equation for the Line of Best Fit (LSRL) for predicting values of hours studying from the caffeine consumption of a student.
Use meaningful words (not x and y like in the video) and proper notation to show what is predicted.
Use 'hours' and 'caffeine' for the variables in the equation.
No spaces.
Give the correlation coefficient (r). r=
You will need to take a square root, round to three places past the decimal point.
Remember:
1. R squared is given as a percent, convert to a decimal first.
2. R squared is always positive, you need to read carefully to determine if the relationship is positive or negative and use that sign on the correlation coefficient.
Use the Linear Model from #4 to predict the amount of studying a student will do if they consume 80 mg of caffeine (found in 8 oz of Starbucks Cold Brew).
Round the hours to two places past the decimal.
If the student in #6 actually does 4.5 hours of studying, what is the residual?
A study published in 1995 investigated the use of marijuana and other drugs.
Data from 11 countries are summarized in the regression analysis below.
The data showed an association between the percentage of a country’s ninth graders who report having smoked marijuana and the percentage who have used other drugs such as LSD, amphetamines, and cocaine.
The computer output is shown below.
Think about which two variables are being compared in this situation.
Classify the numbers above as 'we are interested in' or 'we will not be using'.
Hint: both categories do not have the same number of items, but they ALL should be categorized.
75.6%
0.078
2.413
0.615
84.7%
2.204
7.85
-1.39
-3.068
We are interested in:
We will not be using:
Use the information from #8 to calculate the correlation coefficient (remember, the signs of the slope and correlation coefficient need to be the same).
Match the values below.
| Draggable item | arrow_right_alt | Corresponding Item |
|---|---|---|
0.615 | arrow_right_alt | Explanatory Variable (Predictor) |
0.869 | arrow_right_alt | Response Variable |
% other drug use | arrow_right_alt | Intercept: _____________ percent other drug use |
-3.068 | arrow_right_alt | Slope: ____________ % other drug use per percent marijuana use |
% Marijuana Use | arrow_right_alt | Correlation coefficient |
Use the information above to write the equation for predicting '% other drug use' from the '% marijuana use' be teenagers.
Make sure to use meaningful words and correct notation to show which variable is being predicted.
Use 'other' and 'marijuana' for the variables.
No spaces.
Round the numbers to three places past the decimal point.
Think about the intercept.
Does this point have meaning?
If so explain the meaning or is it just a starting point?
Select both answers.
A country reported that 22% of their teens smoked marijuana,
use the equation from #10 to predict the percent that use other drugs as well.
Keep the 22% as a precent in the equation (don't change to a decimal here).
Round your answer to two places past the decimal.
If the country had collected data and actual reported that 13% used other drugs as well,
calculate the residual and explain its meaning.
Select both correct answers.
A consumer organization reported test data for 50 car models. They examined the association between the weight of the car (in thousands of pounds) and the fuel efficiency (in miles per gallon.)
See the chart below for the computer regression analysis for this data.
Calculate the correlation coefficient.
Round to three places past the decimal.
Be careful of the sign, remember from last time the sign of the correlation coefficient and slope MUST BE THE SAME!
A consumer organization reported test data for 50 car models. They examined the association between the weight of the car (in thousands of pounds) and the fuel efficiency (in miles per gallon.)
See the chart below for the computer regression analysis for this data.
Match the numbers to their corresponding statistical role.
| Draggable item | arrow_right_alt | Corresponding Item |
|---|---|---|
Fuel Efficiency (mpg) | arrow_right_alt | Intercept: _________________ mpg |
Car Weight | arrow_right_alt | Slope: ______________ wt/mpg |
48.739 | arrow_right_alt | Correlation coefficient |
-0.935 | arrow_right_alt | Explanatory Variable (predictor) |
-8.214 | arrow_right_alt | Response Variable |
Use the computer output to write an equation for the Linear Model predicting the fuel efficiency of a car from its weight.
Be sure to use meaningful words and proper notation.
Use 'weight' and 'mpg' for the variable names.
Explain what the correlation coefficient tells us about the relationship between car weight and fuel efficiency.
1. Describe the relationship.
2. Interpret the relationship:
'As the car weight...'
A Toyota 4Runner weighs 4,400 pounds, calculate the predicted fuel efficiency.
Hint: how many 'thousands of pounds' is this?
Calculate the predicted fuel efficiency.
Round your answer to three places past the decimal.
If the 4Runner actually gets 15 mpg, calculate the residual and explain its meaning.
Select both answers below.
If you were purchasing a car, would you rather have a car that had a positive residual for price or negative residual for price?
Why?
A paper in the 1998 issue of American Journal of Sports Medicine examined data regarding the impact age had on the number of days after arthroscopic shoulder surgery before 10 weight lifters were able to return to their sport.
Computer output for the regression is given below.
1. Calculate the correlation coefficient. Remember to write r-sq as a decimal first, then take the square root.
Round to three places past the decimal.
A paper in the 1998 issue of American Journal of Sports Medicine examined data regarding the impact age had on the number of days after arthroscopic shoulder surgery before 10 weight lifters were able to return to their sport.
Computer output for the regression is given below.
Calculate the correlation coefficient.
Round to three places past the decimal.
| Draggable item | arrow_right_alt | Corresponding Item |
|---|---|---|
0.558 | arrow_right_alt | Intercept: ________ days |
Days to return to sport | arrow_right_alt | Slope: ________ days per year |
5.054 | arrow_right_alt | Explanatory variable (predictor) |
Age in years | arrow_right_alt | Response variable |
0.272 | arrow_right_alt | Correlation Coefficient |
Create the linear model equation for predicting the number of days to return to a sport after arthroscopic shoulder surgery.
Be careful to use meaningful words and correct notation to show which variable is predicted.
Use 'age' and 'days' for the variable names.
No spaces.
Predict the time to return to weight lifting for a 15 year old athlete.
Round to three places past the decimal.
If it actually took 12 days to return to weight lifting after surgery, calculate the residual and explain the meaning.
Medical researchers have noted that adolescent females are much more likely to deliver low birth weight babies than are adult females.
Because low birth weight babies have higher mortality rates, a number of studies have examined the relationship between birth weight (in grams) and mother’s age (in years) for babies born to young mothers. Regression output for the data in one such study is given below. (n=138)
Classify the numbers above as 'we are interested in' or 'we will not be using'.
Hint: both categories do not have the same number of items, but they ALL should be categorized.
245.15
0.176
78.1%
-1163.4
-1.49
75.4%
205.308
45.91
783.1
We are interested in:
We will not be using:
Use the computer output information to create the linear model equation for predicting a baby's weight with the mother's age.
Be sure to use meaningful words and correct notation (-hat) to show the variable that is being predicted.
Use 'weight' and 'age' for the variables.
Calculate r and explain what it tells us about the relationship between mother’s age and baby birth weight.
Round the value to three places past the decimal.
Select both answers below:
Use the equation from #27:
An 18 yr old mother gave birth, use the linear model to predict how much her baby will weigh.
If the residual was -195.6 grams, what was the actual weight of the baby?
Set up the residual equation and use algebra to calculate.