5/26 FA 9.2 Causation vs Correlation

Last updated over 2 years ago
28 questions
Note from the author:

OBJECTIVES & STANDARDS

Math Objectives
  • Define correlation, causation, samples, and populations
  • Explain the difference between correlation and causation
  • Recognize the purposes of and differences among sample surveys, experiments, and observational studies; explain how randomization relates to each.
Common Core Math Standards
  • Link to all CCSS Math
  • CCSS.PRACTICE.MP3
  • CCSS.HSS.IC.A.1
  • CCSS.HSS.IC.B.3
  • CSS.HSS.IC.B.4
Personal Finance Objectives
  • Compare college testing and admissions statistics
  • Examine the relationship between college cost and graduation rates
National Standards for Personal Financial Education
Spending
  • 2a: Select a product or service and describe the various factors that may influence a consumer’s purchase decision

DISTRIBUTION & PLANNING

Distribute to students
  • Student Activity Packet
  • Application Problems

OBJECTIVES & STANDARDS

Math Objectives
  • Define correlation, causation, samples, and populations
  • Explain the difference between correlation and causation
  • Recognize the purposes of and differences among sample surveys, experiments, and observational studies; explain how randomization relates to each.
Common Core Math Standards
  • Link to all CCSS Math
  • CCSS.PRACTICE.MP3
  • CCSS.HSS.IC.A.1
  • CCSS.HSS.IC.B.3
  • CSS.HSS.IC.B.4
Personal Finance Objectives
  • Compare college testing and admissions statistics
  • Examine the relationship between college cost and graduation rates
National Standards for Personal Financial Education
Spending
  • 2a: Select a product or service and describe the various factors that may influence a consumer’s purchase decision

DISTRIBUTION & PLANNING

Distribute to students
  • Student Activity Packet
  • Application Problems
Intro

CONSIDER: Drawing Conclusions

Here you see graphs that show some unexpected relationships between seemingly unconnected things. Take some time to analyze each graph, and then answer the questions.

1

Choose one of these graphs and draw a conclusion about the relationship between the variables.

1

What is one action you could recommend based on this information?

Learn It

EDPUZZLE: Correlation - The Basic Idea Explained

The previous graphs all showed two different variables that were correlated. Correlation is one of the factors we use when comparing two or more different variables. It is an important factor to measure, but it is also easily misinterpreted if you’re not careful. Watch this edpuzzle video, and then answer the questions.
1

What is an important thing to remember about correlation and causation?

4
For each statement below about the INTRO graphs, decide if it describes correlation or causation.
a. Sociology doctorates and space launches both increased at the same rate from 2007 to 2009.__________
b. When consumption of chicken decreases, the U.S. imports less crude oil.__________
c. The less margarine that is consumed per capita, the lower the divorce rates in Maine will be.__________
d. The rise in arcade revenue corresponds to the increase in computer science doctorates awarded in the U.S.__________
1

Give an example of two variables that are correlated, but one variable does not cause the other.

ARTICLE: What is Considered to Be a "Strong" Correlation?

A correlation shows a relationship (though NOT necessarily a causal one) between variables. Often, the only measure we get of correlation is a graph or a correlation coefficient from an analysis program. But how do you interpret the correlation coefficient in real terms for a set of data? Read the article through the section on technology fields to get an idea of how to interpret the correlation coefficient, and then answer the questions.
1
What is the typical cutoff for a correlation showing a “strong” linear relationship?_______
1
Below what level of correlation  would typically be considered “no relationship” between variables?_______
3

Describing the correlation based on the description given:

Draggable itemCorresponding Item
There is no discernable pattern to the data and it is scattered relatively evenly around the scatterplot
Weak negative correlation
The data are extremely scattered but overall they slope downward very steeply
Strong positive correlation
There is very slight upward slope to the data points which are all in very nearly a line
No correlation
1

Why might different fields of study have different interpretations for the same correlation levels?

For each of the following graphs, decide if it is a strong or weak, positive or negative, or no correlation and give an estimate of its correlation coefficient.
3
______________________________
1
______________________________
1

Your friend Derek polls his classmates and finds the coefficient of correlation, r, for hours spent hanging out with friends and GPA is 0.93. “I knew it!” he exclaims. This proves that if I just spend more time with my friends, I’m bound to get an A eventually! What mistake is he making? Give one way he could revise his claim or explain the data differently.

Learn It

VIDEO: Types Of Studies Explained

We’ve established that correlation does not imply causation. There are different types of studies that can lead us to different conclusions. Two of these are observational studies and randomized experiments. Watch this video to learn the difference, and then answer the questions.
1
Observational studies can only show__________, while experiments can determine __________.
1

What is the main difference in design between an observational study and an experiment?

2

Your classmate, Katarina, says: “Experiments seem WAY better than observational studies. I’m only going to do experiments to collect data from now on!” Give TWO reasons or examples of why her claim might be difficult to keep.

Samples vs. Populations

Whether you’re working with observational studies or experiments, it’s important to know the difference between a sample and a population.
A population is a set of all items or events which are of interest for some question or experiment. It is generally the group you are trying to make predictions or learn something about. For most studies, it is either impossible or impractical to obtain data on an entire population. This is why you need to use a smaller selection of items.
A sample is a selection of observations from a population. We measure data in a known sample to make a prediction, or inference, about the population.

Good and Bad Samples

You want to know the average height of students in your whole grade. To get a good estimate, you measure the average height of students in your class.
Population: Students in your grade
Sample: Students in your class
A sample that would NOT fit would be:
Sample: Students from the class younger than you
Since height varies with age for adolescents, a younger student is not likely to have the same height as an older student. Because they are not part of the population, they would not provide an accurate estimate.
But, there are many other samples you could take for that same population. Other samples might be:
  • The girls in your grade
  • The students in the other class
  • Your friend group
  • Just you
1

Which of these samples is likely to be the LEAST accurate estimate for the height of students in your class?

1

Give one more example of another possible sample that would fit this population

1

Give an example of a sample that would NOT fit this population

APPLICATION: College Tuition and Graduation Statistics

College Tuition Costs

Oftentimes one of the advertised statistics for colleges is their graduation rate or graduation rate in four years. We can evaluate some statistics on whether those claims are worth any increased tuition charges that might accompany them.
1
Choosing Observational Study (Obs), Experiment (Exp), both, or neither for each statement:
__________Can determine causation between variables

__________Researcher does not affect the subjects

__________Researcher affects the subjects

__________Can show a linear relationship between variables

__________Requires collection of data
Take a look at the following graph showing the relationship between the different states’ and territories’ average college tuition and graduation rates. Then, answer the questions.

1

Describe any correlation you see between graduation rate and average tuition amount using vocabulary from this lesson.

1

Phillip looks at the graph of the data and concludes: “Going to a more expensive school increases your chances of graduating!” Explain why Phillip’s statement is incorrect.

1

This data set contains the average values for each state in the U.S. with a design to estimate the population of all colleges across the U.S. What is one population that this data would NOT be a good sample for?

Level 2 Outlier Effects

Level 2

Outlier Effects

Outliers can have dramatic effects on the mean but also on the apparent correlation of variables. Take a look at this graph of graduation rate versus acceptance rate.


1

How would you describe the correlation relationship between these two variables?

1

What is the approximate correlation coefficient, r, for these two variables? Round to the nearest thousandth. (To find r, calculate the square root of R^2 to find the magnitude of r and then decide if it should be positive or negative based on the data.)

1

Now, look at the same data but with just 6 outliers removed, the states or territories with the top 2 and bottom 4 acceptance rates.

a. What has changed about the correlation now?
b. Calculate the value of r.

1

Now, look at the same data with 6 data points removed. This time, the outliers were specifically chosen to have the data tell a different story.

a. What has changed about the correlation now?
b. Calculate the value of r.

1

Give an argument for or against the practice of excluding outliers from your sample data.

1

After seeing this outlier effect and the sensitivity of this data set, how might it change your attitude in general toward statistics you see reported in various places? What is one thing you could do to make sure you’re interpreting a data set well?