6/6 FA 9.8 Evaluating Statistical Claims
star
star
star
star
star
Last updated over 2 years ago
29 questions
Note from the author:
OBJECTIVES & STANDARDS
Math Objectives
- Understand the limitations of two-way tables
- Understand sampling error and how sample size impacts accuracy
- Investigate selection bias and its effect on statistical error
- Explore four sampling techniques to reduce selection bias
- Understand the p-value in the context of hypothesis testing
- Interpret conclusions correctly for hypothesis testing
Common Core Math Standards
- Link to all CCSS Math
- CCSS.PRACTICE.MP3
- CCSS.HSS.IC.A.1
- CCSS.HSS.IC.A.2
- CCSS.HSS.ID.A.4
Personal Finance Objectives
- Explore different factors in college admissions and scholarship awards
National Standards for Personal Financial Education
Spending
- 2a: Select a product or service and describe the various factors that may influence a consumer’s purchase decision
Managing Credit
- 4a: Describe the different sources of funding for postsecondary education
- 5a: Compare federal and private student loans based on interest rates, repayment rules, and other characteristics
DISTRIBUTION & PLANNING
Distribute to students
- Student Activity Packet
- Application Problems
OBJECTIVES & STANDARDS
Math Objectives
- Understand the limitations of two-way tables
- Understand sampling error and how sample size impacts accuracy
- Investigate selection bias and its effect on statistical error
- Explore four sampling techniques to reduce selection bias
- Understand the p-value in the context of hypothesis testing
- Interpret conclusions correctly for hypothesis testing
Common Core Math Standards
- Link to all CCSS Math
- CCSS.PRACTICE.MP3
- CCSS.HSS.IC.A.1
- CCSS.HSS.IC.A.2
- CCSS.HSS.ID.A.4
Personal Finance Objectives
- Explore different factors in college admissions and scholarship awards
National Standards for Personal Financial Education
Spending
- 2a: Select a product or service and describe the various factors that may influence a consumer’s purchase decision
Managing Credit
- 4a: Describe the different sources of funding for postsecondary education
- 5a: Compare federal and private student loans based on interest rates, repayment rules, and other characteristics
DISTRIBUTION & PLANNING
Distribute to students
- Student Activity Packet
- Application Problems
Intro
ANALYZE: A New Treatment
You are a medical researcher analyzing the results of a new experimental treatment. 1000 patients with the same illness were assigned to one of two groups, 500 to treatment 1 and 500 to treatment 2. Use the chart of patient recovery rates b y treatment type to answer the questions.
1
What is the recovery % for treatment 1?_______ % For treatment 2?_______ %
1
What recommendation would you give about the two treatments from this data? Explain your reasoning.
What recommendation would you give about the two treatments from this data? Explain your reasoning.
Learn It
Sampling Error
Recall that we use sample data instead of a population due to practical limitations. Measurements of a sample are called statistics. When we calculate statistics, we use those statistics as our best estimate for the corresponding values in the population. Measurements of a population are called parameters, which are typically impossible or impractical to measure, but hold more interest because they allow us to make predictions.
Here is a chart of some common variables for sample statistics and population parameters.
Many different samples could be taken from a given population, and you will have a different estimate depending on what particular sample you used. This difference between your sample and the population is called sampling error. While you cannot completely eliminate it, the larger the sample you take, the smaller your sampling error will tend to be, and your estimate will be more likely to be more accurate.
Example:
You may want to know what the average starting salary is for a graduate with a Bachelor’s degree. You could ask 1000 recently graduated students from a nearby university what their starting salary is and find the average, or you could just ask your brother who recently graduated what his salary is.
1
Which of the two samples would you put more trust in, the 1000 students or your brother? Explain.
Which of the two samples would you put more trust in, the 1000 students or your brother? Explain.
1
Let’s say that you asked a different set of 1000 students from that same university their salaries. Would you expect that the average for this second group was close to the average for the first? Why or why not?
Let’s say that you asked a different set of 1000 students from that same university their salaries. Would you expect that the average for this second group was close to the average for the first? Why or why not?
1
Instead of your brother, you ask one other random graduate their starting salary. Would you expect their salary to be close to your brother’s salary? Why or why not?
Instead of your brother, you ask one other random graduate their starting salary. Would you expect their salary to be close to your brother’s salary? Why or why not?
Selection Bias
While summaries of data can be very helpful, simply looking at the data one way does not always give us the whole picture. The particular situation from the intro is an example of what is called Simpson’s Paradox, sometimes called association reversal.
One reason this can occur is a particular sampling error called selection bias, when the sample is not an accurate representation of the population. Selection bias is introduced whenever you choose a sample from a population. This is why understanding randomization is crucial to the integrity of any statistical study in order to reduce this bias and the overall sampling error.
Example
Imagine you got the further detailed results from the study in the intro. You know that severe cases have a lower rate of recovery than minor cases no matter what treatment type you use.
- Of the 500 patients assigned treatment 1, 400 had a minor case and 100 had a major case.
- Of the 500 patients assigned treatment 2, 50 had a minor case and 450 had a major case.
2
What percentage of the patients with treatment 1 had a major case of the illness?_______
What percentage of the patients with treatment 2 had a major case of the illness?_______
4
Calculate the recovery % and complete the table for the patients divided by subgroups. (show your work)
Calculate the recovery % and complete the table for the patients divided by subgroups. (show your work)
1
Would your conclusions about treatment 1 and treatment 2 change? Why?
Would your conclusions about treatment 1 and treatment 2 change? Why?
1
What is a major difference between the groups that would result in treatment 1 appearing more effective when the data is taken as one group?
What is a major difference between the groups that would result in treatment 1 appearing more effective when the data is taken as one group?
ARTICLE: Sampling Methods | Types, Techniques & Examples
In taking samples where you want to make accurate claims about the underlying population, it is important to do everything possible to avoid selection bias in your sampling. Here are four different sampling methods that can help make sure your sample provides meaningful conclusions. Note that there is no single “best” sampling method and it depends on your situation and goals. Read the article, stopping at “Non-probability sampling methods”. Then, answer the questions.
20
Fill in this Summary chart for the four types of random sampling techniques from the article (on paper, this is for the grade 20 pts, or if you are absent fill in here)
Fill in this Summary chart for the four types of random sampling techniques from the article (on paper, this is for the grade 20 pts, or if you are absent fill in here)
Practice It
Dealing with Selection Bias
Your classmate, Lyra, wants to know the average number of scholarships awarded to American college students. She knows that there are many different scholarships affiliated with racial background, so she wants to be sure that her sample is representative of the overall American college demographics. She decides to use her 48 person senior class to survey and divides them into four groups to use cluster sampling. Below is the actual breakdown of enrollment by race in a chart and pie graph.
1
How would cluster sampling help to ensure that the sample data was a good estimate of the population parameter?
How would cluster sampling help to ensure that the sample data was a good estimate of the population parameter?
1
What might be difficult for Lyra about implementing cluster sampling with her 48-person senior class?
What might be difficult for Lyra about implementing cluster sampling with her 48-person senior class?
1
Which sampling technique or techniques would you recommend and why?
Which sampling technique or techniques would you recommend and why?
1
Lyra says, “I want to make sure my sample is completely unbiased because I want to be sure that my studies can’t be contested by anyone!” How might you respond to Lrya’s statement to help her understand selection bias better?
Lyra says, “I want to make sure my sample is completely unbiased because I want to be sure that my studies can’t be contested by anyone!” How might you respond to Lrya’s statement to help her understand selection bias better?
Learn it 2
Hypothesis Testing
Your friend says that 50% of students at your local community college are majoring in business, but you think that there are fewer than that. You decide to go ask 100 students at lunchtime what their major is and see if that is enough to prove your friend wrong.
Very often in statistical research, we have some hypothesis we are trying to prove or disprove and we collect statistics to see if we were correct. A statistical test like this begins first by forming your Null Hypothesis (H0) - the assumption that there is no difference between your sample and the population. For our example, that would be your friend’s claim of 50% business majors at the college.
You can then form your Alternative Hypothesis (H1 or Ha) - the alternative you are testing against the null hypothesis - that tries to disprove the null hypothesis. For our example here, your claim is that fewer than 50% are business students. This will determine what your alternative hypothesis is.
Example
For your sample of 100 community college students, you can express the null and alternative hypotheses as follows:
- H0: 𝜇 = 50 students
- H1 = 𝜇 < 50 students (Note: H1: 𝜇 ≠ 50 and H1: 𝜇 > 50 are different alternative hypotheses)
4
Identify the Hypotheses
1. You read in a Forbes article that the median student loan debt for college borrowers was $28,950. From what you’ve heard from family, that seems too low to you.
- H0=_______
- H1= _______
2. You saw a statistic from the Education Data Initiative that college tuition increases by an average of 8% each year. You don’t know if that’s too high or too low, but you think that it’s definitely not 8%.
a. H0 =_______
b. H1 =_______
VIDEO: Understanding the p-Value and What It Tells Us
We already know that any time you take a sample, you introduce sampling error. So how can you tell if your test failed by chance or because the null hypothesis was not true? We can use a number called a p-value to tell whether your results are statistically significant. Essentially, you need to have enough samples that are different and they need to be different by a large enough amount to claim with any authority that the null hypothesis is incorrect. Watch this video to learn more about the p-value and how it is used in a realistic example. Then, answer the questions.
1
Sample statistics may possibly differ from the population parameters by mere chance. The p-value tells us the probability that _____
Sample statistics may possibly differ from the population parameters by mere chance. The p-value tells us the probability that _____
1
Write the null and alternative hypotheses for this Choconutty test.
H0:_______
H1:_______
1
Assuming the null hypothesis is true, would it become increasingly more likely or increasingly less likely to find a sample mean weight further and further away from 70g?
Assuming the null hypothesis is true, would it become increasingly more likely or increasingly less likely to find a sample mean weight further and further away from 70g?
1
What would happen to the p-value of your test as the mean of the samples gets further and further away from 70g?
What would happen to the p-value of your test as the mean of the samples gets further and further away from 70g?
1
The significance level is the threshold we choose for how unlikely is unlikely enough to reject the null hypothesis. The most common level of significance is 0.05, which means that if the p-value tells us there is a less than a 5% chance, then we can be reasonably sure (95% confidence) it was NOT just chance, but in fact the null hypothesis is wrong. Give an example where you might either want more or accept less confidence in your test.
The significance level is the threshold we choose for how unlikely is unlikely enough to reject the null hypothesis. The most common level of significance is 0.05, which means that if the p-value tells us there is a less than a 5% chance, then we can be reasonably sure (95% confidence) it was NOT just chance, but in fact the null hypothesis is wrong. Give an example where you might either want more or accept less confidence in your test.
APPLICATION: Analyzing College Statistics
10
Desmos: ACTIVITY: Card Sort: Sampling Methods
Desmos: ACTIVITY: Card Sort: Sampling Methods
College Student Aid
You perform a hypothesis test with a null hypothesis that the average percentage of undergraduate students who receive federal student financial aid is 44.5%. Your alternative hypothesis is that more students receive financial aid than that, and you are testing with a significance level of 0.05. You perform your test and come up with a p-value of 0.0224.
1
What does this p-value signify in real terms for your test?
What does this p-value signify in real terms for your test?
1
Give TWO different ways to word your statement to accurately describe the results of your test
Give TWO different ways to word your statement to accurately describe the results of your test
2
When you collected data for your sample, you referenced the list of the largest public universities in the nation and used simple random sampling from that list to get 400 data points.a. What is one possible error in your sample due to selection bias?b. Explain how you might select a better sample to reduce selection bias sampling error.
When you collected data for your sample, you referenced the list of the largest public universities in the nation and used simple random sampling from that list to get 400 data points.
a. What is one possible error in your sample due to selection bias?
b. Explain how you might select a better sample to reduce selection bias sampling error.
Level 2
Ethics in Testing
You decide to perform a statistical analysis to test the claim that 62.5% of undergraduate students receive some form of federal, state, or local financial aid for college. You suspect that it’s something different, either higher or lower, but you’re not sure. First you need to decide how you are getting your sample data.
1
You have a list of all undergraduate universities in the country available and you decide to use a stratified sampling method to select your sample data. List at least one way you could divide the universites into groups and explain why you chose those groups.
You have a list of all undergraduate universities in the country available and you decide to use a stratified sampling method to select your sample data. List at least one way you could divide the universites into groups and explain why you chose those groups.
You collect your data and use a sample of 500 universities out of over 5,000 and perform your tests. You find that the p-value is 0.055 and you were using a significance level of 0.05.
1
What result would you report for your statistical test?
What result would you report for your statistical test?
2
List TWO ways you could word the interpretation of the result of your test.
List TWO ways you could word the interpretation of the result of your test.
1
You decide instead to change how you report your results. They claim that “Our data supports the rejection of the null hypothesis at the 94.5% confidence level.” This is called ‘cherry-picking’ when you change your chosen significance level after performing your test, and it is scientifically unethical. Another group gets a p-value of 0.10 and reports that they reject the null hypothesis with 90% confidence. A third group with p-value of 0.22 publishes their data saying the data shows a statistically significant difference from the null hypothesis at the 78% confidence level. What do you think the problem can be with changing your confidence level after you perform your test analysis?
You decide instead to change how you report your results. They claim that “Our data supports the rejection of the null hypothesis at the 94.5% confidence level.” This is called ‘cherry-picking’ when you change your chosen significance level after performing your test, and it is scientifically unethical. Another group gets a p-value of 0.10 and reports that they reject the null hypothesis with 90% confidence. A third group with p-value of 0.22 publishes their data saying the data shows a statistically significant difference from the null hypothesis at the 78% confidence level. What do you think the problem can be with changing your confidence level after you perform your test analysis?
1
You feel you are really close and are still confident that the null hypothesis is incorrect. What changes could you make in a follow up experiment given the results of your original experiment with p-value of 0.055?
You feel you are really close and are still confident that the null hypothesis is incorrect. What changes could you make in a follow up experiment given the results of your original experiment with p-value of 0.055?