[AP Statistics] 12.2b Chi Square Test for Independence

By Oliver Khamky
Last updated about 1 year ago
31 Questions
First, we learned the Chi-Square test for Goodness of Fit. This test is used when you have 1 sample (Harvard Applicants in 2019) and 1 variable (Racial Identity).

Last time, we learned the Chi-Square test for Homogeneity, which is used when you have 2 samples (my two classes) and 1 variable (which test did you do best on?).

This time, we will lean the Chi-Square test for Independence, which is used when you 1 sample, and 2 variables.

All three have similar conditions and mechanics. The hypothesis/conclusions are worded a bit differently. We will work on distinguishing the three in a bit. Right now, you will learn the Chi-Square test for Independence.

Here is a link to the data (might load slowly)

https://docs.google.com/spreadsheets/d/1j5GRWrH8ikqMqwKi8Y9WHGjbxCPvL0_VkdBaeNMtU0g/edit#gid=0



https://www.random.org/integers/

[Stop 1] Race: _______ Force Level: _______
[Stop 2] Race: _______ Force Level: _______
[Stop 3] Race: _______ Force Level: _______
[Stop 4] Race: _______ Force Level: _______
[Stop 5] Race: _______ Force Level: _______
[Stop 6] Race: _______ Force Level: _______
[Stop 7] Race: _______ Force Level: _______
[Stop 8] Race: _______ Force Level: _______
[Stop 9] Race: _______ Force Level: _______
[Stop 10] Race: _______ Force Level: _______


Note: There is a key on the second sheet of the data explaining the codes

https://www.nyc.gov/site/nypd/bureaus/patrol/find-your-precinct.page

The goal of our study is to determine if there is an ASSOCIATION between race and use of force in our data.

We will use a Chi-Square test for Independence, which takes 1 sample (NY Police Stop/Frisk Data) and compare 2 Variables (Racial Identity and Police Force) and see if those variables are independent or associated within this sample.

In general terms, what is the null hypothesis of our study?

In general terms, what is the alternative hypothesis of our study?

All three Chi-Square Tests have the same conditions



What information do we need before we can check the Large Counts condition?

Here is the class data




What is the expected count for White/None? This is calculated the same way as last time.


Round to 1 place

In order to finish checking our conditions, we want to input the data in our calculator.

Put the observed counts into Matrix A (2nd->Matrix->Edit->[A]) and then use the X2-Test on your calculator. Then open up Matrix B in Edit mode to see the results.

Do we meet the large counts condition?


All calculations for this test work the same as the Test for Homogeneity. The main difference is the null/alternative hypothesis, shown above.

What is the X2 Test Statistic from this study?

Round to 1 place

How many degrees of freedom?

What is the p-value?

Round to two places

What is your conclusion at the 0.05 level?

Ultimately, our sample size was too small to see a difference. Here is the data for the full dataset:



Even though the difference was statistically significant and an association was found, it does NOT mean that there is causation. Further studies are required to discount the effects of confounding variables.
Sample:






STATE

PLAN (will need to input data into your calculator to get the expected values)

DO

CONCLUDE


Explain


Explain


Explain


Explain


Are there any problems above that you would like to go over in class? Indicate the question numbers below (Numbers refer to the Formative question number). I'll try to cover anything that is highly requested.