5/17 FA 8.2 Box Plots

Last updated over 2 years ago
23 questions
Note from the author:

OBJECTIVES & STANDARDS

Math Objectives
  • Calculate measures of central tendency
  • Identify outliers and explain their effects on central tendency calculations
  • Create a 5 number summary of a data set
  • Create a box plot to represent a data set
  • Analyze data using a box plot, range, and interquartile range
Common Core Math Standards
  • Link to all CCSS Math
  • CCSS.HSS.ID.A.1
  • CCSS.HSS.ID.A.2
  • CCSS.HSS.ID.A.3
Personal Finance Objectives
  • Compare student loan debt change
  • Analyze outstanding mortgage balances
  • Graph and summarize auto loan data
National Standards for Personal Financial Education
  • There are no relevant personal finance standards for this lesson

OBJECTIVES & STANDARDS

Math Objectives
  • Calculate measures of central tendency
  • Identify outliers and explain their effects on central tendency calculations
  • Create a 5 number summary of a data set
  • Create a box plot to represent a data set
  • Analyze data using a box plot, range, and interquartile range
Common Core Math Standards
  • Link to all CCSS Math
  • CCSS.HSS.ID.A.1
  • CCSS.HSS.ID.A.2
  • CCSS.HSS.ID.A.3
Personal Finance Objectives
  • Compare student loan debt change
  • Analyze outstanding mortgage balances
  • Graph and summarize auto loan data
National Standards for Personal Financial Education
  • There are no relevant personal finance standards for this lesson
Intro

CALCULATE: Finding the Mean and Median

Consider the following three sets of numbers.
Set 1: {1, 11, 21, 31, 41, 51, 61}
Set 2: {28, 29, 30, 31, 32, 33, 34}
Set 3: {0, 31, 31, 31, 31, 31, 62}

Hint:

What is the mean, median, and mode?

The mean is the number you get by dividing the sum of a set of values by the number of values in the set.

In contrast, the median is the middle number in a set of values when those values are arranged from smallest to largest.

The mode of a set of values is the most frequently repeated value in the set.
1
For each set of numbers, find the mean and median. Set 1: Mean_______ Median_______
Set 2: Mean_______ Median_______
Set 3: Mean_______ Median_______
Learn It

TEACHER TIP: This lesson assumes students have previously learned about median.  If you feel that you need a review of this concept, watch this video.

Describing a Data Set

Mean and median are measures of the center of a set of data, but they don’t tell us the whole story about the data.  The three sets in the intro are all very different but they had the same mean and median.

Box Plots and the 5 Number Summary

To get a clearer picture of a set of data, we can make a box plot.  A box plot is a visual representation of 5 numbers that represent the spread and skew of the data.  These numbers are:
  • Minimum - the smallest number in the set
  • Maximum - the largest number in the set
  • Median - the middle number of the set
  • 1st Quartile (Q1) - the number below which 25% of the data points are found.  You can find this by locating the middle number of the ordered data BELOW the median
  • 3rd Quartile (Q3) -  the number below which 75% of the data points are found. You can find this by locating the middle number of the ordered data ABOVE the median

Example

As a part of a final project for school, Nick asked 11 of his classmates how much student loan debt they will have after graduating college and received the following responses:

1
What is the median of Nick’s data?_______
3
To find Q1, find the middle of the data BELOW the median.  Just like with median, if two numbers share the middle, average them together.  Here’s the first half of Nick’s data:
$5,000
$11,000
$19,500
$25,000
$32,000

a. What is the 3rd quartile value? 3rd Quartile: $_______
b. What are the minimum and maximum of Nick’s data set?
Minimum: $_______
Maximum:$_______
4

Creating a Box Plot

A box plot is a visual representation of the 5 number summary.  Here are the steps to create a box plot of Nick’s student loan debt data.

  1. Create a number line that covers the entire range of values
  2. Place a line above each value of the 5 number summary
  3. Create a box from Q1 to Q3 that includes the median in between
  4. Draw lines from the middle of the box to the minimum and maximum values

Practice It
Timor is looking into retirement investments and surveys 15 people about the age at which they plan to retire.  He got the following responses:
35        45        50        51        52        52        64        64        65        65        66        66        66        67        70
1
Create a 5 number summary of the data.
Minimum:_______

Q1:_______

Median:_______

Q3:_______

Maximum:_______
1

Create an appropriate number line for this data, then draw the box plot.

Learn it 2

Analyzing a Box Plot

When looking at a set of data, we can ask some important questions:
  1. How spread out is the data?  In the intro, the first set changed by 10s and spread from 1 to 61.  The second set only changed by 1s and was grouped around the center.  This is called dispersion.
  2. Is there any data that looks like it doesn’t belong?  If you are a typical A student and most of your grades are in the 90s but you get one bad grade because you were sick that week, that one bad grade would stand out.  We call that type of data an outlier.
  3. How is the data grouped?  If we looked at data on age of retirement, most of the data would be near the high end of ages but if we looked at the age that someone got their driver's license, most of the data would be near the low end.  This is called skewness. (You’ll learn more about this one in a couple of lessons)
Now that we have a visual representation of Nick’s data, let’s look at the conclusions that Nick can draw about his data set to include in his presentation.  Here’s the box plot again for reference.



A box plot breaks the data up into quartiles, which means 4 equal parts.  They might not look equal on the graph but each region represents 25% of the data:

his allows us to analyze where the data is located.   For example, Nick adds the following analysis to his presentation:
“75% of survey respondents said that they will have less than $42,000 of student loan debt when they graduate.”
Because $42,000 is Q3 of Nick’s data, there are three 25% sections below that data.
1

If Nick wanted to include information about the bottom 25% of respondents, what statement could he make?

1

Are there any outliers in Nick’s data?

1
There are two other useful pieces if information that we can calculate about a set of data to learn about its dispersion:
  • The range is the difference between the minimum and maximum values
  • The interquartile range (IQR) is the difference between the first and third quartiles.   It tells us how spread out the middle 50% of data is.
What are the range _______ and interquartile range (IQR)_______ of Nick’s data?
CREATE: Box Plots Jigsaw
How can we use box plots to compare similar data sets? In this activity, you will work with your groupmates to create a box plot for your designated region then compare your results with another region to analyze the results.
Part 1: Create Your Box Plot
The ordered tables below represent average credit card debt of states in different regions of the US. Follow your teacher's directions to create a box plot for your assigned region.




6
  1. Find the 5 number summary of the data for your region.
Assigned Region: _______
Min: _______
Q1:_______
Median:_______
Q3:_______
Max:_______
10

Use the number line to create a box plot. (and create poster with box plot to share with the class)

Part 2: Compare Box Plots
Compare your region’s box plot to the other regions that your classmates worked on. Use these comparisons to answer the following questions.
1

Which region had the highest median credit card debt?

1

Which region had the smallest interquartile range? What does this mean about the region?

1

Compare the midwest and southern regions and summarize their similarities and differences.

4
Complete the following statement about each region:
a. 75% of states had less than _______ in average credit card debt in the Western region

b. 75% of states had less than_______ in average credit card debt in the Midwest region

c. 75% of states had less than_______ in average credit card debt in the Southern region

d. 75% of states had less than_______ in average credit card debt in the Northeast region
1

What general statement could you make about the US based on your answer to question 6?

1

If the US gained a 51st state that had $0 in credit card debt, what would that data do to the box plot?

Application Problems

Level 1

Comparing Student Loan Debt

The box plots below use the data on state average student loan debt for all 50 US states from 2016 and 2021. Use it to answer the questions that follow.

1
What was the approximate maximum average student loan debt in 2016?_______
1
Estimate how much the median student loan debt increased from 2016 to 2021?_______
1

What is the approximate interquartile range of each set of data?  Draw a conclusion about what this means about student loan borrowers in 2016 and 2021.

1
What percentage of states had an average of more than $5,000 in student loan debt in 2021?_______
1
How many states are represented by the box portion of each diagram?_______
1
Fill in the blank: The top 25% of borrowers in 2016 had between $_______ and $_______ debt.