Central Tendancy and Dispersion

Last updated about 3 years ago
15 questions

Central Tendency and Dispersion


You may have seen statements like "Patrick Mahomes' stats are higher than expected for a 24-year old quarterback" or "Guillermo Diaz is a prominent gamer on YouTube mainly through his 'Minecraft' videos". Statements like these present conclusions based on data. Frequently, these conclusions are derived from a measure of central tendency, a number that marks the "middle" of the data.

The three most important measures of central tendency are mean, median, and mode.
  • The mean, also known as the average, is the sum of the data value divided by the number of the values.
  • The median is the number in the middle of a set of data values.
  • The mode is the value that occurs most often in a set of data values.
The table below shows a snippet of Patrick Mahomes game stats for the 2019-2020 NFL season. The example walks through how to determine his mean, median, and mode for passing yards during the regular season.
1

Which statement is true based on Patrick Mahomes' passing record?

3

The following data was collected about two college athletic teams. Use the data to correctly match the mean, median, and mode for each team.

Draggable itemCorresponding Item
mean: height of football team
80 inches
median: height of basketball team & mode: height of the football team
74 inches
mode: height of basketball team
74.1 inches
mean: height of basketball team
72.6 inches
median: heigth of the football team
73 inches
By comparing the heights of the players on the football and basketball teams, we could conclude that the basketball players are taller than the football players. In fact, when you directly compare the central tendencies for each team, the basketball players are always taller than the football players. However, their means and medians only differ by approximately one inch. Whereas their modes differ by 6 inches.

For a more complete comparison of the data sets, we can also study the distribution of the values in each set. To study distribution, we first arrange the values in order from least to greatest, just like we did to determine the median. We can display this arrangement on a dot plot:

A dot plot can reveal several things about a data set. Because the values are arranged from least to greatest, we can determine the range (difference between the lowest and highest) of the values. We can also see how the values group together or are spaced apart. For example, any outliers in a data set will be quiet obvious on a dot plot. An outlier is a value that is much less or greater than other values in a data set. When using dot plots to compare data sets, we can also see any areas where the data in the two sets overlap.

To determine the range for a data set, subtract the minimum from the maximum value.
  • The minimum is the data point with the lowest or smallest value.
  • The maximum is the data point with the greatest or highest value.
1

Which statement is true about the heights of the football and basketball players in our sample?

3

Explain your reasoning behind your answer to #3. Think about how you used the provided data to arrive at your answer.

1

Use the information below to answer questions 5 - 8.

A company asks 6 employees to tun in receipts for their travel expenses. The expenses were separated into transportation and lodging.

Calculate the company's transportation mean.

1

Calculate the company's lodging mean.

1

Which statement is true about the distribution of the company's expenses?

3

The company is planning to send another employee on a business trip. About how much money, in all, should the company expect the employee to spend on the trip? Explain how you determined the answer.

Quartiles and interquartile range are more difficult to work with and understand than mean, median, mode, and range, but they provide important ways of summarizing data.

Because the median is the middle number in a set of data, you can think of the median as dividing the data into two halves, a lower half and an upper half. Frequently, people who use statistics decide that it is also valuable to divide data into fourths. When data are divided into fourths, the divisions between the groups of data are called quartiles.
  • The first quartile is the median of the lower half of the data.
  • The second quartile is the median of all the data.
  • The third quartile is the median of the upper half of the data.
  • The interquartile range, also called IQR, is the range between the third and first quartiles.
Let's use the data set for the heights of the college basketball team to determine how to find each quartile.

The interquartile range for the heights of the college basketball team can be determined by subtracting the first quartile from the first quartile: 79 - 70 = 9. The interquartile range for the heights of the college basketball team is 9 inches.

Similarly to the range of the data, the IQR is used to determine the variation in the data set. Using the heights of the college football team, we can determine the first quartile to be 70 inches and the third quartile to be 75 inches, with an IQR of 5 inches. Since the IQR of the basketball team, 9 inches, is greater than the IQR of the football team, 5 inches, we can determine the height of the basketball team has a greater variation than the height of the football team.
2

The scores Terrence got on the last seven video games he played are listed below.

11,992 12,430 13,094 10,876

9,455 7,097 8,934

What is the interquartile range of his scores?

All of the data sets we have looked at so far have an odd number of data points. Whenever the set or the upper and lower halves of a set contain an even number of values, you must calculate the median. In such cases, the calculated values may not actually be a number in the set. Loot at this set of six values.

The second quartile, 9.5, is not actually a number in the set. It is the mean of the two middle numbers: (9 + 10) ÷ 2 = 9.5. Since it is not actually part of the set, it does not need to be discarded to find the medians of the lower and upper halves. There is already an equal number of values in the halves.
1

A scientist recorded the temperature, in degrees Celsius, in 12 different parts of the rainforest. Her results were:
11, 14, 12, 15, 8, 16, 21, 10, 11, 17, 13, 10
What is the interquartiel range (IQR), in degrees Celcius, of the temperatures?

3

She repeats the experiment in another rainforest, by recording the temperature of 12 different parts of the second rainforst. The IQR of the second rainforest is 8ºC. Which rainforest shows a greater variation in temperatures? How does comparing the IQR of each rainforest help you to determine your answer?

1

A box plot provides the best way to study the IQR. A box plot is a statistical diagram that summarizes data using five plotted points on a number line. The five numbers are listed below. Use what you have discovered about each number to correctly match them to the best description.

Draggable itemCorresponding Item
first quartile
the lowest value in a data set
minimum
the highest value in a data set
maximum
the middle number in a data set
third quartile
the median of the lower half of a data set
median
the median of the upper half of a data set
We will relook at the heights of the basketball team to create our first box plot.

Step 1: Identify the five numbers: minimum, maximum, first and third quartiles, and median.

Step 2: On a number line:
  • Draw a line for each quartile and the median.
  • Draw a point for the minimum and the maximum.

Step 3: Draw your connections:
  • Create the IQR box, by connecting the top and bottom lines for each quartile.
  • Connect the minimum point to the middle of the first quartile line.
  • Connect the maximum point to the middle of the third quartile line.



Use the information below to answer questions 13 - 15.

For each data set, identify the minimum, maximum, first and third quartiles, and median. Use the values to create a box plot of each data set. Then use the box plots to compare the two data sets.

Ticket Sales for The Beast Destroyer
400, 385, 380, 386, 370, 385, 375, 360, 355, 340, 346, 344, 350, 356

Ticket Sales for A Romance Remembered
378, 380, 374, 365, 370, 366, 245, 288, 300, 274, 256, 255, 280, 271
5

Create a box plot for The Beast Destroyer's ticket sales.

5

Create a box plot for A Romance Remembered's ticket sales.

3

Write three comparison statements for the ticket sales of the two movies. Include at least one statement to describe the difference between their:
  • measures of center (mean, median, mode)
  • distribution (range, IQR)