Log in
Sign up for FREE
arrow_back
Library

AP CSP Extracting Information

star
star
star
star
star
Last updated 8 months ago
20 questions
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Question 1
1.

Question 2
2.

Question 3
3.

Based of the question above, why is it better to use student ID to uniquely identify a student? Why can't we use full name?

Question 4
4.

1
Question 5
5.

Examine the chart shown in the image, and type in your observations.

1
Question 6
6.

Examine the chart shown in the image, and type in your observations.

Question 7
7.

The relationship between speed of a car and time to reach a destination.
(is it positive or negative? Explain your answer choice)

Question 8
8.

Question 9
9.

Question 10
10.

Question 11
11.

Question 12
12.

Question 13
13.

Question 14
14.

Question 15
15.

Question 16
16.

Question 17
17.

Question 18
18.

Differentiate between data and information

Question 19
19.

age = [23, 21, 20, 22, 19, 18, 22, 25, 17, 24, 21]
What is the mean / average age?

Question 20
20.

age = [23, 21, 20, 22, 19, 18, 22, 25, 17, 24, 21]

If the above represent the ages of students at a bootcamp
Write a function that returns the average age of students above 18 years old (use any language of your choice, do not use inbuilt functions)

A researcher wants to investigate whether there is a relationship between the number of extracurricular activities a student participates in and their grade point average. The researcher has access to two databases with the following information:

Database 1:
  • First name
  • Last name
  • Grade point average (on a 0.0 to 4.0 scale)
  • Grade level (9, 10, 11, or 12)
Database 2:
  • First name
  • Last name
  • Number of extracurricular activities
  • Total hours spent on extracurricular activities
What step should the researcher take first to analyze the data?
Combine the data from both databases using first name and last name as common keys.
Calculate the average number of extracurricular activities per grade level.
Find the correlation coefficient between grade level and grade point average.
Determine the percentage of students with a grade point average above 3.5.
A school administrator wants to investigate whether the number of hours students spend studying per week affects their performance in math. The administrator has access to the following databases:
Database 1:
  • Student ID
  • Full name
  • Grade level (9, 10, 11, or 12)
  • Math test scores
Database 2:
  • Student ID
  • Number of hours spent studying per week
What should the administrator use as the common key to combine the data from both databases?
Grade level
Math test scores
Student ID
Number of hours spent studying
A researcher wants to analyze the relationship between students’ sleep habits and their grade point averages. The researcher has collected the following information:
  • Average hours of sleep per night
  • Grade point average (GPA)
  • Grade level
What type of graph would best represent the relationship between average hours of sleep per night and GPA?
Bar graph
Pie chart
Scatter plot
Histogram
A company collects customer feedback from two sources:
  • Source 1: Feedback stored as text (e.g., "Good," "Excellent").
  • Source 2: Feedback stored as numerical ratings (e.g., 4 out of 5).
Which of the following is the best approach to analyze the data?
Convert all text-based feedback into numerical ratings using a predefined scale.
Use only the numerical ratings, as they are easier to analyze.
Ignore the differences and treat both datasets as text.
Combine the datasets without modification and run a statistical analysis.
A researcher is collecting data from two sources. One source stores students’ grades as percentages (e.g., 85, 92) and the other stores grades as letter grades (e.g., B, A).
What should the researcher do to combine the data effectively?
Use only the percentage grades, since they are numerical.
Convert all letter grades to percentages or vice versa.
Discard the data because it is stored in different formats.
Combine the data as is and analyze the differences.
A company is training an AI system to recognize human faces. The training data consists primarily of images of adults from one geographic region. After deployment, users report that the system struggles to recognize faces of children and people from other regions.
What is the most likely cause of this issue?
The AI system was not programmed correctly.
The dataset used for training was biased and not representative of all groups.
The system requires more processing power to recognize diverse faces.
The users did not provide enough feedback during testing.
A language translation app often provides inaccurate translations for dialects and less commonly spoken languages. The app performs well for widely spoken languages like English and Spanish.
What is the most likely reason for this disparity?
The app is not updated frequently enough.
The app's algorithm prioritizes commonly spoken languages.
The training data did not include sufficient examples of less common languages.
The app's interface is not user-friendly for non-English speakers.
A political organization is conducting a survey to measure public support for a policy. The survey is distributed only to people who are known to support the organization’s views. The results show overwhelming support for the policy.

What is the issue with this approach to data collection?
The data is insufficient to analyze trends.
The data collection introduces intentional bias by targeting specific groups.
The sample size is too small to be meaningful.
The organization used the wrong type of data analysis.
A school distributes a survey about cafeteria food satisfaction, asking students to include their names. Many students who dislike the food provide positive responses.

What is the main issue with this survey?
The survey targets the wrong group of students.
The survey’s lack of anonymity may lead to unintentional response bias.
The questions are too difficult for students to understand.
The survey data is too small to draw conclusions.
Intentional bias occurs when the data collection process is deliberately skewed to favor a particular outcome
True
False
Metadata can sometimes provide more insights than the content itself. For example, metadata from online articles often includes (for example twitter posts):
  • Author's name
  • Publish date
  • Word count
  • Keywords
What type of analysis could be performed exclusively using this metadata?
Sentiment analysis of the article’s content.
Identifying trending topics based on keywords and publish dates.
Determining the writing style of the author.
Summarizing the article’s main arguments.
A photographer uploads an image to a social media platform. The platform automatically uses the image's metadata to tag the location where it was taken. The metadata includes the GPS coordinates of the photo.
What is a potential risk associated with this use of metadata?
The image quality may degrade when metadata is read.
The photographer might lose copyright ownership of the image.
The photographer's location can be revealed without their consent.
The metadata may distort the appearance of the image.
A school’s website was designed to handle 500 simultaneous users. On the first day of online class registration, 5,000 users attempt to access the site at the same time, causing it to crash.
Which of the following best describes the scalability issue?
The website’s database is corrupt.
The website is not designed to scale up for higher traffic.
The website lacks proper encryption for user data.
The website is using too much bandwidth