A data set has the following values:
Which of the following is the best reason for cleansing the data?
Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?
A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:
A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be themost efficient way to deliver this report?
Given the following table:
Which of the following methods is the best way to describe the changes in the values in the table?
An analyst is reporting on the average income for a county and is reviewing the following data:
Which of the following is the reason the analyst would need to cleanse the data in this data set?
Given the table below:
Which of the following variable types BEST describes the “Year” column?
Which of the following would be considered non-personally identifiable information?
Which of the following data manipulation techniques should an analyst use to hide unnecessary data during analysis?
An analyst has conducted a review of business questions. Which of the following should the analyst do next to conduct an analysis?
Which of the following are reasons to create and maintain a data dictionary? (Choose two.)
An organization would like to add a secondary email field to its customer database in order toenrich the customer profiles. Which of the following data manipulation techniques should the analyst use to add this information?
Given the table below:
Which of the following boxes indicates that a Type Il error has occurred?
A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:
Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?
‘Which of the following is the BEST reason to use database views instead of tables?
Which of the following file formats is best suited to start exploratory analysis within statistical software?
Consider this dataset showing the retirement age of 11 people, in whole years:
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
This tables show a simple frequency distribution of the retirement age data.
Which of the following techniques should an analyst use to analyze a data set to get a snapshot of basic measures of central tendency?
An analyst wants to extract data from a variety of sources and store the data in a cloud-based environment prior to cleaning. Which of the following integration techniques should the analyst use?
A data analyst needs to collect a similar proportion of data from every state. Which of the following sampling methods would be the most appropriate?
An employer needs to maintain adequate office staffing during the winter and wants to track storm data. Which of the following data collection methods should the employer use?
A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.
Which of the following data manipulation techniques would he use to obtain this information?
An analyst has written the following code:
SELECT *
FROM Cust_table
WHERE age > 60 AND City = "New York"
Which of the following criteria is the analyst retrieving?
Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)
During data profiling, an analyst decides to recode the status column in the following data set:
Which of the following data concerns explains why the analyst wants to take this action?
An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:
Which of the following charts would be BEST to use?
An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:
Which of the following types of charts should be considered to BEST display the data?
A recurring event is being stored in two databases that are housed in different geographical locations. A data analyst notices the event is being logged three hours earlier in one database than in the other database. Which of the following is the MOST likely cause of the issue?
A database administrator needs to increase performance on a large dimension table. Which of the following is the best way to accomplish this task?
Which one of the following would not normally be considered a summary statistic?
A development company is constructing a new unit in its apartment complex. The complex has the following floor plans:
Using the average cost per square foot of the original floor plans, which of the following should be the price of the Rose unit?
Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:
Using this information, which of the following students had the BEST score?
What analytics suite is offered by Microsoft and directly integrates with SQL Server Databases?
A table in a hospital database has a column for patient height in inches and a column for patient height in centimeters. This is an example of:
Given the following data table:
Which of the following are appropriate reasons to undertake data cleansing? (Select two).
Which of the following statements would be used to append two tables that have the same number of columns?
An analyst is working with a data set that lists individuals' first and last names in separate columns. Which of the following processes should the analyst use to combine the first and last names into a single spreadsheet cell?
A data analyst needs to perform a full outer join of a customer's orders using the tables below:
Which of the following is the mean of the order quantity?
An analysts building a monthly report for production and wants to ensure the audience is aware of its once-a-month cadence. Which of the following is the MOST important to convey that information?
Given the following table:
Which of the following describes the data quality issues with theagedata?
Jhon is working on an ELT process that sources data from six different source systems.
Looking at the source data, he finds that data about the sample people exists in two of six systems.
What does he have to make sure he checks for in his ELT process?
Choose the best answer.
An analyst needs to conduct a quick analysis. Which of the following is the FIRST step the analyst should perform with the data?
A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:
An analyst is creating a resource to improve users' experience when they select specific records based on particular dates. Which of the following should the analyst use to create a resource that best meets user needs?
An analyst is designing a dashboard that will provide a story of the sales and sales customer ratio. The following data is available:
Which of the following charts should the analyst consider including in the dashboard?
An analyst is compiling a series of reports for the new executive board to review. Which of the following elements provides a snapshot of what is contained in the reports for the executives who do not have time to focus on the details?
A county in Illinois is conducting a survey to determine the mean annual income per household. The county is 427sq mi (2.65q km). Which of the following sampling methods would MOST likely result in a representative sample?
Given the following report:
Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).
A database administrator needs to ensure only approved users can access specific database tables to perform financial functions. Which of the following is the best access control method for the administrator to use?
A sales manager wants quarterly sales reports broken down by unit and week. Which of the following data output lists includes the most necessary information?
The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company's year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?
A business intelligence team wants to create a new dashboard in order to solve a problem statement. Which of the following is the correct order of steps the team should take?
Which of the following best describes a business analytics tool with interactive visualization and business capabilities and an interface that is simple enough for end users to create their own reports and dashboards?
Python
An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?
Which of the following report types is most appropriate for a high-level, year-end report requested by a Chief Executive Officer?
A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?
A site reliability team wants to monitor the stability of their website. so they can proactively diagnose issues when they occur Which of the following deliverables would best suit their needs?
An analyst wants to create a historical data set for the past five years with each year in its own data set. Which of the following methods is the best way to create this historical data set?
A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following
data visualization methods should the analyst use?
An analyst reviews the following data:
7
3
5
2
3
7
7
10
Which of the following is the value of the mode?
A customer survey reveals 90% positive feedback. Which of the following statistical methods would be best to utilize to determine the reliability of a data set and predict how a larger sample of customers over the same time period might respond?
Encryption is a mechanism for protecting data.
When should encryption be applied to data?
Choose the best answer.
A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?
A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:
Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?
A user imports a data file into the accounts payable system each day. On a regular basis. the field input is not what the system is expecting. so it results in an error for the row and a broken import process. To resolve the issue, the user opens the file, finds the error in the row, and manually corrects it before attempting the import again. The import sometimes breaks on subsequent attempts. though. Which of the following changes should be made to this process to reduce the number of errors?
An analyst has been asked to validate data quality. Which of the following are the BEST reasons to validate data for quality control purposes? (Choose two.)
An analyst conducted a preliminary analysis for a data set and identified several patterns and anomalies. Which of the following analysis techniques did the analyst use?
Taylor wants to investigate how manufacturing, marketing, and sales expenditures impact overall profitability for her company.
Which of the following systems is the most appropriate?
Which of the following value is the measure of dispersion "range" between the scores of ten students in a test.
The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.
A salesperson who is prospecting potential clients collected the following data:
Which of the following is an issue with this data?
A data analyst received a large amount of third-party data that needs to be joined with in-house data files. After the data is joined, the analyst notices three columns all contain dates. Which of the following should the analyst do to maintain data consistency?
A data analyst has been asked to create an ad-hoc sales report for the Chief Executive Officer (CEO).
Which of the following should be included in the report?
A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?
An analyst is updating a customer contacts database with information obtained from a survey of new customers. Which of the following data manipulation techniques should the analyst use?
You are working with a dataset and need to swap the values in rows with those in columns.
What action do you need to perform?
Given the following data sample:
Which of the following best describes the data quality issue?
A data analyst received the information in the table below from a recently completed marketing campaign:
Which of the following is the total order conversion rate?
An analyst needs to join two data sets that compare vehicle weights. One data set is in pounds, and the other has various units of measure. Which of the following should the analyst do first to the data prior to any type of join?
A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer The analyst must choose an appropriate chart to include in the dashboard. The following data is available:
Which of the following types of charts should be considered?
A company wants to know how its customers interact with an e-commerce website based on clicks over items. Which of the following is the primary requirement for this report?
A data analyst is performing a data merge within a spreadsheet using the tables below:
https://www.bing.comhttps://www.dumpspedia.com/images/blob?bcid=S1XCF9p02M4GjpbGxHj0lrIaj9sw.....4c
The analyst is attempting to pull the addresses from Table 2 into Table 1 using the last names and is receiving an error message. Which of the following steps can the analyst perform to fix the error?
Consider two different datasets, one with gas prices and the other with food prices. Which of the following measures is most affected by outliers?
A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:
Which of the following types of charts should be considered?
When analyzing the values of two variables, you decide to convert both variables so they are on a scale of 0 to 1.
What term describes this action?
A data analyst needs to create a master file that includes customer information from the tables below:
Given the three tables above, the analyst wants to filter down the information prior to joining it together. In which of the following orders should this data manipulation bo approached for the most efficient result?
A sales director has requested a report for individual team members within the division be developed. The director would like the report to be shared with all team members, but individual team members should not be identifiable within the report Which of the following access requirements would support the director's needs?
Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?
Angela is aggregating data from CRM system with data from an employee system.
While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.
What kind of issues is Angela facing?
Choose the best answer.
Which of the following data types must be used when working with variables that require classification into two or more groups before analysis?
A data analyst for a media company needs to determine the most popular movie genre. Given the table below:
Which of the following must be done to the Genre column before this task can be completed?
Five dogs have the following heights in millimeters:
300,430, 170, 470, 600
Which of the following is the standard deviation for the five dogs?