Changed DAS-C01 Exam Questions

Page: 7 / 14

Question 28

A company ingests a large set of sensor data in nested JSON format from different sources and stores it in an Amazon S3 bucket. The sensor data must be joined with performance data currently stored in an Amazon Redshift cluster.

A business analyst with basic SQL skills must build dashboards and analyze this data in Amazon QuickSight. A data engineer needs to build a solution to prepare the data for use by the business analyst. The data engineer does not know the structure of the JSON file. The company requires a solution with the least possible implementation effort.

Which combination of steps will create a solution that meets these requirements? (Select THREE.)

Options:

Use an AWS Glue ETL job to convert the data into Apache Parquet format and write to Amazon S3.

Use an AWS Glue crawler to catalog the data.

Use an AWS Glue ETL job with the ApplyMapping class to un-nest the data and write to Amazon Redshift tables.

Use an AWS Glue ETL job with the Regionalize class to un-nest the data and write to Amazon Redshift tables.

Use QuickSight to create an Amazon Athena data source to read the Apache Parquet files in Amazon S3.

Use QuickSight to create an Amazon Redshift data source to read the native Amazon Redshift tables.

Question 29

A marketing company has an application that stores event data in an Amazon RDS database. The company is replicating this data to Amazon Redshift for reporting and

business intelligence (BI) purposes. New event data is continuously generated and ingested into the RDS database throughout the day and captured by a change data

capture (CDC) replication task in AWS Database Migration Service (AWS DMS). The company requires that the new data be replicated to Amazon Redshift in near-real

time.

Which solution meets these requirements?

Options:

Use Amazon Kinesis Data Streams as the destination of the CDC replication task in AWS DMS. Use an AWS Glue streaming job to read changed records from Kinesis Data Streams and perform an upsert into the Redshift cluster.

Use Amazon S3 as the destination of the CDC replication task in AWS DMS. Use the COPY command to load data into the Redshift cluster.

Use Amazon DynamoDB as the destination of the CDC replication task in AWS DMS. Use the COPY command to load data into the Redshift cluster.

Use Amazon Kinesis Data Firehose as the destination of the CDC replication task in AWS DMS. Use an AWS Glue streaming job to read changed records from Kinesis Data Firehose and perform an upsert into the Redshift cluster.

Question 30

A company developed a new elections reporting website that uses Amazon Kinesis Data Firehose to deliver full logs from AWS WAF to an Amazon S3 bucket. The company is now seeking a low-cost option to perform this infrequent data analysis with visualizations of logs in a way that requires minimal development effort.

Which solution meets these requirements?

Options:

Use an AWS Glue crawler to create and update a table in the Glue data catalog from the logs. Use Athena to perform ad-hoc analyses and use Amazon QuickSight to develop data visualizations.

Create a second Kinesis Data Firehose delivery stream to deliver the log files to Amazon Elasticsearch Service (Amazon ES). Use Amazon ES to perform text-based searches of the logs for ad-hoc analyses and use Kibana for data visualizations.

Create an AWS Lambda function to convert the logs into .csv format. Then add the function to the Kinesis Data Firehose transformation configuration. Use Amazon Redshift to perform ad-hoc analyses of the logs using SQL queries and use Amazon QuickSight to develop data visualizations.

Create an Amazon EMR cluster and use Amazon S3 as the data source. Create an Apache Spark job to perform ad-hoc analyses and use Amazon QuickSight to develop data visualizations.

Question 31

A company has collected more than 100 TB of log files in the last 24 months. The files are stored as raw text in a dedicated Amazon S3 bucket. Each object has a key of the form year-month-day_log_HHmmss.txt where HHmmss represents the time the log file was initially created. A table was created in Amazon Athena that points to the S3 bucket. One-time queries are run against a subset of columns in the table several times an hour.

A data analyst must make changes to reduce the cost of running these queries. Management wants a solution with minimal maintenance overhead.

Which combination of steps should the data analyst take to meet these requirements? (Choose three.)