March Sale Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: bigdisc65

Selected MLS-C01 AWS Certified Specialty Questions Answers

Page: 2 / 19
Question 8

A Data Scientist is developing a binary classifier to predict whether a patient has a particular disease on a series of test results. The Data Scientist has data on 400 patients randomly selected from the population. The disease is seen in 3% of the population.

Which cross-validation strategy should the Data Scientist adopt?

Options:

A.

A k-fold cross-validation strategy with k=5

B.

A stratified k-fold cross-validation strategy with k=5

C.

A k-fold cross-validation strategy with k=5 and 3 repeats

D.

An 80/20 stratified split between training and validation

Question 9

A financial services company is building a robust serverless data lake on Amazon S3. The data lake should be flexible and meet the following requirements:

* Support querying old and new data on Amazon S3 through Amazon Athena and Amazon Redshift Spectrum.

* Support event-driven ETL pipelines.

* Provide a quick and easy way to understand metadata.

Which approach meets trfese requirements?

Options:

A.

Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Glue ETL job, and an AWS Glue Data catalog to search and discover metadata.

B.

Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Batch job, and an external Apache Hive metastore to search and discover metadata.

C.

Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Batch job, and an AWS Glue Data Catalog to search and discover metadata.

D.

Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Glue ETL job, and an external Apache Hive metastore to search and discover metadata.

Question 10

A company that manufactures mobile devices wants to determine and calibrate the appropriate sales price for its devices. The company is collecting the relevant data and is determining data features that it can use to train machine learning (ML) models. There are more than 1,000 features, and the company wants to determine the primary features that contribute to the sales price.

Which techniques should the company use for feature selection? (Choose three.)

Options:

A.

Data scaling with standardization and normalization

B.

Correlation plot with heat maps

C.

Data binning

D.

Univariate selection

E.

Feature importance with a tree-based classifier

F.

Data augmentation

Question 11

A data scientist has developed a machine learning translation model for English to Japanese by using Amazon SageMaker's built-in seq2seq algorithm with 500,000 aligned sentence pairs. While testing with sample sentences, the data scientist finds that the translation quality is reasonable for an example as short as five words. However, the quality becomes unacceptable if the sentence is 100 words long.

Which action will resolve the problem?

Options:

A.

Change preprocessing to use n-grams.

B.

Add more nodes to the recurrent neural network (RNN) than the largest sentence's word count.

C.

Adjust hyperparameters related to the attention mechanism.

D.

Choose a different weight initialization type.

Page: 2 / 19
Exam Code: MLS-C01
Exam Name: AWS Certified Machine Learning - Specialty
Last Update: Mar 28, 2024
Questions: 281
MLS-C01 pdf

MLS-C01 PDF

$28  $80
MLS-C01 Engine

MLS-C01 Testing Engine

$33.25  $95
MLS-C01 PDF + Engine

MLS-C01 PDF + Testing Engine

$45.5  $130