Weekend Sale Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: bigdisc65

CCA175 Exam Dumps - Cloudera Certified Associate CCA Questions and Answers

Page: 1 / 3
Questions 4

Problem Scenario 78 : You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.orders

table=retail_db.order_items

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Columns of order table : (orderid , order_date , order_customer_id, order_status)

Columns of ordeMtems table : (order_item_td , order_item_order_id , order_item_product_id, order_item_quantity,order_item_subtotal,order_item_product_price)

Please accomplish following activities.

1. Copy "retail_db.orders" and "retail_db.order_items" table to hdfs in respective directory p92_orders and p92_order_items .

2. Join these data using order_id in Spark and Python

3. Calculate total revenue perday and per customer

4. Calculate maximum revenue customer

Options:

Buy Now
Questions 5

Problem Scenario 21 : You have been given log generating service as below.

startjogs (It will generate continuous logs)

tailjogs (You can check , what logs are being generated)

stopjogs (It will stop the log service)

Path where logs are generated using above service : /opt/gen_logs/logs/access.log

Now write a flume configuration file named flumel.conf , using that configuration file dumps logs in HDFS file system in a directory called flumel. Flume channel should have following property as well. After every 100 message it should be committed, use non-durable/faster channel and it should be able to hold maximum 1000 events

Solution :

Step 1 : Create flume configuration file, with below configuration for source, sink and channel.

#Define source , sink , channel and agent,

agent1 .sources = source1

agent1 .sinks = sink1

agent1.channels = channel1

# Describe/configure source1

agent1 .sources.source1.type = exec

agent1.sources.source1.command = tail -F /opt/gen logs/logs/access.log

## Describe sinkl

agentl .sinks.sinkl.channel = memory-channel

agentl .sinks.sinkl .type = hdfs

agentl .sinks.sink1.hdfs.path = flumel

agentl .sinks.sinkl.hdfs.fileType = Data Stream

# Now we need to define channell property.

agent1.channels.channel1.type = memory

agent1.channels.channell.capacity = 1000

agent1.channels.channell.transactionCapacity = 100

# Bind the source and sink to the channel

agent1.sources.source1.channels = channel1

agent1.sinks.sink1.channel = channel1

Step 2 : Run below command which will use this configuration file and append data in hdfs.

Start log service using : startjogs

Start flume service:

flume-ng agent -conf /home/cloudera/flumeconf -conf-file /home/cloudera/flumeconf/flumel.conf-Dflume.root.logger=DEBUG,INFO,console

Wait for few mins and than stop log service.

Stop_logs

Options:

Buy Now
Questions 6

Problem Scenario 20 : You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

1. Write a Sqoop Job which will import "retaildb.categories" table to hdfs, in a directory name "categories_targetJob".

Options:

Buy Now
Questions 7

Problem Scenario 33 : You have given a files as below.

spark5/EmployeeName.csv (id,name)

spark5/EmployeeSalary.csv (id,salary)

Data is given below:

EmployeeName.csv

E01,Lokesh

E02,Bhupesh

E03,Amit

E04,Ratan

E05,Dinesh

E06,Pavan

E07,Tejas

E08,Sheela

E09,Kumar

E10,Venkat

EmployeeSalary.csv

E01,50000

E02,50000

E03,45000

E04,45000

E05,50000

E06,45000

E07,50000

E08,10000

E09,10000

E10,10000

Now write a Spark code in scala which will load these two tiles from hdfs and join the same, and produce the (name.salary) values.

And save the data in multiple tile group by salary (Means each file will have name of employees with same salary). Make sure file name include salary as well.

Options:

Buy Now
Page: 1 / 3
Exam Code: CCA175
Exam Name: CCA Spark and Hadoop Developer Exam
Last Update: May 14, 2024
Questions: 96
CCA175 pdf

CCA175 PDF

$28  $80
CCA175 Engine

CCA175 Testing Engine

$33.25  $95
CCA175 PDF + Engine

CCA175 PDF + Testing Engine

$45.5  $130