Databricks Associate-Developer-Apache-Spark Actual Free Exam Questions & Community Discussion
Which of the following are valid execution modes?
Correct Answer: E
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
The code block shown below should return a DataFrame with columns transactionsId, predError, value, and f from DataFrame transactionsDf. Choose the answer that correctly fills the blanks in the code block to accomplish this.
transactionsDf.__1__(__2__)
transactionsDf.__1__(__2__)
Correct Answer: B
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
Which of the following code blocks returns a new DataFrame in which column attributes of DataFrame itemsDf is renamed to feature0 and column supplier to feature1?
Correct Answer: D
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
The code block displayed below contains an error. The code block should read the csv file located at path data/transactions.csv into DataFrame transactionsDf, using the first row as column header and casting the columns in the most appropriate type. Find the error.
First 3 rows of transactions.csv:
1.transactionId;storeId;productId;name
2.1;23;12;green grass
3.2;35;31;yellow sun
4.3;23;12;green grass
Code block:
transactionsDf = spark.read.load("data/transactions.csv", sep=";", format="csv", header=True)
First 3 rows of transactions.csv:
1.transactionId;storeId;productId;name
2.1;23;12;green grass
3.2;35;31;yellow sun
4.3;23;12;green grass
Code block:
transactionsDf = spark.read.load("data/transactions.csv", sep=";", format="csv", header=True)
Correct Answer: A
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
The code block shown below should set the number of partitions that Spark uses when shuffling data for joins or aggregations to 100. Choose the answer that correctly fills the blanks in the code block to accomplish this.
spark.sql.shuffle.partitions
__1__.__2__.__3__(__4__, 100)
spark.sql.shuffle.partitions
__1__.__2__.__3__(__4__, 100)
Correct Answer: D
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
The code block displayed below contains an error. The code block should produce a DataFrame with color as the only column and three rows with color values of red, blue, and green, respectively.
Find the error.
Code block:
1.spark.createDataFrame([("red",), ("blue",), ("green",)], "color")
Instead of calling spark.createDataFrame, just DataFrame should be called.
Find the error.
Code block:
1.spark.createDataFrame([("red",), ("blue",), ("green",)], "color")
Instead of calling spark.createDataFrame, just DataFrame should be called.
Correct Answer: C
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
The code block displayed below contains an error. The code block should arrange the rows of DataFrame transactionsDf using information from two columns in an ordered fashion, arranging first by column value, showing smaller numbers at the top and greater numbers at the bottom, and then by column predError, for which all values should be arranged in the inverse way of the order of items in column value. Find the error.
Code block:
transactionsDf.orderBy('value', asc_nulls_first(col('predError')))
Code block:
transactionsDf.orderBy('value', asc_nulls_first(col('predError')))
Correct Answer: A
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
Which of the following describes tasks?
Correct Answer: E
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
Which of the following code blocks returns a DataFrame with approximately 1,000 rows from the 10,000-row DataFrame itemsDf, without any duplicates, returning the same rows even if the code block is run twice?
Correct Answer: E
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
Which of the following code blocks returns a 2-column DataFrame that shows the distinct values in column productId and the number of rows with that productId in DataFrame transactionsDf?
Correct Answer: C
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
Which of the following code blocks returns the number of unique values in column storeId of DataFrame transactionsDf?
Correct Answer: A
Vote an answer
Explanation: Only visible for EduDump members. You can sign-up / login (it's free).
0
0
0
10
