You plan to develop a dataset named Purchases by using Azure databricks Purchases will contain the following columns:

DP-203

Last Post by Ezequiel 2 years ago

1 Posts

1 Users

0 Likes

165 Views

RSS

Ezequiel

(@ozolinsezequiel)

Noble Member

Joined: 2 years ago

Posts: 701

Topic starter 09/05/2022 2:53 am

HOTSPOT

You plan to develop a dataset named Purchases by using Azure databricks Purchases will contain the following columns:

• ProductID

• ItemPrice

• lineTotal

• Quantity

• StorelD

• Minute

• Month

• Hour

• Year

• Day

You need to store the data to support hourly incremental load pipelines that will vary for each StoreID. the solution must minimize storage costs.

How should you complete the rode? To answer, select the appropriate options In the answer area. NOTE: Each correct selection is worth one point.

Show Answer Hide Answer

Suggested Answer:

Explanation:

Box 1: partitionBy

We should overwrite at the partition level.

Example:

df.write.partitionBy("y","m","d")

mode(SaveMode.Append)

parquet("/data/hive/warehouse/db_name.db/" + tableName)

Box 2: ("StoreID", "Year", "Month", "Day", "Hour", "StoreID")

Box 3: parquet("/Purchases")

Quote

Topic Tags

Microsoft Certified: Azur

Latest Microsoft DP-203 Dumps Valid Version

Latest And Valid Q&A | Instant Download | Once Fail, Full Refund

Instant Download PDF

Forum Jump:

Previous Topic

Next Topic