Passcerty.com » Google » Google Certifications » PROFESSIONAL-DATA-ENGINEER

PROFESSIONAL-DATA-ENGINEER Exam Questions & Answers

Exam Code: PROFESSIONAL-DATA-ENGINEER

Exam Name: Professional Data Engineer on Google Cloud Platform

Updated:

Q&As: 331

At Passcerty.com, we pride ourselves on the comprehensive nature of our PROFESSIONAL-DATA-ENGINEER exam dumps, designed meticulously to encompass all key topics and nuances you might encounter during the real examination. Regular updates are a cornerstone of our service, ensuring that our dedicated users always have their hands on the most recent and relevant Q&A dumps. Behind every meticulously curated question and answer lies the hard work of our seasoned team of experts, who bring years of experience and knowledge into crafting these premium materials. And while we are invested in offering top-notch content, we also believe in empowering our community. As a token of our commitment to your success, we're delighted to offer a substantial portion of our resources for free practice. We invite you to make the most of the following content, and wish you every success in your endeavors.


Download Free Google PROFESSIONAL-DATA-ENGINEER Demo

Experience Passcerty.com exam material in PDF version.
Simply submit your e-mail address below to get started with our PDF real exam demo of your Google PROFESSIONAL-DATA-ENGINEER exam.

Instant download
Latest update demo according to real exam

*Email Address

* Our demo shows only a few questions from your selected exam for evaluating purposes

Free Google PROFESSIONAL-DATA-ENGINEER Dumps

Practice These Free Questions and Answers to Pass the Google Certifications Exam

Questions 1

Your software uses a simple JSON format for all messages. These messages are published to Google Cloud Pub/Sub, then processed with Google Cloud Dataflow to create a real-time dashboard for the CFO. During testing, you notice that some messages are missing in the dashboard. You check the logs, and all messages are being published to Cloud Pub/Sub successfully. What should you do next?

A. Check the dashboard application to see if it is not displaying correctly.

B. Run a fixed dataset through the Cloud Dataflow pipeline and analyze the output.

C. Use Google Stackdriver Monitoring on Cloud Pub/Sub to find the missing messages.

D. Switch Cloud Dataflow to pull messages from Cloud Pub/Sub instead of Cloud Pub/Sub pushing messages to Cloud Dataflow.

Show Answer
Questions 2

You want to rebuild your batch pipeline for structured data on Google Cloud You are using PySpark to conduct data transformations at scale, but your pipelines are taking over twelve hours to run To expedite development and pipeline run time, you want to use a serverless tool and SQL syntax You have already moved your raw data into Cloud Storage

How should you build the pipeline on Google Cloud while meeting speed and processing requirements?

A. Convert your PySpark commands into SparkSQL queries to transform the data; and then run your pipeline on Dataproc to write the data into BigQuery

B. Ingest your data into Cloud SQL, convert your PySpark commands into SparkSQL queries to transform the data, and then use federated queries from BigQuery for machine learning.

C. Ingest your data into BigQuery from Cloud Storage, convert your PySpark commands into BigQuery SQL queries to transform the data, and then write the transformations to a new table

D. Use Apache Beam Python SDK to build the transformation pipelines, and write the data into BigQuery

Show Answer
Questions 3

You use BigQuery as your centralized analytics platform. New data is loaded every day, and an ETL pipeline modifies the original data and prepares it for the final users. This ETL pipeline is regularly modified and can generate errors, but sometimes the errors are detected only after 2 weeks. You need to provide a method to recover from these errors, and your backups should be optimized for storage costs. How should you organize your data in BigQuery and store your backups?

A. Organize your data in a single table, export, and compress and store the BigQuery data in Cloud Storage.

B. Organize your data in separate tables for each month, and export, compress, and store the data in Cloud Storage.

C. Organize your data in separate tables for each month, and duplicate your data on a separate dataset in BigQuery.

D. Organize your data in separate tables for each month, and use snapshot decorators to restore the table to a time prior to the corruption.

Show Answer
Questions 4

You are selecting services to write and transform JSON messages from Cloud Pub/Sub to BigQuery for a data pipeline on Google Cloud. You want to minimize service costs. You also want to monitor and accommodate input data volume that will vary in size with minimal manual intervention. What should you do?

A. Use Cloud Dataproc to run your transformations. Monitor CPU utilization for the cluster. Resize the number of worker nodes in your cluster via the command line.

B. Use Cloud Dataproc to run your transformations. Use the diagnose command to generate an operational output archive. Locate the bottleneck and adjust cluster resources.

C. Use Cloud Dataflow to run your transformations. Monitor the job system lag with Stackdriver. Use the default autoscaling setting for worker instances.

D. Use Cloud Dataflow to run your transformations. Monitor the total execution time for a sampling of jobs. Configure the job to use non-default Compute Engine machine types when needed.

Show Answer
Questions 5

You need ads data to serve Al models and historical data tor analytics longtail and outlier data points need to be identified

You want to cleanse the data n near-reel time before running it through Al models

What should you do?

A. Use BigQuery to ingest prepare and then analyze the data and then run queries to create views

B. Use Cloud Storage as a data warehouse shell scripts tor processing, and BigQuery to create views tor desired datasets

C. Use Dataflow to identity longtail and outber data points programmatically with BigQuery as a sink

D. Use Cloud Composer to identify longtail and outlier data points, and then output a usable dataset to BigQuery

Show Answer

Viewing Page 1 of 3 pages. Download PDF or Software version with 331 questions