Matteo Perego
Member since 2023
Gold League
28380 points
Member since 2023
In the last installment of the Dataflow course series, we will introduce the components of the Dataflow operational model. We will examine tools and techniques for troubleshooting and optimizing pipeline performance. We will then review testing, deployment, and reliability best practices for Dataflow pipelines. We will conclude with a review of Templates, which makes it easy to scale Dataflow pipelines to organizations with hundreds of users. These lessons will help ensure that your data platform is stable and resilient to unanticipated circumstances.
While the traditional approaches of using data lakes and data warehouses can be effective, they have shortcomings, particularly in large enterprise environments. This course introduces the concept of a data lakehouse and the Google Cloud products used to create one. A lakehouse architecture uses open-standard data sources and combines the best features of data lakes and data warehouses, which addresses many of their shortcomings.
This course is part 1 of a 3-course series on Serverless Data Processing with Dataflow. In this first course, we start with a refresher of what Apache Beam is and its relationship with Dataflow. Next, we talk about the Apache Beam vision and the benefits of the Beam Portability framework. The Beam Portability framework achieves the vision that a developer can use their favorite programming language with their preferred execution backend. We then show you how Dataflow allows you to separate compute and storage while saving money, and how identity, access, and management tools interact with your Dataflow pipelines. Lastly, we look at how to implement the right security model for your use case on Dataflow.
This course helps learners create a study plan for the PDE (Professional Data Engineer) certification exam. Learners explore the breadth and scope of the domains covered in the exam. Learners assess their exam readiness and create their individual study plan.
Complete the intermediate Manage Data Models in Looker skill badge course to demonstrate skills in the following: maintaining LookML project health; utilizing SQL runner for data validation; employing LookML best practices; optimizing queries and reports for performance; and implementing persistent derived tables and caching policies.
Complete the introductory Build LookML Objects in Looker skill badge course to demonstrate skills in the following: building new dimensions and measures, views, and derived tables; setting measure filters and types based on requirements; updating dimensions and measures; building and refining Explores; joining views to existing Explores; and deciding which LookML objects to create based on business requirements.
Data Catalog is deprecated and will be discontinued on January 30, 2026. You can still complete this course if you want to. For steps to transition your Data Catalog users, workloads, and content to Dataplex Catalog, see Transition from Data Catalog to Dataplex Catalog (https://cloud.google.com/dataplex/docs/transition-to-dataplex-catalog). Data Catalog is a fully managed and scalable metadata management service that empowers organizations to quickly discover, understand, and manage all of their data. In this quest you will start small by learning how to search and tag data assets and metadata with Data Catalog. After learning how to build your own tag templates that map to BigQuery table data, you will learn how to build MySQL, PostgreSQL, and SQLServer to Data Catalog Connectors.
In this course, you will get hands-on experience applying advanced LookML concepts in Looker. You will learn how to use Liquid to customize and create dynamic dimensions and measures, create dynamic SQL derived tables and customized native derived tables, and use extends to modularize your LookML code.
Complete the intermediate Create ML Models with BigQuery ML skill badge to demonstrate skills in creating and evaluating machine learning models with BigQuery ML to make data predictions.
Complete the introductory Derive Insights from BigQuery Data skill badge course to demonstrate skills in the following: Write SQL queries.Query public tables.Load sample data into BigQuery.Troubleshoot common syntax errors with the query validator in BigQuery.Create reports in Looker Studio by connecting to BigQuery data.
Complete the introductory Prepare Data for Looker Dashboards and Reports skill badge course to demonstrate skills in the following: filtering, sorting, and pivoting data; merging results from different Looker Explores; and using functions and operators to build Looker dashboards and reports for data analysis and visualization.
Complete the introductory Prepare Data for ML APIs on Google Cloud skill badge to demonstrate skills in the following: cleaning data with Dataprep by Trifacta, running data pipelines in Dataflow, creating clusters and running Apache Spark jobs in Dataproc, and calling ML APIs including the Cloud Natural Language API, Google Cloud Speech-to-Text API, and Video Intelligence API.
This course empowers you to develop scalable, performant LookML (Looker Modeling Language) models that provide your business users with the standardized, ready-to-use data that they need to answer their questions. Upon completing this course, you will be able to start building and maintaining LookML models to curate and manage data in your organization’s Looker instance.
In this course, you learn how to do the kind of data exploration and analysis in Looker that would formerly be done primarily by SQL developers or analysts. Upon completion of this course, you will be able to leverage Looker's modern analytics platform to find and explore relevant content in your organization’s Looker instance, ask questions of your data, create new metrics as needed, and build and share visualizations and dashboards to facilitate data-driven decision making.
In this beginner-level course, you will learn about the Data Analytics workflow on Google Cloud and the tools you can use to explore, analyze, and visualize data and share your findings with stakeholders. Using a case study along with hands-on labs, lectures, and quizzes/demos, the course will demonstrate how to go from raw datasets to clean data to impactful visualizations and dashboards. Whether you already work with data and want to learn how to be successful on Google Cloud, or you’re looking to progress in your career, this course will help you get started. Almost anyone who performs or uses data analysis in their work can benefit from this course.