Data Engineer Learning Path

school 13 activities

update Last updated 13日

person Managed by Google Cloud

説明: データエンジニアは、ビジネス上の意思決定に使用されるデータを収集して変換するシステムを設計、構築します。この学習プログラムでは、データエンジニアの役割に不可欠な Google Cloud テクノロジーを使用した実世界の実践的な体験を提供する一連のオンデマンドコース、ラボ、スキルバッジをご案内します。学習プログラムを修了したら、Google Cloud Data Engineer 認定資格をチェックして専門性をさらに高めましょう。

Start learning path

01 Google Cloud のハンズオンラボの概要

book Lab

access_time 45分

show_chart 入門

初回のこのハンズオンラボでは、Google Cloud コンソールにアクセスし、Google Cloud の基本機能（プロジェクト、リソース、IAM ユーザー、ロール、権限、API）を使用します。

Start lab

02 Preparing for Your Professional Data Engineer Journey - 日本語版

book Course

access_time 5時間

show_chart 中級

このコースでは、Professional Data Engineer（PDE）認定資格試験に向けた学習計画を作成できます。学習者は、試験の範囲を把握できます。また、試験への準備状況を把握して、個々の学習計画を作成します。

Start course

03 Modernizing Data Lakes and Data Warehouses with Google Cloud - 日本語版

book Course

access_time 16時間

show_chart 入門

すべてのデータパイプラインには、データレイクとデータウェアハウスという 2 つの主要コンポーネントがあります。このコースでは、各ストレージタイプのユースケースを紹介し、Google Cloud で利用可能なデータレイクとデータウェアハウスのソリューションを技術的に詳しく説明します。また、データエンジニアの役割や、効果的なデータパイプラインが事業運営にもたらすメリットについて確認し、クラウド環境でデータエンジニアリングを行うべき理由を説明します。これは「Data Engineering on Google Cloud」シリーズの最初のコースです。このコースを修了したら、「Building Batch Data Pipelines on Google Cloud」コースに登録してください。

Start course

04 Building Batch Data Pipelines on Google Cloud - 日本語版

book Course

access_time 24時間

show_chart 入門

通常、データパイプラインは、「抽出、読み込み」、「抽出、読み込み、変換」、「抽出、変換、読み込み」のいずれかの枠組みに分類できます。このコースでは、バッチデータではどの枠組みを、どのような場合に使用するのかについて説明します。本コースではさらに、BigQuery、Dataproc 上での Spark の実行、Cloud Data Fusion のパイプラインのグラフ、Dataflow でのサーバーレスのデータ処理など、データ変換用の複数の Google Cloud テクノロジーについて説明します。受講者には、Qwiklabs を使用して Google Cloud でデータパイプラインのコンポーネントを構築する実践演習を行っていただきます。

Start course

05 Building Resilient Streaming Analytics Systems on Google Cloud - 日本語版

book Course

access_time 24時間

show_chart 入門

ストリーミングによって企業が事業運営に関するリアルタイムの指標を取得できるようになり、ストリーミングデータの処理を行う機会が増えてきました。このコースでは、Google Cloud でストリーミングデータパイプラインを構築する方法について学習します。受信ストリーミングデータの処理のための Pub/Sub について説明します。また、このコースでは、Dataflow を使用してストリーミングデータを集計または変換する方法、処理済みのレコードを分析用に BigQuery や Cloud Bigtable に保存する方法についても説明します。そして、Qwiklabs を使用して Google Cloud でストリーミングデータパイプラインのコンポーネントを構築する実践演習を行います。

Start course

06 Smart Analytics, Machine Learning, and AI on Google Cloud - 日本語版

book Course

access_time 40時間

show_chart 入門

機械学習をデータパイプラインに組み込むと、企業がデータから分析情報を抽出する能力を向上できます。このコースでは、必要なカスタマイズのレベルに応じて Google Cloud でデータパイプラインに機械学習を含める複数の方法について説明します。カスタマイズをまったくしないか、またはほとんどしない場合については、AutoML を紹介します。よりカスタマイズされた機械学習機能については、Notebooks と BigQuery の機械学習（BigQuery ML）を紹介します。また、Kubeflow を使用して機械学習ソリューションを本番環境に導入する方法も説明します。受講者には、QwikLabs を使って、Google Cloud で機械学習モデルを構築する実習を行っていただきます。

Start course

07 Serverless Data Processing with Dataflow: Foundations - 日本語版

book Course

access_time 8時間

show_chart 入門

このコースは、Dataflow を使用したサーバーレスのデータ処理に関する 3 コースシリーズのパート 1 です。この最初のコースでは、始めに Apache Beam とは何か、そして Dataflow とどのように関係しているかを復習します。次に、Apache Beam のビジョンと Beam Portability フレームワークの利点について説明します。Beam Portability フレームワークによって、デベロッパーが好みのプログラミング言語と実行バックエンドを使用できるビジョンが実現します。続いて、Dataflow によってどのように費用を節約しながらコンピューティングとストレージを分離できるか、そして識別ツール、アクセスツール、管理ツールがどのように Dataflow パイプラインと相互に機能するかを紹介します。最後に、Dataflow でそれぞれのユースケースに合った適切なセキュリティモデルを実装する方法について学習します。

Start course

08 Serverless Data Processing with Dataflow: Developing Pipelines - 日本語版

book Course

access_time 8時間

show_chart 中級

Dataflow コースシリーズの 2 回目である今回は、Beam SDK を使用したパイプラインの開発について詳しく説明します。まず、Apache Beam のコンセプトについて復習します。次に、ウィンドウ、ウォーターマーク、トリガーを使用したストリーミングデータの処理について説明します。さらに、パイプラインのソースとシンクのオプション、構造化データを表現するためのスキーマ、State API と Timer API を使用してステートフル変換を行う方法について説明します。続いて、パイプラインのパフォーマンスを最大化するためのベストプラクティスを再確認します。コースの終盤では、Beam でビジネスロジックを表現するための SQL と DataFrame、および Beam ノートブックを使用してパイプラインを反復的に開発する方法を説明します。

Start course

09 Serverless Data Processing with Dataflow: Operations - 日本語版

book Course

access_time 13時間

show_chart 中級

Dataflow シリーズの最後のコースでは、Dataflow 運用モデルのコンポーネントを紹介します。パイプラインのパフォーマンスのトラブルシューティングと最適化に役立つツールと手法を検証した後で、Dataflow パイプラインのテスト、デプロイ、信頼性に関するベストプラクティスについて確認します。最後に、数百人のユーザーがいる組織に対して Dataflow パイプラインを簡単に拡張するためのテンプレートについても確認します。これらの内容を習得することで、データプラットフォームの安定性を保ち、予期せぬ状況に対する回復力を確保できるようになります。

Start course

10 Prepare Data for ML APIs on Google Cloud

book Course

access_time 6時間 30分

show_chart 入門

Complete the introductory Prepare Data for ML APIs on Google Cloud skill badge to demonstrate skills in the following: cleaning data with Dataprep by Trifacta, running data pipelines in Dataflow, creating clusters and running Apache Spark jobs in Dataproc, and...

Start course

11 Build a Data Warehouse with BigQuery

book Course

access_time 5時間 15分

show_chart 中級

Complete the intermediate Build a Data Warehouse with BigQuery skill badge to demonstrate skills in the following: joining data to create new tables, troubleshooting joins, appending data with unions, creating date-partitioned tables, and working with JSON, arrays, and structs in...

Start course

12 Engineer Data for Predictive Modeling with BigQuery ML

book Course

access_time 5時間 30分

show_chart 中級

Complete the intermediate Engineer Data for Predictive Modeling with BigQuery ML skill badge to demonstrate skills in the following: building data transformation pipelines to BigQuery using Dataprep by Trifacta; using Cloud Storage, Dataflow, and BigQuery to build extract, transform, and...

Start course

13 Build a Data Mesh with Dataplex

book Course

access_time 5時間 30分

show_chart 入門

Complete the introductory Build a Data Mesh with Dataplex skill badge to demonstrate skills in the following: building a data mesh with Dataplex to facilitate data security, governance, and discovery on Google Cloud. You practice and test your skills in...

Start course

Google Cloud Skills Boost

Data Engineer Learning Path

01

Google Cloud のハンズオンラボの概要

02

Preparing for Your Professional Data Engineer Journey - 日本語版

03

Modernizing Data Lakes and Data Warehouses with Google Cloud - 日本語版

04

Building Batch Data Pipelines on Google Cloud - 日本語版

05

Building Resilient Streaming Analytics Systems on Google Cloud - 日本語版

06

Smart Analytics, Machine Learning, and AI on Google Cloud - 日本語版

07

Serverless Data Processing with Dataflow: Foundations - 日本語版

08

Serverless Data Processing with Dataflow: Developing Pipelines - 日本語版

09

Serverless Data Processing with Dataflow: Operations - 日本語版

10

Prepare Data for ML APIs on Google Cloud

11

Build a Data Warehouse with BigQuery

12

Engineer Data for Predictive Modeling with BigQuery ML

13

Build a Data Mesh with Dataplex