arrow_back

Setup Dataplex Universal Catalog

ログイン 参加
700 以上のラボとコースにアクセス

Setup Dataplex Universal Catalog

ラボ 1時間 universal_currency_alt クレジット: 5 show_chart 入門
info このラボでは、学習をサポートする AI ツールが組み込まれている場合があります。
700 以上のラボとコースにアクセス

Overview

Dataplex Universal Catalog Enables centralized discover, management, monitoring, and governing data and AI artifacts across your data platform, providing access to trusted data and powering analytics and AI at scale.

Dataplex Universal Catalog (previously named Dataplex, Dataplex Catalog, and BigQuery Universal Catalog) is a fully managed, scalable metadata management service that you can use to tag data assets using aspects and search for assets to which you have access. Aspects are a template to allow you to attach metadata fields to specific data assets for easy identification and retrieval (such as marking certain assets as containing protected or sensitive data); in addition to predefined aspects, you can also create custom aspects to assign to different data assets.

In this lab, you learn how to use Dataplex Universal Catalog to attach aspects to a data asset and then search for assets.

What you'll do

  • Enable the Dataplex and BigQuery APIs
  • Create a lake, zone, and asset in Dataplex Universal Catalog
  • Create a custom aspect type
  • Apply aspects based on an aspect type to assets
  • Search for assets using aspects

Setup and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

What you need

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
  • Time to complete the lab.
Note: If you have a personal Google Cloud account or project, do not use it for this lab. Note: If you are using a Pixelbook, open an Incognito window to run this lab.

Log in to Google Cloud Console

  1. Using the browser tab or window you are using for this lab session, copy the Username from the Connection Details panel and click the Open Google Console button.
Note: If you are asked to choose an account, click Use another account.
  1. Paste in the Username, and then the Password as prompted.
  2. Click Next.
  3. Accept the terms and conditions.

Since this is a temporary account, which will last only as long as this lab:

  • Do not add recovery options
  • Do not sign up for free trials
  1. Once the console opens, view the list of services by clicking the Navigation menu (Navigation menu icon) at the top-left.

Navigation menu

Verify or enable required APIs

  1. In the Google Cloud Console, enter Cloud Dataplex API in the top search bar.

  2. Click on the result for Cloud Dataplex API under Marketplace.

  3. If the API is not already enabled, click Enable to enable the API.

  4. Repeat steps 1-3 for BigQuery API.

Check IAM permissions

  1. From the navigation menu select IAM & Admin and from the flyout submenu select IAM.

  2. Find your entry which will look like student-xx-xxxxxxxxxxxx@qwiklabs.net

  3. Verify you have the role Dataplex Administrator and Dataplex Catalog Admin

Task 1. Create a lake, zone, and asset

In this task, you create a new Dataplex Universal Catalog lake to store customer order information, add a curated zone to the lake, and then attach a pre-created BigQuery dataset as a new asset in the zone.

Create a lake

  1. In the Google Cloud Console, in the Navigation menu (Navigation menu), navigate to View All Products > Analytics > Dataplex Universal Catalog.

If prompted Welcome to the new Dataplex experience, click Close.

  1. Under Manage lakes, click Manage.

  2. Click Create.

  3. Enter the required information to create a new lake:

Property Value
Display Name Orders Lake
ID Leave the default value.
Region

Leave the other default values including Metastore service as None.

  1. Click Create.

It can take up to 3 minutes for the lake to be created.

Add a zone to the lake

  1. On the Manage tab, click on the name of your lake.

  2. Click Add zone.

  3. Enter the required information to create a new zone:

Property Value
Display Name Customer Curated Zone
ID Leave the default value.
Type Curated zone
Data locations Regional

Leave the other default values.

For example, the option for Enable metadata discovery under Discovery settings is enabled by default and allows authorized users to discover the data in the zone.

  1. Click Create.

It can take up to 2 minutes for the zone to be created.

You can perform the next task once the status of the zone is Active.

Attach an asset to a zone

  1. On the Zones tab, click on the name of your zone.

  2. On the Assets tab, click Add assets.

  3. Click Add an asset.

  4. Enter the required information to attach a new asset:

Property Value
Type BigQuery dataset
Display Name Customer Details Dataset
ID Leave the default value.
Dataset .customers

Leave the other default values.

  1. Click Done.

  2. Click Continue.

  3. For Discovery settings, select Inherit to inherit the Discovery settings from the zone level, and then click Continue.

  4. Click Submit.

Click Check my progress to verify your performed task. Create a lake, zone, and asset in Dataplex.

Task 2. Dataplex Universal Catalog setup

We will now work with Dataplex Universal Catalog and enrich the BigQuery asset you just configured.

  1. On the left menu click Search.

  2. The search platform will automatically be Dataplex Universal Catalog. Ignore any messages about the prior product Data Catalog.

  3. In the Filters section go to the Systems section and select BigQuery.

  4. Once the BigQuery assets load on the right, search for Customers, then star it and click Customers which has the Type alias of DATASET. Changing Sort by to Last modified (Recent first) will make it easier to find.

  5. Click the LIST link in header.

  6. Click customer_details.

  7. In the Tags & aspects section under Required aspects click the down facing show more arrow on the right side of the Storage entry. Notice the metadata under Resource Name showing the complete path to the BigQuery table.

  8. In the header click on SCHEMA and notice the field names and other metadata for the fields.

  9. In the header click on Data Profile, next click on QUICK DATA PROFILE. Click CONFIRM in the message box.

  10. In the header click on Data Quality, next click on CREATE DATA QUALITY SCAN. Enter the required information:

Property Value
Display Name Customer Detail
Sampling size * 20%
  1. Verify Publish results to BigQuery and Dataplex Catalog UI is checked. Leave the rest of the fields as their defaults. Click CONTINUE.

  2. Click ADD RULES.

  3. Select Profile based recommendationions in the drop down menu.

  4. In the field Choose columns * click BROWSE. In the pop-up check the box to the left of Name to select all the fields then click the SELECT button at the bottom. If no data fields appear you will need to wait a couple of minutes for the data profile scan you just created to run then start again at step 12

  5. Check the box to the left of Column name to select all rows.

  6. Scroll to the bottom and click SELECT.

  7. In the next screen scroll to the bottom and click CONTINUE.

  8. In the box Select BigQuery dataset click BROWSE.

  9. Select the radio button for Customers and click SELECT.

  10. In the box BigQuery table enter customer_details_quality_scan.

  11. Click RUN SCAN.

  12. We will now add a new aspect. Under the Manage Metadata menu on the left Click Catalog.

  13. Click CREATE ASPECT TYPE.

  14. Click the box labeled Data Sensitivity.

  15. Click USE EXAMPLE.

  16. Set the Location field .

  17. Click SAVE.

  18. We will now add the new Aspect and connect it to fields in the table. Click Search in the left menu bar. Choose search platform by clicking Dataplex Universal Catalog on the right of the header bar.

  19. Select the box for BigQuery in the Systems section under Filters.

  20. Select customer_details.

  21. While you can connect an aspect at the table level on the DETAILS tab by using + ADD next to Optional tags & aspects at the bottom of the page, we are going to connect it to specific fields instead.

  22. To add or edit aspects to individual columns of our table. Click the Schema tab.

  23. Select the zip field.

  24. Click + ADD TAG OR ASPECT and select Data_Sensitivity on the pop-up menu.

  25. On the fly-out set Is Encrypted False Has PII True. PII Type Other then click SAVE.

  26. Repeat step 33-34 for each of the following fields using the same value for Is Encrypted and Has PII as above:

Name PII Type
age Other
email EMAIL
latitude Other
city Address
longitude Other

Click Check my progress to verify your performed task. Dataplex Universal Catalog setup.

Task 3. Searching Dataplex Universal Catalog

  1. On the left menu click Search.

  2. In the Filters section scroll down to Aspects and select Data Sensitivity.

  3. Once the BigQuery assets load on the right, click customer_details which has the Type alias of TABLE.

  4. Click on the SCHEMA tab. Notice the fields marked Data Sensitivity.

  5. Click on one of the Data Sensitivity tags and notice the setting in the pop-out. Click CLOSE on the pop-out.

  6. You now know how to locate data by the aspect associated to it.

Congratulations!

You configured and used Dataplex Universal Catalog to enrich the metadata for specific fields in a table. You then used that metadata to locate the data fields.

始める前に

  1. ラボでは、Google Cloud プロジェクトとリソースを一定の時間利用します
  2. ラボには時間制限があり、一時停止機能はありません。ラボを終了した場合は、最初からやり直す必要があります。
  3. 画面左上の [ラボを開始] をクリックして開始します

シークレット ブラウジングを使用する

  1. ラボで使用するユーザー名パスワードをコピーします
  2. プライベート モードで [コンソールを開く] をクリックします

コンソールにログインする

    ラボの認証情報を使用して
  1. ログインします。他の認証情報を使用すると、エラーが発生したり、料金が発生したりする可能性があります。
  2. 利用規約に同意し、再設定用のリソースページをスキップします
  3. ラボを終了する場合や最初からやり直す場合を除き、[ラボを終了] はクリックしないでください。クリックすると、作業内容がクリアされ、プロジェクトが削除されます

このコンテンツは現在ご利用いただけません

利用可能になりましたら、メールでお知らせいたします

ありがとうございます。

利用可能になりましたら、メールでご連絡いたします

1 回に 1 つのラボ

既存のラボをすべて終了して、このラボを開始することを確認してください

シークレット ブラウジングを使用してラボを実行する

このラボの実行には、シークレット モードまたはシークレット ブラウジング ウィンドウを使用してください。これにより、個人アカウントと受講者アカウントの競合を防ぎ、個人アカウントに追加料金が発生することを防ぎます。