arrow_back

BigQuery in JupyterLab on Vertex AI

로그인 가입
700개 이상의 실습 및 과정 이용하기

BigQuery in JupyterLab on Vertex AI

실습 1시간 15분 universal_currency_alt 크레딧 5개 show_chart 입문
info 이 실습에는 학습을 지원하는 AI 도구가 통합되어 있을 수 있습니다.
700개 이상의 실습 및 과정 이용하기

Overview

The purpose of this lab is to show learners how to instantiate a Jupyter notebook running on Google Cloud's Vertex AI service. To aid in the demonstration, a dataset with various flight departure and arrival times will be leveraged.

Objectives

In this lab, you will learn how to perform the following tasks:

  • Instantiate a Jupyter notebook on Vertex AI.
  • Execute a BigQuery query from within a Jupyter notebook and process the output using Pandas.

Setup and requirements

For each lab, you get a new Google Cloud project and set of resources for a fixed time at no cost.

  1. Sign in to Qwiklabs using an incognito window.

  2. Note the lab's access time (for example, 1:15:00), and make sure you can finish within that time.
    There is no pause feature. You can restart if needed, but you have to start at the beginning.

  3. When ready, click Start lab.

  4. Note your lab credentials (Username and Password). You will use them to sign in to the Google Cloud Console.

  5. Click Open Google Console.

  6. Click Use another account and copy/paste credentials for this lab into the prompts.
    If you use other credentials, you'll receive errors or incur charges.

  7. Accept the terms and skip the recovery resource page.

Open BigQuery Console

  1. In the Google Cloud Console, on the Navigation menu , click BigQuery.
    The Welcome to BigQuery in the Cloud Console dialog opens. This dialog provides a link to the quickstart guide and lists UI updates.

  2. Click Done to close the dialog.

Task 1. Launch Vertex AI Workbench instance

  1. In the Google Cloud console, from the Navigation menu (Navigation menu), select Vertex AI > Dashboard.

  2. Click Enable All Recommended APIs.

  3. In the Navigation menu, click Workbench.

    At the top of the Workbench page, ensure you are in the Instances view.

  4. Click add boxCreate New.

  5. Configure the Instance:

    • Name: lab-workbench
    • Region: Set the region to
    • Zone: Set the zone to
    • Advanced Options (Optional): If needed, click "Advanced Options" for further customization (e.g., machine type, disk size).

Create a Vertex AI Workbench instance

  1. Click Create.

This will take a few minutes to create the instance. A green checkmark will appear next to its name when it's ready.

  1. Click Open Jupyterlab next to the instance name to launch the JupyterLab interface. This will open a new tab in your browser.

Workbench Instance Deployed

  1. Click the Python 3 icon to launch a new Python notebook.

Open the Jupyter Notebook

  1. Right-click on the Untitled.ipynb file in the menu bar and select Rename Notebook to give it a meaningful name.

Rename the notebook

Your environment is set up. You are now ready to start working with your Vertex AI Workbench notebook.

Vertex Notebook ready for use

Click Check my progress to verify the objective. Launch Vertex AI Workbench instance

Task 2. Execute a BigQuery query

  1. Enter the following query in the first cell of the notebook:
%%bigquery df --use_rest_api SELECT depdelay as departure_delay, COUNT(1) AS num_flights, APPROX_QUANTILES(arrdelay, 10) AS arrival_delay_deciles FROM `cloud-training-demos.airline_ontime_data.flights` WHERE depdelay is not null GROUP BY depdelay HAVING num_flights > 100 ORDER BY depdelay ASC

The command makes use of the magic function %%bigquery. Magic functions in notebooks provide an alias for a system command. In this case, %%bigquery runs the query in the cell in BigQuery and stores the output in a Pandas DataFrame object named df.

  1. Run the cell by hitting Shift + Enter, when the cursor is in the cell. Alternatively, if you navigate to the Run tab you can click on Run Selected Cells. Note the keyboard shortcut for this action in case it is not Shift + Enter. There should be no output when executing the command.

Click Check my progress to verify the objective. Execute a BigQuery query

  1. View the first five rows of the query's output by executing the following code in a new cell:
df.head()

Five lines of data below the headings: departure_delay, num_flights, and arrival_delay_deciles

Task 3. Make a plot with Pandas

We're going to use the Pandas DataFrame containing our query output to build a plot that depicts how arrival delays correspond to departure delays. Before continuing, if you are unfamiliar with Pandas the Ten Minute Getting Started Guide is recommended reading.

  1. To get a DataFrame containing the data we need, we first have to wrangle the raw query output. Enter the following code in a new cell to convert the list of arrival_delay_deciles into a Pandas Series object. The code also renames the resulting columns.
import pandas as pd percentiles = df['arrival_delay_deciles'].apply(pd.Series) percentiles.rename(columns = lambda x : '{0}%'.format(x*10), inplace=True) percentiles.head()
  1. Since we want to relate departure delay times to arrival delay times, we have to concatenate our percentiles table to the departure_delay field in our original DataFrame. Execute the following code in a new cell:
df = pd.concat([df['departure_delay'], percentiles], axis=1) df.head()
  1. Before plotting the contents of the DataFrame, drop extreme values stored in the 0% and 100% fields. Execute the following code in a new cell:
df.drop(labels=['0%', '100%'], axis=1, inplace=True) df.plot(x='departure_delay', xlim=(-30,50), ylim=(-50,50));

Line graph plotting arrival delay versus departure delay

End your lab

When you have completed your lab, click End Lab. Qwiklabs removes the resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

  • 1 star = Very dissatisfied
  • 2 stars = Dissatisfied
  • 3 stars = Neutral
  • 4 stars = Satisfied
  • 5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

시작하기 전에

  1. 실습에서는 정해진 기간 동안 Google Cloud 프로젝트와 리소스를 만듭니다.
  2. 실습에는 시간 제한이 있으며 일시중지 기능이 없습니다. 실습을 종료하면 처음부터 다시 시작해야 합니다.
  3. 화면 왼쪽 상단에서 실습 시작을 클릭하여 시작합니다.

시크릿 브라우징 사용

  1. 실습에 입력한 사용자 이름비밀번호를 복사합니다.
  2. 비공개 모드에서 콘솔 열기를 클릭합니다.

콘솔에 로그인

    실습 사용자 인증 정보를 사용하여
  1. 로그인합니다. 다른 사용자 인증 정보를 사용하면 오류가 발생하거나 요금이 부과될 수 있습니다.
  2. 약관에 동의하고 리소스 복구 페이지를 건너뜁니다.
  3. 실습을 완료했거나 다시 시작하려고 하는 경우가 아니면 실습 종료를 클릭하지 마세요. 이 버튼을 클릭하면 작업 내용이 지워지고 프로젝트가 삭제됩니다.

현재 이 콘텐츠를 이용할 수 없습니다

이용할 수 있게 되면 이메일로 알려드리겠습니다.

감사합니다

이용할 수 있게 되면 이메일로 알려드리겠습니다.

한 번에 실습 1개만 가능

모든 기존 실습을 종료하고 이 실습을 시작할지 확인하세요.

시크릿 브라우징을 사용하여 실습 실행하기

이 실습을 실행하려면 시크릿 모드 또는 시크릿 브라우저 창을 사용하세요. 개인 계정과 학생 계정 간의 충돌로 개인 계정에 추가 요금이 발생하는 일을 방지해 줍니다.