Loading...
No results found.

Apply your skills in Google Cloud console

Introduction to Vertex Forecasting and Time Series in Practice

Get access to 700+ labs and courses

Training a model with Vertex AI Forecast

Lab 1 hour universal_currency_alt 5 Credits show_chart Introductory
info This lab may incorporate AI tools to support your learning.
Get access to 700+ labs and courses

Overview

In this lab, you build and train a forecasting model with Vertex AI AutoML.

Objectives:

  • Create a managed dataset.
  • Import data from a BigQuery table.
  • Update the column metadata for appropriate use with AutoML.
  • Train a model using options such as a budget and optimization objective.

Setup and requirements

Before you click the Start Lab button

Note: Read these instructions.

Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This Qwiklabs hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

What you need

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
  • Time to complete the lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab. Note: If you are using a Pixelbook, open an Incognito window to run this lab.

How to start your lab and sign in to the Console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is a panel populated with the temporary credentials that you must use for this lab.

  2. Copy the username, and then click Open Google Console. The lab spins up resources, and then opens another tab that shows the Choose an account page.

    Note: Open the tabs in separate windows, side-by-side.
  3. On the Choose an account page, click Use Another Account. The Sign in page opens.

  4. Paste the username that you copied from the Connection Details panel. Then copy and paste the password.

Note: You must use the credentials from the Connection Details panel. Do not use your Google Cloud Skills Boost credentials. If you have your own Google Cloud account, do not use it for this lab (avoids incurring charges).
  1. Click through the subsequent pages:
  • Accept the terms and conditions.
  • Do not add recovery options or two-factor authentication (because this is a temporary account).
  • Do not sign up for free trials.

After a few moments, the Cloud console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left.

Activate Google Cloud Shell

Google Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud.

Google Cloud Shell provides command-line access to your Google Cloud resources.

  1. In Cloud console, on the top right toolbar, click the Open Cloud Shell button.

  2. Click Continue.

It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. For example:

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  • You can list the active account name with this command:
gcloud auth list

Output:

Credentialed accounts: - @.com (active)

Example output:

Credentialed accounts: - google1623327_student@qwiklabs.net
  • You can list the project ID with this command:
gcloud config list project

Output:

[core] project =

Example output:

[core] project = qwiklabs-gcp-44776a13dea667a6 Note: Full documentation of gcloud is available in the gcloud CLI overview guide .

Task 1. Review the data

This lab uses data from the Iowa Liquor Sales dataset from BigQuery Public Datasets. This dataset consists of the wholesale liquor purchases in the US state of Iowa since 2012.

Once you are in the public dataset, you can browse the original raw data by clicking View Dataset. To access the table, navigate in the left navigation bar to expand the bigquery-public-datasets project, expand the iowa_liquor_sales dataset (you may have to click More Results at the bottom of the list first), then click the sales table. You can select the Preview tab to see a selection of rows from the dataset.

For the purposes of this lab, some basic data pre-processing has already been done to group the purchases by day. You use a CSV file that is extracted from the BigQuery table.

The columns in the CSV file are:

  • ds: The date.
  • y: The sum of all purchases for that day in dollars.
  • holiday: A boolean indicating whether the date is a US holiday.
  • id: A time-series identifier (to support multiple time-series, e.g. by store or by product).

In this case, you forecast the overall purchases in one time-series, so the id column is set to 0 for each row.

Task 2. Import the data

In this task, you import the data from the Cloud Storage bucket configured for your project into a dataset.

Enable the APIs

  1. In the Google Cloud Console, on the Navigation menu, navigate to Vertex AI > Dashboard.

  1. Click Enable all recommended API.

Create a Vertex AI dataset

  1. In the Vertex AI menu, navigate to Datasets.

  2. Click Create Dataset.

  3. On the Create dataset page, set the Dataset name to iowa_daily.

  4. For the data type and objective, click Tabular, and then select Forecasting.

  5. Click Create.

Import data

The next step is to import data into the dataset. It is best practice to import data from a BigQuery table. The data currently resides in a Cloud Storage bucket supplied to you. It must first be imported into a BigQuery table.

  • In Cloud Shell, run the following code:
export STORAGE_BUCKET=$(gsutil ls) gsutil cp gs://automl-demo-240614-lcm/iowa_liquor/iowa_daily.csv . head -n -70 iowa_daily.csv > iowa_data.csv gsutil cp iowa_data.csv $STORAGE_BUCKET export DATA_INPUT_FILE=$(gsutil ls $STORAGE_BUCKET) echo DATA_INPUT_FILE: $DATA_INPUT_FILE

You may need to click Authorize for Cloud Shell to continue.

Task 3. Create a BigQuery dataset

In this task, you import training data into BigQuery.

  1. In the Google Cloud Console, on the Navigation menu, right-click on BigQuery and select Open link in new tab.

  2. In the BigQuery tab in your browser, in the Explorer pane, click the vertical ellipsis () next to your project and select Create dataset.

  3. In the Create dataset pop-up page, set the Dataset ID to iowa_daily.

  4. Set the Location type to Region.

  5. Select the Region to us-central1 (Iowa).

  6. Leave the remaining settings on their default values and click Create Dataset.

  7. Expand the node for your project in the Explorer pane, click the vertical ellipsis () next to the iowa_daily dataset and select Create table.

  8. In the Create table page, under Source, in the Create table from field, select Google Cloud Storage.

  9. Paste the value of DATA_INPUT_FILE displayed in Cloud Shell earlier into the Select file from Cloud Storage bucket field, excluding the "gs://" prefix.

  10. Select CSV in the File format field.

  11. In the Destination section, enter forecasting into the Table field.

  12. In the Schema section, check the Auto detect box.

  13. Leave the remaining settings on their default values and click Create Table.

Task 4. Import data into a Vertex AI dataset

In this task, you configure Vertex AI to import training data from BigQuery.

  1. Return to the Vertex AI browser tab and on the Source tab, select the Select a table or view from BigQuery.

  2. Select a path in the BigQuery path field by clicking Browse.

  3. In the Select path page, enter forecasting in the search box and click Search.

  4. Check the forecasting table associated with your Project ID and click Select.

  5. Click Continue.

Task 5. Create the model

In this task, you create a model and configure it to use the imported data.

Configure model features

After a few minutes, AutoML notifies you that the import has been completed. At that point, you can configure the model features.

  1. Select the Series identifier column to be id. You only have one time-series in our dataset, so this is a formality.

  2. Select the Timestamp column to be ds.

  3. Click Generate Statistics. When this process completes you will see a summary of the data to be imported. This process will take a few minutes to complete. You may continue to the next step.

Create the model

  1. Click Train New Model to begin the training process.

  2. Select the AutoML radio button.

  3. Click Continue.

Define the model properties

  1. Select the Target column to be y. This is the value that you are predicting.

  2. If not already set earlier, set the Series identifier column to id and the Timestamp column to ds.

  3. Set Data granularity to Daily and Forecast horizon to 7. This field specifies the number of periods that the model can predict into the future.

  4. Set the Context window to 7. The model will use data from the previous 30 days to make a prediction. There are trade-offs between shorter and longer windows and generally, selecting a value between 1-10x for the forecasting horizon is recommended.

  5. Check the box to Export test dataset to BigQuery.

  6. In the BigQuery path field, enter your Project ID, iowa_daily and test_data, separated by periods as the format advice guides you.

Note: If you leave the BigQuery path blank, it will automatically create a dataset and table in your project.
  1. Click Continue.

Note: The Data Split option Manual requires that you add an extra field to the data as shown below.

The Data split column enables you to select specific rows to be used for training, validation, and testing. When you create your training data, you add a column that can contain one of the following (case sensitive) values:

* TRAIN * VALIDATE * TEST * UNASSIGNED

The values in this column must be one of the two following combinations:

* All of TRAIN, VALIDATE, and TEST * Only TEST and UNASSIGNED

Every row must have a value for this column; it cannot be the empty string.

For example, with all sets specified:

"TRAIN","John","Doe","555-55-5555" "TEST","Jane","Doe","444-44-4444" "TRAIN","Roger","Rogers","123-45-6789" "VALIDATE","Sarah","Smith","333-33-3333"

With only the test set specified:

"UNASSIGNED","John","Doe","555-55-5555" "TEST","Jane","Doe","444-44-4444" "UNASSIGNED","Roger","Rogers","123-45-6789" "UNASSIGNED","Sarah","Smith","333-33-3333"

The Data split column can have any valid column name; its transformation type can be Categorical, Text, or Auto.

If the value of the Data split column is UNASSIGNED, Vertex AI automatically assigns that row to the training or validation set.

When you train your model, you should select a Manual data split and specify this column as the data split column.

Set the training options

In this step, you will specify more details about how you'd like to train the model.

  1. Set the holiday column Transformation to Automatic.

  2. Set the holiday column Feature type to Covariate.

  3. Set the Available at forecast column to Available, because you know whether a given date is a holiday in advance.

Do not edit the Weight column. This will default to weighting evenly.

  1. Expand Advanced Options and set the Optimization objective to RMSE.

  2. Click Continue.

Task 6. Initiate the model training

In this task, you initiate the model training.

  1. Set a budget of your choice. For the purposes of this lab, 1 node hour is sufficient to train the model, but you should be aware that this is only reasonable for small 'toy' datasets. Typically, in production, this value would be something in the range of 6-12 node hours.

  2. Click Start Training to begin the training process.

  3. You can click Vertex AI > Training in the Google Cloud Console menu to see the progress of the model training.

You complete this lab till this point without waiting for the training result. Depending on the resources, the training takes 1-2 hours to complete. In production, you'd get an email to inform you of the completion. When the model is ready, you can then evaluate the performance and make predictions based on the trained model. You will continue with the rest of the steps of model evaluation and model predictions in the next two labs of this course.

Congratulations!

In this lab, you practiced with data preparation, model building, and model training with Vertex AI AutoML.

You're ready to build your own forecasting model!

End your lab

When you have completed your lab, click End Lab. Google Cloud Skills Boost removes the resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

  • 1 star = Very dissatisfied
  • 2 stars = Dissatisfied
  • 3 stars = Neutral
  • 4 stars = Satisfied
  • 5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

Previous Next

Before you begin

  1. Labs create a Google Cloud project and resources for a fixed time
  2. Labs have a time limit and no pause feature. If you end the lab, you'll have to restart from the beginning.
  3. On the top left of your screen, click Start lab to begin

This content is not currently available

We will notify you via email when it becomes available

Great!

We will contact you via email if it becomes available

One lab at a time

Confirm to end all existing labs and start this one

Use private browsing to run the lab

Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
Preview