arrow_back

Visualizing Data with Google Data Studio

Join Sign in

Visualizing Data with Google Data Studio

1 hour 30 minutes 5 Credits

GSP197

Google Cloud selp-paced labs logo

Overview

This lab demonstrates how to use Google Data Studio to visualize data stored in Google BigQuery.

The US Bureau of Transport Statistics provides datasets that contain data on commercial aviation, multimodal freight activity, and transportation economics, which can be used to demonstrate a wide range of data science concepts and techniques. This lab uses a dataset containing historic information about internal flights in the United States.

Objectives

  • Create BigQuery views

  • Create a BigQuery Datasource in Google Data Studio

  • Create a Data Studio report with a date range control

  • Create multiple charts using BigQuery views

Setup and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab---remember, once you start, you cannot pause a lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab to avoid extra charges to your account.

How to start your lab and sign in to the Google Cloud Console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is the Lab Details panel with the following:

    • The Open Google Console button
    • Time remaining
    • The temporary credentials that you must use for this lab
    • Other information, if needed, to step through this lab
  2. Click Open Google Console. The lab spins up resources, and then opens another tab that shows the Sign in page.

    Tip: Arrange the tabs in separate windows, side-by-side.

    Note: If you see the Choose an account dialog, click Use Another Account.
  3. If necessary, copy the Username from the Lab Details panel and paste it into the Sign in dialog. Click Next.

  4. Copy the Password from the Lab Details panel and paste it into the Welcome dialog. Click Next.

    Important: You must use the credentials from the left panel. Do not use your Google Cloud Skills Boost credentials. Note: Using your own Google Cloud account for this lab may incur extra charges.
  5. Click through the subsequent pages:

    • Accept the terms and conditions.
    • Do not add recovery options or two-factor authentication (because this is a temporary account).
    • Do not sign up for free trials.

After a few moments, the Cloud Console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left. Navigation menu icon

Activate Cloud Shell

Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.

  1. In the Cloud Console, in the top right toolbar, click the Activate Cloud Shell button.

Cloud Shell icon

  1. Click Continue.

It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. The output contains a line that declares the PROJECT_ID for this session:

Your Cloud Platform project in this session is set to YOUR_PROJECT_ID

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  1. (Optional) You can list the active account name with this command:

gcloud auth list

(Output)

ACTIVE: * ACCOUNT: student-01-xxxxxxxxxxxx@qwiklabs.net To set the active account, run: $ gcloud config set account `ACCOUNT`
  1. (Optional) You can list the project ID with this command:

gcloud config list project

(Output)

[core] project = <project_ID>

(Example output)

[core] project = qwiklabs-gcp-44776a13dea667a6 For full documentation of gcloud, in Google Cloud, Cloud SDK documentation, see the gcloud command-line tool overview.

Task 1. Prepare your environment

This lab uses a dataset, code samples, and scripts developed for Data Science on the Google Cloud Platform, 2nd Edition from O'Reilly Media, Inc. and covers the data visualization tasks covered in Chapter 3, "Creating Compelling Dashboards".

Clone the Data Science on Google Cloud repository

  1. In the Cloud Shell, enter the following command to clone the repository:

git clone \ https://github.com/GoogleCloudPlatform/data-science-on-gcp/
  1. Change to the repository directory:

cd data-science-on-gcp/03_sqlstudio

Schema exploration

This lab uses a BigQuery dataset that has been pre-loaded with two months of sample flight data for January and February 2015, which was obtained from the US Bureau of Transport Statistics. The flight data is in a table called flights_raw in the dsongcp dataset.

  1. In the Cloud Console, on the Navigation menu (Navigation menu), click BigQuery.

  2. In the Explorer panel on the left, expand your project and dsongcp dataset, then select the flights_raw table.

  3. On the right side of the window, select the Schema tab to see the schema of the flights_raw table.

For a quick look at a BigQuery table, use the Preview functionality.

Outside of this lab environment, the Preview is free whereas doing a query, for example SELECT * FROM … LIMIT 10, incurs a querying cost.
  1. Click on the Preview tab to view of the flights_raw table.

Create BigQuery views

Create some table views to easily see flights that are delayed by 10, 15 and 20 minutes respectively. You'll use these views later in the lab.

  1. In Cloud Shell, run the script ./create_views.sh.

./create_views.sh
  1. Run the following script to compute the contingency table for various thresholds:

./contingency.sh Create BigQuery Views

Task 2. Connect to Data Studio to visually analyze the dataset

  1. Click to open Google Data Studio in a new browser tab.

  2. If needed, click Use it for free.

  3. Click Data sources in the top menu.

  4. On the top left, click + Create > Data source.

  5. Select a Country and provide a Company name, agree to the Terms of service and click Continue.

  6. Select No for all email preferences, then click Continue.

  7. In the list of Google Connectors, click the BigQuery tile.

  8. Click AUTHORIZE to enable access from Data Studio to your Cloud sources.

  9. If needed, be sure your lab account is selected, click ALLOW.

  10. Click to select MY PROJECTS > [Project-ID] > dsongcp > flights.

  11. Click the blue CONNECT button on the upper right of the screen.

Create BigQuery Data source

Task 3. Creat a scatter chart using Data Studio

  1. Click CREATE REPORT at the top right of the page.

  2. Click ADD TO REPORT to confirm that you want to add the flights table as a data source.

  3. Replace Untitled Report in the top left with your name for this report.

  4. Since you'll create your own charts, click to select, then delete the automatically created chart.

  5. Click Add a chart > Scatter chart, then draw a rectangle on the report canvas to hold the chart.

Add a chart menu

  1. In the right panel, the DATA tab lists the data properties. In the DATA tab, click the field for the settings below and change to the following:
Field Value

Dimension

UNIQUE CARRIER

Metric X

DEP_DELAY

Metric Y

ARR_DELAY

  1. Hover your mouse over the data type icon (SUM) of the Metric X property.

Metric X

  1. Click the pencil icon to edit the aggregation type of Metric X.

Metric X Edit icon

  1. Change the aggregation type to Average.

  2. Click outside the aggregation type box to return to the property pane.

  3. Do the same with for Metric Y to change the aggregation type from SUM to Average.

  4. Click the STYLE tab.

  5. In the Style menu click the Trendline drop-down and select Linear.

  6. In the ribbon above the report, click Add a control > Date range control.

Add a Control Menu

  1. Draw a rectangle the size of a label below the chart to add the Date range control.

Try it out!

  1. Set a date range between January 1, 2015 and February 28, 2015 by either:

  • Clicking Auto data range in the Date range control Properties panel on the right.

  • Clicking the Date range control rectangle you added under the scatter chart.

  1. Click the VIEW button on the upper right to change to the interactive report view to test this control.
You see data only if the range includes dates between Jan 1st 2015 and Feb 28 2015 because the dataset is limited to those dates in this lab.

Task 4. Adding additional chart types to your report

  1. Click Edit on the upper right to add more chart items.

  2. Click Add a chart > Pie chart, then draw a rectangle on the report canvas to hold the pie chart.

Add a chart menu

  1. With the pie chart selected, click ADD A FIELD on the bottom right of the Data tab in the right panel.

  2. Click the <- ALL FIELDS to view the field property summary.

  3. Click the context menu icon to the right of the ARR_DELAY field (three dots) and select Duplicate.

Context menu

  1. Click + ADD A FIELD on the top right of the section.

  2. Name the field is_late.

  3. In the Formula text box enter the following formula:

CASE WHEN ( Copy of ARR_DELAY <15) THEN "ON TIME" ELSE "LATE" END

The field name must register correctly. If you do not see the syntax highlighting as shown below, double check the formula or use the Available Fields selector on the right to select the Copy of ARR_DELAY field.

Formula Editor

  1. Click SAVE and then click DONE.

  2. In the DATA tab in the right panel, change the Dimension for the Pie chart to the new is_late calculated field.

The pie chart now displays the percentage of on time and late flights.

Add a bar (column) chart

  1. Click Add a chart > Column chart, and then draw a rectangle on the report canvas to hold the bar chart.

  2. In the DATA tab, click the field for the settings below and change to the following:

Field Value

Dimension

UNIQUE CARRIER

Metric 1 (Default)

DEP_DELAY

Metric 2 (click Add metric)

ARR_DELAY

Sort

UNIQUE CARRIER

Sort Order

Ascending

Task 5. Creating additional dashboard items for different departure delay thresholds

You've created 3 database table views. Now create charts to display the delay thresholds for these tables.

Add an additional data source for the Delayed_10 database table view

  1. Copy the pie chart and the bar chart so that you now have two sets. The report canvas should now look similar to this:

Report Canvas

  1. Select the second pie chart and click flights in the Data Source in the property list.

  2. Click + ADD DATA at the bottom of the menu.

  3. Click BigQuery in the Google Connectors section of the selection pane.

  4. Select MY PROJECTS > [Project-ID] > dsongcp.

  5. Click the delayed_10 table to select it and then click ADD button on the bottom right of the screen.

  1. Click ADD TO REPORT.

Recreate the copy of the Arr_Delay field and the is_late calculated field.

  1. Click + ADD A FIELD on the bottom right of the screen. You may need to make sure you have selected the DATA property tab on the right hand side of the screen to see this.

  2. If you cannot see the full list of fields with their data type and Aggregation type displayed then click the <- ALL FIELDS to go to the field property summary.

  3. Click the context menu icon to the right of the ARR_DELAY field and select Duplicate.

Context menu

  1. Click + ADD A FIELD on the right side of the screen.

  2. Enter is_late in the Field Name text box.

  3. Enter the following formula in the Formula text box:

CASE WHEN ( Copy of ARR_DELAY <15) THEN "ON TIME" ELSE "LATE" END
  1. The field name must register correctly. If you do not see the syntax highlighting as shown below then double check the formula or use the Available Fields selector on the right to select the Copy of ARR_DELAY field.

Formula Editor

  1. Click SAVE and then click DONE.

  2. Now change the Dimension for the Delayed_10 pie chart to the new is_late calculated field.

The second pie chart now displays the percentage of on time and late flights for the Delayed_10 view.

Report Canvas

Creating the remaining dashboard views (optional)

Optionally repeat the last two sections, where you added an additional database view for the Dealyed_15 and Delayed_20 views.

Congratulations!

You used Google Data Studio to visualize data stored in BigQuery tables and views.

Data Science on Google Cloud Badge

Finish your Quest

This self-paced lab is part of the Quest, Data Science on Google Cloud. A Quest is a series of related labs that form a learning path. Completing this Quest earns you the badge above, to recognize your achievement. You can make your badge (or badges) public and link to them in your online resume or social media account. Enroll in this Quest and get immediate completion credit if you've taken this lab. See other available Quests.

Take your next lab

Continue your Quest with Processing Data with Google Cloud Dataflow, or check out these suggestions:

Next steps / learn more

Here are some follow-up steps:

Google Cloud Training & Certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated April 5, 2022
Lab Last Tested March 08, 2022

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.