arrow_back

Troubleshooting Common SQL Errors with BigQuery v1.5

Accédez à plus de 700 ateliers et cours

Troubleshooting Common SQL Errors with BigQuery v1.5

Atelier 50 minutes universal_currency_alt 5 crédits show_chart Débutant
info Cet atelier peut intégrer des outils d'IA pour vous accompagner dans votre apprentissage.
Accédez à plus de 700 ateliers et cours

Overview

BigQuery is Google's fully managed, NoOps, low cost analytics database. With BigQuery you can query terabytes and terabytes of data without having any infrastructure to manage or needing a database administrator. BigQuery uses SQL, and you can take advantage of the pay-as-you-go model. BigQuery allows you to focus on analyzing data to find meaningful insights.

A newly available ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store has been loaded into BigQuery. You have a copy of that dataset for this lab and will explore the available fields and row for insights.

This lab steps you through the logic of troubleshooting queries. It provides activities within the context of a real-world scenario. Throughout the lab, imagine you're working with a new data analyst on your team, and they've provided you with their queries below to answer some questions on your ecommerce dataset. Use the answers to fix their queries to get a meaningful result.

What you'll do

In this lab, learn how to perform the following tasks:

  • Query the data-to-insights public dataset
  • Use the BigQuery Query editor to troubleshoot common SQL errors
  • Use the Query Validator
  • Troubleshoot syntax and logical SQL errors

Setup and requirements

For each lab, you get a new Google Cloud project and set of resources for a fixed time at no cost.

  1. Sign in to Qwiklabs using an incognito window.

  2. Note the lab's access time (for example, 1:15:00), and make sure you can finish within that time.
    There is no pause feature. You can restart if needed, but you have to start at the beginning.

  3. When ready, click Start lab.

  4. Note your lab credentials (Username and Password). You will use them to sign in to the Google Cloud Console.

  5. Click Open Google Console.

  6. Click Use another account and copy/paste credentials for this lab into the prompts.
    If you use other credentials, you'll receive errors or incur charges.

  7. Accept the terms and skip the recovery resource page.

Open BigQuery and Pin a project to the Resource tree

In this section, you add the data-to-insights project to your environment resources.

Open the BigQuery console

  1. In the Google Cloud console, from the Navigation menu (Navigation menu), select BigQuery.

The Welcome to BigQuery in the Cloud Console message box opens. This message box provides a link to the quickstart guide and the release notes.

  1. Click Done.

  2. Navigate to Explorer pane, and click + Add data.

  3. Select Star a project by name.

Expanded Add data menu with expanded Pin a project option highlighted and Enter project name submenu option highlighted

  1. For Project name, enter data-to-insights.

  2. Click Star.

In the explorer pane, under Show starred only you will see the data-to-insights project pinned.

BigQuery Code editor

For each activity in the following sections, this lab provides queries with common errors for you to troubleshoot. The lab directs you what to look at and suggests how to correct the syntax and return meaningful results.

To follow along with the troubleshooting and suggestions, copy and paste the query into the BigQuery EDITOR. If there are errors you see a red exclamation point at the line containing the error and in the query validator (bottom corner).

query_editor_2.png

If you run the query with the errors, the query fails and the error is specified in the Job information.

query_fail_2.png

When the query is error free, you see a green checkmark in the query validator. When you see the green checkmark, click RUN to run the query to view what you get for results.

For information about syntax, see Standard SQL Query Syntax.

Find the total number of customers who went through checkout

Your goal in this section is to construct a query that gives you the number of unique visitors who successfully went through the checkout process for your website. The data is in the rev_transactions table which your data analyst team has provided. They have also given you example queries to help you get started in your analysis but you're not sure they're written correctly.

Troubleshoot queries that contain query validator, alias, and comma errors

Look at the below query and answer the following question

#standardSQL SELECT FROM `data-to-inghts.ecommerce.rev_transactions` LIMIT 1000

What about this updated query?

#standardSQL SELECT * FROM [data-to-insights:ecommerce.rev_transactions] LIMIT 1000

What about this query that uses Standard SQL?

#standardSQL SELECT FROM `data-to-insights.ecommerce.rev_transactions`

What about now? This query has a column.

#standardSQL SELECT fullVisitorId FROM `data-to-insights.ecommerce.rev_transactions`

What about now? The following query has a page title.

#standardSQL SELECT fullVisitorId hits_page_pageTitle FROM `data-to-insights.ecommerce.rev_transactions` LIMIT 1000

What about now? The missing comma has been corrected.

#standardSQL SELECT fullVisitorId , hits_page_pageTitle FROM `data-to-insights.ecommerce.rev_transactions` LIMIT 1000

Answer: This returns results, but are you sure visitors aren't counted twice? Also, returning only one row answers the question of how many unique visitors reached checkout. In the next section you find a way to aggregate your results.

Troubleshoot queries that contain logic errors, GROUP BY statements, and wildcard filters

Aggregate the following query to answer the question: How many unique visitors reached checkout?

#standardSQL SELECT fullVisitorId , hits_page_pageTitle FROM `data-to-insights.ecommerce.rev_transactions` LIMIT 1000

What about this? An aggregation function, COUNT(), was added.

#standardSQL SELECT COUNT(fullVisitorId) AS visitor_count , hits_page_pageTitle FROM `data-to-insights.ecommerce.rev_transactions`

In this next query, GROUP BY and DISTINCT statements were added.

#standardSQL SELECT COUNT(DISTINCT fullVisitorId) AS visitor_count , hits_page_pageTitle FROM `data-to-insights.ecommerce.rev_transactions` GROUP BY hits_page_pageTitle

Results

query_result.png

Great! The results are good, but they look strange. Filter to just "Checkout Confirmation" in the results.

#standardSQL SELECT COUNT(DISTINCT fullVisitorId) AS visitor_count , hits_page_pageTitle FROM `data-to-insights.ecommerce.rev_transactions` WHERE hits_page_pageTitle = "Checkout Confirmation" GROUP BY hits_page_pageTitle

List the cities with the most transactions with your ecommerce site

Troubleshoot ordering, calculated fields, and filtering after aggregating errors

Complete the partially written query:

SELECT geoNetwork_city, totals_transactions, COUNT( DISTINCT fullVisitorId) AS distinct_visitors FROM `data-to-insights.ecommerce.rev_transactions` GROUP BY

Possible solution

#standardSQL SELECT geoNetwork_city, SUM(totals_transactions) AS totals_transactions, COUNT( DISTINCT fullVisitorId) AS distinct_visitors FROM `data-to-insights.ecommerce.rev_transactions` GROUP BY geoNetwork_city

Update your previous query to order the top cities first.

Possible solution

#standardSQL SELECT geoNetwork_city, SUM(totals_transactions) AS totals_transactions, COUNT( DISTINCT fullVisitorId) AS distinct_visitors FROM `data-to-insights.ecommerce.rev_transactions` GROUP BY geoNetwork_city ORDER BY distinct_visitors DESC

Update your query and create a new calculated field to return the average number of products per order by city.

Possible solution

#standardSQL SELECT geoNetwork_city, SUM(totals_transactions) AS total_products_ordered, COUNT( DISTINCT fullVisitorId) AS distinct_visitors, SUM(totals_transactions) / COUNT( DISTINCT fullVisitorId) AS avg_products_ordered FROM `data-to-insights.ecommerce.rev_transactions` GROUP BY geoNetwork_city ORDER BY avg_products_ordered DESC

Results

query_result_2.png

Filter your aggregated results to only return cities with more than 20 avg_products_ordered.

What's wrong with the following query?

#standardSQL SELECT geoNetwork_city, SUM(totals_transactions) AS total_products_ordered, COUNT( DISTINCT fullVisitorId) AS distinct_visitors, SUM(totals_transactions) / COUNT( DISTINCT fullVisitorId) AS avg_products_ordered FROM `data-to-insights.ecommerce.rev_transactions` WHERE avg_products_ordered > 20 GROUP BY geoNetwork_city ORDER BY avg_products_ordered DESC

Possible solution

#standardSQL SELECT geoNetwork_city, SUM(totals_transactions) AS total_products_ordered, COUNT( DISTINCT fullVisitorId) AS distinct_visitors, SUM(totals_transactions) / COUNT( DISTINCT fullVisitorId) AS avg_products_ordered FROM `data-to-insights.ecommerce.rev_transactions` GROUP BY geoNetwork_city HAVING avg_products_ordered > 20 ORDER BY avg_products_ordered DESC

Find the total number of products in each product category

Find the top selling products by filtering with NULL values

What's wrong with the following query? How can you fix it?

#standardSQL SELECT hits_product_v2ProductName, hits_product_v2ProductCategory FROM `data-to-insights.ecommerce.rev_transactions` GROUP BY 1,2

What is wrong with the following query?

#standardSQL SELECT COUNT(hits_product_v2ProductName) as number_of_products, hits_product_v2ProductCategory FROM `data-to-insights.ecommerce.rev_transactions` WHERE hits_product_v2ProductName IS NOT NULL GROUP BY hits_product_v2ProductCategory ORDER BY number_of_products DESC

Update the previous query to only count distinct products in each product category.

Possible solution

#standardSQL SELECT COUNT(DISTINCT hits_product_v2ProductName) as number_of_products, hits_product_v2ProductCategory FROM `data-to-insights.ecommerce.rev_transactions` WHERE hits_product_v2ProductName IS NOT NULL GROUP BY hits_product_v2ProductCategory ORDER BY number_of_products DESC LIMIT 5

Congratulations!

You have troubleshooted and fixed broken queries in BigQuery standard SQL. Remember to use the Query Validator for incorrect query syntax but also to be critical of your query results even if your query executes successfully.

End your lab

When you have completed your lab, click End Lab. Google Cloud Skills Boost removes the resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

  • 1 star = Very dissatisfied
  • 2 stars = Dissatisfied
  • 3 stars = Neutral
  • 4 stars = Satisfied
  • 5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

Avant de commencer

  1. Les ateliers créent un projet Google Cloud et des ressources pour une durée déterminée.
  2. Les ateliers doivent être effectués dans le délai imparti et ne peuvent pas être mis en pause. Si vous quittez l'atelier, vous devrez le recommencer depuis le début.
  3. En haut à gauche de l'écran, cliquez sur Démarrer l'atelier pour commencer.

Utilisez la navigation privée

  1. Copiez le nom d'utilisateur et le mot de passe fournis pour l'atelier
  2. Cliquez sur Ouvrir la console en navigation privée

Connectez-vous à la console

  1. Connectez-vous à l'aide des identifiants qui vous ont été attribués pour l'atelier. L'utilisation d'autres identifiants peut entraîner des erreurs ou des frais.
  2. Acceptez les conditions d'utilisation et ignorez la page concernant les ressources de récupération des données.
  3. Ne cliquez pas sur Terminer l'atelier, à moins que vous n'ayez terminé l'atelier ou que vous ne vouliez le recommencer, car cela effacera votre travail et supprimera le projet.

Ce contenu n'est pas disponible pour le moment

Nous vous préviendrons par e-mail lorsqu'il sera disponible

Parfait !

Nous vous contacterons par e-mail s'il devient disponible

Un atelier à la fois

Confirmez pour mettre fin à tous les ateliers existants et démarrer celui-ci

Utilisez la navigation privée pour effectuer l'atelier

Ouvrez une fenêtre de navigateur en mode navigation privée pour effectuer cet atelier. Vous éviterez ainsi les conflits entre votre compte personnel et le compte temporaire de participant, qui pourraient entraîner des frais supplémentaires facturés sur votre compte personnel.