arrow_back

Answering Complex Questions Using Native Derived Tables with LookML

Join Sign in
Test and share your knowledge with our community!
done
Get access to over 700 hands-on labs, skill badges, and courses

Answering Complex Questions Using Native Derived Tables with LookML

Lab 1 hour universal_currency_alt No cost show_chart Intermediate
Test and share your knowledge with our community!
done
Get access to over 700 hands-on labs, skill badges, and courses

GSP935

Google Cloud self-paced labs logo

Overview

Looker is a modern data platform in Google Cloud that lets you analyze and visualize your data interactively. You can use Looker to do in-depth data analysis, integrate insights across different data sources, build actionable data-driven workflows, and create custom data applications.

In this lab, you will learn how to leverage native derived tables to answer complex questions, address advanced use cases, and customize them using built-in parameters.

What you'll learn

You will learn how to:

  • Create native derived tables to answer complex questions using derived columns.
  • Update a native derived table to generate dynamic values using built-in filter parameters.
  • Understand how business users leverage customized native derived tables to answer complex questions.

Prerequisites:

To maximize your learning, familiarity with LookML is necessary and completing the Understanding LookML in Looker skill badge course is recommended before beginning this lab.

Setup and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab---remember, once you start, you cannot pause a lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab to avoid extra charges to your account.

How to start your lab and sign in to Looker

  1. When ready, click Start Lab button.

    A new panel will appear with the temporary credentials that you must use for this lab.

    If you need to pay for the lab, a pop-up will open for you to select your payment method.

  2. Note your lab credentials in the left pane. You will use them to sign in to the Looker instance for this lab.

    Note: If you use other credentials, you will get errors or incur charges.
  3. Click Open Looker.

  4. Enter the provided Username and Password in the Email and Password fields.

    Important: You must use the credentials from the Connection Details panel on this page. Do not use your Google Cloud Skills Boost credentials. If you have your own Looker account, do not use it for this lab.
  5. Click Log In.

    After a successful login, you will see the Looker instance for this lab.

Task 1. Customize native derived tables using derived columns

Native derived tables are derived tables that perform the same function as a written SQL query, but are expressed natively in the LookML language.

Why should you use native derived tables? Think about the user_facts SQL derived table you made in the previous lab. You did a COUNT of order IDs as lifetime_order_count, and a SUM of sale_price as lifetime_revenue. If you didn't already notice, these aggregations already exist in your model as measures! Your order_items view already has an order_count and a total_sales.

Native derived tables are great because they embody the core LookML principle of reusability. They allow you to inherit already existing dimensions, measures and even Explores and join logic. This makes your model much more maintainable in the long run, since you’re minimizing the number of “hard-coded” database references.

In this section, you will create a native derived table named brand_order_facts that contains a derived column to rank brands by total revenue and that can be filtered using a dynamic date range and/or user inputs. You will also create new dimensions that label rows as being in the top 5 brands or not (i.e. lump all brands ranked 6+ as one brand name of “6) Other”).

Create a native derived table that ranks the top 5 brands by total sale

  1. First, on the bottom left of the Looker User Interface, click the toggle button to enter Development mode.

  2. On the Looker navigation menu, click Explore.

  3. Under E-Commerce Training, click Order Items.

  4. Under the Inventory Items view, click the Product Brand dimension.

  5. Under the Order Items view, click the Total Revenue measure.

  6. Click Run.

  7. Click on the settings gear icon (settings-gear-icon.png) next to Run (top right of page), and select Get LookML.

  8. Tab over to Derived Table, click on the LookML code in the box, and copy it to your clipboard.

  9. Navigate to the Looker IDE (Develop > qwiklabs-ecommerce), click on the plus (+) icon next to File Browser, and choose Create View.

  10. Name the new view brand_order_facts and click Create.

  11. Click brand_order_facts.view and drag it under the views folder.

  12. Erase all the sample auto-generated code, and paste in the copied code from the Explore. Don’t forget to fix the auto-generated view name to brand_order_facts. Your view should resemble:

File browser page showing brand_order_facts.view code

  1. Click Save Changes.

Add a brand rank derived column

So now you have the foundation for your native derived table. The next task is to rank the brands, this can be achieved in most SQL dialects through what is called a ROW_NUMBER() function.

To accomplish this, you'll need to add a derived_column to your native derived tables' explore_source. In a native derived table, you can use derived_column to specify a column that does not yet exist in the Explore specified by the explore_source parameter. For this example, you will call it brand_rank.

  1. Under the column: total_revenue {} definition, start by defining the brand_rank derived column:
derived_column: brand_rank {}
  1. Next, in the curly braces, add the following sql parameter to it:
derived_column: brand_rank { sql: row_number() over (order by total_revenue desc) ;; }

Whenever you make a derived column, you also need to add a dimension for it. It is the same as when you have a column in your regular database table; the column needs to be represented in LookML as a dimension. Have you noticed the auto-generated dimensions don’t have a sql parameter? That is because, when you don’t specify the sql for a dimension, Looker assumes it should point to a column in the underlying data with exactly the same name as the dimension. That can be a useful shortcut for other areas of your project if you’d like, although generally it is a better practice to be explicit whenever possible. In this case, you should at least specify the type. If you don’t, Looker defaults to string, which you do not want here.

  1. Outside of the derived_table definition, just above the product_brand dimension, add the following code:
dimension: brand_rank { type: number }

Your new view should now resemble the following:

File browser page showing brand_order_facts.view code

  1. Click Save Changes.

  2. Then, from the same page, click on the training_ecommerce.model file inside of the model folder to modify its contents.

  3. Locate the explore: order_items definition.

  4. In the explore: order_items definition, add a new join for brand_order_facts by specifying:

join: brand_order_facts { type: left_outer sql_on: ${inventory_items.product_brand} = ${brand_order_facts.product_brand} ;; relationship: many_to_one }
  1. Click Save Changes.

  2. Your model file should now resemble the following:

File browser page showing training_ecommerce.model code

  1. Now that you've joined the brand_order_facts view to the Explore, navigate to the Explore page for Order Items.

  2. Under the Brand Order Facts view, select the Brand Rank, Product Brand, and Total Revenue dimensions.

  3. Set the Row Limit to 10.

  4. Click Run. Your result should look like the following:

Output result table displaying 10 rows of data under the column headings: Brand Order Facts Brand Rank, Brand Order Facts Product Brand, and Brand Order Facts Total Revenue

So far so good! But what if your business users want to see the brand names displayed like "1) Example Brand", and not just "Example Brand" alone. How would this be accomplished? In this case, you could create a dimension that concatenates two other dimension values.

  1. Navigate back to the brand_order_facts view.

  2. Make another dimension called brand_rank_concat, which concatenates the brand rank and the product brand:

dimension: brand_rank_concat { type: string sql: ${brand_rank} || ') ' || ${product_brand} ;; }
  1. Hide brand_rank since business users would probably just find the rank number in our new brand_rank_concat, and not want to use a separate field:
dimension: brand_rank { hidden: yes type: number }
  1. Add a label to brand_rank_concat so it appears more user-friendly. Use the label “Brand Name”:
dimension: brand_rank_concat { label: "Brand Name" type: string sql: ${brand_rank} || ') ' || ${product_brand} ;; }

For the final step, you'll need to lump all brands at rank 6 and beyond into an “Other” classification. To do this, you will first make a “stepping-stone” dimension that evaluates whether a brand’s rank is in the top 5 or not.

  1. In the same brand_order_facts view, create a new dimension named brand_rank_top_5 with the following parameters:
dimension: brand_rank_top_5 { hidden: yes type: yesno sql: ${brand_rank} <= 5 ;; }
  1. Next, create a new dimension named brand_rank_grouped and incorporate the brand_rank_top_5 into it with the following code:
dimension: brand_rank_grouped { label: "Brand Name Grouped" type: string sql: case when ${brand_rank_top_5} then ${brand_rank_concat} else '6) Other' end ;; }
  1. Click Save Changes.

Your view should now resemble the following:

File browser page showing brand_order_facts.view code

  1. Navigate back to the Explore page for Order Items.

  2. Under the Brand Order Facts view, select the Brand Name Grouped dimension.

Under the Order Items view, select the Total Revenue measure. Set the Row Limit to 10.

  1. Click Run.

  2. Make sure the Brand Name Grouped column is ordered from first to last, then under the Visualization tab, click Pie Chart.

  3. Verify your visualization resembles the following:

Output result pie chart displaying Example Brand groupings

  1. Click on the settings gear icon (settings-gear-icon.png) next to Run (top right of page), and select Save > As a Look.

  2. Title your Look: Ranked Brand Revenue.

  3. Click Save.

  4. Navigate back to the brand_order_facts view.

Commit changes and deploy to production

  1. Click Validate LookML and then click Commit Changes & Push.

  2. Add a commit message and click Commit.

  3. Lastly, click Deploy to Production.

Great! Hopefully this helps you appreciate how useful it can be to break down a use case or desired logic into separate basic dimensions, that you can then combine or build upon to answer specific business questions. It is very common in best-practice LookML development to have lots of hidden “stepping-stone” dimensions and measures like these.

Click Check my progress to verify that you've performed the above task. Customize native derived tables using derived columns

Task 2. Customize native derived tables using filters

Now, say the business only cares about recent orders made in the past 365 days. Maybe some of these top 5 brands were very popular years ago due to different trends, but the rankings could have shifted in the past one year.

In this section, you will explore the different types of filters to use for your native derived table in LookML. Filters can be used to apply filters to the derived table, similar to a filtered measure. It adds a WHERE or HAVING clause.

Add a static date filter

  1. First, navigate back to the brand_order_facts view.

  2. Under the derived_column definition, add a filter to restrict the native derived table to orders created in the past 365 days:

filters: [order_items.created_date: "365 days"]
  1. Click Save Changes. Your file should now resemble the following:

File browser page showing brand_order_facts.view code

  1. Navigate back to the Order Items Explore.

  2. Under the Brand Order Facts view, select the Brand Name Grouped.

  3. Under the Order Items view, select the Total Revenue measure.

  4. Click Run.

  5. On the Data bar, click the SQL tab to see how the filter gets used in the query.

Since you added a filter on the Ordered Items Created Date to look at only the orders made in the past 365 days, the WHERE condition is only generated in what is called the outer query. This is the default behavior for any dimension filter; you can’t tell it to go inside the common table expression for the derived table, or make the outer WHERE “trickle down” to the inner query. This is where adding a filter to the NDT itself comes in handy.

Add a bind filter

What if the business finds it too rigid to restrict the data to only orders from the past 365 days? Maybe sometimes users want to analyze the rankings from the past two years. With filters: [order_items.created_date: "365 days"], you are hard-coding the time frame.

This is where bind_filters can be a more useful parameter than simply filters. You can indicate which field from the outer Explore you want to “trickle down” to the inner query of the native derived table (the from_field) and which native derived table field it should map to (the to_field). The vast majority of the time, these two should be the same.

The bind_filters sub-parameter of explore_source passes a specific filter from the Explore query into the native derived table subquery:

  • The to_field is the field in the native derived table to which the filter is applied. The to_field must be a field from the underlying explore_source.
  • The from_field specifies the field in the Explore from which to get the filter, if the user specifies a filter at runtime.
  1. Navigate back to the brand_order_facts view.

  2. To use the bind filter, start by removing the static date filter inside of the derived table definition you created in the previous section.

  3. Next, add the following template of the bind_filters under the derived_column definition:

bind_filters: { from_field: # The field the end user interacts with via the filters area to_field: # The field which should be filtered inside the NDT }

In this case, you will want to grab the filter from_field: order_items.created_date and make it influence or apply to_field: order_items.created_date.

  1. Add the following code for the bind filter fields:
bind_filters: { from_field: order_items.created_date to_field: order_items.created_date } Note: this is equivalent to adding a templated filter to the SQL derived table. As you saw in the previous lab, a templated filter was added to a SQL derived table, so that users could choose a date and see the values get updated based on that date.
  1. Click Save Changes. Your file should resemble the following:

File browser page showing brand_order_facts.view code

  1. Navigate back to the Order Items Explore.

  2. Under the Brand Order Facts view, select the Brand Name Grouped.

  3. Under the Order Items view, select the Total Revenue measure.

  4. Also under the Order Items view, Created Date dimension, select the Date field and then select the filter button next to Date.

  5. In the filter definition, specify the filter to be: is in the past 1000 days. For demonstration purposes, you are using 1000 days to ensure that the filter is not too restrictive, and that it will capture the past 3 years.

  6. Click Run.

Filters(1) section

  1. On the Data bar, click the SQL tab to see how the filter gets used in the query. Notice the generated SQL now dynamically updates the WHERE condition within the common table expression for the derived table as well as the outer WHERE condition.

As you can see, this is a lot more flexible! If you filter by orders created in the past 3 quarters, the native derived table will calculate the rankings from the past 3 quarters accordingly. And if you filter by orders created in a particular date range, the native derived table will also use that same date range in its WHERE condition.

  1. Now, under the Users field, select Country and Age and add a filter for them; set them to Country is equal to USA and Age is greater than 21.

  2. Click Run.

Filter(s) section

  1. Finally, click the SQL tab.

Notice how the WHERE condition of the derived table is unaffected. What if the business users have other criteria besides Ordered Items Created Date? What if they only want to see the rankings for orders made by customers in the USA, or by male customers?

You could certainly keep adding bind_filters, but look how many fields you have in the Order Items Explore. It would take forever to add bind_filters for all of them. This is where yet another parameter can be extremely useful: bind_all_filters.

Click Check my progress to verify that you've performed the above task. Customize native derived tables using filters

Using bind_all_filters

The easiest way to pass filters from an Explore to a native derived table subquery is to specify bind_all_filters: yes in the native derived table’s explore_source parameter. This will pass all of an Explore’s runtime filters into the native derived table subquery.

If you want to use the native derived table in a different Explore, use the bind_filters parameter instead, as described in the previous section.

  1. Start by removing the bind_filter inside of the derived table definition you created in the previous section.

  2. Add the bind_all_filters: yes definition under the derived_column definition to not only bind the order_created_date to itself, but every filter to itself:

bind_all_filters: yes
  1. Click Save Changes. Your file should resemble the following:

File browser page showing brand_order_facts.view code

  1. Navigate back to the Order Items Explore.

  2. Under the Brand Order Facts view, select the Brand Name Grouped.

  3. Under the Order Items view, select the Total Revenue measure.

  4. Also under the Order Items view, find the Created Date dimension, and click the filter button next to Date.

  5. In the filter definition, specify the filter to be: is in the past 365 days.

  6. Under the Users view, add a filter on the Country and Age; set them to Country is equal to USA and Age is greater than 21.

  7. Click Run.

  8. Click the SQL tab. Notice how the WHERE condition of the derived table now dynamically updates!

Although bind_all_filters is great, it only works when you have joined the native derived table to its explore_source. In other words, you are only able to use this here because you joined brand_order_facts back to the same Explore as the explore_source, order_items.

Why? Because bind_all_filters means Looker needs to know how to generate a WHERE condition for any field in the whole Explore. If your native derived table is using an explore_source of order_items but you join it to a different Explore, that other Explore could have any number of joined views and fields that don’t exist in order_items, and therefore wouldn’t make any sense in the world of order_items. Looker would not know how to filter the derived table with those other fields.

Now that you have seen the bind_all_filters in action, play around with a couple of different Explore filters and see how they affect the way the native derived table compiles.

  1. Navigate back to the brand_order_facts view.

Commit changes and deploy to production

  1. Click Validate LookML and then click Commit Changes & Push.

  2. Add a commit message and click Commit.

  3. Lastly, click Deploy to Production.

Congratulations!

In this lab, you used native derived tables to answer complex questions and address advanced use cases using derived columns, and updated them to generate dynamic values using built-in filter parameters. You also learned how business users leverage customized native derived tables to answer complex questions.

Next steps / learn more

Google Cloud training and certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated March 4, 2024

Lab Last Tested March 4, 2024

Copyright 2024 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.