arrow_back

Analyze and Reason on Multimodal Data with Gemini: Challenge Lab

Sign in Join
Get access to 700+ labs and courses

Analyze and Reason on Multimodal Data with Gemini: Challenge Lab

Lab 1 hour 30 minutes universal_currency_alt 5 Credits show_chart Intermediate
info This lab may incorporate AI tools to support your learning.
Get access to 700+ labs and courses

GSP524

Google Cloud self-paced labs logo

Overview

In a challenge lab you’re given a scenario and a set of tasks. Instead of following step-by-step instructions, you will use the skills learned from the labs in the course to figure out how to complete the tasks on your own! An automated scoring system (shown on this page) will provide feedback on whether you have completed your tasks correctly.

When you take a challenge lab, you will not be taught new Google Cloud concepts. You are expected to extend your learned skills, like changing default values and reading and researching error messages to fix your own mistakes.

To score 100% you must successfully complete all tasks within the time period!

This lab is recommended for students who have enrolled in the Analyze and Reason Multimodal Data with Gemini course. Are you ready for the challenge?

Prerequisites

Before starting this lab, you should be familiar with:

  • Basic Python programming.
  • General API concepts.
  • Running Python code in a Jupyter notebook on Vertex AI Workbench.

Topics tested

In this challenge, you use the Gemini 2.0 Flash and Gemini 2.5 Flash models to:

  • Construct and execute complex multimodal prompts to analyze text, images, audio, and video data.
  • Extract structured information (e.g., sentiment scores, key themes, object detection, audio characteristics, action recognition) from multimodal data.
  • Synthesize insights from multiple data modalities to draw meaningful conclusions and provide actionable recommendations.
  • Format the output of the models into a structured Markdown report for effective communication of findings.

Setup and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources are made available to you.

This hands-on lab lets you do the lab activities in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito (recommended) or private browser window to run this lab. This prevents conflicts between your personal account and the student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab—remember, once you start, you cannot pause a lab.
Note: Use only the student account for this lab. If you use a different Google Cloud account, you may incur charges to that account.

Challenge scenario

Cymbal Direct: Analyzing social media engagement for a new product launch

Cymbal Direct just launched a new line of athletic apparel designed for enhanced performance during various activities. To gauge public perception and potential market impact, Cymbal Direct is tasked with analyzing social media engagement across multiple platforms. This analysis involves:

  • Text: Analyzing customer reviews and social media posts for sentiment and key themes.
  • Image: Analyzing images posted by influencers and customers wearing the apparel to identify style trends and usage patterns.
  • Audio Analyzing an audio clip of a podcast episode of a recent interview about Cymbal Direct's new product launch.

The goal is to provide Cymbal Direct with actionable insights to refine their marketing strategy, improve their products, and bolster product positioning. Are you ready for the challenge?

Task 1. Open the notebook in Vertex AI Workbench

  1. In the Google Cloud console, on the Navigation menu (Navigation menu icon), click Vertex AI > Workbench.

  2. Find the instance and click on the Open JupyterLab button.

The JupyterLab interface for your Workbench instance opens in a new browser tab.

Note: If you do not see notebooks in JupyterLab, please follow these additional steps to reset the instance:

1. Close the browser tab for JupyterLab, and return to the Workbench home page.

2. Select the checkbox next to the instance name, and click Reset.

3. After the Open JupyterLab button is enabled again, wait one minute, and then click Open JupyterLab.

  1. Click on the file.

  2. In the Select Kernel dialog, choose Python 3 from the list of available kernels.

  3. Complete Task 1 in the notebook to import the libraries and install the Gen AI SDK.

Once you have completed Task 1 and have set up your environment, you are ready to move onto the next sections.

For the following tasks, you need to complete the missing parts of each cell to progress to the next section. The missing parts are denoted with TODO and an instruction to complete.

Click Check my progress to verify the objective. Import the required libraries and set up the Gen AI SDK.

Task 2. Analyze and reason on customer feedback (text)

In this task, you gather information about Cymbal Direct's new athletic apparel using the Gemini 2.0 Flash and Gemini 2.5 Flash models to analyze customer reviews and social media posts in text format. You then save the findings from the model into a Markdown file to use for a comprehensive report in the last task.

Note: Your tasks are labeled with a #TODO section in the cell. Read each cell carefully and ensure you are filling them out correctly! Check your progress to be sure you have completed the cells correctly.

In the notebook, use the cells in the Task 2. Analyze and reason on customer feedback (text) section for this task.

Initial analysis with Gemini 2.0 Flash

In the Initial Analysis with Gemini 2.0 Flash section of the notebook:

  1. In the notebook cell, under 3. Construct the prompt for Gemini, fill in the TODOs to construct a prompt that instructs the Gemini model to analyze the customer reviews and social media posts.

  2. In the notebook cell, under 4. Send the prompt to Gemini, fill in the TODOs to send the prompt and text data to the Gemini model.

Click Check my progress to verify the objective. Initial Analysis with Gemini 2.0 Flash.

Deep dive with Gemini 2.5 Flash: Reasoning on customer sentiment

In this section, you use the Thinking model to delve deeper into the customer sentiment and identify key areas for improvement. Particularly interesting is the reasoning behind positive and negative reviews and uncovering any recurring themes that might not be immediately apparent.

You also configure a thinking budget, which provides guidance to the model on the number of thinking tokens it can use when generating a response. A greater number of tokens is typically associated with more detailed thinking, which is needed for solving more complex tasks. Setting budget to 0 turns off thinking and turns the model into a non-thinking model for simpler tasks.

  1. In the notebook, in the Deep Dive with Gemini-2.5-flash: Reasoning on Customer Sentiment section, under 1. Construct the prompt for Gemini, fill in the TODOs to construct a prompt that instructs the Gemini model to analyze the customer reviews and social media posts in more detail.

  2. Under 2. Send the prompt to the Gemini Thinking model, fill in the TODOs to send the prompt and text data to the Gemini model.

Click Check my progress to verify the objective. Deep dive with Gemini 2.5 Flash: Reasoning on customer sentiment.

Task 3. Analyze and reason on visual content: Style trends and customer behavior

In this task, you use the Gemini 2.0 Flash and Gemini 2.5 Flash models to analyze images related to Cymbal Direct's new athletic apparel line. The goal is to identify style trends and customer behavior based on the images. You save the findings from the model into a markdown file that to use for a comprehensive report in the last task.

Note: Your tasks are be labeled with a #TODO section in the cell. Read each cell carefully and ensure you are filling them out correctly! Check your progress on this page to ensure you have completed the cells correctly.

In the notebook, use the cells in the Task 3. Analyze and reason on visual content: Style trends and customer behavior section for this task.

Initial Analysis with Gemini 2.0 Flash

  1. In the notebook, in the Initial Analysis with Gemini 2.0 Flash section, under 3. Construct the prompt for Gemini, fill in the TODOs to construct a prompt that instructs the Gemini model to analyze the images of Cymbal Direct's new athletic apparel line.

  2. Under 4. Send the prompt and images to Gemini, fill in the TODOs to send the prompt and images to the Gemini model.

Click Check my progress to verify the objective. Initial Analysis with Gemini 2.0 Flash.

Reasoning on image trends with Gemini 2.5 Flash

You'll now use the Thinking model to perform a more in-depth analysis of the visual elements, inferring context, target audience, and potential marketing implications.

  1. In the notebook, in the Reasoning on image trends with Gemini-2.5-flash section, under 1. Construct the prompt for Gemini, fill in the TODOs to construct a prompt that instructs the Gemini model to analyze the images in more detail.

  2. Under "2. Send the prompt and images to the Gemini Thinking model", fill in the TODOs to send the prompt and images to the Gemini model.

Click Check my progress to verify the objective. Reasoning on Image Trends with Gemini 2.5 Flash.

Task 4. Analyze and reason on audio content: Customer perceptions

In this task, you use the Gemini 2.0 Flash and Gemini 2.5 Flash models to analyze a podcast about Cymbal Direct's new clothing line. You extract information/sentiment out of it and use those to generate insights for the company. You then save the findings from the model into a Markdown file to use for a comprehensive report in the last task.

This audio clip is from a podcast episode featuring an Cymbal Direct representative interview discussing the new athletic apparel line. The conversation covers various aspects of the apparel, including design, features, target audience, and marketing strategy.

Note: Your tasks are labeled with a #TODO section in the cell. Read each cell carefully and ensure you are filling them out correctly! Check your progress to ensure you have completed the cells correctly.

In the notebook, use the cells in the Task 4. Analyze and reason on audio content: Customer perceptions section for this task.

Initial analysis with Gemini 2.0 Flash

  1. In the notebook, in the Initial analysis with Gemini 2.0 Flash section, under 1. Construct the prompt for Gemini, fill in the TODOs to construct a prompt that instructs the Gemini model to analyze the audio recording of the conversation about Cymbal Direct's new athletic apparel line.

  2. Under 2. Send the prompt and audio to Gemini, fill in the TODOs to send the prompt and audio data to the Gemini model.

Click Check my progress to verify the objective. Initial Analysis with Gemini 2.0 Flash.

Reasoning on audio insights with Gemini 2.5 Flash

In this section, you use the Thinking model to analyze the conversation at a deeper level, reason about customer satisfaction, deduce influencing factors, and generate data-driven recommendations.

  1. In the notebook, in the Reasoning on Audio Insights with Gemini 2.5 Flash section, under 1. Construct the prompt for Gemini, fill in the TODOs to construct a prompt that instructs the Gemini model to analyze the audio recording in more detail.

  2. Under 2. Send the prompt and audio to the Gemini Thinking model, fill in the TODOs to send the prompt and audio data to the Gemini model.

Click Check my progress to verify the objective. Reasoning on Audio Insights with Gemini 2.5 Flash.

Task 5. Synthesize multimodal insights: Generate a comprehensive report

In this final task, you synthesize the insights gained from your previous analyses of text, images, and audio data. You use the Gemini 2.5 Flash model to generate a comprehensive report that consolidates the findings from each modality, providing a holistic view of customer sentiment, style preferences, and key trends related to Cymbal Direct's new athletic apparel line.

You then save the final report generated by the model into a markdown file, which you upload to Cloud Storage for review and evaluation. This comprehensive report serves as a valuable resource for Cymbal Direct, enabling them to make informed decisions and optimize their strategies based on a thorough understanding of customer perceptions and market trends.

Note: Your tasks are labeled with a #TODO section in the cell. Read each cell carefully and ensure you are filling them out correctly! Check your progress on this page to ensure you have completed the cells correctly.
  1. In the notebook, in the Task 5. Synthesize multimodal insights: Generate a comprehensive report section, under 3. Construct the prompt for Gemini, fill in the TODOs to construct a prompt to instruct the Gemini model to generate a comprehensive report based on the combined analysis results.

  2. Under 4. Send the prompt to Gemini, fill in the TODOs to send the prompt to the Gemini model.

Click Check my progress to verify the objective. Synthesize multimodal insights: generate a comprehensive report.

Congratulations!

Congratulations! In this lab, you have successfully utilized the Gemini 2.0 Flash and Gemini 2.5 Flash models to analyze multimodal data, including text, images, and audio, to gain valuable insights for Cymbal Direct's new athletic apparel line. You have demonstrated proficiency in constructing effective prompts, leveraging the reasoning and thinking budget capabilities of the Thinking model, and generating a comprehensive report with actionable recommendations.

Next steps / learn more

Check out the following resources to learn more about Gemini:

Google Cloud training and certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated July 11, 2025

Lab Last Tested July 11, 2025

Copyright 2025 Google LLC. All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

Before you begin

  1. Labs create a Google Cloud project and resources for a fixed time
  2. Labs have a time limit and no pause feature. If you end the lab, you'll have to restart from the beginning.
  3. On the top left of your screen, click Start lab to begin

Use private browsing

  1. Copy the provided Username and Password for the lab
  2. Click Open console in private mode

Sign in to the Console

  1. Sign in using your lab credentials. Using other credentials might cause errors or incur charges.
  2. Accept the terms, and skip the recovery resource page
  3. Don't click End lab unless you've finished the lab or want to restart it, as it will clear your work and remove the project

This content is not currently available

We will notify you via email when it becomes available

Great!

We will contact you via email if it becomes available

One lab at a time

Confirm to end all existing labs and start this one

Use private browsing to run the lab

Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.