One of the best tools for improving the quality of responses from large language models (LLMs) is retrieval augmented generation (RAG). RAG is the pattern of retrieving some non-public data and using that data to augment your prompt sent to the LLM. RAG allows the LLM to generate more accurate responses based on the data included in the prompt.
You'll use AlloyDB, Google Cloud's scalable and performant PostgreSQL-compatible database, to store and search by a special kind of vector data called vector embeddings. Vector embeddings can be retrieved using a semantic search, which allows retrieval of the available data that is the best match for a user's natural language query. The retrieved data is then passed to the LLM in the prompt.
You'll also use Vertex AI, Google Cloud's fully-managed, unified AI development platform for building and using generative AI. Your application uses Gemini Pro, a multimodal foundation model that supports adding image, audio, video, and PDF files in text or chat prompts and supports long-context understanding.
What you will learn
In this lab, you'll learn:
How RAG enhances LLM capabilities by retrieving relevant information from a knowledge base.
How AlloyDB can be used to find relevant information using semantic search.
How you can use Vertex AI and Google's foundation models to provide powerful generative AI capabilities to applications.
Setup and requirements
Before you click the Start Lab button
Note: Read these instructions.
Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.
This Qwiklabs hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.
What you need
To complete this lab, you need:
Access to a standard internet browser (Chrome browser recommended).
Time to complete the lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab.Note: If you are using a Pixelbook, open an Incognito window to run this lab.
How to start your lab and sign in to the Console
Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method.
On the left is a panel populated with the temporary credentials that you must use for this lab.
Copy the username, and then click Open Google Console.
The lab spins up resources, and then opens another tab that shows the Choose an account page.
Note: Open the tabs in separate windows, side-by-side.
On the Choose an account page, click Use Another Account. The Sign in page opens.
Paste the username that you copied from the Connection Details panel. Then copy and paste the password.
Note: You must use the credentials from the Connection Details panel. Do not use your Google Cloud Skills Boost credentials. If you have your own Google Cloud account, do not use it for this lab (avoids incurring charges).
Click through the subsequent pages:
Accept the terms and conditions.
Do not add recovery options or two-factor authentication (because this is a temporary account).
Do not sign up for free trials.
After a few moments, the Cloud console opens in this tab.
Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left.
Activate Google Cloud Shell
Google Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud.
Google Cloud Shell provides command-line access to your Google Cloud resources.
In Cloud console, on the top right toolbar, click the Open Cloud Shell button.
Click Continue.
It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. For example:
gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.
You can list the active account name with this command:
When the installation is complete, you are left in the virtual Python environment, with a (.venv) prompt.
If the VM SSH session ever times out or the tab is closed, you can SSH into the VM again and use the command source ~/.venv/bin/activate to restart the virtual Python environment.
To confirm the python version, run the following command:
cd ~/genai-databases-retrieval-app
cat retrieval_service/models/models.py
The Python data models are shown here. The model includes airports, flights, amenities within the terminals, policies, and tickets.
To see an example of the airport data, run the following commands:
head -1 data/airport_dataset.csv; grep SFO data/airport_dataset.csv
These commands show the CSV header that specifies the column names for the airport dataset followed by the row for the San Francisco International airport (SFO). The data in the airport model can be retrieved based on the International Air Transport Association (IATA) code, or by country, city, and airport name. You can use keyword search to find rows in this table, so there are no vector embeddings for this data.
To see an example of the flight data, run the following commands:
head -1 data/flights_dataset.csv; grep -m10 "SFO" data/flights_dataset.csv
These commands show the CSV header that specifies the column names for the flights dataset followed by the first 10 rows of flights to or from SFO. The data in the flights model can be retrieved based on the airline and flight number, or by the departure and arrival airport codes.
To see an example of the amenities data, run the following command:
head -2 data/amenity_dataset.csv
This command shows the CSV header that specifies the column names for the amenities dataset followed by the first amenity.
You'll notice that the first amenity has several simple values, including name, description, location, terminal, category, and business hours. The next value is content, which incorporates the name, description, and location. The last value is embedding, the vector embedding for the row.
The embedding is an array of 768 numbers which is used when performing a semantic search. These embeddings are calculated using an AI model provided by Vertex AI. When a user provides a query, a vector embedding can be created from the query, and data with vector embeddings that are close to the search's embedding can be retrieved.
The policy data also uses vector embeddings in a similar fashion.
Note: The calculation of embeddings takes a while, so the embeddings have already been provided. The run_generate_embeddings.py script can be examined to see how embeddings are generated.
To create a database configuration file, run the following commands:
export PGUSER={{{project_0.startup_script.gcp_alloydb_user | PG_USER}}}
export PGPASSWORD={{{project_0.startup_script.gcp_alloydb_password | PG_PASSWORD}}}
export PROJECT_ID=$(gcloud config get-value project)
export REGION={{{project_0.default_region | REGION }}}
export ADBCLUSTER={{{project_0.startup_script.gcp_alloydb_cluster_name | CLUSTER}}}
export INSTANCE_IP=$(gcloud alloydb instances describe $ADBCLUSTER-pr --cluster=$ADBCLUSTER --region=$REGION --format="value(ipAddress)")
cd ~/genai-databases-retrieval-app/retrieval_service
cp example-config.yml config.yml
sed -i s/127.0.0.1/$INSTANCE_IP/g config.yml
sed -i s/my-user/$PGUSER/g config.yml
sed -i s/my-password/$PGPASSWORD/g config.yml
sed -i s/my_database/assistantdemo/g config.yml
cat config.yml
The config file config.yml is created with the instance IP address, username, password, and database updated. Your configuration file should now resemble this:
The first command adds all required packages to the Python virtual environment and the second command populates the database with the data.
Populate the database with the sample dataset.
Task 5. Create a service account for the retrieval service
In this task, you create a service account for the retrieval service.
The retrieval service is responsible for extracting relevant information from the database. It extracts the necessary information from the database based on the request from an AI application. This service account is used as the identity of that Cloud Run service.
Create service account
The SSH user does not have permission for the project instance to provide the service account with the correct role. You create the service account using a new Cloud Shell tab.
In Cloud Shell, to open a new Cloud Shell tab, click Open a new tab (+).
To create a service account and grant it the necessary privileges, in the new tab, run the following commands:
This service account is granted the role roles/aiplatform.user, which allows the service to call Vertex AI.
To close the new tab, run the following command:
exit
Create the service account retrieval-identity.
Task 6. Deploy the retrieval service to Cloud Run
In this task, you deploy the retrieval service to Cloud Run.
To deploy the retrieval service, in the VM SSH Cloud Shell tab, run the following commands:
export REGION={{{project_0.default_region | REGION }}}
cd ~/genai-databases-retrieval-app
gcloud alpha run deploy retrieval-service \
--source=./retrieval_service/\
--no-allow-unauthenticated \
--service-account retrieval-identity \
--region $REGION \
--network=default \
--quiet
Wait a few minutes until the deployment completes.
To verify the service, run the following command:
curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" $(gcloud run services list --filter="(retrieval-service)" --format="value(URL)")
If you see the "Hello World" message, the service is up and serving requests.
Deploy the retrieval service.
Task 7. Register the OAuth consent screen
In this task, you register the OAuth consent screen that is presented to users who are logging in.
When you use OAuth 2.0 for authorization, Google displays a consent screen to capture the user's consent to share data with the application.
In the Google Cloud console, select the Navigation menu (), and then select APIs & Services > OAuth consent screen.
Click Get Started.
For App name, enter Cymbal Air.
Click User support email, then click the student email, and then click Next.
For Audience, select Internal, and then click Next.
Users with access to the project should be able to log in to the app.
On the left panel of the lab instructions, copy the Username.
For Contact information, paste the copied username.
Click Next.
Click Checkbox to agree the User Data Policy, then click Continue, and then click Create.
The consent screen is now set up.
Task 8. Create a client ID for the application
In this task, you create a client ID for the application.
The application requires a client ID to use Google's OAuth service. You configure the allowed origins that can make this request, and a redirect URI where the web app is redirected after the user has consented to log in.
In the Google Cloud console, select the Navigation menu (), and then select APIs & Services > Credentials.
Click + Create Credentials, and then click OAuth client ID.
A client ID is used to identify a single app to Google's OAuth servers.
For Application type, select Web application.
For Name, enter Cymbal Air.
You can generate the JavaScript origin and redirect URI using Cloud Shell.
In Cloud Shell, to open a new Cloud Shell tab, click Open a new tab (+).
To get the origin and redirect URI, in the new tab, run the following commands:
In this task, you run a sample chat application that uses the retrieval service.
Run the application
To install the Python requirements for the chat application, in the VM SSH Cloud Shell tab, run the following commands:
cd ~/genai-databases-retrieval-app/llm_demo
pip install -r requirements.txt
Before starting the application, you need to set up some environment variables. The basic functionality of the application, including querying flights and returning airport amenities, requires an environment variable named BASE_URL to contain the base URL of the retrieval service.
To specify the base URL of the retrieval service, run the following commands:
export BASE_URL=$(gcloud run services list --filter="(retrieval-service)" --format="value(URL)")
echo $BASE_URL
The base URL is used by the local application to access the retrieval service.
To run the application, run the following command:
python run_app.py
Your response should look similar to this:
(.venv) student-03-b2f40c6c89d6@app-vm:~/genai-databases-retrieval-app/llm_demo$ python run_app.py
INFO: Started server process [32894]
INFO: Waiting for application startup.
Loading application...
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8081 (Press CTRL+C to quit)
The application is now running.
Connect to the VM
You have several ways to connect to the application running on the VM. For example, you can open port 8081 on the VM using firewall rules in the VPC, or create a load balancer with a public IP. Here you use a SSH tunnel to the VM, translating the Cloud Shell port 8080 to the VM port 8081.
In Cloud Shell, to open a new Cloud Shell tab, click Open a new tab (+).
To create an SSH tunnel to the VM port, in the new tab, run the following command:
The gcloud command connects port 8080 in Cloud Shell with port 8081 on the VM. You can ignore the error "Cannot assign requested address."
To run the application in the web browser, click Web Preview, and then select Preview on port 8080.
A new tab is opened in the browser, and the application is running. The Cymbal Air application prompts "Welcome to Cymbal Air! How may I assist you?"
Enter the following query:
When is the next flight to Dallas?
The application responds with the next flight from SFO to Dallas/Fort Worth.
Enter the following query:
Which restaurants are near the departure gate?
The application understands the context, and responds with restaurants near the departure gate in SFO.
Task 10. Log in to the application (optional)
In this task, you log into the application to book the flight.
Click Sign in.
A pop-up window opens.
In the pop-up window, select the student.
The student account is logged in.
If you are asked to confirm that you want to sign in as the student, click Confirm.
Enter the following query:
Please book that flight.
The application presents the flight that was being discussed.
Click Looks good to me. Book it.
The flight is booked.
Enter the following query:
Which flights have I booked?
The flight you just booked is shown.
The chat app can help answer user questions like:
When is the next flight to Miami?
Are there any luxury shops around gate D50?
Where can I get coffee near gate A6?
The application uses the latest Google foundation models to generate responses and augment them with information about flights and amenities from the operational AlloyDB database. You can read more about this demo application on the GitHub page of the project.
Congratulations!
You've successfully built a chat application that leverages large language models (LLMs) and retrieval augmented generation (RAG) to create engaging and informative conversations.
When you have completed your lab, click End Lab. Google Cloud Skills Boost removes the resources you’ve used and cleans the account for you.
You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.
The number of stars indicates the following:
1 star = Very dissatisfied
2 stars = Dissatisfied
3 stars = Neutral
4 stars = Satisfied
5 stars = Very satisfied
You can close the dialog box if you don't want to provide feedback.
For feedback, suggestions, or corrections, please use the Support tab.
Copyright 2024 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.
Moduły tworzą projekt Google Cloud i zasoby na określony czas.
Moduły mają ograniczenie czasowe i nie mają funkcji wstrzymywania. Jeśli zakończysz moduł, musisz go zacząć od początku.
Aby rozpocząć, w lewym górnym rogu ekranu kliknij Rozpocznij moduł.
Użyj przeglądania prywatnego
Skopiuj podaną nazwę użytkownika i hasło do modułu.
Kliknij Otwórz konsolę w trybie prywatnym.
Zaloguj się w konsoli
Zaloguj się z użyciem danych logowania do modułu. Użycie innych danych logowania może spowodować błędy lub naliczanie opłat.
Zaakceptuj warunki i pomiń stronę zasobów przywracania.
Nie klikaj Zakończ moduł, chyba że właśnie został przez Ciebie zakończony lub chcesz go uruchomić ponownie, ponieważ spowoduje to usunięcie wyników i projektu.
Ta treść jest obecnie niedostępna
Kiedy dostępność się zmieni, wyślemy Ci e-maila z powiadomieniem
Świetnie
Kiedy dostępność się zmieni, skontaktujemy się z Tobą e-mailem
Jeden moduł, a potem drugi
Potwierdź, aby zakończyć wszystkie istniejące moduły i rozpocząć ten
Aby uruchomić moduł, użyj przeglądania prywatnego
Uruchom ten moduł w oknie incognito lub przeglądania prywatnego. Dzięki temu unikniesz konfliktu między swoim kontem osobistym a kontem do nauki, co mogłoby spowodować naliczanie dodatkowych opłat na koncie osobistym.
In this lab, you create a chat application that uses Retrieval Augmented Generation, or RAG, to augment prompts with data retrieved from AlloyDB.
Czas trwania:
Konfiguracja: 10 min
·
Dostęp na 90 min
·
Ukończono w 90 min