Get access to 700+ labs and courses

Creating a RAG Chat Assistant with MongoDB Atlas Vector Search, Google Cloud and Langchain

Lab 1 hour 30 minutes universal_currency_alt 1 Credit show_chart Introductory

info This lab may incorporate AI tools to support your learning.

GSP1251
Overview
Setup and requirements
Task 1: Send chat prompt requests with Gemini 2.0 Flash
Task 2. Chunking and Embedding: From PDF documents to vector embeddings
Task 3: Index the vector embeddings in MongoDB Atlas
Task 4: Implement the retriever component
Task 5: Implement the generation component
Congratulations!
MongoDB Skill Badge for Retrieval Augmented Generation

Get access to 700+ labs and courses

This lab was developed with the partner, MongoDB. Your personal information may be shared with MongoDB, the lab sponsor, if you have opted-in to receive product updates, announcements, and offers in your Account Profile.

Note: This lab requires a partner account. Please follow the lab instructions to create your account before starting the lab.

GSP1251

Google Cloud Self-Paced Labs

Overview

MongoDB Atlas is a fully managed multi-cloud database service built by the MongoDB. It allows you to deploy, scale, and monitor your MongoDB database in the cloud. Additionally, Atlas comes with built-in services for workload isolation, analytics, search, and more.

MongoDB Atlas Vector Search allows you to search vector data stored in your MongoDB database. By creating an Atlas Vector Search index on your collection, you can perform vector search queries on the indexed fields. This integration enables you to store vector data alongside your other MongoDB data within the same database or even the same collection, eliminating the need to manage separate storage systems for your vector and operational data.

This hands-on lab guides you through the process of creating a chat assistant using Gemini 2.0 Flash, Langchain, Node.js, and Angular. You'll explore the limitations of out-of-context prompts and how to overcome them by implementing Retrieval Augmented Generation (RAG) with MongoDB Atlas Vector Search.

Retrieval-augmented generation (RAG) is a technique to enhance the quality of pre-trained LLM generation using information retrieved from external sources.

Objectives

In this lab, you perform the following tasks:

Send chat prompt requests with Gemini 2.0 Flash.
Convert PDF documents to vector embeddings with Langchain and Vertex AI Text embeddings API.
Store and index vector embeddings in MongoDB Atlas.
Implement Retrieval Augmented Generation (RAG) for context-aware chatbot responses.

Setup and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This Qwiklabs hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

What you need

To complete this lab, you need:

Access to a standard internet browser (Chrome browser recommended).
Time to complete the lab.

Note: If you already have your own personal Google Cloud account or project, do not use it for this lab.

Note: If you are using a Pixelbook, open an Incognito window to run this lab.

How to start your lab and sign in to the Google Cloud Console

Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is a panel populated with the temporary credentials that you must use for this lab.
Copy the username, and then click Open Google Console. The lab spins up resources, and then opens another tab that shows the Sign in page.

Tip: Open the tabs in separate windows, side-by-side.

If you see the Choose an account page, click Use Another Account.
In the Sign in page, paste the username that you copied from the Connection Details panel. Then copy and paste the password.

Important: You must use the credentials from the Connection Details panel. Do not use your Qwiklabs credentials. If you have your own Google Cloud account, do not use it for this lab (avoids incurring charges).
Click through the subsequent pages:
- Accept the terms and conditions.
- Do not add recovery options or two-factor authentication (because this is a temporary account).
- Do not sign up for free trials.

After a few moments, the Cloud Console opens in this tab.

Activate Cloud Shell

Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.

In the Cloud Console, in the top right toolbar, click the Activate Cloud Shell button.

Cloud Shell icon

Click Continue.

It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. For example:

Cloud Shell Terminal

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

You can list the active account name with this command:

gcloud auth list

(Output)

Credentialed accounts: - <myaccount>@<mydomain>.com (active)

(Example output)

Credentialed accounts: - google1623327_student@qwiklabs.net

You can list the project ID with this command:

gcloud config list project

(Output)

[core] project = <project_ID>

(Example output)

[core] project = qwiklabs-gcp-44776a13dea667a6

For full documentation of gcloud see the gcloud command-line tool overview.

MongoDB Atlas setup

Follow the registration link to create a free MongoDB Atlas account.
Log in to your MongoDB Atlas account and create a new project.

Task 1: Send chat prompt requests with Gemini 2.0 Flash

Set up the application

In the Google Cloud console, type Cloud Shell Editor in the search bar on the top and open the service from the search results. Wait for the Editor to load — it may take a minute.
Open the terminal in the Cloud Shell Editor with Ctrl + Shift + ` (backtick). Alternatively, you can click on the Open Terminal button.
Clone the GitHub repository for the lab by running the following command:
git clone https://github.com/mongodb-developer/Google-Cloud-RAG-Langchain.git rag-chatbot -b start cd rag-chatbot
Start the application by running the following command:
npm start
Wait for the application to start. The command installs the necessary Node.js and Angular packages (dependencies), builds both the server and the client applications and runs them in development mode. This means that when you make changes, the applications will rebuild and restart automatically. Once the client's running, you'll see the following message:
[start:client] ➜ Local: http://localhost:4200/
Open the Angular client in the browser. Click on the web preview button on the top right in Cloud Shell.

.
Select Change port, type 4200 in the input field, and click Change and preview.
The chat assistant interface will open in a new tab. Type a message in the chat box and press Enter to send the message. You should get the following response from the chat assistant:
Sorry, I'm having trouble understanding you right now. Please try again later.
This is because the chat assistant is not yet connected to Langchain or the Gemini 2.0 Flash API. Don't close the chatbot interface. You will use it throughout the lab.
Go to the previous tab with the Cloud Shell Editor. Click on Open Editor if you don't see the code.
From the file tree on the left, navigate to the rag-chatbot/server/src/ directory and open the server.ts file.
If you inspect the code, you will notice that the /messages endpoint is not implemented yet. You will implement this endpoint to send chat prompt requests to Gemini 2.0 Flash using Langchain.

Implement the `/messages` endpoint

First, you need to initialize the conversational AI model. To do this, you'll use the ChatVertexAI class imported from the @langchain/google-vertexai package. Along with this, you'll also set the parameters for the model, such as the model name, maximum number of tokens to generate in the response, temperature, topP, and topK values. These parameters control the randomness and diversity of the output generated by the model.

After initializing the model, you'll define a history variable with a system message describing the chatbot role and ground rules.

To do this, add the following code to the server.ts file right above the router.post("/messages" line:
// Initialize the conversational Vertex AI model const model = new ChatVertexAI({ // We will use the Gemini 2.0 Flash model model: "gemini-2.0-flash", // The maximum number of tokens to generate in the response maxOutputTokens: 2048, // The temperature parameter controls the randomness of the output — the higher the value, the more random the output temperature: 0.5, // The topP parameter controls the diversity of the output — the higher the value, the more diverse the output topP: 0.9, // The topK parameter controls the diversity of the output — the higher the value, the more diverse the output topK: 20, }); // Store chat history, starting with the system message that the assistant is a helpful insurance policies assistant const history: BaseLanguageModelInput = [ [ "system", `You are a knowledgeable and reliable insurance policies assistant. Provide only accurate and verified information related to insurance policies. Do not respond to irrelevant or nonsensical questions. Use any provided context about the user's insurance policies, such as coverage details, policy terms, and claim procedures, to ensure your responses are precise and pertinent. Do not mention that the context was used to generate the response. Include only information directly relevant to the user's inquiry.` ], ];
Next, replace the /messages endpoint with the following implementation:
router.post("/messages", async (req, res) => { let message = req.body.text; if (!message) { return res.status(400).send({ error: 'Message is required' }); } let prompt = `User question: ${message}.`; try { const modelResponse = await model.invoke([...history, prompt]); const textResponse = modelResponse?.content; if (!textResponse) { return res.status(500).send({ error: 'Model invocation failed.' }); } history.push([ "human", message ]); history.push([ "assistant", textResponse ]); return res.send({ text: textResponse }); } catch (e) { console.error(e); return res.status(500).send({ error: 'Model invocation failed.' }); } });

Test the chat assistant

Navigate to the chat assistant interface tab and type a message in the chat box, for example "What are insurance policies?". Press Enter to send the message.
Go back to the Cloud Shell Editor tab and check the logs. You might be prompted to allow the Cloud Shell Editor to make authorized requests. Click Allow to proceed.
If you see an error message, you have exceeded the time limit for allowing the Cloud Shell Editor to make authorized requests. In this case, try sending another message in the chat assistant interface.
The chat assistant will respond with a message similar to the following:
Insurance policies are contracts that provide financial protection or reimbursement against losses or damages. They are agreements between an insurance company and an individual or entity that specify the terms and conditions under which the insurer will compensate the insured party for covered losses. Insurance policies are designed to protect against the risk of financial loss due to unforeseen events, such as accidents, illnesses, or natural disasters.
Let's imagine you're a user asking about the specific coverage of their insurance policy. Type a message in the chat assistant interface, for example "What does my insurance policy cover?". Press Enter to send the message.

You are likely to get a generic response or a response asking for more details. This is because the chat assistant does not have context about your insurance policy. However, if you are already a user of the insurance company, the chat assistant should be able to provide a more accurate response based on the company's records.

Retrieval-Augmented Generation (RAG) for answering questions about specific insurance policies

Currently, the chat assistant does not have access to the user's insurance policy details and can only provide generic responses. You will address this limitation by implementing Retrieval Augmented Generation (RAG) with MongoDB Atlas Vector Search.

RAG is a grounding technique that improves the responses coming from an LLMs by augmenting the prompt with additional context. The additional context is retrieved from an external source such as a vector database containing proprietary information. In our system, the additional context will be extracted from the user's insurance policy.

RAG consists of three main phases: - Data ingestion — In this case, the proprietary data consists of PDF documents containing insurance policies. - Chunking — The PDFs will be cleaned and split into smaller, overlapping text chunks. This step will be essential for retrieving relevant information accurately. - Embedding — Each chunk will be converted to a vector embedding using Vertex AI's Text embeddings API. - Storing and Indexing — The resulting vector embeddings will be stored and indexed in a MongoDB Atlas database. - Information retrieval - When a user submits a query, it will also be converted into a vector embedding. Then, vector search will be performed to find the most semantically similar document chunks in the database. Additional pre-filtering may be applied to speed up the vector search execution. - Generation — The retrieved information will be added to the original user query to create an augmented prompt. This prompt will then be sent to the LLM, which will generate the final response. The main goal of this component is to generate a context-aware response to the user's query.

In the following tasks, you will chunk and convert the PDF documents with the insurance policy data to vector embeddings, store and index them in MongoDB Atlas, and implement RAG to enable context-aware chatbot responses.

Task 2. Chunking and Embedding: From PDF documents to vector embeddings

The insurance company has provided you with a set of PDF documents containing user insurance policy data. The documents have information about different types of insurance policies, coverage details, policy terms, and claim procedures. Based on the user query, the application will retrieve the most relevant context by performing vector search in MongoDB Atlas. Then, this context will be appended to the chat prompt requests to Gemini 2.0 Flash, enabling the chat assistant to generate context-aware responses.

In this lab, your data source will be limited to a few PDF documents. However, complex RAG systems typically use multiple data sources, including structured and unstructured data, such as text, videos, and images.

In this task, you will: - create a MongoDB Atlas database deployment, - split PDF documents into text chunks, - convert the chunks to vector embeddings using Langchain and the Vertex AI Text embeddings API, - store the vector embeddings in a MongoDB Atlas database, - create a vector search index on the ingested embeddings.

Create a MongoDB Atlas database deployment

Log in to your MongoDB Atlas account.
You can deploy only one free tier cluster per project. If you already have a free cluster, you will need to create a new project to deploy an additional free cluster. To do this, open the dropdown menu located in the top left corner, just below the Atlas logo, and click New Project. Then, enter a project name, click Next and then Create Project.
Click Create in the Create a cluster section.
- Select Free as the cluster tier.
- Change the name of the cluster to GeminiRAG.
- Select Google Cloud and any region of your choice.
- Uncheck the Automate security setup checkbox.
- And finally, click Create Deployment.
You will be prompted to complete the security setup of your deployment.
- Click Allow Access from Anywhere and then use default IP Address then click Add IP Address. This will allow you to connect to the cluster from the Google Cloud Shell development environment. You should never do this in a production environment.
- Fill the Username and Password fields with your desired credentials and click Create Database User.
- Click Choose a connection method and then Drivers. You're using Node.js in this lab and this is the default option in the Atlas UI.
- Wait for Atlas to complete provisioning your database deployment. This may take a few minutes.
- Once provisioning is complete, copy the connection string.
- Finally, close the dialog window.

Note: You should never allow access from everywhere in a production environment. Instead, use VPC peering or a private endpoint.

Open the tab with the Cloud Shell Editor and navigate to the rag-chatbot/server/ directory from the file tree on the left. Create a new file called .env with the following content:
ATLAS_URI=<your_mongodb_uri>
Replace <your_mongodb_uri> with the connection string you copied from the MongoDB Atlas UI. Make sure to replace the <password> placeholder with the password you set for the database user.

Chunk and embed the source data

Let's create the script that chunks the PDF documents and converts them to vector embeddings.

Splitting large PDF documents into smaller text chunks is an essential preparation step before ingestion. Chunking ensures that only the most relevant information is retrieved in response to user queries. If the entire document were ingested as a whole, searches would return the full document instead of precise, useful context. Smaller chunks improve retrieval accuracy and help the model generate more relevant responses.

Navigate to the rag-chatbot/server/src/ directory from the file tree on the left and open the embed-documents.ts file. You will see that the file is empty.

You should start by loading all PDFs from the pdf_documents directory. Langchain provides a helper PDFLoader class that you can use to load PDFs from a directory.
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb"; import { DirectoryLoader } from "langchain/document_loaders/fs/directory"; import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf"; import { RecursiveCharacterTextSplitter } from "langchain/text_splitter"; import { VertexAIEmbeddings } from "@langchain/google-vertexai"; import { connectToDatabase } from "./database.js"; // Load all PDFs within the specified directory const directoryLoader = new DirectoryLoader( "pdf_documents/", { ".pdf": (path: string) => new PDFLoader(path), } ); const docs = await directoryLoader.load(); console.log(`Loaded ${docs.length} PDFs from the specified local directory.`);
Next, you will use the RecursiveCharacterTextSplitter helper from Langchain to split the text content of the PDFs into chunks of text.

By splitting text in a way that preserves the context within each chunk, the splitter ensures that each piece of text remains meaningful on its own. This is essential for retrieval tasks where maintaining the integrity of information is crucial. Additionally, smaller, well-defined chunks can be indexed more efficiently. When a query is made, the system can quickly match it with the most relevant chunks rather than processing an entire document.
The helper method lets you define the chunk length and the overlap between neighboring chunks. Overlap ensures that important context is preserved across chunks.
Add the following code right after the loading of the PDFs:
// Split the PDF documents into chunks using recursive character splitter const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000, chunkOverlap: 200, }); const splitDocs = await textSplitter.splitDocuments(docs); console.log(`Split into ${splitDocs.length} text chunks using recursive character splitting.`);
Finally, you'll instantiate a MongoDB Atlas Vector database instance. When you create the instance, you'll specify the embedding model that you want to use. Behind the scenes, Langchain will invoke the embeddings API to convert the text chunks into vector embeddings, which will then be stored in the MongoDB Atlas database. As you can see, Langchain abstracts the complexity of the embeddings API and the MongoDB Atlas Vector Search API, allowing you to focus on the core functionality of your application.

Add the following code right after splitting the PDF documents into chunks:
// Connect to the MongoDB database const collections = await connectToDatabase(); // Instantiates a new MongoDBAtlasVectorSearch object with the specified configuration const vectorStore = new MongoDBAtlasVectorSearch( // Google Cloud Vertex AI's text embeddings model will be used for vectorizing the text chunks new VertexAIEmbeddings({ model: "text-embedding-005" }), { collection: collections.context as any, // The name of the Atlas Vector Search index. You must create this in the Atlas UI. indexName: "default", // The name of the collection field containing the raw content. Defaults to "text" textKey: "text", // The name of the collection field containing the embedded text. Defaults to "embedding" embeddingKey: "embedding", } ); // Insert the text chunks to the MongoDB Atlas vector store const result = await vectorStore.addDocuments(splitDocs); console.log(`Imported ${result.length} documents into the MongoDB Atlas vector store.`); process.exit(0); You can speed up vector search by using a pre-filter to narrow the search space. For example, you can filter by user identifier, document type or date to exclude irrelevant data.

Cleaning up unnecessary information during chunking can also help improve search efficiency.

Execute the script for creating the embeddings

Stop the running process by pressing Ctrl + C or Command + C in the terminal emulator tab.
Run the following command to execute the script that converts PDF documents to vector embeddings:
npm run embed-documents
You should see the following output in the terminal:
Loaded 13 PDFs from the specified local directory. Split into 79 text chunks using recursive character splitting. Imported 79 documents into the MongoDB Atlas vector store.
This output indicates that the PDF documents were successfully loaded, split into text chunks, and imported into the MongoDB Atlas vector store.

Task 3: Index the vector embeddings in MongoDB Atlas

In this task, you will create an Atlas Vector Search index in MongoDB Atlas. The index will be used to perform vector search queries on the embedded chunks stored that you just stored in the Atlas database.

Open the MongoDB Atlas UI and refresh the page to acknowledge the new imported data.
Click Browse collections under the GeminiRAG deployment.
Explore the documents in the context collection. You'll see the text chunks and their corresponding vector embeddings, along with other metadata.
Click Search Indexes or Atlas Search to be taken to the Atlas Search page. Once you're there, click Create Search Index.
Scroll up to the Search Type section and select Vector Search.
You should see the Index Name change to vector_index.
Then, select the collection that you want to index. Expand the collections in the chat-rag database and then select the context collection.
In the Configuration Method section, select JSON editor.
Click Next.
Paste the following configuration in the JSON editor:
{ "fields": [ { "path": "embedding", "numDimensions": 768, "similarity": "euclidean", "type": "vector" } ] }
- This configuration defines an index on the embedding field of the context collection.
- The numDimensions parameter sets the length of the vector embeddings, which varies depending on the embedding model used. Since you're using Vertex AI Text Embeddings, which generates vectors with 768 dimensions, this value is specified in the configuration.
- The similarity parameter determines the vector similarity function for vector search queries—in this case, the Euclidean distance similarity function, which measures the distance between ends of vectors.
- Finally, the type of the index is set to vector.
Click Next and then Create Vector Search Index.
Wait for the status to change to READY. Once the index is ready, you can start performing vector search queries on the indexed field.

Task 4: Implement the retriever component

Next, you will implement the retriever component of your RAG system. As mentioned earlier, the retriever will use the Atlas Search Vector Database as its source. It will: - Convert the user's question into a vector embedding. - Perform a vector search against the indexed data to find the most relevant text chunks.

The retrieved chunks will serve as context for the chatbot.

Navigate to the rag-chatbot/server/src/ directory in the Cloud Shell Editor and open the server.ts file.
First, you need to initialize the MongoDB Atlas vector store. Add the following code right above the /messages endpoint:
// Connect to the MongoDB Atlas database await connectToDatabase(); // Initialize a MongoDB Atlas vector store with the specified configuration const vectorStore = new MongoDBAtlasVectorSearch( // Google Cloud Vertex AI's text embeddings model will be used for vectorizing the text chunks new VertexAIEmbeddings({ model: "text-embedding-005" }), { collection: collections.context as any, // The name of the Atlas Vector Search index. You must create this in the Atlas UI. indexName: "vector_index", // The name of the collection field containing the raw content. Defaults to "text" textKey: "text", // The name of the collection field containing the embedded text. Defaults to "embedding" embeddingKey: "embedding", } ); // Initialize a retriever wrapper around the MongoDB Atlas vector store const vectorStoreRetriever = vectorStore.asRetriever();
The code is similar to the one you used to create the MongoDB Atlas vector store in the embed-documents.ts script. The only difference is that you're using the asRetriever method to create a retriever wrapper around the vector store. The retriever wrapper provides a convenient interface for performing vector search queries on the indexed fields.

Task 5: Implement the generation component

Finally, you will implement the generation component of your RAG system. The generation component will prompt the LLM (Gemini 2.0 Flash) to generate context-aware responses based on the user's question and the retrieved context.

The prompt will be constructed by combining the original user query and the retrieved context chunks. Then, the system will prompt Gemini 2.0 Flash to generate the final response and return it to the user.

Let's change the generation logic we implemented earlier in the /messages endpoint. Right under the let prompt = ... line add the following code:
// If RAG is enabled, retrieve context from the MongoDB Atlas vector store const rag = req.body.rag; if (rag) { const context = await vectorStoreRetriever.invoke(message); if (context) { prompt += ` Context: ${context?.map(doc => doc.pageContent).join("\n")} `; } else { console.error("Retrieval of context failed"); } }
This code checks if the rag property is set to true in the request body. If it is, the code retrieves context from the MongoDB Atlas vector store using the invoke method of the retriever. The context is then appended to the prompt message.

The rest of the implementation remains the same.

Prompt construction considerations

When constructing a prompt, you need to consider the LLM’s context window—the amount of information, including the question and additional context, that the model can process at once.

An AI model’s context window is measured in tokens, which are the fundamental units used for processing information. Tokens can represent entire words, parts of words, images, videos, audio, or code. The larger the context window, the more data the model can analyze in a single prompt, leading to more consistent, relevant, and useful responses.

Gemini 2.0 Flash has an exceptionally large context window of up to 1 million tokens. This allows it to process vast amounts of information in one go, such as an hour of video, 11 hours of audio, codebases with over 30,000 lines, or more than 700,000 words.

Another consideration is how many chunks to retrieve as context through vector search and include in your prompt. While more chunks can provide extra context, too many may include irrelevant information and lower the response accuracy. There’s no set number—it depends on your data. You can experiment with different context sizes and see how it affects the relevance of the responses.

Try out the RAG chat assistant

Start the application again by running:
npm start
Wait for the application to start. Once it's running, click on the web preview button on the top right in Cloud Shell to open the web app again.

.
Select Preview on port 4200.
In the chat assistant interface, switch on the RAG toggle next to the message input. This will instruct the app to perform vector search queries on the MongoDB Atlas vector store and include the retrieved context in the chat prompt requests to Gemini 2.0 Flash.
Type a message such as "Does my car insurance cover mechanical failure?" in the chat box and press Enter to send the message.

You should get a response similar to the following:
Your policy covers sudden mechanical failures that affect the performance of the engine, electronics, gearbox, and transmission in a car under 8 years old with less than 100,000 km. If you're still getting a generic response from the chatbot, navigate back to your Atlas Vector Search index and ensure it has finished building. Its status should be Ready.
Let's try another question such as "What does my insurance policy cover?".

Notice that the answer is more specific, takes into account the chat history, and provides context-aware responses based on the user's questions.

Great job! You have successfully implemented Retrieval Augmented Generation (RAG) with MongoDB Atlas Vector Search to create a context-aware chat assistant.

Congratulations!

You learned how to create a chat assistant using Gemini 2.0 Flash, Langchain, Node.js, and Angular. You explored the limitations of out-of-context prompts and managed to overcome them by implementing Retrieval Augmented Generation (RAG) with MongoDB Atlas Vector Search.

MongoDB Skill Badge for Retrieval Augmented Generation

Validate your expertise in Retrieval Augmented Generation by earning the official MongoDB Skill Badge!

This digital credential validates your knowledge in building Retrieval-Augmented Generation (RAG) applications with MongoDB. It recognizes your understanding of integrating vector search, optimizing retrieval workflows, and enhancing LLM-powered apps.

Complete the knowledge check to validate your newly acquired skills a receive a Credly-certified badge.

Important: The knowledge check URL is only accessible after you have started and completed the lab. Don't miss your chance to get the RAG with MongoDB badge!

Next Steps / Learn More

To keep learning MongoDB try these labs:

Be sure to check out MongoDB on the Google Cloud Marketplace!

Get free $500 credits for MongoDB on Google Cloud Marketplace - Applicable only for new customers.

Google Cloud training and certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated: April 07, 2025

Lab Last Tested: April 07, 2025

Copyright 2024 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

Creating a RAG Chat Assistant with MongoDB Atlas Vector Search, Google Cloud and Langchain

Creating a RAG Chat Assistant with MongoDB Atlas Vector Search, Google Cloud and Langchain

GSP1251

Overview

Objectives

Setup and requirements

Before you click the Start Lab button

What you need

How to start your lab and sign in to the Google Cloud Console

Activate Cloud Shell

MongoDB Atlas setup

Task 1: Send chat prompt requests with Gemini 2.0 Flash

Set up the application

Implement the /messages endpoint

Test the chat assistant

Retrieval-Augmented Generation (RAG) for answering questions about specific insurance policies

Task 2. Chunking and Embedding: From PDF documents to vector embeddings

Create a MongoDB Atlas database deployment

Chunk and embed the source data

Execute the script for creating the embeddings

Task 3: Index the vector embeddings in MongoDB Atlas

Task 4: Implement the retriever component

Task 5: Implement the generation component

Prompt construction considerations

Try out the RAG chat assistant

Congratulations!

MongoDB Skill Badge for Retrieval Augmented Generation

Next Steps / Learn More

Google Cloud training and certification

Before you begin

Use private browsing

Sign in to the Console

Use private browsing to run the lab

Implement the `/messages` endpoint