arrow_back

Importing Data to a Firestore Database

Join Sign in

Importing Data to a Firestore Database

45 minutes 1 Credit

GSP642

Google Cloud self-paced labs logo

image

Overview

For the labs in the Google Cloud Serverless Workshop: Pet Theory Quest, you will read through a fictitious business scenario and assist the characters with their serverless migration plan.

Twelve years ago, Lily started the Pet Theory chain of veterinary clinics. The Pet Theory chain has expanded rapidly over the last few years. However, their old appointment scheduling system is not able to handle the increased load, so Lily is asking you to build a cloud-based system that scales better than the legacy solution.

Pet Theory's Ops team is a single person, Patrick, so they need a solution that doesn't require lots of ongoing maintenance. The team has decided to go with serverless technology.

Ruby has been hired as a consultant to help Pet Theory make the transition to serverless. After comparing serverless database options, the team decides to go with Cloud Firestore. Since Firestore is serverless, capacity doesn't have to be provisioned ahead of time which means that there is no risk of running into storage or operations limits. Firestore keeps your data in sync across client apps through real-time listeners and offers offline support for mobile and web, so a responsive app can be built that works regardless of network latency or Internet connectivity.

In this lab you will help Patrick upload Pet Theory's existing data to a Cloud Firestore database. He will work closely with Ruby to accomplish this.

Architecture

This diagram gives you an overview of the services you will be using and how they connect to one another:

arch.png

What you will learn

In this lab, you will learn how to:

  • Set up Firestore in Google Cloud.
  • Write database import code.
  • Generate a collection of customer data for testing.
  • Import the test customer data into Firestore.
  • Manipulate data in Firestore through the Console.
  • Add a developer to a Google Cloud project without giving them Firestore access.

Prerequisites

This is a fundamental level lab. This assumes familiarity with the Cloud Console and shell environments. Experience with Firebase will be helpful, but is not required.

You should also be comfortable editing files. You can use your favorite text editor (like nano, vi, etc.) or you can launch the code editor from Cloud Shell, which can be found in the top ribbon:

OpenEditor11.png

Once you're ready, scroll down and follow the steps below to setup your lab environment.

Setup

This lab provisions two usernames. One to login to Google Cloud. Another for learning how to add a developer to a Google Cloud project without giving them Firestore access. While following the setup instructions, use Username 1 as your login.

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab---remember, once you start, you cannot pause a lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab to avoid extra charges to your account.

How to start your lab and sign in to the Google Cloud Console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is the Lab Details panel with the following:

    • The Open Google Console button
    • Time remaining
    • The temporary credentials that you must use for this lab
    • Other information, if needed, to step through this lab
  2. Click Open Google Console. The lab spins up resources, and then opens another tab that shows the Sign in page.

    Tip: Arrange the tabs in separate windows, side-by-side.

    Note: If you see the Choose an account dialog, click Use Another Account.
  3. If necessary, copy the Username from the Lab Details panel and paste it into the Sign in dialog. Click Next.

  4. Copy the Password from the Lab Details panel and paste it into the Welcome dialog. Click Next.

    Important: You must use the credentials from the left panel. Do not use your Google Cloud Skills Boost credentials. Note: Using your own Google Cloud account for this lab may incur extra charges.
  5. Click through the subsequent pages:

    • Accept the terms and conditions.
    • Do not add recovery options or two-factor authentication (because this is a temporary account).
    • Do not sign up for free trials.

After a few moments, the Cloud Console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left. Navigation menu icon

Activate Cloud Shell

Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.

  1. Click Activate Cloud Shell Activate Cloud Shell icon at the top of the Google Cloud console.

  2. Click Continue.

It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. The output contains a line that declares the PROJECT_ID for this session:

Your Cloud Platform project in this session is set to YOUR_PROJECT_ID

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  1. (Optional) You can list the active account name with this command:

gcloud auth list

Output:

ACTIVE: * ACCOUNT: student-01-xxxxxxxxxxxx@qwiklabs.net To set the active account, run: $ gcloud config set account `ACCOUNT`
  1. (Optional) You can list the project ID with this command:

gcloud config list project

Output:

[core] project = <project_ID>

Example output:

[core] project = qwiklabs-gcp-44776a13dea667a6 Note: For full documentation of gcloud, in Google Cloud, refer to the gcloud CLI overview guide.

Set up Firestore in Google Cloud

Patrick's task is to upload Pet Theory's existing data to a Cloud Firestore database. He will work closely with Ruby to accomplish this goal. Ruby receives a message from Patrick in IT...

image

Patrick, IT Administrator

Hi Ruby,

Our first step in going serverless is creating a Firestore database with Google Cloud. Can you help with this task? I am not very familiar with setting this up.

Patrick

image

Ruby, Software Consultant

Hey Patrick,

Sure, I would be happy to help with that. I'll send you some resources to get started, let's get in touch once you're done creating the database.

Ruby

Help Patrick set up a Firestore database through the Cloud Console.

  1. In the Cloud Console, go to the Navigation menu and select Firestore

32a62c62fa2a9ac5.png

  1. Click the Select Native Mode button.

9214aff5ce713dcd.png

Both modes are high performing with strong consistency, but they look different and are optimized for different use cases.
  • Native Mode is good for letting lots of users access the same data at the same time (plus, it has features like real-time updates and direct connection between your database and a web/mobile client
  • Datastore Mode puts an emphasis on high throughput (lots of reads and writes).
  1. In the Select a location dropdown, choose a database region closest to your location and then click Create Database.

On completion of the task, Ruby emails Patrick...

ruby-image

Ruby, Software Consultant

Hey Patrick,

Great work setting up the Firestore database! To manage database access, we will use a Service Account that has been automatically created with the necessary privileges.

We are now ready to migrate from the old database to Firestore.

Ruby

image

Patrick, IT Administrator

Hey Ruby,

Thanks for the help, setting up the Firestore database was straightforward.

I hope the database import process will be easier than it is with the legacy database, which is quite complex and requires a lot of steps.

Patrick

Write database import code

The new Cloud Firestore database is in place, but it's empty. The customer data for Pet Theory still only exists in the old database.

Patrick sends a message to Ruby...

image

Patrick, IT Administrator

Hi Ruby,

My manager would like to begin migrating the customer data to the new Firestore database.

I have exported a CSV file from our legacy database, but it's not clear to me how to import this data into Firestore.

Any chance you can lend me a hand?

Patrick

ruby-image

Ruby, Software Consultant

Hey Patrick,

Sure, let's set up a meeting to discuss what needs to be done.

Ruby

As Patrick said, the customer data will be available in a CSV file. Help Patrick create an app that reads customer records from a CSV file and writes them to Firestore. Since Patrick is familiar with Javascript, build this application with the Node.js JavaScript runtime.

  1. In Cloud Shell, run the following command to clone the Pet Theory repository:

git clone https://github.com/rosera/pet-theory
  1. Use the Cloud Shell Code Editor (or your preferred editor) to edit your files. From the top ribbon of your Cloud Shell session, click Open Editor, it will open a new tab. If prompted, click Open in a new window to launch the code editor:

OpenEditor11.png

  1. Then change your current working directory to lab01:

cd pet-theory/lab01

In the directory you can see Patrick's package.json. This file lists the packages that your Node.js project depends on and makes your build reproducible, and therefore easier to share with others.

An example package.json is shown below:

{ "name": "lab01", "version": "1.0.0", "description": "This is lab01 of the Pet Theory labs", "main": "index.js", "scripts": { "test": "echo \"Error: no test specified\" && exit 1" }, "keywords": [], "author": "Patrick - IT", "license": "MIT", "dependencies": { "csv-parse": "^4.4.5" } }

Now that Patrick has his source code imported, he gets in touch with Ruby to see what packages he needs to make the migration work.

patrick-image

Patrick, IT Administrator

Hi Ruby,

The code I use for the legacy database is pretty basic, it just creates a CSV ready for the import process. Anything I need to download before I get started?

Patrick

ruby-image

Ruby, Software Consultant

Hi Patrick,

I would suggest using one of the many @google-cloud Node packages to interact with Firestore.

We should then only need to make small changes to the existing code since the heavy lifting has been taken care of.

Ruby

To allow Patrick's code to write to the Firestore database, you need to install some additional peer dependancies.

  1. Run the following command to do so:

npm install @google-cloud/firestore
  1. To enable the app to write logs to Cloud Logging, install an additional module:

npm install @google-cloud/logging

After successful completion of the command, the package.json will be automatically updated to include the new peer dependencies, and will look like this.

... "dependencies": { "@google-cloud/firestore": "^2.4.0", "@google-cloud/logging": "^5.4.1", "csv-parse": "^4.4.5" }

Now it's time to take a look at the script that reads the CSV file of customers and writes one record in Firestore for each line in the CSV file. Patrick's original application is shown below:

const {promisify} = require('util'); const parse = promisify(require('csv-parse')); const {readFile} = require('fs').promises; if (process.argv.length < 3) { console.error('Please include a path to a csv file'); process.exit(1); } function writeToDatabase(records) { records.forEach((record, i) => { console.log(`ID: ${record.id} Email: ${record.email} Name: ${record.name} Phone: ${record.phone}`); }); return ; } async function importCsv(csvFileName) { const fileContents = await readFile(csvFileName, 'utf8'); const records = await parse(fileContents, { columns: true }); try { await writeToDatabase(records); } catch (e) { console.error(e); process.exit(1); } console.log(`Wrote ${records.length} records`); } importCsv(process.argv[2]).catch(e => console.error(e));

It takes the output from the input CSV file and imports it into the legacy database. Next, update this code to write to Firestore.

  1. Open the file pet-theory/lab01/importTestData.js.

To reference the Firestore API via the application, you need to add the peer dependency to the existing codebase.

  1. Add the following Firestore dependency on line 4 of the file:

const {Firestore} = require('@google-cloud/firestore');

Ensure that your code looks like the following:

const {promisify} = require('util'); const parse = promisify(require('csv-parse')); const {readFile} = require('fs').promises; const {Firestore} = require('@google-cloud/firestore'); // add this

Integrating with the Firestore database can be achieved with a couple of lines of code. Ruby has shared some template code with you and Patrick for exactly that purpose.

  1. Add the following code underneath line 9, or the if (process.argv.length < 3) conditional:

const db = new Firestore(); function writeToFirestore(records) { const batchCommits = []; let batch = db.batch(); records.forEach((record, i) => { var docRef = db.collection('customers').doc(record.email); batch.set(docRef, record); if ((i + 1) % 500 === 0) { console.log(`Writing record ${i + 1}`); batchCommits.push(batch.commit()); batch = db.batch(); } }); batchCommits.push(batch.commit()); return Promise.all(batchCommits); }

The above code snippet declares a new database object, which references the database created earlier in the lab. The function uses a batch process in which each of the records is processed in turn and sets a document reference based on the identifier added. At the end of the function, the batch content is written to the database.

  1. Finally, you need to add a call to the new function. Update the importCsv function to add the function call to writeToFirestore and remove the call to writeToDatabase. It should look like this:

async function importCsv(csvFileName) { const fileContents = await readFile(csvFileName, 'utf8'); const records = await parse(fileContents, { columns: true }); try { await writeToFirestore(records); // await writeToDatabase(records); } catch (e) { console.error(e); process.exit(1); } console.log(`Wrote ${records.length} records`); }
  1. Next, add logging for the application. To reference the Logging API via the application, add the peer dependency to the existing codebase. Add the line const {Logging} = require('@google-cloud/logging'); just below the other require statements at the top of the file:

const {promisify} = require('util'); const parse = promisify(require('csv-parse')); const {readFile} = require('fs').promises; const {Firestore} = require('@google-cloud/firestore'); const {Logging} = require('@google-cloud/logging'); //add this
  1. Add few constant variables and initialize the Logging client. Add those just below the above lines in the file (~line 5), like this:

const logName = 'pet-theory-logs-importTestData'; // Creates a Logging client const logging = new Logging(); const log = logging.log(logName); const resource = { type: 'global', };
  1. Add code to write the logs in importCsv function just below the line console.log(Wrote ${records.length} records); which should look like this:

// A text log entry success_message = `Success: importTestData - Wrote ${records.length} records` const entry = log.entry({resource: resource}, {message: `${success_message}`}); log.write([entry]);

After these updates, your importCsv function code block should look like the following:

async function importCsv(csvFileName) { const fileContents = await readFile(csvFileName, 'utf8'); const records = await parse(fileContents, { columns: true }); try { await writeToFirestore(records); //await writeToDatabase(records); } catch (e) { console.error(e); process.exit(1); } console.log(`Wrote ${records.length} records`); // A text log entry success_message = `Success: importTestData - Wrote ${records.length} records` const entry = log.entry({resource: resource}, {message: `${success_message}`}); log.write([entry]); }

Now when the application code is running, the Firestore database will be updated with the contents of the CSV file. The function importCsv takes a filename and parses the content on a line by line basis. Each line processed is now sent to the Firestore function writeToFirestore, where each new record is written to the "customer" database.

Note: In a production environment, you will write your own version of the import script.

Create test data

Time to import some data! Patrick contacts Ruby about a concern he has about running a test with real customer data...

image

Patrick, IT Administrator

Hi Ruby,

I think it would be better if we don't use customer data for testing. We need to maintain customer privacy, but also need to have some confidence that our data import script works correctly.

Can you think of an alternative way to test?

Patrick

ruby-image

Ruby, Software Consultant

Hey Patrick,

Fair point, Patrick. This is a tricky area, as customer data may include personal identifiable information, also called PII.

I'll share some starter code with you to create pseudo customer data. We can then use this data to test the import script.

Ruby

Help Patrick get this pseudo-random data generator up and running.

  1. First, install the "faker" library, which will be used by the script that generates the fake customer data. Run the following command to update the dependency in package.json:

npm install faker@5.5.3
  1. Now open the file named createTestData.js with the code editor and inspect the code. Ensure it looks like the following:

const fs = require('fs'); const faker = require('faker'); function getRandomCustomerEmail(firstName, lastName) { const provider = faker.internet.domainName(); const email = faker.internet.email(firstName, lastName, provider); return email.toLowerCase(); } async function createTestData(recordCount) { const fileName = `customers_${recordCount}.csv`; var f = fs.createWriteStream(fileName); f.write('id,name,email,phone\n') for (let i=0; i<recordCount; i++) { const id = faker.datatype.number(); const firstName = faker.name.firstName(); const lastName = faker.name.lastName(); const name = `${firstName} ${lastName}`; const email = getRandomCustomerEmail(firstName, lastName); const phone = faker.phone.phoneNumber(); f.write(`${id},${name},${email},${phone}\n`); } console.log(`Created file ${fileName} containing ${recordCount} records.`); } recordCount = parseInt(process.argv[2]); if (process.argv.length != 3 || recordCount < 1 || isNaN(recordCount)) { console.error('Include the number of test data records to create. Example:'); console.error(' node createTestData.js 100'); process.exit(1); } createTestData(recordCount);
  1. Add Logging for the codebase. Reference the Logging API module from the application code with the following:

const fs = require('fs'); const faker = require('faker'); const {Logging} = require('@google-cloud/logging'); //add this
  1. Now add a few constant variables and initialize the Logging client. Add those just below the const statements:

const logName = 'pet-theory-logs-createTestData'; // Creates a Logging client const logging = new Logging(); const log = logging.log(logName); const resource = { // This example targets the "global" resource for simplicity type: 'global', };
  1. Add code to write the logs in the createTestData function just below the line console.log(Created file ${fileName} containing ${recordCount} records.); which will look like this:

// A text log entry const success_message = `Success: createTestData - Created file ${fileName} containing ${recordCount} records.` const entry = log.entry({resource: resource}, {name: `${fileName}`, recordCount: `${recordCount}`, message: `${success_message}`}); log.write([entry]);
  1. After updating, the createTestData function code block should look like this:

async function createTestData(recordCount) { const fileName = `customers_${recordCount}.csv`; var f = fs.createWriteStream(fileName); f.write('id,name,email,phone\n') for (let i=0; i<recordCount; i++) { const id = faker.datatype.number(); const firstName = faker.name.firstName(); const lastName = faker.name.lastName(); const name = `${firstName} ${lastName}`; const email = getRandomCustomerEmail(firstName, lastName); const phone = faker.phone.phoneNumber(); f.write(`${id},${name},${email},${phone}\n`); } console.log(`Created file ${fileName} containing ${recordCount} records.`); // A text log entry const success_message = `Success: createTestData - Created file ${fileName} containing ${recordCount} records.` const entry = log.entry({resource: resource}, {name: `${fileName}`, recordCount: `${recordCount}`, message: `${success_message}`}); log.write([entry]); }
  1. Run the following command to configure your Project ID in Cloud Shell, replacing PROJECT_ID with your Qwiklabs Project ID:

gcloud config set project PROJECT_ID
  1. Now set the project ID as an environment variable:

PROJECT_ID=$(gcloud config get-value project)
  1. Run the following command in Cloud Shell to create the file customers_1000.csv, which will contain 1000 records of test data:

node createTestData 1000

You should receive a similar output:

Created file customers_1000.csv containing 1000 records.
  1. Open the file customers_1000.csv and verify that the test data has been created.

Test Completed Task

Click Check my progress to verify your performed task. If you have successfully created a sample test data for the Firestore Database, you will see an assessment score.

Create test data for the Firestore Database

Import the test customer data

  1. To test the import capability, use both the import script and the test data created earlier:

node importTestData customers_1000.csv
  1. If you get an error that resembles the following:

Error: Cannot find module 'csv-parse'

Run the following command to add the csv-parse package to your environment:

npm install csv-parse
  1. Then run the command again. You should receive the following output:

Writing record 500 Writing record 1000 Wrote 1000 records
  1. At this point, if you are feeling adventurous, feel free to create a larger test data file and import it into the Firestore database!

node createTestData 20000 node importTestData customers_20000.csv

Over the past couple of sections you have seen how Patrick and Ruby have created test data and a script to import data into Firestore. Patrick now feels more confident about loading customer data into the Firestore database.

Test Completed Task

Click Check my progress to verify your performed task. If you have successfully imported sample test data into the Firestore Database, you will see an assessment score.

Import test data into the Firestore Database

Inspect the data in Firestore

With a little help from you and Ruby, Patrick has now successfully migrated the test data to the Firestore database. Open up Firestore and see the results!

  1. Return to your Cloud Console tab. In the Navigation menu click on Firestore. Once there, click on the pencil icon:

firestore.png

  1. Type in /customers and press Enter.

  2. Refresh your browser tab and you should see the following list of customers successfully migrated:

customers.png

Edit and delete data from Firestore

  1. Select a customer from the customers collection, then click the customer's phone number.

  2. An Edit Field popup appears. Make a change to the customer's phone number then click Update. That record has now been updated in the database:

phone-num.png

  1. Now hover over the customer's email and select Delete field. Confirm by clicking Delete.

When testing migration scripts, you often need to delete all records once your test is complete.

  1. Click the three vertically placed dots next to customers.

  2. Select Delete collection in the popup menu. Then click Cancel:

cancel.png

The records in the database are test data. Once real data is loaded, access can be locked down so only a few (or no) team members can edit and delete production data through this user interface.

Add a developer to the project without giving them Firestore access

Now that customer data has been pulled into Firestore, Patrick gets in touch with Ruby to plan the final phase of the database migration...

patrick-image

Patrick, IT Administrator

Hi Ruby,

I had a meeting with the Lily today, she is really pleased with the database migration to Firestore!

As part of that meeting, I was tasked with ensuring security permissions are set correctly.

Our developers should only be able to read the system log and to check-in source code. They should not be able to read or modify the data in Firestore. Can you help me set that up?

Patrick

ruby-image

Ruby, Software Consultant

Hi Patrick,

I'd be happy to!

I'll forward you some resources that will help you get set up.

Ruby

Help Patrick search a list of pre-defined roles for the roles that need to be allocated to Pet Theory developers.

  1. Open the understanding roles page and search for "view logs" on the page. You'll find a role called roles/logging.viewer that lets a member read the logs.

  2. Patrick also wants developers to be able to check code into source control. Search the page for the word "repository". There is a role called roles/source.writer that lets a member read and write to the source control repository.

  3. Now add these two roles to the developer, replacing [EMAIL] with the user 2nd User ID you have for this lab:

gcloud projects add-iam-policy-binding $PROJECT_ID \ --member=user:[EMAIL] --role=roles/logging.viewer

Test Completed Task

Click Check my progress to verify your performed task. If you have successfully added a logging.viewer role to the second user, you will see an assessment score.

Add a developer to the project without giving them Firestore access (role: logging.viewer) gcloud projects add-iam-policy-binding $PROJECT_ID \ --member=user:[EMAIL] --role roles/source.writer

Test Completed Task

Click Check my progress to verify your performed task. If you have successfully added a source.writer role to the second user, you will see an assessment score.

Add a developer to the project without giving them Firestore access (role: source.writer) For this lab you're using the email you logged in with, but in a production environment you use the user's email to assign the role.
  1. Copy your Console URL and open it in a new tab.

  2. In the new console, log out then log back in with the 2nd Qwiklabs Google Cloud Google account you have been provided.

You're now logged in as the Developer. The Developer can log in to the Cloud Console and open this project, but will only be able to view the logs and read/write to the source code repository. They will not be able to read or modify the Firebase database.

ruby-image

Ruby, Software Consultant

Hi Patrick,

It was great working with you today. We all made a lot of good progress!

Ruby

image

Patrick, IT Administrator

Hi Ruby,

Thanks for all the pointers! I'm really impressed with Firestore's ease of set up and flexibility.

Pretty amazing that I was able to take my legacy JavaScript code and repurpose it to populate the Firestore database with a couple of changes.

Thanks for taking the time to walk me through all of this, it is very much appreciated!

Patrick

Congratulations!

Throughout the course of this lab, you received hands-on practice with Firestore. After generating a collection of customer data for testing, you ran a script that imported the data into Firestore. You then learned how to manipulate data in Firestore through the Cloud Console. Finally, you added a developer to a Google Cloud project without giving them Firestore access.

Finish Your Quest

Pet_Theory_125.png Firebase_125x125

This self-paced lab is part of the Qwiklabs Google Cloud Run Serverless Workshop and Build Apps & Websites with Firebase Quests. A Quest is a series of related labs that form a learning path. Completing this Quest earns you the badge above, to recognize your achievement. You can make your badge (or badges) public and link to them in your online resume or social media account. Enroll in a Quest and get immediate completion credit if you've taken this lab. See other available Qwiklabs Quests.

Take Your Next Lab

Continue your quest with the next lab in the series, Build a Serverless Web App with Firebase.

End your lab

When you have completed your lab, click End Lab. Your account and the resources you've used are removed from the lab platform.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

  • 1 star = Very dissatisfied
  • 2 stars = Dissatisfied
  • 3 stars = Neutral
  • 4 stars = Satisfied
  • 5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Google Cloud training and certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated April 4, 2022
Lab Last Tested April 4, 2022

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.