De-identifying DICOM Data with the Healthcare API

Join Sign in

De-identifying DICOM Data with the Healthcare API

1 hour 15 minutes universal_currency_alt 5 Credits


Google Cloud self-paced labs logo


In this lab you will discover and use the de-identification functionality of Cloud Healthcare API using Digital Imaging and Communications in Medicine (DICOM) data model.

In this lab, you will:

  • Gain a general understanding of Cloud Healthcare API and its role in managing healthcare data.
  • Learn how to create Cloud Healthcare API datasets and stores.
  • Import and Export DICOM data using the Cloud Healthcare API.

Setup and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab---remember, once you start, you cannot pause a lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab to avoid extra charges to your account.

How to start your lab and sign in to the Google Cloud console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is the Lab Details panel with the following:

    • The Open Google Cloud console button
    • Time remaining
    • The temporary credentials that you must use for this lab
    • Other information, if needed, to step through this lab
  2. Click Open Google Cloud console (or right-click and select Open Link in Incognito Window if you are running the Chrome browser).

    The lab spins up resources, and then opens another tab that shows the Sign in page.

    Tip: Arrange the tabs in separate windows, side-by-side.

    Note: If you see the Choose an account dialog, click Use Another Account.
  3. If necessary, copy the Username below and paste it into the Sign in dialog.

    {{{user_0.username | "Username"}}}

    You can also find the Username in the Lab Details panel.

  4. Click Next.

  5. Copy the Password below and paste it into the Welcome dialog.

    {{{user_0.password | "Password"}}}

    You can also find the Password in the Lab Details panel.

  6. Click Next.

    Important: You must use the credentials the lab provides you. Do not use your Google Cloud account credentials. Note: Using your own Google Cloud account for this lab may incur extra charges.
  7. Click through the subsequent pages:

    • Accept the terms and conditions.
    • Do not add recovery options or two-factor authentication (because this is a temporary account).
    • Do not sign up for free trials.

After a few moments, the Google Cloud console opens in this tab.

Note: To view a menu with a list of Google Cloud products and services, click the Navigation menu at the top-left. Navigation menu icon

Activate Cloud Shell

Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.

  1. Click Activate Cloud Shell Activate Cloud Shell icon at the top of the Google Cloud console.

When you are connected, you are already authenticated, and the project is set to your Project_ID, . The output contains a line that declares the Project_ID for this session:

Your Cloud Platform project in this session is set to {{{project_0.project_id | "PROJECT_ID"}}}

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  1. (Optional) You can list the active account name with this command:
gcloud auth list
  1. Click Authorize.


ACTIVE: * ACCOUNT: {{{user_0.username | "ACCOUNT"}}} To set the active account, run: $ gcloud config set account `ACCOUNT`
  1. (Optional) You can list the project ID with this command:
gcloud config list project


[core] project = {{{project_0.project_id | "PROJECT_ID"}}} Note: For full documentation of gcloud, in Google Cloud, refer to the gcloud CLI overview guide.

Task 1. Create Healthcare dataset

In this exercise you will use the UI to create a Cloud Healthcare API dataset.

  1. Under the Navigation Menu (Navigation menu icon), select Healthcare > Browser and then Enable the API.

The Healthcare option selected on the expanded Navigation menu

  1. Once the API is enabled, in the Healthcare browser select Create Dataset.

  2. Name the dataset dataset1 within region and click Create.

The Create dataset page displaying the populated dataset name and region fields

Click Check my progress to verify the objective. Create Healthcare Dataset

Task 2. Set up IAM permissions

  1. From the Navigation menu (Navigation menu icon), go to IAM & admin > IAM.

  2. In the IAM page, select the Include Google-provided role grants checkbox.

  3. Edit the permissions for your Healthcare Service Agent by locating the service agent under the IAM list and selecting the pencil icon. The service account will have the Domain

  4. Click Add another role to add additional roles to the Healthcare Service Agent account.

  5. Click inside the Select a roll box and choose the following roles:

  • Cloud Storage > Storage Object Admin
  • Cloud Healthcare > Healthcare Dataset Administrator
  • Cloud Healthcare > Healthcare DICOM Editor
  1. After all of the roles are added, select Save to commit your updates.

Task 3. Enable data access logs on Cloud Healthcare

  1. From the IAM & Admin menu, navigate to Audit Logs.

  2. Scroll or use the filter box to locate Cloud Healthcare, then check the box next to it to select.

  3. If the info panel isn't already open on the right side of the interface, click the Show Info Panel link.

The Show Info Panel link in the UI

  1. Select Data Read and Data Write, then click Save.

The Log Type tabbed page displaying the selected Data Read and Data Write checkboxes

Click Check my progress to verify the objective. Set up IAM Permissions

Task 4. Define variables needed

  • In Cloud Shell, export the variables needed for the lab:
export PROJECT_ID=`gcloud config get-value project` export REGION={{{ project_0.default_region | "REGION" }}} export DATASET_ID=dataset1 export DICOM_STORE_ID=dicomstore1

Task 5. Create data stores

Data in Cloud Healthcare API datasets and stores can be accessed and managed using a REST API that identifies each store using its project, location, dataset, store type and store name. This API implements modality-specific standards for access that are consistent with industry standards for that modality. For example, the Cloud Healthcare DICOM API natively provides operations for reading DICOM studies and series that are consistent with the DICOMweb standard, and supports the DICOM DIMSE C-STORE protocol via an open-source adapter.

  1. Call the API to create a DICOM store:
gcloud beta healthcare dicom-stores create $DICOM_STORE_ID --dataset=$DATASET_ID --location=$REGION

The server returns a path to the newly created store.

Users can also use the curl utility to issue Cloud Healthcare API calls. curl is pre-installed in your Cloud Shell machine. By default, curl does not show HTTP status codes or session-related information; if you would like to see this information please add the -v option to all commands in this tutorial.

  1. Try creating a secondary FHIR store by using the below command:
curl -X POST \ -H "Authorization: Bearer "$(sudo gcloud auth print-access-token) \ -H "Content-Type: application/json; charset=utf-8" \ "$PROJECT_ID/locations/$REGION/datasets/$DATASET_ID/dicomStores?dicomStoreId=dicomstore2"

Operations that access a modality-specific store use a request path that is comprised of two pieces: a base path, and a modality-specific request path.

Administrative operations—which generally operate only on locations, datasets and stores—may only use the base path. Data modality-specific retrieval operations use both the base path (for identifying the store to be accessed) and request path (for identifying the actual data to be retrieved).

Click Check my progress to verify the objective. Create data stores

Note: If this check fails, wait a minute and try again. It often takes a minute or two for the import operation to be logged.

Task 6. Import to DICOM datasets

In this section you will be importing data from the NIH Chest x-ray data set to a DICOM store. For more information on the public dataset, visit the NIH Chest X-ray dataset documentation.

  • Call the API to use the import functionality:
gcloud beta healthcare dicom-stores import gcs $DICOM_STORE_ID --dataset=$DATASET_ID --location=$REGION --gcs-uri=gs://spls/gsp626/LungCT-Diagnosis/R_004/*

Click Check my progress to verify the objective. Import to DICOM Datasets

Task 7. Configure OHIF Viewer

The Open Health Imaging Foundation (OHIF) Viewer is an open source, web-based, medical imaging viewer. You will use OHIF Viewer in this lab to view your DICOM dataset.

The following steps will walk through setting up OHIF Viewer to view your dataset:

  1. First, select APIs & Services > OAuth Consent Screen from the Navigation Window to create an OAuth Consent screen:

The highlighted navigation path to the OAuth Consent Screen option

  1. At the OAuth Consent Screen, select Internal and click Create:

The User Type section displaying the selected Intenal option and highlighted Create button

  1. Fill out the following on the Edit app registration window:
  • App name: QL-de-identify
  • User support email: YOUR STUDENT EMAIL (this is provided by the lab)
  • Developer contact information: YOUR STUDENT EMAIL (same value as user support email)

The populated Edit app registration window

  1. Click Save and Continue.

  2. At the Scopes tab, click the Add or Remove Scopes button.

  3. Scroll to the bottom of the pop-up window to the Manually add scopes section.

  4. Add the following scopes:

The Manually add scope section displaying the aforementioned scopes

  1. Click Add to table and then click Update.

  2. Scroll to the bottom of the Scopes tab and click Save and Continue.

Next, you'll need an OAuth Client ID to connect OHIF Viewer to your Cloud Healthcare resources.

  1. Select Credentials from the APIS & Services menu:

The Credentials option selected on the APIS and Services menu

  1. In the Credentials page, click + Create Credentials > OAuth Client ID:

The expanded Create Credentials menu displaying the selected OAuth Client ID option

  1. For your Application Type, choose Web application.

You will need to return to your client ID and fill out the domains once your OHIF Viewer application has been launched.

  1. So, for now, leave everything as default and click Create.

You'll now the see your Client ID and Client Secret in the next window.

  1. Click OK to close the window.

Now, deploy the OHIF Viewer container to Cloud Run and connect it with your OAuth Client ID.

To simplify the setup, the OHIF Viewer docker image already exists in container registry in a project you have access to, so you can directly deploy the container to Cloud Run.

  1. In Cloud Shell, deploy the OHIF Viewer container to Cloud Run with this command substituting PASTE-CLIENT-ID-HERE with the Client ID of the OAuth Client you just created:
gcloud run deploy ohif-viewer --platform=managed --region={{{ project_0.default_region | "REGION" }}} --allow-unauthenticated --set-env-vars=CLIENT_ID=[PASTE-CLIENT-ID-HERE] --max-instances=3 Note: You can view and copy your Client ID in the Credentials tab:

The copy paste icon alongside the client ID on the OAuth 2.0 Client IDs page
  1. If asked to enable the Cloud Run API, enter y and continue.

Once your Cloud Run deployment completes, you will be given a unique service URL that looks similar to this:

Service URL:
  1. You can now return to your OAuth Client ID and update the domains with this Service URL.

  2. If you're not still on the Credentials page, select APIs & Services > Credentials from the Navigation Menu in your Cloud Console.

  3. Edit your Client ID by clicking the pencil icon.

The pencil icon hihglighted alongside the client ID on the OAuth 2.0 Client IDs page

  1. Add your unique service URL to Authorized Javascript Origins.

  2. Add your unique service URL + /callback to Authorized Redirect URIs.

  3. Click Save.

Task 8. Using De-identification

De-identification (redacting or transformation) of sensitive data elements is often an important step in pre-processing healthcare data so that it can be made available for analysis, machine learning models, and other use cases. Cloud Healthcare API has the capability to de-identify data stored in the service, facilitating analysis by researchers or machine learning analysis for advanced anomaly scans.

  1. First, navigate to the service URL of your ohif-viewer Cloud Run app and sign in using your lab credentials. If you've lost track of your service URL, you can find it again with this command:
gcloud run services list --platform managed
  1. Once on the OHIF-Viewer page, select your Project ID for the Project.

The Google Cloud Healthcare API window displaying the list of Projects and their IDs

  1. Select for the location.

  2. Select dataset1 for your dataset.

  3. Select dicomstore1 in the DICOM Store window.

You'll see one entry, R_004 with info for its ID number, Study Date, and Description:

  1. Click on the entry to inspect it further and view the associated images.

  2. This dataset contains pre-surgery images of a chest. You can scroll through them to view them all.

  3. When you're done looking at it, press the Back button on your browser to return to the OHIF-Viewer main menu.

Next, you will de-identify this dataset.

  1. Navigate back to Cloud Shell and issue the following request to de-identify the dataset:
curl -X POST \ -H "Authorization: Bearer "$(gcloud auth print-access-token) \ -H "Content-Type: application/json; charset=utf-8" \ --data "{ 'destinationDataset': 'projects/$PROJECT_ID/locations/$REGION/datasets/de-id', 'config': { 'dicom': { 'filterProfile': 'ATTRIBUTE_CONFIDENTIALITY_BASIC_PROFILE' }, 'image': { 'textRedactionMode': 'REDACT_NO_TEXT' } } }" "$PROJECT_ID/locations/$REGION/datasets/$DATASET_ID:deidentify"

With our small dataset, this operation will be done quickly, but on a larger dataset this operation can take a few minutes.

  1. You can issue a rest request to check the status of a long running operation, replacing <operation-ID> with the operations ID issued in the previous output:
curl -X GET \ "$PROJECT_ID/locations/$REGION/datasets/$DATASET_ID/operations/<operation-id>" \ -H "Authorization: Bearer "$(sudo gcloud auth print-access-token) \ -H 'Content-Type: application/json; charset=utf-8'

If you see "done": true in the output of the previous command, you can be sure that your operation is complete.

Once the operation is complete a new de-id dataset will appear on the Healthcare UI page in the Console.

  1. Confirm the identifiable information has been redacted by returning to your OHIF-Viewer browser tab and selecting the Change DICOM Store button.

  2. In the window that pops up, select your Qwiklabs Project ID as the Project.

  3. Select for the location.

  4. Select de-id as the dataset.

  5. Select dicomstore1 for the DICOM Store.

You'll now see one entry in the DICOM Store, but the outward facing information/tags have been removed:

  1. Select the entry to confirm it's the same images copied from the previous dataset but with most of its information removed.

Click Check my progress to verify the objective. Using De-identification

Task 9. Converting DICOM Images

From the Navigation menu, navigate to Cloud Storage > Buckets.

  1. Click Create bucket.

  2. Fill out the first box with a unique name and click Continue.

  3. Set the Location type to Region and select the region .

  4. Click Create.

  5. Using Cloud Shell export the variable for your newly created bucket, replacing with you bucket's name:

export BUCKET_ID=<name of bucket>

Now you can export the DICOM images into JPEG or PNG using a gcloud command.

  1. Export the DICOM images into JPEG:
gcloud beta healthcare dicom-stores export gcs $DICOM_STORE_ID --dataset=$DATASET_ID --gcs-uri-prefix=gs://$BUCKET_ID/ --mime-type="image/jpeg; transfer-syntax=1.2.840.10008." --location=$REGION


Export the DICOM images into PNG:

gcloud beta healthcare dicom-stores export gcs $DICOM_STORE_ID --dataset=$DATASET_ID --gcs-uri-prefix=gs://$BUCKET_ID/ --mime-type="image/png" --location=$REGION
  1. In the Console, from the Navigation menu navigate to Cloud Storage and click on your bucket.

  2. Select a folder, click on an image, then click on the Link URL. This will download the image.

  3. You can check the file extension to verify your file is correct or click the image to view.

Click Check my progress to verify the objective. Converting DICOM Images

Lab review

Cloud Healthcare API provides a comprehensive facility for ingesting, storing, managing, and securely exposing healthcare data in FHIR, DICOM, and HL7 v2 formats. Using Cloud Healthcare API, you can ingest and store data from electronic health records systems (EHRs), radiological information systems (RISs), and custom healthcare applications. You can then immediately make that data available to applications for analysis, machine learning prediction and inference, and consumer access.

Cloud Healthcare API enables application access to healthcare data via widely-accepted, standards-based interfaces such as FHIR STU3 and DICOMweb. These APIs allow data ingestion into modality-specific data stores, which support data retrieval, update, search and other functions using familiar standards-based interfaces.

Further, the API integrates with other capabilities in Google Cloud through two primary mechanisms:

  • Cloud Pub/Sub, which provides near-real-time updates when data is ingested into a Cloud Healthcare API data store, and
  • Import/export APIs, which allow you to integrate Cloud Healthcare API into both Google Cloud Storage and Google BigQuery.

Using Cloud Pub/Sub with Google Cloud Functions enables you to invoke machine learning models on healthcare data, storing the resulting predictions back in Cloud Healthcare API data store. A similar integration with Cloud Dataflow supports transformation and cleansing of healthcare data prior to use by applications.

To support healthcare research, Cloud Healthcare API offers de-identification capabilities for FHIR and DICOM. This feature allows customers to share data with researchers working on new cutting-edge diagnostics and medicines.


In this lab you:

  • Gained a general understanding of Cloud Healthcare API and its role in managing healthcare data.
  • Learned how to create datasets and stores for FHIR and DICOM data.
  • Imported FHIR and DICOM data.

Finish your quest

This self-paced lab is part of the Cloud Healthcare API quest. A quest is a series of related labs that form a learning path. Completing this quest earns you a badge to recognize your achievement. You can make your badge or badges public and link to them in your online resume or social media account. Enroll in any quest that contains this lab and get immediate completion credit. See the Google Cloud Skills Boost catalog to see all available quests

Take your next lab

Continue your quest with Ingesting FHIR Data with the Healthcare API or try one of these suggestions:

Manual Last Updated October 26, 2023

Lab Last Tested October 27, 2023

Copyright 2024 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.