Getting Started with Splunk Cloud GDI on Google Cloud

Join Sign in

Getting Started with Splunk Cloud GDI on Google Cloud

1 hour 30 minutes 5 Credits

This lab was developed with our partner, Splunk. Your personal information may be shared with Splunk, the lab sponsor, if you have opted in to receive product updates, announcements, and offers in your Account Profile.

Note: this lab requires a partner trial account. Please follow the lab instructions to create your trial account before starting the lab.


Google Cloud selp-paced labs logo


In this hands-on lab you'll learn how to configure Google Cloud to send logging and other infrastructure data to Splunk Cloud via Dataflow, the Splunk Add-on for Google Cloud Platform, and Splunk Connect for Kubernetes (SC4K).

Although you can easily copy and paste commands from the lab to the appropriate place, students should type the commands themselves to reinforce their understanding of the core concepts.


In this lab, you will:

  • Create a Splunk Cloud trial

  • Install the Splunk Add-on for Google Cloud Platform (GCP-TA)

  • Create Splunk indexes

  • Create Splunk HTTP Event Collectors (HECs)

  • Create log sinks

  • Create Cloud Storage buckets

  • Create Pub/Sub topics and subscriptions

  • Launch a Dataflow template deployment

  • Configure GCP-TA inputs

  • Perform sample Splunk searches across ingested data

  • Monitor and troubleshoot Dataflow pipelines

  • Deploy a demo "Online Boutique" microservice in GKE (optional)

  • Install Splunk Connect for Kubernetes (SC4K) (optional)


  • Familiarity with Splunk is beneficial

Architecture you'll configure



Splunk Cloud trial

Prior to starting the timer on this lab, please create and configure a Splunk Cloud trial.

Sign-up for a free trial account

Visit to sign up for a free trial account.


Access your trial account

Once you have logged into, click on "Free Splunk." Then click "Access Free 14-day Trial" and then click "Start trail".

Login to your Splunk Cloud environment

Credentials for your trial Splunk Cloud environment will be sent to your email. Please check your spam folder if you do not see the email in your inbox.


Once you’ve logged into your Splunk Cloud instance, you will be asked to change the password and accept the terms of service.


Install the Splunk Add-on for Google Cloud

Next, install the Splunk Add-on for Google Cloud. On the left pane click on "+ Find More Apps"


Search for "Google Cloud" in the search box. To narrow search results, ensure "IT Operations" and "Business Analytics" are checked under Category. Choose "Add-on" under App Type, "Splunk" under Support Type, "Inputs" under App Content, and "Yes" under Fedramp.


Look for the "Splunk Add-on for Google Cloud Platform" search result under Best Match. Click on the Install button.


You will be required to enter your credentials. This should be the same account you used to initiate the Splunk Cloud trial. Ensure you have placed a check in the box indicating you have reviewed the Splunk software terms and conditions. Click the "Login and Install" button to proceed.


Once you have logged in successfully, you will see a screen indicating that the add-on is downloading and installing.

If you encounter an error with your username/password ensure that you have verified your e-mail address (a confirmation is sent via e-mail to confirm the address).


Finally, you will see a screen indicating the installation is complete. Click "Done."


You may see a warning in the "Messages" dropdown section of the top navigation bar indicating that Splunk must be restarted. For the purposes of this lab, you can safely ignore this message.


Create Indexes

An index is a repository for Splunk data. Splunk Cloud transforms incoming data into events which it then persists to an index.

You will need to create the following event indexes for this lab:

  • gcp_data - For data from Splunk Dataflow template
  • gcp_ta - For data from the Splunk Add-on for Google Cloud
  • gcp_connect - For data from Splunk Connect for Kubernetes (SC4K)

You will need to create the following metric indexes for this lab:

  • gcp_metrics - For metric data from Splunk Connect for Kubernetes (SC4K)

To create an index, start by selecting the "Search and Reporting App" in the top left navigation bar.


Next, click on "Settings." Once this menu is expanded, click "Indexes."


Click "New Index."


Use an "Index name" of "gcp_data" and leave "Index Data Type" set as "Events." Set "Max raw data size" to "0" and "Searchable retention (days)" to "15." Click "Save" to create the index.


Following the same aforementioned gcp_data steps above, create the other gcp_ta, and gcp_connect indexes. Use the same type, size, and searchable time.

You will also need to create a metrics index for gcp_metrics. The steps are the same as previous indexes, with the exception of selecting a "Metrics" index rather than an "Events" index during creation.


Create HECs

The HTTP Event Collector (HEC) is a fast and efficient way to send data over HTTP (or HTTPS) to Splunk Cloud from a logging source such as Splunk Connect for Kubernetes (SC4K) or the Splunk Dataflow template. In this section, you will create HEC endpoints along with corresponding authentication tokens.


Click "Settings > Data Inputs" in the Splunk Cloud top navigation.


Click on "Add New" next to HTTP Event Collector.


Name the HEC "gcp-sc4k" and leave other fields as default. Click Next.


In the "Selected Allow Indexes" chooser, select the gcp_connect and gcp_metrics indexes created in the previous step. Select the gcp_connect index as the default index.


Review and submit the HEC configuration.


Copy the token value to a temporary scratch file. You will be using this token later in the lab.


Dataflow HEC

Click "Settings > Data Inputs" in the Splunk Cloud top navigation.


Click on "Add New" next to HTTP Event Collector.


Name the HEC "gcp-dataflow" and leave other fields as default. Click Next.


In the "Source type" section, click "Select" and specify google:gcp:pubsub:message as the source type.

In the "Selected Allow Indexes" chooser, select the gcp_data index. Select the gcp_data index as the default index.


Review and submit the HEC configuration.


Copy the token value to a temporary scratch file. You will be using this token later in the lab.


Qwiklabs setup

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab---remember, once you start, you cannot pause a lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab to avoid extra charges to your account.

How to start your lab and sign in to the Google Cloud Console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is the Lab Details panel with the following:

    • The Open Google Console button
    • Time remaining
    • The temporary credentials that you must use for this lab
    • Other information, if needed, to step through this lab
  2. Click Open Google Console. The lab spins up resources, and then opens another tab that shows the Sign in page.

    Tip: Arrange the tabs in separate windows, side-by-side.

    Note: If you see the Choose an account dialog, click Use Another Account.
  3. If necessary, copy the Username from the Lab Details panel and paste it into the Sign in dialog. Click Next.

  4. Copy the Password from the Lab Details panel and paste it into the Welcome dialog. Click Next.

    Important: You must use the credentials from the left panel. Do not use your Google Cloud Skills Boost credentials. Note: Using your own Google Cloud account for this lab may incur extra charges.
  5. Click through the subsequent pages:

    • Accept the terms and conditions.
    • Do not add recovery options or two-factor authentication (because this is a temporary account).
    • Do not sign up for free trials.

After a few moments, the Cloud Console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left. Navigation menu icon

Activate Cloud Shell

Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.

  1. In the Cloud Console, in the top right toolbar, click the Activate Cloud Shell button.

Cloud Shell icon

  1. Click Continue.

It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. The output contains a line that declares the PROJECT_ID for this session:

Your Cloud Platform project in this session is set to YOUR_PROJECT_ID

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  1. (Optional) You can list the active account name with this command:

gcloud auth list


ACTIVE: * ACCOUNT: To set the active account, run: $ gcloud config set account `ACCOUNT`
  1. (Optional) You can list the project ID with this command:

gcloud config list project


[core] project = <project_ID>

(Example output)

[core] project = qwiklabs-gcp-44776a13dea667a6 For full documentation of gcloud, in Google Cloud, Cloud SDK documentation, see the gcloud command-line tool overview.

Setting Environment Variables


This section is only required if you're performing lab steps via the CLI

  1. Launch Cloud Shell.

  2. Ensure the Project ID variables is set:


If the environment variable is not set, please follow the steps above under "Google Cloud Shell"

  1. Set the environment variables

First, assign several Splunk Cloud-specific environment variables. You will need to supply the hostname of your Splunk Cloud instance along with the HEC tokens you created in the previous HTTP Event Collector setup steps.


For example:

export export SC4K_HEC_TOKEN=bc77efcf-fc60-494f-b80c-52701d7901d4 export DATAFLOW_HEC_TOKEN=bf4bae6f-f9c8-4f5c-b349-3cf77c9baa16

Additionally, please set the following environment variables:

# Common export SINK_NAME=splunk-dataflow-sink-cli export SINK_TOPIC=splunk-dataflow-sink export DISABLE_CERT_VALIDATION=true # Dataflow export DEADLETTER_TOPIC=splunk-dataflow-deadletter export DATAFLOW_SUB=dataflow-sub export DEADLETTER_SUB=deadletter-sub export MAX_WORKERS=4 export MACHINE=n1-standard-1 export HEC_URL=https://${SPLUNK_HOSTNAME}:8088 export BATCH_COUNT=10 export PARALLELISM=4 export DATAFLOW_FORMAT_LIKE_PUBSUB=true # GCP-TA export SPLUNK_SERVICE_ACCOUNT=splunk-ta export TA_SUBSCRIPTION=ta-subscription

Creating a Log Sink

The first step to getting data from Operations Logging (Stackdriver) to Splunk is to create a log sink. All logging data for Google Cloud is sent to Operations Logging; the sink exports that data real-time to another location (Pub/Sub, BigQuery, Cloud Storage). You will forward the logs on to Pub/Sub for processing.

You have to be careful that you don't create an infinite loop of logging. If you don't have exclusions on the log sink then the system will attempt to send the log event of it sending a previous event which creates another log event that it will then try to forward.

This process also creates the destination Pub/Sub topic (automatically in the UI, manually via the CLI).

Cloud Console

  1. In the Cloud Console go to Analytics > Pub/Sub > Topics.


  1. Click on Create Topic.
  1. Name the sink splunk-dataflow-sink, leave the default values, and click Create Topic.


  1. Next, in the Cloud Console go to Operation > Logging > Logs Router.
  1. Click on Create Sink.
  1. Name the sink splunk-dataflow-sink and click Next.


  1. Specify Splunk as the sink service and select the topic that you created above then click Next.


  1. Leave the inclusion filter blank in order to send all logs to Splunk unless excluded. Click Next.


  1. Click Add Exclusion to specify an exclusion filter to omit Dataflow logs.


  1. Once complete click on Create Sink.


Note: You don't need to do this section if you've perform the previous steps via the Console UI
  1. In Cloud Shell (with the environment variables set), create the Pub/Sub topic:

gcloud pubsub topics create ${SINK_TOPIC}
  1. Create the Log Sink

gcloud logging sinks create ${SINK_NAME} \${GOOGLE_CLOUD_PROJECT}/topics/${SINK_TOPIC} \ --log-filter="resource.type!=\"dataflow_step\""
  1. Set the environment variable for the correct service account

export SERVICE_ACCOUNT=`gcloud logging sinks describe ${SINK_NAME} --format="value(writerIdentity)"`
  1. Modify the IAM permissions of Pub/Sub topic to allow the log sink to publish

gcloud pubsub topics add-iam-policy-binding ${SINK_TOPIC} \ --member="${SERVICE_ACCOUNT}" --role="roles/pubsub.publisher"

Deploying the Pub/Sub to Splunk Dataflow Template

The next part of getting the logging sent to Splunk is the deployment of the Pub/Sub to Splunk Dataflow Template. This deploys a Dataflow pipeline that streams the events from a Pub/Sub subscription, batches them up, and sends them to Splunk HEC. Optionally (although not done in the lab), you can add an inline UDF function that manipulates the log messages. This could be a process to remove sensitive information or augment the message with additional data from another source.

The Pub/Sub to Splunk Dataflow template is just one way of sending data to Splunk. In a later section you will also explore using the Splunk Add-on for Google Cloud.

Enable Dataflow Service Account

  1. Go to Navigation menu > IAM & Admin > IAM.

  2. Click the pencil icon on the service account.


  1. Add the Dataflow Admin role and click Save.


Cloud Console

Now you will create a bucket for the Dataflow template used during deployment.

  1. Go to Navigation menu > Cloud Storage > Browser.

  2. Click on Create Bucket.

  • Give your bucket a globally unique name (<project-id>-dataflow would be unique).
  • Select Region for Location type and choose ‘us-central1' for the Location.
  • Click on Create (leaving the rest of the options as default).

Now you will create a Pub/Sub Topic for the dead letter queue.

Note: The dead letter queue is used to store events that are not processed successfully by the Dataflow pipeline. This will allow us to reprocess these events at a later time and will also be useful for troubleshooting the pipeline if any issues are encountered.
  1. Go to Navigation Menu > Analytics > Pub/Sub > Topics.
  • You'll see the one topic created for the logs router. Click Create Topic to create another.


  • Type splunk-dataflow-deadletter for Topic ID.
  • Leave the default values and click Create Topic.
  • You'll be forwarded to the topic page for the dead letter queue. Scroll down to subscriptions. Click on Create Subscription > Create Subscription to create a subscription to store items forwarded to the dead letter queue. If a topic doesn't have a subscription anything sent to it is discarded.


  • Type deadletter Subscription ID and leave all else as default. Click Create.


Note: The "enable dead lettering" option on the subscription creation page is used when the subscription (usually a push) fails to send the message to its target. This is not to be confused with the dead letter Pub/Sub topic you created for Dataflow, which is used in the scenario where the Dataflow template is unable to send the data to Splunk HEC. Therefore the "enable dead lettering" option here can be ignored. The Dataflow pipeline to Splunk has built-in capability to handle message retries.
  1. Create the Pub/Sub subscription for Dataflow to process logs. While the log sink step created the topic to dump the logs into you have to create a subscription to that topic so that the message in that topic are delivered to Dataflow.

  • Back under topics click on the initial Pub/Sub topic created (splunk-dataflow-sink) by the Log Router.


  • Scroll down on the topic page
  • Click on Create Subscription as you did for the dead letter topic
  • Type dataflow for Subscription ID and create the subscription for Dataflow (leave all other defaults). Click Create.


  • Click on the newly created subscription and note down the subscription name as you will need it in a later step (format should be projects/<your-project-id>/subscription/dataflow)

  1. Next, deploy the Dataflow Template. Go to Navigation menu > Analytics > Dataflow > Jobs.
  • Click on Create Job from Template.

  • Type splunk-dataflow for the Job name and select the Pub/Sub to Splunk template.


  • Enter the main required parameters:

Input Cloud Pub/Sub subscription




Output deadletter Pub/Sub topic


Temporary location



  • Click on Show Optional Parameters.

  • Set the parameters as shown:

    HEC Authentication token


    Batch size


    Maximum number of parallel requests


    Disable SSL certification validation


    Include full Pub/Sub message in the payload


    Max workers


  • Click on Run Job.
  • The job should take a few minutes to deploy. Once deployed, you can monitor the job throughput and other metrics using the Job Graph and Job Metrics tabs. For example, you can monitor the throughput metric to track the number of events processed over time.
Note: If you get a failure for insufficient quota ensure that you set the max-workers and machine-type to the values above. The default values are too large for the smaller lab environment.


Note: You don't need to do this section if you've perform the previous steps via the Console UI
  1. Enable the Dataflow API

gcloud services enable
  1. Create a bucket for the Dataflow template use during deployment

gsutil mb -l us-central1 gs://${GOOGLE_CLOUD_PROJECT}-dataflow
  1. Create a Pub/Sub topic for the deadletter queue and a subscription

gcloud pubsub topics create ${DEADLETTER_TOPIC} gcloud pubsub subscriptions create ${DEADLETTER_SUB} \ --topic ${DEADLETTER_TOPIC}
  1. Create the Pub/Sub subscription for Dataflow to process logs

gcloud pubsub subscriptions create ${DATAFLOW_SUB} \ --topic ${SINK_TOPIC}
  1. Deploy the Dataflow Template

gcloud dataflow jobs run splunk-dataflow-`date +%s` \ --region us-central1 \ --gcs-location=gs://dataflow-templates/latest/Cloud_PubSub_to_Splunk \ --staging-location=gs://${GOOGLE_CLOUD_PROJECT}-dataflow/tmp \ --max-workers=${MAX_WORKERS} \ --worker-machine-type=${MACHINE} \ --parameters="\ inputSubscription=projects/${GOOGLE_CLOUD_PROJECT}/subscriptions/${DATAFLOW_SUB},\ token=${DATAFLOW_HEC_TOKEN},\ url=${HEC_URL},\ outputDeadletterTopic=projects/${GOOGLE_CLOUD_PROJECT}/topics/${DEADLETTER_TOPIC},\ batchCount=${BATCH_COUNT},\ parallelism=${PARALLELISM},\ includePubsubMessage=${DATAFLOW_FORMAT_LIKE_PUBSUB},\ disableCertificateValidation=${DISABLE_CERT_VALIDATION}"

Configure the Splunk TA for Google Cloud

Create a Service Account for Splunk TA

In order to connect the Splunk Add-on for Google Cloud (TA) to Google Cloud to pull data, create a service account with appropriate permissions. In this lab you are providing the service account an exhaustive list of permissions as you are connecting all input methods. If you're using the TA only for a few of the inputs, a reduced list of permissions can be set.

Cloud Console

  1. Go to Navigation menu > IAM & Admin > Service Accounts.
  1. Click on Create Service Account.
  1. Type splunk-ta for the account name and click on Create & Continue.


  1. Add the permissions needed (you can search to make it easier to find them):

  • Compute Admin

  • Logs Configuration Writer

  • Logs Viewer

  • Monitoring Viewer

  • Storage Admin

  • Storage Object Viewer

  • Viewer

  • Pub/Sub Viewer

  1. Click Continue.

  2. Click Done.

  3. Click on your newly provisioned service account. Click the Keys tab on the top bar.

  4. Click Add Key > Create new key.

  5. Leave the default JSON type select and click Create. This will download the JSON key file to your system - store this for later.



NOTE: You don't need to do this section if you've perform the previous steps via the Console UI

  1. Create the service account.

gcloud iam service-accounts create ${SPLUNK_SERVICE_ACCOUNT} --description "Splunk account for TA"
  1. Give the correct permissions.

gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/compute.admin gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/storage.admin gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/storage.objectViewer gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/logging.configWriter gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/monitoring.viewer gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/logging.viewer gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/viewer gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/pubsub.viewer
  1. Create and download the service key JSON.

gcloud iam service-accounts keys create \ ${SPLUNK_SERVICE_ACCOUNT}.json \ --iam-account=${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} cat ${SPLUNK_SERVICE_ACCOUNT}.json

Add JSON credential to GCP-TA

  1. Open the Splunk Cloud console in a new browser tab.
  2. Under Apps, select Splunk Add-on for Google Cloud Platform.
  3. Click on Configuration.
  4. Under the Google Credentials tab, click on Add.
  5. Type gcp_creds for the name and paste the credentials from the JSON file that was downloaded when you created the service account, then click Add.


If you encounter an invalid JSON error, verify that you haven't pasted any additional carriage returns (line ending) in the private key. If you copy the text from a terminal it may lead to extra lines. The private_key should be a single line that will line wrap in the UI.

Explore GCP-TA Inputs

  1. Click on Inputs.

  2. Click on Create New Input.


You should see the different options, including Cloud Pub/Sub, Cloud Monitoring, Google Cloud BigQuery Billing, Cloud Storage Bucket, and Resource Metadata. The table below details their purposes.



Cloud Pub/Sub

Logging events and other Pub/Sub generated events

Cloud Monitor

Metrics such as CPU and DISK usages of Instances.

Google Cloud BigQuery Billing

Pull billing information from a Cloud Storage bucket. Note, this doesn't work currently work for everyone due to the decomissioning of File Export for Billing (only billing accounts with previous CSV configuration will work).

Cloud Storage Bucket

Pulls data from a Cloud Storage bucket such as application logs but could be any CSV, JSON or raw text.

Resource Metadata

Information on the resources in an organization/project.

As you can see the Splunk Add-on for Google Cloud supports ingesting a variety of data sources from Google Cloud. You've already set up Dataflow to send Google Cloud logs to Splunk, so next you'll configure the Splunk TA to pull:

  1. Resource Metadata

  2. Cloud Monitoring

Resource Metadata

The Resource Metadata input can be configured to pull metadata from various Compute Engine resources and enable Splunk users to monitor and set up analytics for their Compute Engine deployments.

  1. Click Create New Input > Resource Metadata.

  2. Configure a Resource Metadata input with the following (again, project will be unique):




select the gcp_creds credentials that you created earlier




us-central1-a, b, c, f


leave all checked




keep default

Your configuration should resemble the following:


  1. Click Add.

Cloud Monitoring

Cloud Monitoring collects metrics from a wide range of services on Google Cloud, as well as a variety of third-party software. A complete list of all predefined metrics can be found here. If you need something that isn't already defined, you can create your own custom metrics.

  1. Click Create New Input > Cloud Monitoring.

  2. Configure Cloud Monitoring Input with the following parameters




select the gcp_creds credentials that you created earlier



Cloud Monitor Metrics


keep default

Start Date Time

keep default



Your configuration should resemble the following:


  1. Click Add.

Bonus - Pub/Sub Input

This section requires usage of the CLI. Ensure the environment variables provided in the Setup section of this lab are present in your shell.

You can also ingest logs via the Pub/Sub input using the TA.

Note: See "Comparison of Methods" section below for a detailed comparison of ingestion methods
  1. To do this, run the following command in Cloud Shell to create a second subscription to the original Pub/Sub topic previously created. In this case, it's called ta_subscription.

gcloud pubsub subscriptions create ${TA_SUBSCRIPTION} \ --topic ${SINK_TOPIC}
  1. Then you'll have to give the TA service account explicit subscriber access to the subscription you created.

gcloud pubsub subscriptions add-iam-policy-binding ${TA_SUBSCRIPTION} \ --member=serviceAccount:${SPLUNK_SERVICE_ACCOUNT}@${GOOGLE_CLOUD_PROJECT} --role=roles/pubsub.subscriber
  1. Finally, navigate back to the Splunk Cloud console.

  2. Click Create New Input > Cloud Pub/Sub.

  3. Fill out the configuration as follows, you can add the Cloud Pub/Sub input pointing to the subscription created.




select the gcp_creds credentials that you created earlier



Pub/Sub Subscriptions




Your configuration should resemble the following:


Sample Searches

Now that you have various types of Google Cloud data hooked up to Splunk, take a look at some common Splunk searches you can use to get value out of this data.

  1. First, from the top navigation bar navigate to Apps > Search & Reporting.


Search queries in Splunk are composed using Search Processing Language, more commonly referred to as SPL. SPL is a very powerful feature whose the details aren't covered in this lab, but for a quick primer on SPL, see here.

Who is exporting JSON credential keys?

  1. Copy/paste the following in the search box:

index="gcp_data" data.resource.type="service_account" data.protoPayload.methodName="google.iam.admin.v1.CreateServiceAccountKey" | rename data.protoPayload.authenticationInfo.principalEmail as "Principal Email" | rename data.protoPayload.requestMetadata.callerIp as "Source IP" | rename data.protoPayload.requestMetadata.callerSuppliedUserAgent as "User Agent" | rename as "Key Name" | rename data.protoPayload.response.valid_after_time.seconds as "Valid After" | rename data.protoPayload.response.valid_before_time.seconds as "Valid Before" | eval "Valid After"=strftime('Valid After', "%F %T") | eval "Valid Before"=strftime('Valid Before', "%F %T") | eval "Private Key Type" = case('protoPayload.request.private_key_type' == 0, "Unspecified", 'protoPayload.request.private_key_type' == 1, "PKCS12", 'protoPayload.request.private_key_type' == 2, "Google JSON credential file") | table _time, "Principal Email", "Source IP", "User Agent", "Key Name", "Private Key Type", "Valid After", "Valid Before"

What service accounts have been created and by whom? This SPL will generate a table of those events

  1. Copy/paste the following in the search box:

index="gcp_data" data.resource.type="service_account" data.protoPayload.methodName="google.iam.admin.v1.CreateServiceAccount" | rename data.protoPayload.authenticationInfo.principalEmail as "Principal Email" | rename data.protoPayload.requestMetadata.callerIp as "Source IP" | rename data.protoPayload.requestMetadata.callerSuppliedUserAgent as "User Agent" | rename as "Service Account Email" | rename data.protoPayload.response.project_id as Project | table _time, "Principal Email", "Source IP", "User Agent", Project, "Service Account Email"

Display Instances in the Project

  1. Copy/paste the following in the search box:

index="gcp_ta" sourcetype="google:gcp:resource:metadata" | search(kind="compute#instance")

Monitor and Troubleshoot Pipelines

While Splunk can be used for most troubleshooting scenarios, there is one situation where that cannot be done: when Dataflow has failed to send data to Splunk. To troubleshoot and monitor in these situations you'll have to rely on the built in monitoring and logging in the Google Cloud console.

Monitoring Dataflow via Operations Monitor Dashboard

This section requires usage of the CLI. Ensure the environment variables provided in the Setup section of this lab are present in your shell.

  1. Navigate to Navigation menu > Monitoring > Dashboards in the Cloud Console.


  1. The first time the monitoring console is opened it may prompt you to either create a new workspace (default select) or to add it to an existing workspace. Accept the default select and click on Add.

  2. Once the workspace has been prepared open Cloud Shell and run the following to deploy a custom dashboard. You're free to edit and tweak the dashboards.

gsutil cp \ gs://${GOOGLE_CLOUD_PROJECT}-dashboard/SplunkExportDashboard.json . gcloud alpha monitoring dashboards create \ --config-from-file=SplunkExportDashboard.json \ --project=${GOOGLE_CLOUD_PROJECT}
  1. Go back to dashboards. You should see a dashboard in the list called Splunk Dataflow Export Monitor. If you don't, refresh the webpage.

  2. After some time you should see the metrics for the Dataflow and Pub/Sub jobs.

NOTE: Some graphs may show errors if opened too early or if there are no errors. c9e8ae5670d59f07.png

Finding Errors in Dataflow Jobs

  1. To see errors in Dataflow you can open up the Dataflow job by going to Navigation menu > Analytics > Dataflow > Jobs and click on the running job.


  1. At the bottom of the page you can click on either Job Logs or Worker Logs to see logging relating to either the deployment and running of the stream itself (Job Logs) or logging relating to the function of the workers individually (Worker Logs)


Comparison of Methods

In this lab you've seen two different ways to ingest logging data into Splunk. The following table compares each method.




Splunk Add-on for Google Cloud (TA)

  • No additional infrastructure needed in Google Cloud
  • Google Cloud cost minimized
  • Supported by Splunk
  • Support all data types (assets, logging, metrics, etc..)
  • TA infrastructure must be scaled to handle ingestion volume
  • Data is not pushed from the sources but rather periodically pulled


  • Supported by Google
  • Supports batching of message to ease impact on HEC
  • Exponential backoff support to ease load on HEC
  • Data is pushed (fresher data in Splunk)
  • Only supports logging and asset data (today)
  • Operational management of Dataflow

Optional - Online Boutique Demo

This section requires usage of the CLI. Ensure the environment variables provided in the Setup section of this lab are present in your shell.

The Online Boutique Demo deploys numerous microservices and a simulated work log that will generate realistic log entries that you'll be able to inspect in Splunk. See the GitHub repo for full details as to what is getting deployed.

Build GKE cluster

  1. Enable required services

gcloud services enable gcloud services enable gcloud services enable
  1. Create a GKE cluster and verify nodes creation. This process can take about 2 minutes.

gcloud container clusters create demo \ --enable-autoupgrade --enable-autoscaling \ --min-nodes=2 --max-nodes=4 --num-nodes=3 \ --machine-type=n1-standard-4 --zone=us-central1-a kubectl get nodes
  1. Configure gcloud for docker auth.

gcloud auth configure-docker -q

Install boutique

  1. Clone the Online Boutique shop demo repository.

git clone cd microservices-demo
  1. Deploy using pre-built container images.

kubectl apply -f ./release/kubernetes-manifests.yaml
  1. Get the external IP of the Online Boutique once deployed. The deployment can take a few minutes to fully spin up and show the external IP.

kubectl get service/frontend-external


Install Splunk Connect for Kubernetes (SC4K)

In the next few steps you will configure Splunk Connect for Kubernetes to send data to HEC. While much of this logging information is available via the standard Google Cloud logging, SC4K can provide deeper insight and allow Kubernetes visibility outside of Google Cloud.

  1. First, create a namespace for the SC4K pods.

kubectl create namespace splunk
  1. Add the SC4K repo to helm.

helm repo add splunk
  1. Create a YAML file for the SC4K configuration.

cat << EOF > values.yaml global: splunk: hec: host: ${SPLUNK_HOSTNAME} port: 8088 token: ${SC4K_HEC_TOKEN} protocol: https indexName: gcp_connect insecureSSL: ${DISABLE_CERT_VALIDATION} kubernetes: clusterName: "demo" prometheus_enabled: true splunk-kubernetes-logging: containers: logFormatType: cri logFormat: "%Y-%m-%dT%H:%M:%S.%NZ" splunk-kubernetes-metrics: splunk: hec: indexName: gcp_metrics EOF
  1. Install Splunk Connect for Kubernetes via the Helm Chart.

helm install splunk-connect \ --namespace splunk \ -f values.yaml \ splunk/splunk-connect-for-kubernetes

Explore boutique logs

Data should now be streaming into Splunk. Here's a couple of samples searches:

  1. Average response time by request path:

index="gcp_connect" http.req.method="GET" earliest=-15m | search http.resp.status=200 | timechart avg(http.resp.took_ms) by http.req.path
  1. See the type of metrics that are being reported:

| mcatalog values(_dims) WHERE "index"="gcp_metrics" GROUPBY metric_name index | rename values(_dims) AS dimensions | table metric_name dimensions


In this lab you used the Splunk Add-on for Google Cloud to create Splunk indexes, HTTP Event Collectors (HECs), log sinks, Cloud Storage buckets, and create Pub/Sub topics and subscriptions. You then launched a Dataflow template deployment, configure GCP-TA inputs, performed sample Splunk searches across ingested data, and monitored and troubleshooted Dataflow pipelines.

Next Steps / Learn More

Check out the following for more information on Splunk with Google Cloud:

Google Cloud Training & Certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated June 17, 2022
Lab Last Tested June 17, 2022

Creator Content available herein, is owned by Splunk Inc. and is provided "AS IS" without warranty of any kind.

Splunk, Splunk>, Turn Data Into Doing, Data-to-Everything and D2E are trademarks or registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2021 Splunk Inc. All rights reserved.

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.