Checkpoints
Create a new VM instance
/ 20
Create HTTP load balancer
/ 30
Create TCP load balancer
/ 30
Create a Storage Bucket
/ 20
Improving Network Performance II
GSP046
Overview
In this hands-on lab learn through some real-word scenarios, re-create the environments, and work on improving network performance using load balancers and other Google Cloud products. At the end, you'll go through an exercise for increasing the window size to improve bandwidth.
This lab was adapted from blog posts by Colt McAnlis: Profiling GCP's Load Balancers, Removing the Need for Caching Servers, with GCP's Load Balancers, and The Bandwidth Delay Problem. Colt blogs about Google Cloud network performance on Medium.
Some of the resources you'll need for this lab have been created for you to save time, and some you will create.
What you'll learn
-
What load balancers are offered through Google Cloud
-
How load balancers can improve network performance
-
How to resize your window
Prerequisites
-
Basic knowledge of Google Cloud services (best obtained by having previously taken the labs in the Google Cloud Essentials Quest
-
Basic Google Cloud networking and TCP/IP knowledge (best obtained by having taken the earlier labs in the Networking in the Google Cloud Quest)
-
Basic Unix/Linux command line knowledge
Setup and Requirements
Before you click the Start Lab button
Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.
This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.
To complete this lab, you need:
- Access to a standard internet browser (Chrome browser recommended).
- Time to complete the lab---remember, once you start, you cannot pause a lab.
How to start your lab and sign in to the Google Cloud Console
-
Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is a panel populated with the temporary credentials that you must use for this lab.
-
Copy the username, and then click Open Google Console. The lab spins up resources, and then opens another tab that shows the Sign in page.
Tip: Open the tabs in separate windows, side-by-side.
-
In the Sign in page, paste the username that you copied from the left panel. Then copy and paste the password.
Important: You must use the credentials from the left panel. Do not use your Google Cloud Training credentials. If you have your own Google Cloud account, do not use it for this lab (avoids incurring charges).
-
Click through the subsequent pages:
- Accept the terms and conditions.
- Do not add recovery options or two-factor authentication (because this is a temporary account).
- Do not sign up for free trials.
After a few moments, the Cloud Console opens in this tab.
Use case 1: Performance overhead using Google Cloud load balancer
With an estimated 4 billion IoT devices in the world by 2020, scaling your services to handle the sudden load of a million new devices is an important feature.
This concern was explicitly highlighted by the IoT Roulette group. Their company offers a scalable way for IoT devices to connect with cloud backends though a common and simple transport layer and SDK. As you can imagine, scale is really important to IoT Roulette.
Profiling Google Cloud's load balancers
If the term Load Balancer is new to you, the gist is this: These are intermediary systems which scale up new backends based upon frontend work load. When configured the right way, load balancers make it possible to do cool stuff, like 1,000,000 queries per second.
Google Cloud provides two primary load balancers: TCP/UDP (aka L4) and HTTP (aka L7). The TCP/UDP load balancer acts much like you'd expect - a request comes in from the client to the load balancer, and is forwarded along to a backend directly.
The HTTP load balancer, on the other hand, has some serious magic going on. For the HTTP load balancer, traffic is proxied through Google Front End (GFE) Servers which are typically located close to the edge of Google's global network. The GFE terminates the TCP session and connects to a backend in a region which has the capacity to serve the traffic.
So, with this in mind, let's set up some load balancers and test them, to determine which type of Google Cloud load balancer will provide the best performance.
Establishing baseline with no load balancer
Before talking about what types of overhead the load balancers add, we first need to get a baseline estimate of connectivity to a cloud instance from some machine outside of the cloud cluster. For IoT Roulette, they were more concerned with bandwidth performance rather than just simple ping
times, so we tested 500 iterations of a curl
command from an instance in Europe fetching from an instance in US-CENTRAL1.
Now you try. Create a baseline for the instances provided in the lab.
Create baseline for network speed without load balancers
Create a new instance with following configuration:
Configuration | Value |
---|---|
Name | instance-3 |
Region | europe-west1 |
Zone | europe-west1-d |
Boot Disk | CentOs 7 |
- Click the "Management Disks Network SSH Keys" link
- Click Networking tab
- In Network interfaces, click on the default and select:
- Network nw-testing from the drop-down
- Subnetwork: europe-west1
Click Create.
Click Check my progress to verify the objective.
Now SSH into the instance-3
and execute the following command:
Run the following shell script (on instance-3
) at the command prompt. Substitute in the external IP Address of any instance from instance group (i.e. instances with prefix myinstance-grp-pri)
where it says <ip>
.
The above command provides output of ten runs of curl
in ms. (i.e. milliseconds) against a simple webserver, testing between Europe and the US, to get a baseline performance between regions.
Example Output (yours will differ):
Copy the curl
command output. You're going to use it in the next step.
Comparing load balancer types
Now you'll determine what the performance overhead is when using a load balancer. You'll create a HTTP load balancer and a TCP load balancer, test them, and compare the differences.
HTTP Load Balancer
Navigate to Navigation menu > Network Services > Load balancing.
Click on Create load balancer, then click Start Configuration on the HTTP(S) Load Balancing tile.
Name the load balancer ‘my-http-lb', then click Backend configuration.
Backend configuration
From the Backend services and backend buckets drop-down, select Backend services > Create the backend service. Use the following configuration:
- Name: my-http-lb-backend
- Instance group: Select myinstance-grp-pri-igm from the drop-down.
- Health Check: Select Create a health check.
Configure the Health Check with following details:
- Name: instance-group-hc
- Protocol: TCP
- Port: 80
- Proxy Protocol: None
- Request: /
- Check Interval: 10 seconds
- Unhealthy Threshold: 3 consecutive failures
- Click Save and continue.
- Cloud CDN: check Enable Cloud CDN
Click Create.
Create Backend Bucket
Again from the Backend services & backend buckets menu, choose Backend buckets > Create a backend bucket.
Create a bucket with the following configuration:
- Name: my-http-lb-backend-bucket
- Cloud Storage Bucket: Click Browse, then click the New Bucket icon.
Storage bucket details:
- Name: Enter a unique name for your bucket
- Storage class: Multi-Regional
- Location: United States
- Click Create
- Then click Select.
One more setting:
- Cloud CDN: check Enable Cloud CDN
Click Create.
Configure Host and Path Rules
Now click on Host and path rules.
In the my-http-lb-backend-bucket row, configure:
- Hosts: *
- Paths: /static/*
Create frontend
Click on Frontend Configuration. You don't have to do anything here. Keep the default values.
Review and Finalize
Click Review and Finalize to see all of the settings for your HTTP Load Balancer, then click the Create button when you are done.
Click Check my progress to verify the objective.
Create TCP load balancer
Navigate to Navigation menu > Network Services > Load balancing.
Click on Create load balancer, then click Start Configuration on the TCP Load Balancing tile:
Use all of the default settings for your load balancer. Click Continue.
Name the load balancer my-tcp-lb, then click Backend configuration.
Backend configuration
From the Backend services and backend buckets drop-down, select Backend services > Create the backend service. Use the following configuration:
- Region: us-central1
- Instance Group: select myinstance-grp-pri-igm
- Health Check: Select Create a health check from the drop-down.
Configure Health Check with following details:
- Name: my-tcp-lb-hc
- Click Save and Continue.
Create Frontend
Click on Frontend Configuration. Add port number ‘80', and leave other values as they are. Then click Done.
Review and Finalize
Click Review and Finalize to see all of the settings for your TCP load balancer, then click Create.
Test HTTP Load Balancer:
Navigate to Navigation menu > Network services
Click on my-http-lb load balancer and copy the IP:Port under the Details tab.
SSH into the instance-3
and execute the following command (Make sure to replace <ip>:<port>
with the copied IP:Port from HTTP load balancer):
The above command provides output of curl
in ms. (i.e. milliseconds).
Example Output (yours will differ):
Copy the curl
command output. You're going to use it in the next step.
Test TCP load balancer
Navigate to Navigation menu > Networking Services.
SSH into the instance-3
and execute the following command (Make sure to replace <ip>:<port>
with the copied IP:Port from TCP load balancer):
The above command provides output of curl
in ms. (i.e. milliseconds).
Example Output (yours will differ):
Look back at the baseline data you collected earlier, and compare it to these two 2 graphs with load balancers. You can see that the HTTP load balancer is faster than the TCP load balancer.
Why HTTP load balancing can be faster
When you look at how the HTTP load balancer is working under the hood, you can see why there's a difference in performance.
When a request hits the HTTP load balancer, the TCP session stops there. The GFEs then move the request on to Google's private network and all further interactions happen between the GFE and your specific backend.
Now here's the important bit: After a GFE connects to a backend to handle a request, it keeps that connection open.This means that future requests from this GFE to this specific backend will not need the overhead of creating a connection. Instead, it can just start sending data ASAP.
The results of this setup means that the first query that causes the GFE to open a connection to the backend will see higher response times (those are the large spikes). However, subsequent packets routed to the same backend will see a lower minimum latency.
Google Cloud load balancer conclusion
So, for IoT Roulette's case, the decision was made to use a HTTP load balancer. The more popular your service gets in a region, the better your HTTP load balancer will perform. The first ~100 clients will result in GFE connections being made to the backends, while the next ~100 clients will have faster fetches since those connections have already been established.
While the HTTP load balancer was ideal for IoT Roulette, there's a whole set of reasons why a TCP balancer might be better for your use case. To figure out which one is best for the scenario you're running into, please watch the NEXT 2016 talk or read the official docs.
Use case 2: Load balancing and caching / CDN
Tax Lemming is a startup out of Vancouver, BC, which focuses on helping you make sense of the purchases, taxes, and basic bookkeeping for your small business. Tax Lemming has too many instances spinning up, and need cut that number down before they go for another round of VC capital.
Like most web based applications, Tax Lemming's architecture looks something like this:
At first glance, it's easy to see that this can become expensive quickly. All the static content is being sent through the server instance, so they end up paying for compute hours, and each request requires the Apache server to re-hit the relational DB.
The common answer
For most developers of web based applications, a solution to this problem looks something like this:
Generally, add Nginx
as a reverse cache proxy to the Apache server and modify the source assets so the client fetches the big files from the CDN so that they don't end up eating server time (and are sent out faster).
Although a very tried-and-true solution, there's a few issues:
- The reverse proxy (aka
Nginx
,Varnish
,S
quid etc) needs a whole new instance to be setup per region. Technically, it's still cheaper than the load on the server instance itself, but that's still a lot of overhead in terms of cost to cache & send content. - Static assets using a de facto CDN. This typically requires a whole new URL scheme most of the time though. (e.g. instead of url="./abc/tacos.jpg" we get url="cdn.cloudr.com/1721617282.jpg") This isn't so much a performance issue, but more an aesthetic one.
Given these two nuances, there are some Google Cloud features that Tax Lemmings can use to improve performance.
Additional load balancer features
Google Cloud Load Balancer already allows you to split traffic between instance groups, regions, etc. It also has two nice features that could help Tax Lemmings reduce their instance count further and reduce some upkeep costs as well.
CDN the cacheable requests
Google Cloud's load balancer can cache common requests and move them to Google Cloud CDN. This will reduce latency as well as reduce the number of requests needing to be served by the instance. As long as the request is cached, it will be served directly at Google's network edge.
You enabled CDN when you created the load balancer earlier!
Here's what the performance graph looks like: fetching the request through the load balancer to the instance directly vs. fetching the request through the load balancer with the "enable cloud CDN" turned on.
You can see it: once the asset gets cached, the response time drops significantly.
What's even better about this is that there are no extra instances needed for this process. While Nginix
, Varnish
, and Squid
require dedicated hosting on a VM, Google's load balancer + CDN is serverless.
Cloud Storage for static assets
If your content is static you can reduce the load on the web servers by serving content directly from Cloud Storage. Typically, your compute URL is separated from your CDN URL (e.g. www.gcpa.com/mypage/17266282.jpg vs cdni.cloudcdn.com/17266282.jpg). With Google Cloud Load Balancer, you're able to create a host routing path, so that gcpa.com/mypage will route over to fetch assets from a public Cloud Storage bucket, which is cached to Google Cloud CDN.
The setup for adding a cloud storage bucket is straightforward. Earlier in the lab you created a storage bucket. You can have a backend service which points to an instance group; or a backend bucket, which points to a gcs bucket.
Performance of your backend bucket is improved even more by enabling cloud CDN when you create it in the load balancer UI.
Here is a quick exercise for creating a storage bucket with a static folder:
Create a Storage Bucket
Navigate to Navigation menu > Storage, then click on "Create Bucket".
- Name: Enter a unique name for your bucket
- Storage class: Multi-Regional
- Location: United States
- Click Create
Create a static Folder
While in the bucket, click "Create folder" in the top toolbar.
- Name: static
Click Create.
Creating a bucket this way, you'll be able to select it when building the backend bucket to your load balancer.
Click Review and Finalize to see all of the settings for your TCP load balancer, then click Create.
Load balancing and caching conclusion
Armed with the information about the power of Google's Load Balancer, Tax Lemmings updated their architecture:
This change resulted in lots of new caching for dynamic request (with proper headers) which helped reduce the number of requests to the backend, spinning up less instances to service the same load.
Use case 3: Google Cloud networking and bandwidth delay problem
Tutorama is a company built to create a crowd-sourced solution to instructional videos. Users all over the world can upload screencasts, recordings, and other videos to help teach people how to do everything from properly walking a dog to changing the oil in your car. Tutorama recently upgraded their connection from on-premise to Google Cloud; but despite having big pipes to connect to Google Cloud instances, they still get poor performance.
We've looked at some networking issues already, and they were all ruled out with simple tests:
- Core count - Their 8vCPU machine should have a max of 16Gb / sec, so that's not the problem.
- Internal / external IP - This doesn't impact the throughput. Something else is keeping it arbitrarily low.
- Region - Obviously we're crossing regions here, but that's kinda the point. We can't solve this by putting the box closer to the client.
So what's going on?
Bandwidth delay product
Like most modern operating systems, Linux now does a good job of auto-tuning the TCP buffers. In some cases, the default maximum Linux TCP buffer sizes are still too small. When this is the case, you can observe an effect called the Bandwidth Delay Product.
The TCP window is the maximum number of bytes that can be sent before the ACK must be received. If either the sender or receiver are frequently forced to stop and wait for ACKs for previously sent packets, gaps in the data flow are created, which limits the maximum throughput of the connection.
Finding the right window size
The optimal window size is twice the bandwidth delay product. You can compute the optimal window size if you know the RTT and the available bandwidth on both ends. There's lots of great resources which explain how to compute your window sizes. Rather than covering that here, check out this bandwidth calculator: https://www.switch.ch/network/tools/tcp_throughput/.
For Tutorama, we were able to determine their maximum available bandwidth, and the maximum anticipated latency, which we threw into one of the available calculators. Their tcp_rmem
value is set to 125k; and the tcp_wmem
to 64kb; then the test was re-run:
2.10 - 2.20 MBits / sec is much better than what they were getting, but not as good as our default value (90 MBits / sec), to see why we looked at the default values for a new instance:
As such, it's generally a good idea to leave net.tcp_mem
alone, as the defaults are fine. A number of performance experts say to also increase net.core.optmem_max
to match net.core.rmem_max
and net.core.wmem_max
, but we have not found that makes any difference. Using the default window size usually provides the best bandwidth.
Change the window size
Your turn to try this out. In your lab, you have two instances that you'll compare. First you'll find the current value of your bandwidth delay product, then you'll change the size of your window and re-run the test to see the what happens.
In the Cloud Console, navigate to Navigation menu > Compute Engine and review instance-1
and instance-2
.
Default TCP window and bandwidth
SSH into instance-1
, then run this command:
This is the receiver.
Example Output
This is the current value of your bandwidth delay product for instance-1.
SSH into instance-2
, then run this command:
This is the sender.
Example Output
This is the current value of your bandwidth delay product for instance-2
.
Now find out the default bandwidth between these instances. Note the TCP window size when you iperf
.
In the receiver's SSH window, run:
Example Output:
In the sender's SSH window, run:
Example Output
In your lab, note what your bandwidth is.
Adjusted window and bandwidth
Now increase the window size, and see what happens to your bandwidth.
In the receiver's SSH window, run the following to adjust the tcp_rmem
to the value below:
Example Output
In the sender's SSH window run the following to adjust the tcp_wmem
to the value below:
Example Output
In the receiver's SSH window, change the window size to 64kb:
Example Output
Run the following in the sender's SSH window:
Example Output
After increasing the TCP window size, the new transfer rate and bandwidth are much slower.
If You Have More Time
In the Qwiklabs lab interface, under Student Resources on the left-hand side, you'll see links to videos related to this lab. They are worth watching!
Congratulations!
Finish Your Quest
This self-paced lab is part of the Qwiklabs Quest Network Performance and Optimization. A Quest is a series of related labs that form a learning path. Completing this Quest earns you the badge above, to recognize your achievement. You can make your badge (or badges) public and link to them in your online resume or social media account. Enroll in this Quest and get immediate completion credit if you've taken this lab. See other available Qwiklabs Quests.
Take Your Next Lab
Continue your Quest with Building High-throughput VPN, or check out these suggestions:
Next Steps / Learn More
- Work through the Google Cloud post "Using Google's cloud networking products: a guide to all the guides".
- Read the Compute Engine Networking Documentation.
- Learn about Subnetworks.
- Post questions and find answers on Stackoverflow under the google-compute-engine or google-cloud-platform tags.
Google Cloud Training & Certification
...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.