Google Cloud Autoscaling Capabilities

The Google cloud platform offers managed instance groups that provide efficient autoscaling capabilities.

worldwide autoscaling

These features allow you to automatically add or remove instances from a managed instance group based on increase or decrease in load.

Autoscaling with GCP, helps your applications gracefully handle increase in traffic and helps reduce cost when the need for resources are lower.

This autoscaling is governed by defining an autoscaling policy. The autoscaler performs automatic scaling based on the measured load.

There are many ways to configure autoscaling policies. For example: Scaling based on CPU utilization, load balancing capacity, or monitoring various system metrics, or by a queue-based workload like Cloud Pub/Sub.

Let’s assume you have two instances that are currently running at 100 percent and 85 percent CPU utilization.

If you configure your target CPU utilization as 75 percent, the autoscaler will automatically add another instance to spread out the CPU load to ensure that the CPU utilization stays below the 75 percent target.

google cloud autoscaling capabilities

Similarly, if the overall load is much lower than the target, the autoscaler will remove instances as long as doing so, maintains the overall utilization below the target.

Google cloud platform has a graphical user interface that provides all this information per instance. You can see the CPU utilization over the past hour. But you can’t change the timeframe and visualize other metrics like disk and network usage.

This graphical user interface is very useful for monitoring your instances, utilization, and for determining how best to configure your autoscaling policy to meet changing demands.

Using Stackdriver monitoring, you can set up alerts through several notification channels.

An important configuration for a managed instance group and load balancer is a health check. A health check is very similar to an Uptime check in Stackdriver.

With this, you define a protocol, port, and health criteria & based on this configuration, GCP computes a health state for each instance.

The health criteria defines how often to check whether an instance is healthy. It also includes checks and criteria, here are a few examples:

How long to wait for a response?
How many successful attempts are decisive?
How many failed attempts are decisive?

The health check can even define how many times it has to fail over what total time period before an instance is considered unhealthy.

Google Cloud Autoscaling Capabilities

Further Reading: