The Google cloud platform offers managed instance groups that provide efficient autoscaling capabilities.
These features allow you to automatically add or remove instances from a managed instance group based on increase or decrease in load.
Autoscaling with GCP, helps your applications gracefully handle increase in traffic and helps reduce cost when the need for resources are lower.
This autoscaling is governed by defining an autoscaling policy. The autoscaler performs automatic scaling based on the measured load.
There are many ways to configure autoscaling policies. For example: Scaling based on CPU utilization, load balancing capacity, or monitoring various system metrics, or by a queue-based workload like Cloud Pub/Sub.
Let’s assume you have two instances that are currently running at 100 percent and 85 percent CPU utilization.
If you configure your target CPU utilization as 75 percent, the autoscaler will automatically add another instance to spread out the CPU load to ensure that the CPU utilization stays below the 75 percent target.
Similarly, if the overall load is much lower than the target, the autoscaler will remove instances as long as doing so, maintains the overall utilization below the target.
Google cloud platform has a graphical user interface that provides all this information per instance. You can see the CPU utilization over the past hour. But you can’t change the timeframe and visualize other metrics like disk and network usage.
This graphical user interface is very useful for monitoring your instances, utilization, and for determining how best to configure your autoscaling policy to meet changing demands.