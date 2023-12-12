Auto scaling is used to ensure that applications have the resources they need to maintain consistent availability and hit performance goals, as well as to promote the efficient use of cloud resources and minimize cloud costs.

According to a 2023 white paper from Infosys, organizations that migrate to cloud waste about 32% of their cloud cost.1 Because of its focus on efficient resource utilization, auto scaling is a useful component in a successful FinOps practice.

When organizations configure cloud infrastructure, they provision resources according to a “baseline” of compute, storage and network resource needs. But demand fluctuates, say, with spikes or drops in network traffic or application use. Auto scaling features allow for resources to be scaled to match real-time demand according to specific metrics like CPU utilization or bandwidth availability, without human intervention.

Auto scaling can be used to optimize the allocation of resources through a variety of means. For example, predictive scaling uses historical data to predict future demand. Or, dynamic scaling, which reacts to resource needs in real-time as determined by an organization’s auto scaling policies.

Auto scaling policies automate the lifecycles of cloud computing instances, launching and terminating virtual machines as needed to assist with resource demand. Auto scaling is often used in tandem with elastic load balancing to fully leverage available cloud resources.