How to use the autoscale feature to adjust in response to dynamic workload changes.
We previously announced new autoscaling capabilities for IBM Cloud Foundry Public that are now Generally Available. This blog post will focus on using the autoscaling feature to adjust your application capability in response to dynamic workload changes.
Cloud Foundry Autoscaling helps you to scale in/out your application horizontally by adding or removing your application instances. This makes sure that your application can run across multiple-instances for High Availability.
You can consider your scalability design through the following aspects as defined in the “twelve-factor app” methodology:
- Stateless Processes
- Scaling based on process model
- Fast startup and graceful shutdown
Best practices when setting up autoscaling policies
Cloud Foundry Autoscaling allows you to customize when and how to scale your application using a pre-defined autoscaling policy. Autoscaling policy documents the min/max instance limits for your application, the selected scaling criteria metric type, the threshold to trigger scale out/in, and the number of instances to be added/removed per step.
Here are some tips to help you create a better autoscaling policy:
- Set up performance baselines to understand how your application behaves under load.
- Choose a proper metric to scale your application according to performance test results.
- For CPU-intensive applications, be aware that the CPU is actually weighted and shared with other apps on the same host VMs. So, the autoscaling decision may be affected by other apps.
- For memory-intensive applications, the memory usage of an application depends on the process memory allocation algorithm of the runtime type. If the runtime doesn’t free up allocated memory in time, the scale-in based on memory-related metrics will be slow.
- Scaling rules with multiple metric types may conflict with one another and can introduce unexpected fluctuations on application instance counts.
- One is recommended to use the same metric-type for scale-out and scale-in for consistency.
- Set proper threshold values to trigger scale out/in.
- Aggressively scale-out, but scale-in less aggressively if the workload varies frequently.
- Don’t push the upper threshold value too high to avoid system crash before autoscaling happens.
- Use schedule scaling to get enough resources prepared if the burst workload is predictable.
- Allow enough quota for your organization for scaling out.
- Autoscaling policy needs to be applied to each target “application” separately. If the target application is rolling over with the blue-green approach in which a new application is pushed, please ensure the same policy is added to the newly created application as well.
Examples of the best practices in action
Example 1: Create a dynamic autoscaling policy with throughput
User scenario
- A web application is designed to serve about 1200 requests/second in total with at least three application instances. An automatic scale-out with a throughput metric is required to expand the capacity, when necessary, to support up to 4000 requests/second with more instances.
- The application memory is set to 128MB for each instance.
- The application is hosted in Cloud Foundry with an organization memory of 4GB. Also, this application is the only consumer of the org memory.
Solution
- Define min/max instances count:
- According to the use case, you need to define the
instance_min_count
to 3 to fulfill the minimum instance requirements. - Then, for the
instance_max_count
—according the org/space quota limitation 2G—the maximum instance number could be up to 32. But, given that the massive instances will add more cost, you can set the instance_max_count according to the estimation for application maximum capacity as well. In this case, 10+ instances should be fine to support the requirement of throughput of 4000 requests/second, soinstance_max_count
can safely be set to 12.
- According to the use case, you need to define the
- Define dynamic scaling rules:
- For dynamic scaling rules,
throughput
is selected as themetric_type
. Since the maximum capacity on each instance is 400 requests/second, you can set 300 requests/second as the upper threshold to scale-out and 100 requests/second as the lower threshold to scale-down. - To add more instances quickly with scale-out, you can change the capacity using a % ‘adjustment’ when scaling out and stepping down one by one.
- The
breach_duration_secs
controls when the scaling action should happen. Thecool_down_sec
controls how long the next scaling action must wait to make sure that the system is stable after a scaling action is done. You can define these explicitly or omit to use the default values.
- For dynamic scaling rules,
Sample policy JSON
Example 2: Create a specific-date schedule to handle massive access during special event
User scenario
- A web application normally runs with 3–10 instances but is expected to handle more requests during a marketing event that is scheduled for New Years Eve.
Solution
If the usage of an application increases extremely quickly during a special event (e.g., marketing promotion events), dynamic scaling may not respond to the usage changes quickly enough. In this case, it is recommended to prepare more instances before the event starts.
- Create a schedule for the specific event:
- You can use
specific-date
schedule to override the default instance limits definition so that autoscaling can adjust instance numbers in a large scope. - In this case, the default
instance_min_count
is 3 and the defaultinstance_max_count
is 10; in a pre-defined period, you can set theinstance_min_count
to 10 and theinstance_max_count
to 30. Additionally, you can set theinitial_min_instance_count
to 15 if more instances are required at the beginning. - During the period of the defined schedule, the dynamic rules still take effect to adjust the instance capacity but just within the larger range 10–30, instead of the default 3–10.
- Once the schedule period ends, the instance limits will fall to the default one.
- You can use
Sample Policy JSON
How to change the specific schedule quickly with the command line
If you need to change schedule frequently, it is recommended to use the command line tool (i.e., JQ) to edit the policy JSON file
For example, if you would like to replace the start_date_time
and end_end_time
of the original schedule, you can use the following script snippets:
Feel free to use other tools to edit policy JSON file by scripting.
Example 3: Apply autoscaling policy for blue-green deployments
Blue-green deployment is a common practice in Cloud Foundry to update an application with zero down time. One needs to push a NEW “green” application during the update and then re-route the traffic to the NEW application.
Autoscaler policy is applied to the Cloud Foundry application, so once you push a NEW “green” application, you need apply the same autoscaling policy against the new application.
You can achieve this easily by using the autoscaler CLI tool:
About IBM Cloud Foundry
IBM Cloud Foundry is a Cloud Foundry-certified development platform for cloud-native applications on IBM Cloud. IBM Cloud Foundry is the fastest and the cheapest way to build and host a cloud-native application on IBM Cloud.