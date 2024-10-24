As the ALB pods are predominantly CPU-heavy processes, using the CPU average utilization as the basis of autoscaling configuration is a good choice for most use cases.

In order to simply enable HPA based on CPU utilization average, you can use the following CLI command:

$ ibmcloud ks ingress alb autoscale set -c <clusterID> --alb <albID> --max-replicas <desired maximum replicas> --min-replicas <desired minimum replicas> --cpu-average-utilization <desired CPU average utilization>

To determine the desired CPU average utilization, you may find guidance in our documentation. It is not recommended to set the minimum replica count below two for high availability purposes. It is also not recommended to set the maximum replica count above the number of workers your cluster has, as these excess ALB pods will not be scheduled due to anti-affinity rules.

For example, if you would like to configure a maximum of 12 replicas with a target CPU average utilization of 600%, you can use the following command:

$ ibmcloud ks ingress alb autoscale set -c <clusterID> --alb public-cr<clusterID>-alb1 --max-replicas 12 --min-replicas 2 --cpu-average-utilization 600

Keep in mind that it may take up to 10 minutes for the HPA resource to be deployed.

To verify that your HPA resource is configured as intended, you can use the kubectl get horizontalpodautoscaler command:

$ kubectl get horizontalpodautoscaler -n kube-system NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE public-cr<clusterID>-alb1 Deployment/public-cr<clusterID>-alb1 11%/600% 2 12 2 131m

You can use the same command to determine if the HPA is working as intended. The following command was executed on the same cluster. This time, the utilization has increased, and more requests are being served by the ALB. As you can see, the HPA automatically increased the amount of ALB replicas as the load increased: