A look at how you can use autoscaler to scale your worker nodes.

If you want to scale the pods in a Kubernetes cluster, this can be done easily through Replicaset within Kubernetes; but what if you want to scale your worker nodes? In this situation, autoscaler can help you avoid having pods in a pending state in your environment due to a lack of computational resources. You can increase or decrease the number of worker nodes in your work cluster automatically based on the resource demand.

Scale-up and scale-down

To start, we need to understand how scale-up and scale-down work, and it’s important to understand the criteria used. The autoscaler works based on the resource request value defined for your deployments/pods and not on the value that is being consumed by the application.

  • Scale-up: This situation occurs when you have pending pods because there are insufficient computing resources. 
  • Scale-down: This occurs when less than the total compute resources are considered underutilized. The default scale-down utilization threshold is utilization below 50%.

Step-by-step instructions

In this step-by-step guide, we will show you how to install and configure the autoscaler on your IBM Cloud Kubernetes Service cluster and perform a little test to see how it works in practice.

Before start, you’ll need to install the required CLI into your computer: ibmcloud and Helm version 3 (the correct version is important due to the differences in commands between them).

1. Confirm that your credentials are stored in your Kubernetes cluster

kubectl get secrets -n kube-system | grep storage-secret-store

If you do not have credentials stored, you can create one

2. Check if you worker pool has the required label

ibmcloud ks worker-pool get --cluster <cluster_name_or_ID> --worker-pool <worker_pool_name_or_ID> | grep Labels

If you don’t have the required label, you have to add a new worker pool. If you don’t know the <worker_pool_name_or_ID>, you can get it through this command: 

ibmcloud ks worker-pool ls --cluster <cluster_name_or_ID>

3. Add and update the Helm repo into your computer

helm repo add iks-charts https://icr.io/helm/iks-charts
helm repo update

Note: If you try to add a repo and receive an error message that says “Error: Couldn’t load repositories file…”, you have to init the helm. On prompt, type: helm init 

4. Install the cluster autoscaler helm chart in the kube-system namespace:

helm install ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler --namespace kube-system --set workerpools[0].<pool_name>.max=<number_of_workers>,workerpools[0].<pool_name>.min=<number_of_workers>,workerpools[0].<pool_name>.enabled=(true|false)
  • workerpools[0]: The first worker pool to enable autoscaling.
  • <pool_name>: The name or ID of the worker pool .
  • max=<number_of_workers>:  Specify the maximum number of worker nodes.
  • min=<number_of_workers>: Specify the minimum number of worker nodes.

It is necessary to set the min value of autoscaler to, at least, the current number of worker nodes of your pool because min size does not automatically trigger a scale-up.

If you set up the autoscaler with a min size below the number of current worker nodes, the autoscaler does not initiate and needs to be set to the correct value before works properly:

Note: In this option, we are using all the default values ​​of the autoscaler and just specifying the minimum and maximum number of worker nodes. However, there are several options that you can change by using the –set option.

5. Confirm that the pod is running and the service has been created

kubectl get pods --namespace=kube-system | grep ibm-iks-cluster-autoscaler
kubectl get service --namespace=kube-system | grep ibm-iks-cluster-autoscaler

6. Verify if the status of ConfigMap is in SUCCESS state

kubectl get cm iks-ca-configmap -n kube-system -o yaml

How to check if Autoscaler is working

In this example, we will perform a basic test by increasing the number of pods to check the scale-up and then decreasing it to check the scale-down.

Scale-up

Please remember that the autoscaler is performed based on the pod’s request value, so we do not need to perform a stress test; we just have to increase the number of pods to achieve the worker-node limit.

If your deployment does not have the Requests set up accordingly, autoscaler won’t work as you expect. In our example, we will use a simple Replica Set that deploys four nginx pods.

nginx-test-autoscaler.yaml

kind: ReplicaSet
metadata:
  name: web-autoscaler-replicaset
spec:
  selector:
    matchLabels:
      app: webserver
      tier: webserver
  replicas: 4
  template:
    metadata:
      labels:
        app: webserver
        tier: webserver
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        resources:
          requests:
            cpu: 100m
            memory: 500Mi
        ports:
        - containerPort: 80

Note: Take a look at the requests value—we are requesting 100m for CPU and 500Mi for Memory. This is the value of the request for each pod. 

Create a replica set using the yaml file:

kubectl apply -f nginx-test-autoscaler.yaml

Check the pod status—one of them is in the “Pending” state:

kubectl get pods

Let’s take a look at the pod’s events to verify the reason:

kubectl describe pod <podname> | grep Warning

In this case, the pod is in the pending state because there is insufficient memory on worker nodes.

Looking at the Kubernetes cluster inside the IBM Cloud portal, we can verify that a new node is being provisioned. Autoscaler identified that there is no computational resource to start the pod in pending state, so it is automatically scaling up to put the pod in running state:

After the nodes are provisioned, we can check the pod’s status and the number of nodes:

kubectl get nodes
kubectl get pods | grep web

Scale-down

In our test, we do not have workload in our pods so we just need to decrease the number of pods, which will decrease the total Requests. After the autoscaler identifies that the pods that are on a node have a request of less than 50% of its capacity, it will start the scale-down process.

Let’s change the replicaset from four to two pods and confirm when only two pods are running:

kubectl scale rs web-autoscaler-replicaset --replicas=2
kubectl get rs web-autoscaler-replicaset
kubectl get pods

Looking at the Kubernetes cluster inside the IBM Cloud portal, we can verify that one node is being deleted:

Customizing configuration values

You can change the cluster autoscaler configuration values using the helm upgrade command with –set option.  If you want to learn more about the available parameters, see the following: Customizing configuration values (–set)

Here are two examples:

  1. Change the scan interval to 5m and enable autoscaling for the default worker pool, with a maximum of five and minimum of three worker nodes per zone:
    helm upgrade --set scanInterval=5m --set workerpools[0].default.max=5,workerpools[0].default.min=3,workerpools[0].default.enabled=true ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler -i --recreate-pods --namespace kube-system
  2. Change the threshold scale-down utilization to 0.7 and maximum time in minutes before pods is automatically restarted:
    helm upgrade --set scaleDownUtilizationThreshold=0.7 --set max-inactivity=5min ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler -i --recreate-pods --namespace kube-system
  3. We saw in the examples above that there are options to customize the values, but what if you want to return to the default values? There is a command to reset the settings:
    helm upgrade --reset-values ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler --recreate-pods

Conclusion

Autoscaler can help you to avoid having pods in a pending state in your environment due to lack of computational resources by increasing the number of worker nodes and by decreasing them if they are underutilized.

In this article, we gave just one example of how it can be used and explored the customization options to adapt to the best for your environment. It’s also important to take into consideration the time that IBM Cloud can provision a new worker node in order to make adequate the threshold to be used in your environment. In this case, a validation is recommended before going into production.

Learn more

Want to get some free, hands-on experience with Kubernetes? Take advantage of interactive, no-cost Kubernetes tutorials by checking out IBM CloudLabs.

Categories

More from Cloud

Kubernetes version 1.28 now available in IBM Cloud Kubernetes Service

2 min read - We are excited to announce the availability of Kubernetes version 1.28 for your clusters that are running in IBM Cloud Kubernetes Service. This is our 23rd release of Kubernetes. With our Kubernetes service, you can easily upgrade your clusters without the need for deep Kubernetes knowledge. When you deploy new clusters, the default Kubernetes version remains 1.27 (soon to be 1.28); you can also choose to immediately deploy version 1.28. Learn more about deploying clusters here. Kubernetes version 1.28 In…

Temenos brings innovative payments capabilities to IBM Cloud to help banks transform

3 min read - The payments ecosystem is at an inflection point for transformation, and we believe now is the time for change. As banks look to modernize their payments journeys, Temenos Payments Hub has become the first dedicated payments solution to deliver innovative payments capabilities on the IBM Cloud for Financial Services®—an industry-specific platform designed to accelerate financial institutions' digital transformations with security at the forefront. This is the latest initiative in our long history together helping clients transform. With the Temenos Payments…

Foundational models at the edge

7 min read - Foundational models (FMs) are marking the beginning of a new era in machine learning (ML) and artificial intelligence (AI), which is leading to faster development of AI that can be adapted to a wide range of downstream tasks and fine-tuned for an array of applications.  With the increasing importance of processing data where work is being performed, serving AI models at the enterprise edge enables near-real-time predictions, while abiding by data sovereignty and privacy requirements. By combining the IBM watsonx data…

The next wave of payments modernization: Minimizing complexity to elevate customer experience

3 min read - The payments ecosystem is at an inflection point for transformation, especially as we see the rise of disruptive digital entrants who are introducing new payment methods, such as cryptocurrency and central bank digital currencies (CDBC). With more choices for customers, capturing share of wallet is becoming more competitive for traditional banks. This is just one of many examples that show how the payments space has evolved. At the same time, we are increasingly seeing regulators more closely monitor the industry’s…