Managing Worker Nodes with Terraform
1 March 2023
4 min read
Learn how to migrate your worker pools to a new operating system like Ubuntu 20.

In the following example scenarios, you will learn how to use Terraform to migrate your worker nodes to a new Ubuntu version (e.g., from Ubuntu 18 to Ubuntu 20) and change your default worker pool to use different worker nodes.

 
Migrating to a new Ubuntu version with Terraform

To migrate your worker nodes to a new Ubuntu version, you must first provision a worker pool that uses a newer Ubuntu version. Then, you can add worker nodes to the new pool and finally remove the original worker pool.

  1. We begin with the following example cluster configuration. This cluster contains an Ubuntu 18 worker pool called oldpool

     

    resource "ibm_container_vpc_cluster" "cluster" {
        ...
        }
    
        resource "ibm_container_vpc_worker_pool" "oldpool" {
        cluster          = ibm_container_vpc_cluster.cluster.id
        worker_pool_name = "ubuntu18pool"
        flavor           = var.flavor
        vpc_id           = data.ibm_is_vpc.vpc.id
        worker_count     = var.worker_count
        ...
        operating_system = "UBUNTU_18_64"
        }
    

     

  2. Next, add a worker pool resource for your Ubuntu 20 workers. In the following example, a temporary new_worker_count variable is introduced to control the migration:

     

        resource "ibm_container_vpc_worker_pool" "oldpool" {
        count = var.worker_count - var.new_worker_count == 0 ? 0 : 1
    
        cluster          = ibm_container_vpc_cluster.cluster.id
        worker_pool_name = "ubuntu18pool"
        flavor           = var.flavor
        vpc_id           = data.ibm_is_vpc.vpc.id
        worker_count     = var.worker_count - var.new_worker_count
        ...
        operating_system = "UBUNTU_18_64"
        }
    
    
        resource "ibm_container_vpc_worker_pool" "newpool" {
        count = var.new_worker_count == 0 ? 0 : 1
    
        cluster          = ibm_container_vpc_cluster.cluster.id
        worker_pool_name = "ubuntu20pool"
        flavor           = var.flavor
        vpc_id           = data.ibm_is_vpc.vpc.id
        worker_count     = var.new_worker_count
        ...
        operating_system = "UBUNTU_20_64"
        }
    

     

  3. Start the migration by gradually increasing thenew_worker_count  variable.In the following example, thenew_worker_count  is set to 1:

     

    terraform plan -var new_worker_count=1
    
    terraform apply -var new_worker_count=1
    

     

  4. Review the following actions that are performed when you change the worker count:

     

        # ibm_container_vpc_worker_pool.newpool[0] will be created
        + resource "ibm_container_vpc_worker_pool" "newpool" {
            + cluster                 = "<clusterid>"
            + flavor                  = "bx2.4x16"
            + id                      = (known after apply)
            + labels                  = (known after apply)
            + operating_system        = "UBUNTU_20_64"
            + resource_controller_url = (known after apply)
            + resource_group_id       = (known after apply)
            + secondary_storage       = (known after apply)
            + vpc_id                  = "<vpcid>"
            + worker_count            = 1
            + worker_pool_id          = (known after apply)
            + worker_pool_name        = "ubuntu20pool"
    
            + zones {
                + name      = "<zone_name>"
                + subnet_id = "<subnet_id>"
            }
        }
    
        # ibm_container_vpc_worker_pool.oldpool[0] will be updated in-place
        ~ resource "ibm_container_vpc_worker_pool" "oldpool" {
            id                      = "<oldpoolid>"
            ~ worker_count            = 3 -> 2
            # (9 unchanged attributes hidden)
    
            # (1 unchanged block hidden)
        }
    
        Plan: 1 to add, 1 to change, 0 to destroy.
    

     

  5. Verify that the new worker pool and the new worker(s) have been created and the old worker pool is scaled down.
  6. Finish the migration by setting the new worker pool’s worker count to the same value as the old one before the migration. As a best practice, always review your changes using the terraform plan command:

     

    terraform plan -var new_worker_count=3
    
    terraform apply -var new_worker_count=3
    
    ...
    
        Terraform will perform the following actions:
    
        # ibm_container_vpc_worker_pool.newpool[0] will be updated in-place
        ~ resource "ibm_container_vpc_worker_pool" "newpool" {
                id                      = "<newpoolid>"
            ~ worker_count            = 2 -> 3
                # (9 unchanged attributes hidden)
    
                # (1 unchanged block hidden)
            }
    
        # ibm_container_vpc_worker_pool.oldpool[0] will be destroyed
        - resource "ibm_container_vpc_worker_pool" "oldpool" {
            - cluster                 = "<clusterid>" -> null
            ...
            }
    
        Plan: 0 to add, 1 to change, 1 to destroy.
    

     

  7. Verify that the old worker pool has been deleted.
  8. Remove the old worker pool resource and the temporary changes from the Terraform script:

     

    resource "ibm_container_vpc_cluster" "cluster" {
        ...
        }
    
        resource "ibm_container_vpc_worker_pool" "newpool" {
        cluster          = ibm_container_vpc_cluster.cluster.id
        worker_pool_name = "ubuntu20pool"
        flavor           = var.flavor
        vpc_id           = data.ibm_is_vpc.vpc.id
        worker_count     = var.worker_count
        ...
        operating_system = "UBUNTU_20_64"
        }
    
Changing the default worker pool

Begin by defining the worker pool as its own resource.

While you are changing the default worker pool, a backup worker pool is required if the change includes a `ForceNew` operation. If you update the default worker pool without not having a separate worker pool with existing workers already added, your cluster will stop working until the worker replacement is finished.

  1. Create the resource similar to the following example:
    resource "ibm_container_vpc_cluster" "cluster" {
            ...
        }
    
    
        resource "ibm_container_vpc_worker_pool" "default" {
            cluster           = ibm_container_vpc_cluster.cluster.id
            flavor            = <flavor>
            vpc_id            = <vpc_id>
            worker_count      = 1
            worker_pool_name  = "default"
            operating_system  = "UBUNTU_18_64"
            ...
        }
    

     

     

  2. Import the worker pool:
    terraform import ibm_container_vpc_worker_pool.default <cluster_id/workerpool_id>
    

     

     

  3. Add the following lifecycle options to ibm_container_vpc_cluster.cluster so changes made by theibm_container_vpc_worker_pool.default won’t trigger new updates and won’t trigger ForceNew. Note that the events that triggerForceNew might change. Always runterraform plan and review the changes before applying them:
      resource "ibm_container_vpc_cluster" "cluster" {
            ...
            lifecycle {
                ignore_changes = [
                    flavor, operating_system, host_pool_id, secondary_storage, worker_count
                ]
            }
        }
    

     

  4. In this example, we modify the operating system of the default worker pool and set the worker count to two. Note that updating the worker count would normally resize the worker pool, but since we changed the operating system, a new worker pool is created. Making this change on a cluster resource would trigger theForceNew option on the cluster itself and would result in a new cluster being created. However, since we defined the worker pool resource separately, new workers are created instead:
    resource "ibm_container_vpc_worker_pool" "default" {
        cluster           = ibm_container_vpc_cluster.cluster.id
        flavor            = <flavor>
        vpc_id            = <vpc_id>
        worker_count      = 2
        worker_pool_name  = "default"
        operating_system  = "UBUNTU_20_64"
        ...
    }
    

     

  5. Runterraform plan to review your changes:

    terraform plan

     

  6. Apply your changes to replace your Ubuntu 18 worker nodes with Ubuntu 20 worker nodes:

    terraform apply

     

  7. Verify your changes by listing your worker nodes:
    ibmcloud ks worker ls -c <cluster_id>
    

     

  8. After updating the default worker pool, pull your changes into the current state and remove the lifecycle operations you added earlier:

     

    terraform state pull ibm_container_vpc_cluster.cluster
    

     

  9. Then, remove the ibm_container_vpc_worker_pool.default  resource so it is no longer managed:

     

    terraform state rm ibm_container_vpc_worker_pool.default
    

     

  10. Remove the lifecycle options that you added earlier from cluster resource.
Conclusion

In the previous examples you learned how to do the following.

  • Migrate your worker pools to a new operating system, such as Ubuntu 20.
  • Make changes to the default worker pool while using a backup pool to prevent downtime.

For more information about the IBM Cloud provider plug-in for Terraform, see the Terraform registry documentation.

For more information about IBM Cloud Kubernetes Service, see the docs.

For more information about Red Hat OpenShift on IBM Cloud, see the docs.

Author
Attila László Tábori Software Developer
Zoltán Illés Software Developer