Upgrade from previous versions of the IBM® Cloud Infrastructure Center

How to upgrade the IBM® Cloud Infrastructure Center from release 1.2.0 multi-node cluster deployment environment to 1.2.1 multi-node cluster deployment environment.

The steps are as follows:

  1. Check 1.2.0 multi-node cluster.
    Run the clustercheck command and pcs resource command on the DC node (You can get the DC server information by running pcs cluster status command.) to make sure that the v1.2.0 multi-node cluster deployment environment is in the normal working status. The three management nodes need to be online. If the Galera cluster node is synced when running the clustercheck, it is okay for the backup.

    [root@doc-ha-mgmt-1 ~]# pcs cluster status
    Cluster Status:
    Cluster Summary:
       * Stack: corosync
       * Current DC: node3-ip-address (version 2.1.2-4.el8_6.3-ada5c3b36e2) - partition with quorum
       * Last updated: Mon Nov 27 20:22:33 2023
       * Last change:  Fri Nov 17 01:59:56 2023 by root via cibadmin on node3-ip-address
       * 3 nodes configured
       * 110 resource instances configured (8 DISABLED)
    Node List:
       * Online: [ node1-ip-address node2-ip-address node3-ip-address ]
    
    PCSD Status:
    node1-ip-address: Online
    node2-ip-address: Online
    node3-ip-address: Online
    
    
    [root@doc-ha-mgmt-1 ~]# clustercheck
    HTTP/1.1 200 OK
    Content-Type: text/plain
    Connection: close
    Content-Length: 32
    
    Galera cluster node is synced.
    
  2. Apply the 1.2.0.0 interim fix 03.
    Before the upgrade, you need to download and apply the 1.2.0.0 interim fix 03 if the multi-node cluster deployment environment has KVM compute nodes.

  3. Backup the 1.2.0 multi-node cluster data.

    • Before the backup, you need to make sure there is enough disk space in the local disk. Or, you can specify the mounted backup folder to a NFS server or the IBM® Storage spectrum.
      Note: The icic-opsmgr backup command can check the disk space. If there is no enough disk space for the backup, the backup will fail & prompt related error message. You can estimate it roughly by comparing the size of available disk space and the size of the swfit object storage in the multi-node cluster. You can run df -h /srv/node/partition1 command to know the size of the swfit object storage.

    • Run icic-opsmgr backup -c clustername -p <backup_folder> to backup the 1.2.0 multi-node cluster. Currently, you need to run tar -xvf icic_backup.tar.gz to make sure it is NOT an in-compete backup file manually. Refer to Backing up multi-node cluster for the details.

  4. Run the pre-upgrade validation tool.
    Refer to the Upgrade Validator Overview and run pre-upgrade validation tool to know whether the environment is okay to do the upgrade. The problems reported from pre-upgrade validation tool need to be fixed before the upgrade.

  5. Keep the safety of the backup files.
    You need to make sure that all the backup files in above <backup_folder> have been well stored in a secure place. You need to copy them to a different remote server or another NFS server for emergency.

    If the backup files are broken or deleted by mistake, the 1.2.0 multi-node cluster deployment environment can not be recovered back either. That is a serious disaster.

  6. Save the 1.2.0 multi-node cluster definition.
    Before the uninstallation of the 1.2.0 multi-node cluster, you need to run command icic-opsmgr inventory -l -j and save the output into a text file and copy it to a secure place. You need the content when creating the 1.2.1 multi-node cluster definition later.

  7. Uninstall the 1.2.0 multi-node cluster.

    • Run icic-opsmgr uninstall -c clustername to uninstall the 1.2.0 multi-node cluster deployment environment.
      Note: You do Not need to uninstall the compute nodes in the 1.2.0 multi-node cluster.

    • If above command can not uninstall the environment successfully, you need to go to each of the multi-node management node to run /opt/ibm/icic-opsmgr/scripts/icic-failed-node-uninstall -y -f command and uninstall the environment forcedly and completely. Refer to the Uninstallation of multi-node cluster for the details.

  8. Uninstall the 1.2.0 opsmgr utility.
    Run /tmp/uninstall-opsmgr.sh shell script to uninstall the 1.2.0 opsmgr utility.

  9. Prepare the installation of 1.2.1 multi-node cluster. Download the IBM® Cloud Infrastructure Center 1.2.1 build and extract the tarball, then go to the icic-opsmgr-1.2.1.0 folder and run setup_opsmgr.sh shell script to setup the 1.2.1 opsmgr utility.

  10. Create the 1.2.1 multi-node cluster definition.
    Run icic-opsmgr inventory -c clusername to create the 1.2.1 multi-node cluster definition. Note: During the upgrade, the 1.2.1 cluster definition is required to be consistent with that in the 1.2.0 multi-node cluster. You can refer to the kept content of 1.2.0 multi-node cluster definition in above step 6.
    Note: You need to guarantee that the following settings in the 1.2.1 multi-node cluster definition are the same as that in the 1.2.0 multi-node cluster definition.

    • The cluster name

    • The choice of whether to apply Swift Object Storage

    • The virtual ip

    • The three IPs of the management nodes

  11. Install the 1.2.1 multi-node cluster.
    Refer to Deploy multi-node cluster for the details.

  12. Check the 1.2.1 multi-node cluster.

    • Run icic-servcies and pcs resource command on one of the management nodes in the 1.2.1 multi-node cluster to make sure that the services have been started and the multi-node cluster status is okay.

    • Login to the 1.2.1 IBM® Cloud Infrastructure Center UI to check whether the 1.2.1 multi-node cluster is well established.

  13. Backup the 1.2.1 multi-node cluster data.
    You need to take a backup of the 1.2.1 multi-node cluster data and copy it to a secure place, which is for any emergency. Refer to Backing up multi-node cluster for the details.
    Note: Do not mix it with 1.2.0 backup file.
    Note: If some un-expected errors happen on the DC management node during the upgrade (For example, the DC management node is down suddenly.), the multi-node cluster data can be in a state of confusion. You can make use of some files (icic-keystone.conf, icic-db.conf, etc.) in the 1.2.1 backup file to recover the environment and go on upgrading. In a summary, the 1.2.1 backup file can help you recover the environment back in some disasters. Without the 1.2.1 backup file, in some disasters, you need to re-install the 1.2.1 multi-node cluster deployment environment.

  14. Upgrade on the DC management node.
    Run icic-opsmgr restore -c <clustername> -b <120_backup_folder> command on the DC management node and start the upgrade. For the restore command, Refer to Recover data for multi-node cluster deployment for the details.
    Note: <120_backup_folder> is the backup folder, which contains the 1.2.0 backup files.

  15. Verify the upgrade process from 1.2.0 to 1.2.1.

    • If the restore playbook fails during the upgrade, you need to keep the 1.2.1 multi-node cluster environment unchanged and contact the IBM® Cloud Infrastructure Center service support team for help immediately.

    • If the restore playbook is finished successfully, you need to follow the Upgrade Validator Overview and run post-upgrade validation tool to know whether there is an error or exception in the logs during the upgrade. Even if the restore playbook is well executed, it can not guarantee that all compute nodes have been upgraded successfully. For example, if the yum.repo on a compute node before the upgrade is mis-configured or deleted, the upgrade will fail on the compute node.

  16. Rescue the upgrade failure compute nodes (optional).

    • When the post-upgrade validation tool detects that some compute nodes failed to be upgraded, you need to run the icic-opsmgr restore -c <clustername> -r [host1,...,hostn] --upgrade -b backup_folder to upgrade the compute nodes, again. This command tries to do the upgrade, again, on the specified compute nodes based on the backup files.
      Note: The backup_folder needs to contain the backup files of the compute nodes.

    • You need to run the icic-services restart to restart the 1.2.1 multi-node cluster after the rescue of the upgrade failure compute nodes.

  17. Readd users to the icic-filter group (optional).
    When you are using the local operating system as your identity repository, you need to readd these users to the icic-filter group to make these users visible to the IBM Cloud Infrastructure Center after the upgrade. Refer to Configuring security for more details.

  18. Verify the upgraded 1.2.1 multi-node cluster.

    • Login to the 1.2.1 IBM Cloud Infrastructure Center UI

    • Run the Environment Checker

    • Check the health status of resources

    • Deploy the virtual machines, create networks, create volumes, and so on, to ensure that the features in 1.2.1 multi-node cluster can work as expected.