Performing an online upgrade from an x86 Utility Node management server

Remember: Throughout this procedure, after each reboot or before each restart of a management server (Power9) or a management server VM (IBM® Storage Scale System Utility Node), you must use the following command to start Chrony:
systemctl start chronyd
If you do not start Chrony after a container reboot or before a container restart, you get the next error:
ERROR: Neither NTPD or CHRONYD are up. One time server service is required to run on EMS.
Please run this command prior to start the container "systemctl start chronyd"
  1. Copy to the x86 Utility Node management server the IBM Storage Scale release image that you previously downloaded from IBM Fix Central. Preferably, copy the release image to the /home/deploy directory in the Utility Node MSVM.
  2. If a protocol node is part of the environment, copy to the Utility Node MSVM the IBM Storage Scale release image that you previously downloaded from IBM Fix Central. Preferably, copy the release image to the /home/deploy directory in the Utility Node MSVM.
  3. Set up the target release container in the Utility Node MSVM by following the next substeps.
    1. Decompress the package.
      cd /home/deploy ; xz --decompress <xz package>
    2. Decompress the generated tarball.
      tar -xvf <uncompressed tarball>
    3. Start the container.
      sh <untarred binary file>
      1. When the following prompt is displayed, type y for yes and n for no.
        Is the current EMS FQDN <hostname.domain> correct (y/n):
        Remember: This hostname must be the management hostname.
      2. When the following prompt is displayed, type the desired name for the new container.
        Please type the desired container short hostname [cems]:

        This hostname should not be registered in the /etc/hosts file or resolvable by any DNS server.

      3. When the following prompt is displayed, choose one of the options.
        Inventory file exists, do you want to wipe it out? (y/n):
        Type y for yes. This option deletes the hardware inventory from the container. This is the best option for a clean start.
        Note: If this is not the first time that you run the container or if there is no hardware change on the system, it is not necessary to cleanup the inventory. In a later step, the inventory file is filled with the latest hardware information of the system by running the essrun config load command.
        Type n for no. This option does not delete the hardware inventory from the container.
        Note: It is highly recommended to run the essrun config load command in a later step, even if the hardware inventory from the container is not deleted.
    4. When the container prompt is displayed, run the essrun -N utilityBareMetal config load command, as shown in the next example.
      Container prompt:
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ #
      Note:
      • In the essrun tool, the reserved word utilityBareMetal is used to refer to an x86 Utility Node management server.
      • In this procedure, emsvm is used to refer to the virtual machine of an x86 Utility Node management server.
      If necessary, populate the hardware inventory file for all the nodes by running the following commands.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N utilityBareMetal config load
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N emsvm,ems,ems2,essio1,essio2,prt01,prt02 config load
    5. In preparation for the next step, run the update --precheck command to verify that the system is healthy.

      If a failure occurs during the run of the update precheck command, correct the issues that are flagged in the output before you run this command again.

      If no issues are flagged, proceed with the update.

      The following example shows the previous command run.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N utilityBareMetal,emsvm,ems2,essio1,essio2,prt01,prt02 update --precheck
  4. Update the x86 Utility Node management server.
    1. Run the initial update.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N utilityBareMetal update
    2. In the prompt that is displayed, choose one of the options.
      Please enter 'accept' indicating that you want to update the following list of nodes: utilityBareMetal or enter 'no' if you don't want to proceed.

      Type accept if you agree to update listed nodes.

      After you accept, the update begins.

      If no error due to a kernel or OFED change is reported, skip to step 5.

      If the update reports an error related to a kernel or OFED change, follow substep 4.c.

    3. If there is a kernel or OFED change, you are requested to exit the container, reboot, and restart the container.
      The prompt is similar to the following one:
      msg:
          - "Seems that kernel has changed. This will require a reboot "
          - "Please exit container, shutdown VM and reboot utilityBareMetal”
          - "Restart VM (./emsvm --start-EMS ) and container (./essmkyml --restart) once utilityBareMetal is back and run update again."
    4. Reboot the x86 Utility Node management server.
      Warning: This step stops the management server VM, which means that the IBM Storage Scale in that node becomes inactive. Before you perform this step, use the mmgetstate -s command to check whether the quorum is okay if this node goes inactive.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # ssh utilityBareMetal
      root@<utilitynode_hostname>:/ # systemctl reboot
    5. When the x86 Utility Node management server is back from the reboot, restart the management server VM (if it was not automatically activated).
      1. Check whether the management server VM is automatically active after the x86 Utility Node is back from the update.
        virsh  list --state-running
        If the management server VM is running, the output of the previous command is similar to the next example, where the state is reported as running.
        [root@utilityBareMetal ~]# virsh  list --state-running
         Id   Name        State
        ---------------------------
         1    EMSVM-23E   running
        
      2. If the management server VM is not running, start it.
        cd /serv/EMSVM-main/ ; ./emsvm --start-EMS
      3. When the following prompt is displayed, type y.
        Do you want to continue with the start EMS VM? (y/n):
        Then, the next message is displayed.
        EMS VM is started now ...
    6. When the x86 Utility Node management server and the management server VM are back from the reboot, establish an SSH connection to the management server VM again, restart the container in the management server VM and run the essmkyml script by completing the next substeps.

      The essmkyml script is in the extracted directory of the IBM Storage Scale System image that was used in step 3.c. Usually, the path is /home/deploy/ess_6.2.x.x_...dir/.

      1. Run the --restart command.
        cd /home/deploy/ess_6.2.x.x_...dir/ ; ./essmkyml --restart
      2. When the following prompt is displayed, choose one of the options.
        Is the current EMS FQDN <hostname.domain> correct (y/n):

        Type y for yes.

        Type n for no.

        Remember: This hostname must be the management hostname.
      3. If the following prompt is displayed, press Enter.
        Please type the desired container short hostname [containerHostname] :
      4. When the following prompt is displayed, type n.
        Inventory file exists, do you want to wipe it out? (y/n):
        This action takes you back to the container prompt:
        ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ #
    7. Still in the container prompt, run a second update.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N utilityBareMetal update
      After the second update finishes successfully, you see a message similar to next example.
      msg:
          - "Upgrade of ESS Utility Node (Bare metal) Complete."

    If this second update of the x86 Utility Node management server was successful, proceed with the management server VM update.

  5. Update the x86 Utility Node management server VMs.

    Depending on the number of x86 Utility Node management server VMs in the environment, use one of the next options.

    Option 1. For one x86 Utility Node management server VM in the environment, complete these substeps.
    1. Run the initial update.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N emsvm update
      In the prompt that is displayed, choose one of the options.
      Please enter 'accept' indicating that you want to update the following list of nodes: emsvm or enter 'no' if you don't want to proceed.

      Type accept if you agree to update listed nodes.

      After you accept, the update begins.

      If there is a kernel or OFED change, you are requested to exit the container, reboot, and restart the container.

      The prompt is similar to the following one:
      msg:
       - 'Seems that kernel has changed. This will require a reboot '
       - Please exit container and reboot <MScontainer_hostname>
       - Restart container (./essmkyml --restart) once ems is back and run update again.
      Restart the performance monitoring sensor to restart the performance monitoring collector and sensors. Then, continue with step 6. Use the following commands to restart the performance monitoring sensor:
      systemctl start pmcollector
      systemctl start pmsensors

      If the previous message is not displayed, restart the performance monitoring sensors and the GUI service. Then, continue with step 6.

    2. Reboot the x86 Utility Node management server VM.
      ESS UNIFIED v6.2.x.x CONTAINER root@<MScontainer_hostname>:/ # exit
      root@emsHostname:/ # reboot
    3. When the x86 Utility Node management server is back from the reboot, run the essmkyml script by completing the next substeps.

      The essmkyml script is in the extracted directory of the IBM Storage Scale System image that was used in step 3.c. Usually, the path is /home/deploy/ess_6.2.x.x_...dir/.

      1. Run the --restart command.
        cd /home/deploy/ess_6.2.x.x_...dir/ ; ./essmkyml --restart
      2. When the following prompt is displayed, choose one of the options.
        Is the current EMS FQDN <hostname.domain> correct (y/n):

        Type y for yes.

        Type n for no.

        Remember: This hostname must be the management hostname.
      3. If the following prompt is displayed, press Enter.
        Please type the desired container short hostname [containerHostname] :
      4. When the following prompt is displayed, type n.
        Inventory file exists, do you want to wipe it out? (y/n):
        This action takes you back to the container prompt:
        ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ #
    4. Still in the container prompt, run a second update.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ #essrun -N emsvm update

      Carefully read the process output.

      If you encounter a message similar to the next one, follow its instructions: exit the container, reboot the x86 Utility Node management server, and restart the container.
      msg:
        - Please shutdown <MScontainer_hostname> and reboot the Utility Host(Bare Metal) since OFED was updated or reinstalled
        - "YOU MUST RUN THE FOLLOWING SCRIPT IN THE EMS VM AFTER REBOOT TO SET NETWORKING PARAMETERS AND ACTIVATE GPFS"
        - "Run this script: '/opt/ibm/ess/tools/samples/reloadEms.sh' when <MScontainer_hostname> is back."
    5. Reboot the management server VM and the x86 Utility Node management server.
      1. Reboot the x86 Utility Node management server.
        ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # ssh utilityBareMetal 
        root@utilityBareMetal:/ # reboot
      2. When the management server VM is back from reboot, run the reloadEms.sh script.
        /opt/ibm/ess/tools/samples/reloadEms.sh
    6. Restart the performance monitoring services and the IBM Storage Scale GUI.
      systemctl restart pmsensors
      systemctl restart gpfsgui
    7. Restart the container by using the essmkyml script, as described in the next substeps.

      The essmkyml script is in the extracted directory of the IBM Storage Scale System image that was used in step 3.c. Usually, the path is /home/deploy/ess_6.2.x.x_...dir/.

      1. Run the --restart command.
        cd /home/deploy/ess_6.2.x.x_...dir/ ; ./essmkyml --restart
      2. When the following prompt is displayed, choose one of the options.
        Is the current EMS FQDN <hostname.domain> correct (y/n):

        Type y for yes.

        Type n for no.

        Remember: This hostname must be the management hostname.
      3. If the following prompt is displayed, press Enter.
        Please type the desired container short hostname [containerHostname] :
      4. When the following prompt is displayed, type n.
        Inventory file exists, do you want to wipe it out? (y/n):
        This action takes you back to the container prompt:
        ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ #
    Option 2. For coexistent environment (an x86 Utility Node server and a Power management server) or dual x86 Utility Node management servers in the environment, complete these substeps.
    1. Run the first update in the secondary management server.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N ems2 update
      In the prompt that is displayed, choose one of the options.
      Please enter 'accept' indicating that you want to update the following list of nodes: ems2 or enter 'no' if you don't want to proceed.

      Type accept if you agree to update the listed nodes.

      After you accept, the update begins.

      The secondary x86 Utility Node management server automatically reboots and then reactivates both IBM Storage Scale and the I/O nodes.

    2. Run the second update, now in the primary management server.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N emsvm update
      In the prompt that is displayed, choose one of the options.
      Please enter 'accept' indicating that you want to update the following list of nodes: emsvm or enter 'no' if you don't want to proceed.

      Type accept if you agree to update the listed nodes.

      After you accept, the update begins.

      For the primary management server (where is container is running), if there is a kernel or OFED change, you are requested to exit container, reboot, and restart the container.

      The prompt is similar to the following one:
      msg:
       - 'Seems that kernel has changed. This will require a reboot '
       - Please exit container and reboot emsHostname
       - Restart container (./essmkyml --restart) once ems is back and run update again.
    3. Reboot the primary x86 Utility Node management server.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # exit
      root@<MScontainer_hostname>:/ # reboot
    4. When the primary x86 Utility Node management server is back from the reboot, run the essmkyml script by completing the next substeps.

      The essmkyml script is in the extracted directory of the IBM Storage Scale System image that was used in step 3.c. Usually, the path is /home/deploy/ess_6.2.x.x_...dir/.

      1. Run the --restart command.
        cd /home/deploy/ess_6.2.x.x_...dir/ ; ./essmkyml --restart
      2. When the following prompt is displayed, choose one of the options.
        Is the current EMS FQDN <hostname.domain> correct (y/n):

        Type y for yes.

        Type n for no.

        Remember: This hostname must be the management hostname.
      3. If the following prompt is displayed, press Enter.
        Please type the desired container short hostname [containerHostname] :
      4. When the following prompt is displayed, type n.
        Inventory file exists, do you want to wipe it out? (y/n):
        This action takes you back to the container prompt:
        ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ #
    5. Still in the container prompt, run a second update on the primary x86 Utility Node management server.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ #essrun -N emsvm update

      Carefully read the process output.

      If you encounter a message similar to the next one, follow its instructions: exit the container, reboot the x86 Utility Node management server, and restart the container.
      msg:
      - Please reboot the EMS node since OFED was updated or reinstalled.
      - YOU MUST RUN THE FOLLOWING SCRIPT AFTER REBOOT TO SET NETWORKING PARAMETERS AND ACTIVATE GPFS
      - 'Run this script: ''/opt/ibm/ess/tools/samples/reloadEms.sh'' when node emsHostname is back.'
    6. Reboot the management server VM and the x86 Utility Node management server.
      1. Reboot the primary x86 Utility Node management server.
        ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # ssh utilityBareMetal 
        root@utilityBareMetal:/ # reboot
      2. When the management server VM is back from reboot, run the reloadEms.sh script.
        /opt/ibm/ess/tools/samples/reloadEms.sh
    7. Restart the performance monitoring services and the IBM Storage Scale GUI in the management server VM. If you have a dual management server configuration, restart these services in both.
      systemctl restart pmsensors
      systemctl restart gpfsgui
    8. In a coexistent environment (an x86 Utility Node server and a Power management server), update the system firmware in Power management server.
      Warning: Do not revert the system firmware level to an earlier version because it can cause a system breakage. Instead, upgrade the firmware to any latest firmware level available.
      Remember: The system reboots automatically and the firmware is upgraded. This process can take up to 1.5 hours per node. Consider this for planning the maintenance window.
      1. Go to the firmware directory and look for the 1VL950_131_045.img file.
        cd /install/ess/otherpkgs/rhels8/ppc64le/firmware/
      2. Verify the package.
        update_flash -v -f 01VL950_131_045.img

        If the result of the verification does not flag any issues, proceed with the update.

      3. Start the update.
        update_flash -f 01VL950_131_045.img

        Flashing the firmware sets the new image as temporary. After you are comfortable with the new firmware, you must commit it to permanent.

      4. Commit the new firmware image to permanent.
        update_flash -c
    9. Restart the container in the primary x86 Utility Node management server VM.

      The essmkyml script is in the extracted directory of the IBM Storage Scale System image that was used in step 3.c. Usually, the path is /home/deploy/ess_6.2.x.x_...dir/.

      1. Run the --restart command.
        cd /home/deploy/ess_6.2.x.x_...dir/ ; ./essmkyml --restart
      2. When the following prompt is displayed, choose one of the options.
        Is the current EMS FQDN <hostname.domain> correct (y/n):

        Type y for yes.

        Type n for no.

        Remember: This hostname must be the management hostname.
      3. If the following prompt is displayed, press Enter.
        Please type the desired container short hostname [containerHostname] :
      4. When the following prompt is displayed, type n.
        Inventory file exists, do you want to wipe it out? (y/n):
        This action takes you back to the container prompt:
        ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ #
  6. Update the I/O nodes.
    1. Choose one of the following update options and follow its substeps.

      Option 1. Update all the I/O nodes

      Start the update on all the I/O nodes.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N essio1,essio2,essioN… update
      In the prompt that is displayed, choose one of the options.
      Please enter 'accept' indicating that you want to update the following list of nodes: essio1,essio2,essioN… or enter 'no' if you don't want to proceed.

      Type accept if you agree to update the listed nodes.

      After you accept, the update begins.

      Option 2. Update a subset of the I/O nodes

      Start the update on all the I/O nodes.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N essio1,essio2,essioN… update --serial N
      In the prompt that is displayed, choose one of the options.
      Please enter 'accept' indicating that you want to update the following list of nodes: essio1,essio2,essioN… or enter 'no' if you don't want to proceed.

      Type accept if you agree to update the listed nodes.

      After you accept, the update begins with the N of nodes that are defined in the --serial argument.

      For more information, see What does the --serial option of the essrun command support for an online upgrade?

      The I/O nodes should have the following features active: daemon (IBM Storage Scale), recovery groups, and file system.

    2. Restart the performance monitoring service on each upgraded I/O node.
      systemctl restart pmsensors
    3. Update the firmwares of the I/O nodes that were upgraded in the previous substeps.
      For each of the I/O nodes, complete these substeps.
      1. Access the node via SSH.
      2. Update the enclosures firmwares.
        mmchfirmware --type storage-enclosure

        For x86 systems, this command upgrades the BIOS/BMC/FPGA of the canisters.

      3. Update the drives firmwares.
        mmchfirmware --type drive
      4. Update the system firmware.
        Warning: Do not revert the system firmware level to an earlier version because it can cause a system breakage. Instead, upgrade the firmware to any latest firmware level available.
        Remember: The system reboots automatically and the firmware is upgraded. This process can take up to 1.5 hours per node. Consider this for planning the maintenance window.
        1. Go to the firmware directory and look for the 1VL950_131_045.img file.
          cd /install/ess/otherpkgs/rhels8/ppc64le/firmware/
        2. Verify the package.
          update_flash -v -f 01VL950_131_045.img

          If the result of the verification does not flag any issues, proceed with the update.

        3. Start the update.
          update_flash -f 01VL950_131_045.img

          Flashing the firmware sets the new image as temporary. After you are comfortable with the new firmware, you must commit it to permanent.

        4. Commit the new firmware image to permanent.
          update_flash -c
      5. Run a health check in each of the I/O nodes that were upgraded.
        essinstallcheck

        If the I/O nodes upgrade finished successfully, no issues are reported in the output of the previous command.

        If an error is flagged in the health check, see Troubleshooting for more information.

  7. Update the x86 Utility Node protocol nodes.
    1. Run a first update in the protocol nodes.
      The following is an example of running the previous command. Replace <utilitynode_protocol1>,<utilitynode_protocol2>... with your hostnames.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N <utilitynode_protocol1>,<utilitynode_protocol2>... update
      In the prompt that is displayed, choose one of the options.
      Please enter 'accept' indicating that you want to update the following list of nodes: <utilitynode_protocol1>,<utilitynode_protocol2>... or enter 'no' if you don't want to proceed.

      Type accept if you agree to update listed nodes.

      After you accept, the update begins.

      If there is a kernel or OFED change, you are requested to exit the container, reboot, and restart the container.

      The prompt is similar to the following one:
      msg:
       - 'Seems that kernel has changed. This will require a reboot '
       - Please exit container and reboot <MScontainer_hostname>
       - Restart container (./essmkyml --restart) once ems is back and run update again.
    2. Reboot the x86 Utility Node protocol server (host).
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # ssh <utilitynode_protocol1>
      root@<utilitynode_protocol_hostname>:/ # reboot
      Warning: This step stops any protocol VM that runs in the host, which means that the IBM Storage Scale in that node becomes inactive. Before you perform this step, check whether the quorum is okay if this node goes inactive.
    3. When the x86 Utility Node protocol server is back from the reboot, run a second update.
      ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N <utilitynode_protocol1>,<utilitynode_protocol2>... update
      After the second update finishes successfully, you see a message similar to next example.
      msg:
          - "Upgrade of ESS Utility Node (Bare metal) Complete."
  8. Update the protocol nodes (virtual machines and Power protocol nodes).
    1. Upgrade IBM Storage Scale in the protocol nodes.

      For more information about how to upgrade IBM Storage Scale in the protocol nodes, see Performing online upgrade by using the installation toolkit in the IBM Storage Scale documentation.

    2. Upgrade the operative system and other components by using the IBM Storage Scale System container.
      1. Start the upgrade.
        ESS UNIFIED v6.2.x.x CONTAINER root@containerHostname:/ # essrun -N prt1,prt2 update
      2. When the following prompt is displayed, choose one of the options.
        Please enter 'accept' indicating that you want to update the following list of nodes: prt01,prt02 or enter 'no' if you don't want to proceed.

        Type accept if you agree to update the listed nodes.

        After you accept, the update begins.

        Remember: After the upgrade is complete, the IBM Storage Scale daemon is kept down in these nodes.
    3. Update the system firmware of the Power protocol nodes.
      Warning: Do not revert the system firmware level to an earlier version because it can cause a system breakage. Instead, upgrade the firmware to any latest firmware level available.
      Remember: The system reboots automatically and the firmware is upgraded. This process can take up to 1.5 hours per node. Consider this for planning the maintenance window.
      1. Go to the firmware directory and look for the 1VL950_131_045.img file.
        cd /install/ess/otherpkgs/rhels8/ppc64le/firmware/
      2. Verify the package.
        update_flash -v -f 01VL950_131_045.img

        If the result of the verification does not flag any issues, proceed with the update.

      3. Start the update.
        update_flash -f 01VL950_131_045.img

        Flashing the firmware sets the new image as temporary. After you are comfortable with the new firmware, you must commit it to permanent.

      4. Commit the new firmware image to permanent.
        update_flash -c