Version 1.0.8.5 release notes

Cloud Pak for Data System 1.0.8.5 is the next version after 1.0.8.4, and it includes bug fixes and fixes to customer-critical upgrade issues.

Upgrading

The upgrade procedure is performed by IBM Support.

Do not upgrade to 1.0.8.5 if you are using Cloud Pak for Data. The upgrade path for system with Cloud Pak for Data is 1.0.7.3 > 2.0.x. For more information, see https://www.ibm.com/docs/en/cloud-paks/cloudpak-data-system/2.0?topic=system-advanced-upgrade-from-versions-10x.

Note: It is advised that you upgrade to Cloud Pak for Data System 1.0.8.5 if you are on 1.0.7.6 and 1.0.7.8
Note: It is not necessary to upgrade to Cloud Pak for Data System 1.0.8.5 if you are on 1.0.8.3 or 1.0.8.4.

Software components

The recommended NPS version is 11.2.1.9 because it has base node expansion support, which needs 1.0.8.x release. The minimum supported NPS version is 11.2.1.5.

Resolved issues

Cloud Pak for Data System version 1.0.8.5 provides bug fixes and critical upgrades that fix customer upgrade issues. These fixes are only related to apupgrade component and do not affect or fix any current functions. The following is a list of upgrade issues that were fixed in 1.0.8.5:
  • IPA certificates renewal is part of the upgrade now.
  • Improvements in upgrade precheck done to handle 4 new checks faced by customers earlier.
  • Improvement in upgrade done to handle intermittent issues faced during upgrade.
  • Improvement in upgrade logging.
  • Parallel switch firmware upgrade is included.

Known issues

  1. Upgrade might fail due to disks are down and the filesystem cannot be mounted
    Upgrade to version 1.0.8.5 might fail because of the disks that are down and the filesystem cannot be mounted.
    Workaround
    Run the following commands and restart upgrade.
    1. mmchdisk platform start -a
    2. mmchdisk ips start -a
    3. mmlsdisk platform
    4. mmlsdisk ips
    5. mmlsnsd -X
  2. Upgrade might fail when the GPFS component upgrades
    Upgrade to version 1.0.8.5 from 1.0.7.8 on systems with connector nodes might fail when the GPFS component upgrades.
    Example of tracelog:
    2023-11-29 18:25:04 TRACE: run_shell_cmd_in_parallel(): running cmd systemctl restart mmsdrserv on nodes ['e5n1', 'e1n3', 'e4n1', 'e1n2', 'e1n1']
                               LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446
    2023-11-29 18:35:09 TRACE:
                               ['e5n1', 'e4n1']
                               RC: 1
                               STDOUT: []
                               STDERR: [A dependency job for mmsdrserv.service failed. See 'journalctl -xe' for details.
                               ]
    
                               ['e1n3', 'e1n2']
                               RC: 0
                               STDOUT: []
                               STDERR: []
    
                               ['e1n1']
                               RC: 1
                               STDOUT: []
                               STDERR: [Job for mmsdrserv.service failed because the control process exited with error code. See "systemctl status mmsdrserv.service" and "journalctl -xe" for details.
                               ]
    
                               LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446
    2023-11-29 18:35:09 TRACE: run_shell_cmd_in_parallel_or_raise(): running cmd service gpfs start on nodes ['e5n1', 'e1n3', 'e4n1', 'e1n2', 'e1n1']
                               LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446
    2023-11-29 18:45:14 TRACE:
                               ['e5n1', 'e4n1']
                               RC: 1
                               STDOUT: []
                               STDERR: [Redirecting to /bin/systemctl start gpfs.service
                               A dependency job for gpfs.service failed. See 'journalctl -xe' for details.
                               ]
    
                               ['e1n3', 'e1n2', 'e1n1']
                               RC: 0
                               STDOUT: []
                               STDERR: [Redirecting to /bin/systemctl start gpfs.service
                               ]
    
                               LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446
    2023-11-29 18:45:14 ERROR: Error running command [service gpfs start] on ['e5n1', 'e4n1']
                               LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446
    2023-11-29 18:45:14 INFO: You can view mmfs.log.latest log file at /var/adm/ras/ for details
                               LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446
    2023-11-29 18:45:14 FATAL ERROR: Prerequisite system checks failed
                               LOGGING FROM: bundleupgradechecker.py:perform_bundle_level_checks:223
    2023-11-29 18:45:14 FATAL ERROR: More Info: See trace messages at /var/log/appliance/apupgrade/20231129/apupgrade20231129175951.log.tracelog for additional troubleshooting information.
    Workaround
    1. Run mmgetstate -aLv to verify that the GPFS status is active.
      mmgetstate -aLv
      Expected output:
      [root@gt08-node1 ~]# mmgetstate -aLv
      
       Node number  Node name  Quorum  Nodes up  Total nodes  GPFS state    Remarks
      ---------------------------------------------------------------------------------
                 1  e1n1          2         3          5      active        quorum node
                 2  e1n2          2         3          5      active        quorum node
                 3  e1n3          2         3          5      active        quorum node
                 4  e4n1          2         3          5      active
                 5  e5n1          2         3          5      active
    2. Restart the upgrade.
  3. Upgrade might fail when the HPI component upgrades
    Upgrade to version 1.0.8.5 might fail with the following error in tracelog when the HPI component upgrades:
    Example of tracelog:
    "/localrepo/1.0.8.5_release/EXTRACT/system/upgrade/bundle_upgraders/../hpi/hpi_postinstaller.py", line 87, in run_hpi_status_verify
        except JSONDecodeError as err:
    NameError: name 'JSONDecodeError' is not defined
    Traceback (most recent call last):
      File "/localrepo/1.0.8.5_release/EXTRACT/system/upgrade/bundle_upgrade", line 56, in <module>
        sys.exit(bundle_upgrade.main(sys.argv[1:]))
      File "/localrepo/1.0.8.5_release/EXTRACT/system/upgrade/bundle_upgraders/bundle_upgrade.py", line 1509, in main
        should_continue = self.run_actions_based_on_args(self.target_node)
      File "/localrepo/1.0.8.5_release/EXTRACT/system/upgrade/bundle_upgraders/bundle_upgrade.py", line 1456, in run_actions_based_on_args
        self.do_upgrade_phases(self.components_to_upgrade, node)
      File "/localrepo/1.0.8.5_release/EXTRACT/system/upgrade/bundle_upgraders/bundle_upgrade.py", line 1623, in do_upgrade_phases
        self.logger.log_raise("The following components failed to upgrade: {}".format(failed_component_list), Exception)
      File "/localrepo/1.0.8.5_release/EXTRACT/system/upgrade/modules/ibm/ca/util/logger.py", line 378, in log_raise
        raise excClass(error_msg)
    Exception: ERROR: The following components failed to upgrade: ['hpi']
    Workaround
    1. On node e1n1, run the following command. Replace the <bundle_dir> directory name in the commands with the actual upgrade directory name that you used on your system
      sed -i -e 's|except JSONDecodeError as err:|except json.decoder.JSONDecodeError as err:|g' /localrepo/<bundle_dir>/EXTRACT/system/upgrade/hpi/hpi_postinstaller.py
    2. Run the following command to disable the bundle integrity and authenticity checks:
      sed -i -e 's,self.verify_bundle,#self.verify_bundle,g' /opt/ibm/appliance/apupgrade/bin/apupgrade
    3. Restart the upgrade.