Version 1.0.8.4 release notes
Cloud Pak for Data System 1.0.8.4 is the next version after 1.0.8.3 and is intended for Netezza Performance Server customers only. 1.0.8.4 comes with upgrade time improvement and bug fixes.
SP information to be confirmed and updated.
Upgrading
The upgrade procedure is performed by IBM Support.
Do not upgrade to 1.0.8.4 if you are using Cloud Pak for Data. The upgrade path for system with Cloud Pak for Data is 1.0.7.3 > 2.0.x. For more information, see https://www.ibm.com/docs/en/cloud-paks/cloudpak-data-system/2.0?topic=system-advanced-upgrade-from-versions-10x.
The information above needs to be verified.
Software components
The recommended NPS version is 11.2.1.9 because it has base node expansion support, which needs 1.0.8.x release. The minimum supported NPS version is 11.2.1.5.
NPS version to be confirmed.
Enhancements
- Added the respective log path for each component in the upgrade tracelog to improve clarity on
long-lasting stages of the upgrade and to improve the turn-around time if there are any
failures.Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4405 - Added a new pre-check to the upgrade process to verify whether NPS is up and running. Upgrade
fails if NPS is down. To restart the upgrade, NPS must be online.Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4427 - SCJ verification script is integrated to upgrade and executed during prechecks.Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4699 - Both the upgrade console and the log now show the time that is required to complete upgrade for
each
component.Example:
2023-10-26 04:00:59 Approx Estimated time required to upgrade psklm 00:05:00 2023-10-26 04:01:15 Approx Estimated time required to upgrade nodeos 01:00:00 2023-10-26 04:44:01 Approx Estimated time required to upgrade aposcomms 00:15:00 2023-10-26 05:05:57 Approx Estimated time required to upgrade docker_upgrade 00:5:00 2023-10-26 05:08:00 Approx Estimated time required to upgrade supporttools 00:05:00 2023-10-26 05:08:13 Approx Estimated time required to upgrade platformbackups 00:01:00 2023-10-26 05:08:24 Approx Estimated time required to upgrade magneto 00:20:00 2023-10-26 05:08:51 Approx Estimated time required to upgrade containerapi 00:01:00 2023-10-26 05:09:03 Approx Estimated time required to upgrade gpfsconfig 00:20:00 2023-10-26 05:10:35 Approx Estimated time required to upgrade callhome 00:15:00 2023-10-26 05:14:31 Approx Estimated time required to upgrade cyclops 00:15:00 2023-10-26 05:16:23 Approx Estimated time required to upgrade appliancesoftwareversion 00:01:00 2023-10-26 05:16:44 Approx Estimated time required to upgrade hpi 00:15:00 2023-10-26 05:50:18 Approx Estimated time required to upgrade hpifirmware 03:50:00
Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4488 - Combined the SPU and non-SPU phases of the firmware upgrade process to improve upgrade
speed.Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4623
Resolved issues
- Resolved the issue that caused the alert Failed to collect status from resource
manager after upgrade. The issue was due to the wrong path of the installed Python 3
packages. The installed Python 3 packages must be present under /usr/lib. If
you try to install any packages that are already present under /usr/local/lib,
the system uninstalls and reinstalls them in the right location.Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4551
Known issues
- Exception while running --preliminary-check
- There might be an exception while running --preliminary-check during the
upgrade from 1.0.7.x versions to 1.0.8.4.
Please review the release notes for this version at https://ibm.biz/icpds_rn_1084 prior to running the upgrade. Upgrade command: apupgrade --upgrade --use-version 1.0.8.4_release --upgrade-directory /localrepo --bundle system Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 466, in get_distribution dist = get_provider(dist) File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 342, in get_provider return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0] File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 886, in require needed = self.resolve(parse_requirements(requirements)) File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 772, in resolve raise DistributionNotFound(req, requirers) pkg_resources.DistributionNotFound: The 'pipdeptree' distribution was not found and is required by the application
- Workaround
-
The workaround is to proceed with running the upgrade command:
apupgrade --upgrade --upgrade-directory /localrepo --use-version 1.0.8.4_release --bundle system
Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4720
- Upgrade fails when GPFS disks are detected as down
-
Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4721 - Upgrade might fail when the GPFS component upgrades
- Upgrade to version 1.0.8.4 from 1.0.7.8 on systems with connector nodes might fail when the GPFS
component upgrades.Example of tracelog:
2023-11-29 18:25:04 TRACE: run_shell_cmd_in_parallel(): running cmd systemctl restart mmsdrserv on nodes ['e5n1', 'e1n3', 'e4n1', 'e1n2', 'e1n1'] LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446 2023-11-29 18:35:09 TRACE: ['e5n1', 'e4n1'] RC: 1 STDOUT: [] STDERR: [A dependency job for mmsdrserv.service failed. See 'journalctl -xe' for details. ] ['e1n3', 'e1n2'] RC: 0 STDOUT: [] STDERR: [] ['e1n1'] RC: 1 STDOUT: [] STDERR: [Job for mmsdrserv.service failed because the control process exited with error code. See "systemctl status mmsdrserv.service" and "journalctl -xe" for details. ] LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446 2023-11-29 18:35:09 TRACE: run_shell_cmd_in_parallel_or_raise(): running cmd service gpfs start on nodes ['e5n1', 'e1n3', 'e4n1', 'e1n2', 'e1n1'] LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446 2023-11-29 18:45:14 TRACE: ['e5n1', 'e4n1'] RC: 1 STDOUT: [] STDERR: [Redirecting to /bin/systemctl start gpfs.service A dependency job for gpfs.service failed. See 'journalctl -xe' for details. ] ['e1n3', 'e1n2', 'e1n1'] RC: 0 STDOUT: [] STDERR: [Redirecting to /bin/systemctl start gpfs.service ] LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446 2023-11-29 18:45:14 ERROR: Error running command [service gpfs start] on ['e5n1', 'e4n1'] LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446 2023-11-29 18:45:14 INFO: You can view mmfs.log.latest log file at /var/adm/ras/ for details LOGGING FROM: yosemite_bundleupgradechecker.py:ensure_shared_storage_is_up:446 2023-11-29 18:45:14 FATAL ERROR: Prerequisite system checks failed LOGGING FROM: bundleupgradechecker.py:perform_bundle_level_checks:223 2023-11-29 18:45:14 FATAL ERROR: More Info: See trace messages at /var/log/appliance/apupgrade/20231129/apupgrade20231129175951.log.tracelog for additional troubleshooting information.
- Workaround
-
- Run mmgetstate -aLv to verify that the GPFS status is
active.
Expected output:mmgetstate -aLv
[root@gt08-node1 ~]# mmgetstate -aLv Node number Node name Quorum Nodes up Total nodes GPFS state Remarks --------------------------------------------------------------------------------- 1 e1n1 2 3 5 active quorum node 2 e1n2 2 3 5 active quorum node 3 e1n3 2 3 5 active quorum node 4 e4n1 2 3 5 active 5 e5n1 2 3 5 active
- Restart the upgrade.
- Run mmgetstate -aLv to verify that the GPFS status is
active.
Draft comment: arun.c.r@ibm.com
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/4786
- Upgrade might fail due to missing wheel (Python 3) package
- For the systems with security patch 7.9.23.02.SP20 applied, upgrading to version 1.0.8 and later
might fail due to missing wheel (Python 3) package.Example of tracelog:
2024-02-10 09:58:54 INFO: nodeos:Starting Node OS component post-install steps... LOGGING FROM: NodeosUpgrader.py:postinstall:166 2024-02-10 09:58:54 INFO: NodeosUpgrader.postinstall:Running nodeOS postinstall script for post upgrade configuration LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:30 2024-02-10 09:58:54 TRACE: Running command [/opt/ibm/appliance/platform/xcat/scripts/xcat/nodeos_post_actions.py]. LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:31 2024-02-10 10:01:31 TRACE: RC: 1. STDOUT: [] STDERR: [ERROR: Command ['pip3', 'install', '/tmp/python3_packages/wheel-*.whl', '-f', '/tmp/python3_packages/', '--no-index', '--prefix', '/usr'] failed with error: WARNING: Requirement '/tmp/python3_packages/wheel-*.whl' looks like a filename, but the file does not exist ERROR: wheel-*.whl is not a valid wheel filename. More Info: See /var/log/appliance/platform/xcat/nodeos_post_action.log for details. ] LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:31 2024-02-10 10:01:31 ERROR: NodeosUpgrader.postinstall:Issue encountered while running nodeOS postinstall script for configuration. LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:33 2024-02-10 10:01:31 ERROR: LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:33 2024-02-10 10:01:31 ERROR: ERROR: Command ['pip3', 'install', '/tmp/python3_packages/wheel-*.whl', '-f', '/tmp/python3_packages/', '--no-index', '--prefix', '/usr'] failed with error: LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:33 2024-02-10 10:01:31 ERROR: WARNING: Requirement '/tmp/python3_packages/wheel-*.whl' looks like a filename, but the file does not exist LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:33 2024-02-10 10:01:31 ERROR: ERROR: wheel-*.whl is not a valid wheel filename. LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:33 2024-02-10 10:01:31 ERROR: More Info: See /var/log/appliance/platform/xcat/nodeos_post_action.log for details. LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:33 2024-02-10 10:01:31 ERROR: LOGGING FROM: node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:33 2024-02-10 10:01:31 TRACE: In method logger.py:log_error:142 from parent method node_os_yosemite_postinstaller.py:run_nodeos_postinstall_script:33 with args msg = NodeosUpgrader.postinstall:Issue encountered while running nodeOS postinstall script for configuration. ERROR: Command ['pip3', 'install', '/tmp/python3_packages/wheel-*.whl', '-f', '/tmp/python3_packages/', '--no-index', '--prefix', '/usr'] failed with error: WARNING: Requirement '/tmp/python3_packages/wheel-*.whl' looks like a filename, but the file does not exist ERROR: wheel-*.whl is not a valid wheel filename. More Info: See /var/log/appliance/platform/xcat/nodeos_post_action.log for details.
- Workaround
-
- Download the release
bundle:
cp -r /localrepo/1.0.8.x_release/EXTRACT/system/bundle/app_img/python3_dependencies /install/app_img/
- Copy the Python 3 dependencies from the bundle to /install/app_img/ on node
e1n1
. - Restart the upgrade.
- Download the release
bundle:
https://github.ibm.com/privatecloud-ap/cpds-issues/issues/5030
TBD