Version 2.0.1.1 release notes

Cloud Pak for Data System version 2.0.1.1 improves the upgrade process.

The upgrade process

  • For versions 2.x: If you are already on versions 2.0.0 or 2.0.1, there is no need to upgrade.
  • For versions 1.x: Cloud Pak for Data System 2.0 is based on Red Hat OpenShift Container Platform 4.6.32. In the upgrade process, the operating system is upgraded and Portworx is replaced by Red Hat OpenShift Data Foundation (formerly known as Red Hat OpenShift Container Storage) to provide storage on the system. Cloud Pak for Data and Netezza Performance Server are also upgraded. Because the upgrade process is complex, it must be supervised by the IBM Support team, who can run and monitor the process remotely. System downtime is required - the length of the maintenance window is determined by Support. Open a support ticket if you are planning to upgrade your 1.x system. For more information on the process, see Advanced upgrade from versions 1.0.x.

What's new

Read the What's new section in the 2.0.1 release notes to get familiar with the new features introduced in the 2.0.x release.

Software components

Fixed issues

  • Improved the way resources are used on the system: Storage Suite for Cloud Paks license that comes with Cloud Pak for Data System includes 12 TB free usable space in ODF. The more space in ODF, the more vCPUs ODF uses. On 2.0.1.0, all worker nodes' drives were added to ODF even if the license did not include the space. This was coming at the cost of 6 vCPU per 3.84TB usable space. Starting in 2.0.1.1, by default the ODF cluster uses only 12 drives, all coming from the first three worker nodes. For more information, see Platform storage.
  • Fixed the issue with the web console being in WARNING state due to unhealthy deployments of zen-watchdog pod after provisioning or upgrade.
  • Red Hat OpenShift Data Foundation 4.6.8 fixes the issue that occurred during machineconfig updates, which caused the ODF cluster to go down temporarily and required manual intervention.
  • Fixed the issue with DB2 OLTP, DB2 Event Store, or MongoDB installation using dedicated node which caused ODF to be degraded.

Known issues

Red Hat OpenShift Console launch from the system web console requires extra configuration
When clicking the Red Hat OpenShift Console launch icon, the user is redirected to the login page, and when trying to log in the URL that does not open.

Workaround:

To be able to use the console, you need to replace a part of the URL to ensure the console is displayed correctly:

In the website URL, replace localcluster.fbond with customer FQDN. For example, in the following URL:
https://oauth-openshift.apps.localcluster.fbond/oauth/authorize?client_id=console&redirect_uri=https%3A%2F%2Fopenshift-console.gt21-app.your.server.abc.com%2Fauth%2Fcallback&response_type=code&scope=user%3Afull&state=514ff224
localcluster.fbond must be replaced with gt21-app.your.server.abc.com
FIPS must not be enabled when upgrading from 1.x to 2.x. if NPS is installed on the system
On Cloud Pak for Data System 2.x with NPS, enabling FIPS is not supported.
Reboot induced by Machine Config takes very long because Ceph components are trying to use the network after the network has been shut down
No workaround is available at the time of the release.
HTTPD service certificate expired
Upgrade might fail during NodeOS installation with ERROR: Problem: Unable to start httpd on e1n2 The error message says that the certificate has expired and needs to be regenerated.

Workaround:

  1. Add the following line to /etc/httpd/conf.d/nss.conf, so that httpd ignores certificate validation failures:
    NSSEnforceValidCerts off
  2. Restart httpd service.

You can also create a new certificate and add it to the NSS database.

admin user is no longer supported for system console login
You can no longer log in to the system web console using the admin credentials. apadmin must be used instead, or any other system user.
Firmware upgrade fails if SMM queries return invalid information
The sys_hw_check node command must be run after firmware updates. If the sys_hw_config run showed no issues, but the subsequent sys_hw_check node lists uEFI as BELOW, then this issue was hit. It may be possible to recover the node by shutting it down, reseating it physically, powering it back up, then trying the firmware update again.
After upgrade, you can't create Cloud Pak for Data System users with same username as Cloud Pak for Data users
This happens if LDAP was configured on the system. In Cloud Pak for Data System 2.0, system users and Cloud Pak for Data users are separated. Any pre-existing Cloud Pak for Data users (from version 1.x) are added as system users without the Admin group privileges in version 2.0- they can log into the system web console but no operations are allowed.

Workaround:

You must delete the LDAP user and then re-add that user to the Admin group:
ap_external_ldap userdel  -u <username>
ap_external_ldap useradd -u <username> -d <username> -g 2001 -e abc@ibm.com
The operation does not affect the Cloud Pak for Data user IDs.
Pods crash after setting system time
  • When setting system time to the correct time or connecting to an upstream time source in a situation when the system time was ahead of the current real time, for example:
    • System time: 13:00 local time
    • Real time: 12:00 local time
    the Cloud Pak for Data pods crash. Open support ticket to get help when resolving the situation.
  • When a small forward time jump is required (for example, 5 minutes), monitor the system, no issues should occur.
  • When a forward time jump by a large time delta is required, for example:
    • System time: 13:00 local time, Wednesday
    • Real time: 02:00 local time, Thursday (the next day)
    CSRs need to get regenerated and approved with the following command:
    python3 /opt/ibm/appliance/platform/xcat/scripts/xcat/installCluster.py approve-csrs --restart-kubelet --rolling
Control node marked FAILED after restarting with apstop/apstart

A control node is marked FAILED and issue 439: Openshift node is not ready is raised. Kubelet stops posting node status. The issue is with the nodeip-configuration.service.

Workaround:
  1. To recognize this issue, log into the affected node and check the logs of the nodeip-configuration.service with systemctl or journalctl.
  2. If the issue is confirmed, try restarting the service.
  3. If that does not work run sudo rm -rf /var/lib/containers/* on the affected node. Then, monitor the service to ensure that it completes.