Upgrading to version 2.0.2
Follow this procedure to upgrade Cloud Pak for Data System to version 2.0.2. The end to end upgrade time for 2.0.1.1 to 2.0.2 ranges from 24 to 30 hours. This includes system, firmware and OCP/OCS component upgrades.
- Platform upgrade (extended outage required)
- OCP upgrade
Extended outage upgrade applies to both Cloud Pak for Data System and Netezza® Performance Server.
Before you begin
Upgrade prerequisites:
- Your system must be on version 2.0.x and Cloud Pak for Data 4.0.2 to upgrade with the following instructions. For more information, see Verifying software components version.
- If your version of Cloud Pak for Data System is 2.0.0, before
starting the upgrade, you must run the following
command:
to avoid the upgrade failure caused by the incorrect rpm user group combination in 2.0.0 upgrade script.sed -i -e '/apadmin/s/ibmapadmins/ibmapsysadmins/g' /opt/ibm/appliance/apupgrade/modules/ibm/ca/repository/yosemite_repository.py
- If FIPS is enabled on your system, you must disable it before starting the upgrade. For more
information, see Configuring FIPS on pre-2.0.2 Cloud Pak for Data System. If you do not disable FIPS,
apupgrade
will fail.
- If the system already has a custom network configuration, it must be configured by using the
/opt/ibm/appliance/platform/apos-comms/customer_network_config/ansible with a
System_Name.yml file:
Before you upgrade, ensure that in /opt/ibm/appliance/platform/apos-comms/customer_network_config/ansible there is a System_Name.yml file specifying the house network configuration.
To locate the file, run the following command from /opt/ibm/appliance/platform/apos-comms/customer_network_config/ansible:ls -t *yml | grep -v template | head -1
If the file does not exist, you must create it. Otherwise, your network configuration might break during the upgrade. For more information on the file and how to create it, see the Node side network configuration section, and the following link specifically: Editing the network configuration YAML file.
- To connect to the
node1
management interface, you must usecustom_hostname
value undernode1
section orip
value undernetwork1
section from your System_Name.yml For example:all: children: control_nodes: hosts: node1: custom_hostname: sbpoc04a.svl.ibm.com management_network: network1: ip: 9.30.106.111
- If apupgrade detects custom network configuration and no yml file, it will fail at the pre-check step.
- If you are upgrading a new system with no network configuration, the apupgrade will not stop at the check for System_Name.yml, but will continue the upgrade process.
- Before you start the upgrade, from
/opt/ibm/appliance/platform/apos-comms/customer_network_config/ansible
directory, you must
run:
If any changes are listed inANSIBLE_HASH_BEHAVIOUR=merge ansible-playbook -i ./System_Name.yml playbooks/house_config.yml --check -v
--check --v
, ensure that they are expected. If they are unexpected, you must edit the YAML file so that it contains only the expected changes. You might rerun this command as necessary, until you see no errors.
- Before you start the 2.0.2 upgrade, you must stop NPS by running:
- Stop NPS monitoring. Run:
and save the number of monitors for monitor type:oc get magnetomonitor -o=custom-columns='NAME:metadata.name,NUM_OF_MONITORS:spec.number_of_monitors,TYPE:spec.monitor_type' -n ap-magneto
nps
- For each name of monitor type
nps
run:
For example:oc patch magnetomonitor <magneto_monitor_name_of_nps_type> --type json -p '[{"op": "replace", "path": "/spec/number_of_monitors", "value": 0}]' -n ap-magneto
oc patch magnetomonitor magneto-monitor-nps-nps-1 --type json -p '[{"op": "replace", "path": "/spec/number_of_monitors", "value": 0}]' -n ap-magneto
where the monitor of typeoc get magnetomonitor -o=custom-columns='NAME:metadata.name,NUM_OF_MONITORS:spec.number_of_monitors,TYPE:spec.monitor_type' -n ap-magneto NAME NUM_OF_MONITORS TYPE magneto-gateway 1 gateway magneto-monitor-node 3 node magneto-monitor-node2 1 node2 magneto-monitor-nps-nps-1 1 nps magneto-monitor-ocs 1 ocs
nps
ismagneto-monitor-nps-nps-1
and hasnum_of_monitors=1
- Stop NPS. For each NPS
<nps_namespace_name>
instance, run:- Stop NPS by using nzstop.
Run:
oc --namespace=<nps_namespace_name> exec -t ipshost-0 -- su - nz -c "nzstop"
- Scale down
ipshost
and SPUstatefulsets
. Run:oc scale sts --all -n <nps_namespace_name> --replicas=0
- Run:
to confirm that all NPSoc get sts -n <nps_namespace_name>
statefullsets
scaled down. The expected value inREADY
column is 0/0 for eachstatefullset
. For example:NAME READY AGE ipshost 0/0 41d
- Run:
to determineoc get mcp
machineconfigpool
name associated withNPS <nps_namespace_name>
instance. - Un-pause the associated
machineconfigpool
. Run:oc patch --type=merge --patch='{"spec":{"paused":false}}' machineconfigpool/nps-shared
- Run:
to confirm that associatedoc get machineconfigpool/nps-shared -o yaml | grep paused
machineconfigpool
updates are unpaused. The expected result is as follows:f:paused: {} paused: false
- Stop NPS by using nzstop.
Run:
- Stop NPS monitoring. Run:
During the extended upgrade, after the firmware update phase, SPU nodes might power down because the PXE boot source (NPS host) is unavailable as it was shut down during the procedure. This is an expected upgrade behavior. SPUs are powered on as part of the Netezza Performance Server post-upgrade steps.
Procedure
In 2.0.2, the firmware upgrade is enabled by default in the upgrade process. Unlike Cloud Pak for Data System 1.0, --skip-firmware
option is not
allowed in 2.0.x. During the extended outage upgrade the firmware on all the nodes is upgraded at
once. For Netezza Performance Server systems, there is a downtime
during extended outage upgrade.
You now need to upgrade OCP and its components. In Cloud Pak for Data System 2.0.2 upgrade, OCP cannot be upgraded directly from
4.6 to 4.8 version. During the first hop, OCP and OCS are upgraded from 4.6 to 4.7, and in the
second upgrade step to 4.8. OCP and OCS will be marked install_complete
and
postinstall_complete
after the first hop, and started
before the
second hop. Also, CLO upgrade involves applying a machine configuration as Red Hat changed the repo
names from 4.6 to 5.3.4.14 versions which require an update to imagecontentsource
policies on the old cluster.
You can locate the upgrade directory in /opt/ibm/appliance/storage/platform/localrepo. You need icpds_ocp-2.0.2.0_*.tar.gz bundle to perform the OCP upgrade.
Netezza Performance Server post-upgrade steps
Perform the following actions to start NPS.
Procedure
FIPS post-upgrade steps
After you upgrade to 2.0.2 version and try to re-enable FIPS, the command fails with the
following error: dracut: installkernel failed in module kernel-modules-extra
as
there are old kernel-modules-extra
RPM on your system along with the new kernel RPM
after the upgrade. For more information, see Version 2.0.2 release notes.
Cloud Pak for Data post-upgrade steps
If you have Cloud Pak for Data installed on your
system, after you upgrade your Cloud Pak for Data System to version
2.0.2, you must ensure that an unexpected Cloud Pak for Data
upgrade is not triggered. To prevent that, existing 2.x Cloud Pak for Data System customers must pin down the zen
operand
version to 4.2.0 (Cloud Pak for Data 4.0.2
level) in ap-console
namespace. Cloud Pak for Data upgrade to a higher version should be done only on the
zen
namespace. Perform the following actions after your Cloud Pak for Data System 2.0.2 upgrade completes.