Upgrading to version 1.0.7.1

Follow this procedure to upgrade Cloud Pak for Data System to version 1.0.7.1.

Before you begin

Upgrade prerequisites:

If you are upgrading to version 1.0.7.1 from a version prior to 1.0.7.0, you must follow the procedure described in Upgrading to version 1.0.7.0 to upgrade the system bundle and the switch firmware. You can skip upgrading the following bundles:

  • icpds_vm
  • icpds_services
  • icpds_services_addon_cyclops
If your system was already on 1.0.7.0, follow the rest of the procedure. If your switches failed to upgrade in 1.0.7.0, use on of the following solutions before proceeding further:
  • Run the 1.0.7.0 apupgrade command with the option --update-switches:
    apupgrade --upgrade --upgrade-directory /localrepo --use-version 1.0.7.0_release --bundle system --update-switches
  • Run the following commands to update the switches directly:
    /opt/ibm/appliance/platform/hpi/sys_hw_config --plain-text mgtsw --os-upgrade
    /opt/ibm/appliance/platform/hpi/sys_hw_config --plain-text fabsw --os-upgrade
  • Contact IBM Support to solve any switch upgrade issues.

To avoid Ansible performance issue, ensure that ulimit -n is set to 1024 on all OpenShift nodes (that is, all control and worker VMs).

Network setup prerequisites:
  • If the system already has a custom network configuration, it must be configured via the /opt/ibm/appliance/platform/apos-comms/customer_network_config/ansible with a System_Name.yml file:

    Before you upgrade, ensure that in /opt/ibm/appliance/platform/apos-comms/customer_network_config/ansible there is a System_Name.yml file specifying the house network configuration.

    To locate the file, run the following command from /opt/ibm/appliance/platform/apos-comms/customer_network_config/ansible:
    ls -t *yml | grep -v template | head -1

    If the file does not exist, you have to create it. Otherwise your network configuration might break during the upgrade. For more information on the file and how to create it for versions older than 1.0.3.6, see the Node side network configuration section, and the following link specifically: Editing the network configuration YAML file.

  • If apupgrade detects custom network configuration and no yml file, it will fail at the precheck step.
  • If you are upgrading a new system with no network configuration, the apupgrade will not stop at the check for System_Name.yml, but will continue the upgrade process.

Netezza prerequisites:

If Netezza® Performance Server is installed on your system, run nzhostbackup before you upgrade. For more information, see Create a host backup.

All upgrade commands must be run with root user.

Procedure

  1. Connect to node e1n1 via the management address and not the application address or floating address.
  2. Download the system and rhos_repo bundles from Fix Central and copy them to /localrepo on e1n1.
    Note: The upgrade bundles require a significant amount of free space. Make sure you delete all bundle files from previous releases.
  3. From the /localrepo directory on e1n1, run:
    mkdir 1.0.7.1_release
    and move the bundle files into that directory.
    The directory that is used here must be uniquely named - for example, no previous upgrades on the system can have been run out of a directory with the same name.
  4. Optional: Run upgrade details to view details about the specific upgrade version:
    apupgrade --upgrade-details --upgrade-directory /localrepo --use-version 1.0.7.1_release --bundle system
  5. Run preliminary checks before you start the upgrade process:
    apupgrade --preliminary-check --upgrade-directory /localrepo --use-version 1.0.7.1_release --bundle system
  6. Upgrade the apupgrade command to get the new command options:
    apupgrade --upgrade-apupgrade --upgrade-directory /localrepo --use-version 1.0.7.1_release --bundle system

    The value for the --use-version parameter is the same as the name of the directory you created in step 3.

You now need to upgrade the VM cluster on the system.

  1. Download the following additional ICPDS bundles and copy them to the directory you created in step 3.
    • icpds_vm
    • icpds_rhos_repo
    • icpds_services
    • icpds_services_addon_cyclops
  2. Run:
    rm -rf /localrepo/1.0.7.1_release/EXTRACT/*
    The command enables you to rerun the apupgrade command if you wish to do so, or if the VM redeployment fails for any reason.
    Note: The root partition on the system nodes is only 200GB. Make sure the partition is clean (aside from the bundles you copied in) before starting.
  3. To avoid issues during the OpenShift upgrade and routes reconfiguration, before you upgrade the VM, the following parameters need to be added to inventory file e1n1-1-control:/opt/ibm/appmgt/config/hosts in the [OSEv3:vars] section:
    skip_logging_health_sanity_check=True
    openshift_certificate_expiry_fail_on_warn=false
    For example:
    [OSEv3:children]
    masters
    etcd
    nodes
    
    [OSEv3:vars]
    ...
    ...
    skip_logging_health_sanity_check=True
    openshift_certificate_expiry_fail_on_warn=false
  4. Initiate the VM upgrade by running:
    apupgrade --upgrade --upgrade-directory /localrepo --use-version 1.0.7.1_release --bundle vm
    

You now need to upgrade the application services (Portworx, Cloud Pak for Data, Netezza Performance Server console, Cloud Pak for Data System console).

  1. Ensure the following bundles are located in the directory you created in step 3. Run the command:
    ls /localrepo/1.0.7.1_release/icpds*.tar.gz
    and look for the following:
    • icpds_vm
    • icpds_rhos_repo
    • icpds_services
    • icpds_services_addon_cyclops
  2. Remove the following directories
    • rm -rf /localrepo/1.0.7.1_release/EXTRACT/*
  3. Initiate the services upgrade by running:
    apupgrade --upgrade --upgrade-directory /localrepo --use-version 1.0.7.1_release --bundle services --application all
    
    Note: If you run ap sw after this part of the upgrade completes, it might show that the web console is on version 1.0.7 which is a known behavior. The web console UI should show the correct version.

    If ap sw shows Cloud Pak for Data on 2.5 and ap version shows 1.0.7.0, this implies your system is on 1.0.7.0.

    If ap sw shows Cloud Pak for Data on 3.0.1 and ap version shows 1.0.7.0, this implies your system is on 1.0.7.1.

  4. Apply the following patch for Cloud Pak for Data:
    1. Run the following command from e1n1:
      curl -k https://localhost:5001/apupgrade/progress -u a:$(cat /run/magneto.token) -X PUT -d '{"upgrade_in_progress": "True"}'
    2. Log in to e1n1-1-control:
      ssh e1n1-1-control
    3. Run the following three commands:
      systemctl start appmgnt-rest
      appcli patchfix --app=icp4d
      systemctl stop appmgnt-rest
    4. When complete, run the following command from e1n1:
      curl -k https://localhost:5001/apupgrade/progress -u a:$(cat /run/magneto.token) -X PUT -d '{"upgrade_in_progress": "False"}'
  5. Run the following command from e1n1-1-control:
    for node in $(oc get nodes --no-headers |awk '{print $1}'); do echo $node; ssh $node "firewall-cmd --zone=public --permanent --add-port=9024/udp; firewall-cmd --reload"; done
  6. Apply the following steps to avoid the user management issue with the admin user not being able to add services in the web console:
    1. Find the control nodes by running the command as root from e1n1:
      /opt/ibm/appliance/platform/xcat/scripts/xcat/display_nodes.py --control
    2. ssh to each of the control node as root user and run the following commands:
      sed -i  's/data\["uid"\] = "999"/data["uid"] = "1000330999"/g'   /opt/ibm/appliance/platform/userauth/globalusermgmt/usermgmtserver.py
      systemctl stop apglobalusrmgmt;systemctl start apglobalusrmgmt
    3. Ensure to log out of any admin sessions after applying these changes and before creating any service instances. The steps above modify the admin account, and will make changes required by the service instances.

Netezza Performance Server

If NPS is installed on Cloud Pak for Data System, perform the steps described in this section.

Procedure

  1. Stop the platform manager:
    apstop -p
  2. Stop NPS:
    docker exec -it ipshost1 bash -c "su - nz -c 'nzstop'"
  3. Remove the GPFS token file to avoid accidental nzstart:
    docker exec ipshost1 rm /nz/.gpfstoken
  4. Log in to the e1n1 control plane node and remove the net.ipv4.conf.all.rp_filter setting from e1n1's ipshost1 container:
    sysctl --system
    
    docker exec ipshost1 bash -c "sed -i -e '/net.ipv4.conf.all.rp_filter/d' /etc/sysctl.conf"
    docker exec ipshost1 bash -c "echo 'net.ipv4.conf.all.rp_filter = 0' >> /etc/sysctl.conf"
    docker exec ipshost1 bash -c "echo 'net.ipv4.conf.default.rp_filter = 0' >> /etc/sysctl.conf"
    docker exec ipshost1 bash -c "echo 'net.ipv4.conf.mgt1.rp_filter = 0' >> /etc/sysctl.conf"
    docker stop ipshost1 
  5. Perform the following steps on the other two control nodes (e1n2 and e1n3 in case of Lenovo, e2n1 and e3n1 in case of Dell):
    docker start ipshost1
    docker exec ipshost1 bash -c "sed -i -e '/net.ipv4.conf.all.rp_filter/d' /etc/sysctl.conf"
    docker exec ipshost1 bash -c "echo 'net.ipv4.conf.all.rp_filter = 0' >> /etc/sysctl.conf"
    docker exec ipshost1 bash -c "echo 'net.ipv4.conf.default.rp_filter = 0' >> /etc/sysctl.conf"
    docker exec ipshost1 bash -c "echo 'net.ipv4.conf.mgt1.rp_filter = 0' >> /etc/sysctl.conf"
    docker exec ipshost1 bash -c "sysctl --system"
    docker stop ipshost1
    
    sysctl --system 
  6. Start NPS on e1n1 control node:
    docker start ipshost1
    docker exec -it ipshost1 bash -c "su - nz -c 'nzstart'"
  7. Recreate the GPFS token file:
    docker exec ipshost1 touch /nz/.gpfstoken
  8. Start platform manager:
    apstart -p