1.0.7.8 Interim Fix 2 release notes

1.0.7.8 Interim Fix 2 applies a number of fixes on systems with Connector Node (CN) installed along with the fixes for Platform Manager and Hardware Platform Interface (HPI). For large systems (with more than 16 enclosures), it:

  • addresses the memory leak associated with BOM caching.
  • fixes node sorting when running ap node and ap node -d
  • ensures that NPS container migrates over correctly and NPS starts automatically during VDB HOST migration from CN.
  • ensures that Docker sockets are disposed correctly to avoid data piling and too many open files error when using the NPS web console.
  • upgrades aposcomms component to address the regressions from policy-based routing (PBR).

1.0.7.8 Interim Fix 2 replaces previously released 1.0.7.8 Interim Fix 1. If you applied 1.0.7.8 IF1, you must apply 1.0.7.8 IF2 to fix the PBR issue.

Before you begin

  • Download the following package 1.0.7.8.IF2-WS-ICPDS-fpXXX, where XXX stands for the latest package number, from Fix Central.
  • For Dell systems, estimated upgrade time is six hours. Downtime of around five hours is required.
  • For Lenovo systems, estimated upgrade time is two hours. Downtime of around 45 minutes is required.
  • The system must be on version 1.0.7.8 before applying the fix.
  • If NPS has non-default admin account credentials, the following actions must be completed before you can upgrade:
    1. Ensure that you have the NPS database admin user password.
    2. In /export/home/nz/.bashrc file inside the container, set NZ_USER=admin and NZ_PASSWORD=<customer_password>

Procedure

  1. Connect to node e1n1 via the management address and not the application address or floating address.
  2. Verify that e1n1 is the hub:
    1. Check for the hub node by verifying that the dhcpd service is running:
      systemctl is-active dhcpd
    2. If the dhcpd service is running on a node other than e1n1, bring the service down on that other node:
      systemctl stop dhcpd
    3. On e1n1, run:
      systemctl start dhcpd
  3. Download the icpds-release-1.0.7.8.IF2.tar.gz bundle and copy it to /localrepo on e1n1.
    Note: Make sure you delete all bundle files from previous releases.
  4. From the /localrepo directory on e1n1, run:
    mkdir /localrepo/1.0.7.8.IF2_release

    and move the system bundle into that directory. The directory that is used here must be uniquely named - for example, no previous upgrades on the system can have been run out of a directory with the same name.

  5. Verify the status of your appliance by running:
    • ap issues
    • ap version -s
    • ap sw
  6. Optional: Run upgrade details to view details about the specific upgrade version:
    apupgrade --upgrade-details --upgrade-directory /localrepo --use-version 1.0.7.8.IF2_release --bundle system
  7. Run preliminary checks before you start the upgrade process. The preliminary check option checks for possible issues and attempts to automatically fix any known issues during pre-checks.
    apupgrade --preliminary-check-with-fixes --upgrade-directory /localrepo --use-version 1.0.7.8.IF2_release --bundle system
  8. Optional: If you have custom certificates, copy them to the following /opt/ibm/appliance/storage/platform/cyclops/ directory before you start the upgrade process.
    1. Copy cert.crt to /opt/ibm/appliance/storage/platform/cyclops/cert.crt
    2. Copy cert.key to /opt/ibm/appliance/storage/platform/cyclops/cert.key
  9. Start the upgrade process:
    apupgrade --upgrade --upgrade-directory /localrepo --use-version 1.0.7.8.IF2_release --bundle system
  10. Wait for the upgrade to complete successfully.
  11. Run:
    ap version -s

    and verify that the IF is listed in Interim Fixes.

Known issue: apupgrade fails at stopping Magneto Service

apgrade fails to stop Magneto Service due to some container running. The command attempts to stop out these containers and errors .
INFO: Attempt number 3 to stop Magneto service
ERROR: Exception: Traceback (most recent call last):
ERROR:   File "/localrepo/1.0.7.8.IF2/EXTRACT/upgrade/modules/ibm/ca/util/magneto_manager.py", line 264, in stop_magneto_service
ERROR:     raise Exception("Failed to stop Magneto Service, within {} tries.".format(str(count-1)))
ERROR: Exception: Failed to stop Magneto Service, within 3 tries.
ERROR:
Workaround:
  1. Run docker ps command on the control nodes to see the running containers:
    ssh e1n1
    docker ps
    CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS              PORTS                    NAMES
    2a19811bed99        callhome_repo:callhome.x86_64   "/usr/bin/entrypoi..."   30 hours ago        Up 29 hours                        callhome
    
  2. Run docker stop to stop the running container on the respective node.
    docker stop 2a19811bed99