1.0.7.8 Interim Fix 3 release notes

1.0.7.8 Interim Fix 3 upgrades firmware to support new drives through XCC and UEFI as the old drives are nearing the end of life.

The upgraded firmware components are:
  • OneCLI: 3.4.0
  • SMM: 1.25
  • LXPM: PDL138G - 2.06
  • UEFI: TEE180H - 3.41
  • XCC: TEO3D2Q - 5.41
  • NIC_ETH_EXT: 16.27.2008
  • SSD_NVME_STORAGE_INTEL_SSDPE2KX040T8O: VDV10184
  • SSD_NVME_STORAGE_INTEL_SSDPF2KX038T1O: 9CV10320
  • SSD_NVME_STORAGE_SAMSUNG_MMZWLR3T8HCLS-00A07: MPPA5B5Q
Note: If upgrade fails during scp to /tmp directory as shown:
2023-11-07 04:26:22 ERROR: Error running command [scp -r e1n1:/localrepo/1.0.7.8.IF3_release/EXTRACT/system/upgrade/hpi/*.rpm /tmp/APUPGRADE/hpi.20231107042312/] on [u'e1n1']
2023-11-07 04:26:22 INFO: Cleaning up the files(and the dir) that were copied to all the nodes...
2023-11-07 04:26:23 INFO: Done
2023-11-07 04:26:23 INFO: hpi:Failed. Failed to complete HPI component upgrade.
2023-11-07 04:26:23 FATAL ERROR: Errors encountered
2023-11-07 04:26:23 FATAL ERROR:
2023-11-07 04:26:23 FATAL ERROR: HpiUpgrader.install : Fatal Problem: Could not copy files to all nodes.
2023-11-07 04:26:23 FATAL ERROR: This error requires manual intervention to resolve. Please contact IBM Support.
2023-11-07 04:26:23 FATAL ERROR:
2023-11-07 04:26:23 FATAL ERROR: More Info: See trace messages at /var/log/appliance/apupgrade/20231107/apupgrade20231107040119.log.tracelog for additional troubleshooting information.
2023-11-07 04:26:23 INFO: File /var/log/appliance/apupgrade/rest_server_pid.txt storing REST Server process ID does not exist
2023-11-07 04:26:23 INFO: REST Server - Instance is not running.
2023-11-07 04:26:23 ERROR: The following components failed to upgrade: ['hpi']
2023-11-07 04:26:23 FATAL ERROR: Unhandled error when attempting upgrade. Stack trace of failed command logged to /var/log/appliance/apupgrade/20231107/apupgrade20231107040119.log.tracelog
2023-11-07 04:26:23 FATAL ERROR: More Info: See trace messages at /var/log/appliance/apupgrade/20231107/apupgrade20231107040119.log.tracelog for additional troubleshooting information.
2023-11-07 04:26:23 INFO: File /var/log/appliance/apupgrade/rest_server_pid.txt storing REST Server process ID does not exist
2023-11-07 04:26:23 INFO: REST Server - Instance is not running.
Check if there is enough space under /tmp. If it is full, clean up the directory by using:
rm -rf /tmp/APUPGRADE/*
And restart the upgrade by using:
apupgrade --upgrade --upgrade-directory /localrepo --use-version 1.0.7.8.IF3_release --bundle system

Before you begin

  • Download the following package 1.0.7.8.IF3-WS-ICPDS-fpXXX, where XXX stands for the latest package number, from Fix Central.
  • Estimated upgrade time:
    • For Dell systems, the estimated upgrade time is 2 hours. Downtime of around 2 hours is required.
    • For Lenovo systems, the estimated upgrade time is 4 hours 30 minutes. Downtime of around 4 hours and 30 minutes is required.
  • The system must be on version 1.0.7.8 to apply the fix.
  • If NPS has non-default admin account credentials, the following actions must be completed before you can upgrade:
    1. Ensure that you have the NPS database admin user password.
    2. In /export/home/nz/.bashrc file inside the container, set NZ_USER=admin and NZ_PASSWORD=<customer_password>

Procedure

  1. Connect to node e1n1 via the management address and not the application address or floating address.
  2. Verify that e1n1 is the hub:
    1. Check for the hub node by verifying that the dhcpd service is running:
      systemctl is-active dhcpd
    2. If the dhcpd service is running on a node other than e1n1, bring the service down on that other node:
      systemctl stop dhcpd
    3. On e1n1, run:
      systemctl start dhcpd
  3. Download the icpds-release-1.0.7.8.IF3.tar.gz bundle and copy it to /localrepo on e1n1.
    Note: Make sure that you delete all bundle files from previous releases.
  4. From the /localrepo directory on e1n1, run:
    mkdir /localrepo/1.0.7.8.IF3_release

    And move the system bundle into that directory. The directory that is used here must be uniquely named - for example, no previous upgrades on the system can have been run out of a directory with the same name.

  5. Verify the status of your appliance by running:
    • ap issues
    • ap version -s
    • ap sw
  6. Optional: Run upgrade details to view details about the specific upgrade version:
    apupgrade --upgrade-details --upgrade-directory /localrepo --use-version 1.0.7.8.IF3_release --bundle system
  7. Run preliminary checks before you start the upgrade process. The preliminary check option checks for possible issues and attempts to automatically fix any known issues during pre-checks.
    apupgrade --preliminary-check-with-fixes --upgrade-directory /localrepo --use-version 1.0.7.8.IF3_release --bundle system
  8. Optional: If you have custom certificates, copy them to the following /opt/ibm/appliance/storage/platform/cyclops/ directory before you start the upgrade process.
    1. Copy cert.crt to /opt/ibm/appliance/storage/platform/cyclops/cert.crt
    2. Copy cert.key to /opt/ibm/appliance/storage/platform/cyclops/cert.key
  9. Start the upgrade process:
    apupgrade --upgrade --upgrade-directory /localrepo --use-version 1.0.7.8.IF3_release --bundle system
  10. Wait for the upgrade to complete successfully.
  11. Run:
    ap version -s

    And verify that the IF is listed in Interim Fixes.

Known issue: apupgrade fails at stopping Magneto Service

apgrade fails to stop Magneto Service due to some container running. The command attempts to stop out these containers and errors .
INFO: Attempt number 3 to stop Magneto service
ERROR: Exception: Traceback (most recent call last):
ERROR:   File "/localrepo/1.0.7.8.IF3/EXTRACT/upgrade/modules/ibm/ca/util/magneto_manager.py", line 264, in stop_magneto_service
ERROR:     raise Exception("Failed to stop Magneto Service, within {} tries.".format(str(count-1)))
ERROR: Exception: Failed to stop Magneto Service, within 3 tries.
ERROR:
Workaround:
  1. Run docker ps command on the control nodes to see the running containers:
    ssh e1n1
    docker ps
    CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS              PORTS                    NAMES
    2a19811bed99        callhome_repo:callhome.x86_64   "/usr/bin/entrypoi..."   30 hours ago        Up 29 hours                        callhome
    
  2. Run docker stop to stop the running container on the respective node.
    docker stop 2a19811bed99