IBM Support

How to recover from 564 node errors after installing the SAS Host Interface Card

How To


Summary

How to recover from 564 node errors after installing the SAS Host Interface Card

Steps

Abstract

Installing the optional SAS Host Interface Card (HIC) in a Storwize V3700 for Lenovo node canister running V7.1.0.0 or V7.1.0.1 may result in the node reporting a 564 error and ceasing system operations.

Solution

Recovery if there are no offline volumes in the I/O group:

It is important to check for offline volumes before removing a node, as doing so may result in loss of hardened cache data.

  1. Confirm that there are no offline volumes in the I/O group.
  2. Remove the node canister from the cluster using the management GUI, or by running the following CLI command:
      rmnodecanister <nodeid>
  3. Use the service assistant GUI on the node to force the node to leave the cluster.
  4. Use the service assistant GUI to reboot the node.
  5. The node should return in the Candidate state and automatically re-join the cluster. If the node does not re-join and instead reports node error 690, run the following CLI command to force the node to re-join:
      satask stopservice <nodepanelname>
  6. Confirm by use of the management GUI that the new SAS HIC has been accepted.

Recovery if there are no offline volumes in the I/O group:

  1. Confirm that offline volumes exist in the I/O group.
  2. Power down the node canister by running the following CLI command:
      satask stopnode -poweroff <nodepanelname>
  3. Remove the newly added SAS HIC.
  4. Allow the node to restart and join the cluster.
  5. Confirm that there are no longer any offline volumes.
  6. Remove the node canister from the cluster using the management GUI, or by running the following CLI command:
      rmnodecanister <nodeid>
  7. Use the service assistant GUI on the node to force the node to leave the cluster.
  8. Use the service assistant GUI to power off the node.
  9. Remove the node from its enclosure and install the SAS HIC.
  10. Restore the node to the enclosure.
  11. The node should return in the Candidate state and automatically re-join the cluster. If the node does not re-join and instead reports node error 690, run the following CLI command to force the node to re-join:
      satask stopservice <nodepanelname>
  12. Confirm by use of the management GUI that the new SAS HIC has been accepted.

Document Location

Worldwide

Operating System

Lenovo RackSwitches and Storage devices:Operating system independent / None

[{"Type":"HW","Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STLM5A","label":"IBM Storwize V3700"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
25 March 2023

UID

ibm1MIGR-5097103