Administering the node
Administering the nodes in the system includes day-to-day operations such as monitoring the health, putting a node into maintenance mode, performing power operations, and doing firmware updates whenever it is available.
- Configured and Discovered node tabs
- Node details
- Node power operations
- Enable and disable node maintenance
- Firmware upgrade
- Add disks
- Add racks
- Upsize nodes
For any node drain issues during maintenance, see Issues related to IBM Storage Fusion HCI System node drains.
Configured and Discovered node tabs
From the IBM Storage Fusion menu, click . The Node page includes Configured nodes and Discovered nodes tabs. By default, the Node page opens the Configured nodes tab.
Go to Discovered nodes tab to add discovered nodes in a rack that were not added to the OpenShift cluster earlier.
Node details
Description | |
---|---|
Name | The name of the node.
|
Hardware status |
The node Status column also displays the health of the node.
The node Status column that is shown on the node inventory page is the node hardware monitoring state on the IBM Storage Fusion user interface. A node is monitored by connecting to its remote management module. Hence, the node state is a reflection of its connectivity to the monitoring module and its ability to fetch the monitoring data. Important:
To see the OpenShift state, go to Dashboard > OpenShift section. |
Firmware | The firmware version of the node. |
Type | The options are Compute only, Compute storage, AFM, or GPU. |
S/N | The hardware serial number. |
Rack | The name of the rack. |
Rack unit | The rack unit position of the node in the rack. |
CPU cores | The amount of CPU in cores. |
Memory (GB) | The amount of memory in GB. |
Use the Search text box to filter and find a specific node.
Node power operations
- Enable maintenance or Disable maintenance When the node is moved to maintenance succesfully, the ellipsis overflow menu option shows Disable maintenance option and the Enable maintenance option is available when the node is not already in maintenance.
- Power operations:
Alternatively, you can click the Manage resources in the node details > Inventory tab page and then do power operations.
- Power on and power off operations on a node:
-
Note: Do all power operations only from the IBM Storage Fusion HCI System user interface.
- Move the node to maintenance.
- In the ellipsis menu of the node, click Power off node to power off the node.
- In the confirmation window, click Power off.
- In the ellipsis menu of the node, click Power on node to power on the node.
- In the confirmation window, click Power on.
- After you complete the maintenance operations, from the ellipsis overflow menu of the node, click Power on node to power on the node.
- Restart and shutdown operations on a node
- Move the node to maintenance.
- In the ellipsis menu of the node, click Restart node or Shutdown node. Alternatively, you can click the Manage resources in the node details > Inventory tab page and then do power operations.
- In the confirmation window for restart or shutdown actions, click Restart
node or Shutdown node accordingly. Note: If you shutdown a node, then the node goes offline. While offline, all data collection for this node stops and no changes or upgrades can be made.
Enable and disable node maintenance
When you put a node into maintenance mode, it marks the node to Scheduling disabled and also drains workload from it.You can place a node to maintenance from the user interface or through any operation that needs a server reboot.
To run power operations, place a node in maintenance mode.
- Procedure
-
- In the Nodes page, click the ellipsis menu of the node you want to move to maintenance and click Enable maintenance mode. Alternatively, you can click the Manage resources in the node details > Inventory tab page.
- In the Enable maintenance mode confirmation window, click Enable.
- Wait for node to go to maintenance mode.
- After all the required maintenance operations are completed, move the node out of maintenance. From the ellipsis overflow menu of a node that is in maintenance, select Disable maintainance mode option.
- You cannot move more than one node to maintenance at a time. Also, if the GPFS cluster health is
degraded, node maintenance will not succeed. If you ignore, then the IBM Storage Fusion HCI System user interface shows a failed
message:
"Detected problem in previous maintenance operation"
Fix the root cause of the issue in events or CR status:oc describe cmt <instance node name> -n <fusion namespace>
- If you face an issue retrieving the compute nodes and network components after maintenance mode operation, log out and log in from the user interface.
- The maintenance mode on a node can take four minutes to 30 minutes to succeed, depending on the workload on the node and the Scale PodDisruptionBudget. If it takes more than 30 minutes, then the operation gets timed out eventually. For more information about this issue, see Issues related to IBM Storage Fusion HCI System node drains.
- When you put a node into maintenance mode from IBM Storage Fusion HCI System user interface, it marks a node to Scheduling disabled and also drains workload from it.
Firmware upgrade
If firmware upgrade is available for a node, then click the ellipsis overflow menu of the node record and click Upgrade firmware. You can also select multiple nodes at a time and click Upgrade button.
For more information about node firmware upgrade, see Upgrading node firmware.
- Failure on a node
-
If upgrade fails on a node, then click the ellipsis overflow menu of the node record and click Cancel upgrade. The Retry upgrade option is enabled.Note: Upon clicking Cancel upgrade, the firmware upgrade failed state changes to upgrade available.If a node is queued up for upgrade or scheduled for upgrade, then the Cancel upgrade option is available in the menu.
- If you click Cancel upgrade on a node that is queued up for upgrade, then the node is removed from the upgrade queue.
- The cancel option for an ongoing firmware upgrade is not allowed.
- Failure when multiple nodes are selected
- When you choose multiple nodes for upgrade and a node fails in between, then the rest of the
nodes in the sequence changes to Scheduled for the upgrade state. Fix the
issue in the failed node and click Retry upgrade to complete the upgrade on
the following nodes:
- Problematic node
- Rest of the nodes queued for upgrade, which are in Scheduled state.
Add disks
Click Add disks to upsize disks. For the actual procedure to add, see Adding additional storage nodes.
Add racks
Click Actions and select Add racks to expand your IBM Storage Fusion HCI System system with an additional rack. For the procedure to add racks, see Adding expansion racks.
Upsize nodes
Go to Discovered nodes tab to add discovered nodes in a rack that were not added to the OpenShift cluster earlier. For the procedure to add, see Configuring nodes for management.