Monitoring nodes and switches

Procedure to monitor nodes and switches from the Overview dashboard page.

Procedure

  1. In the IBM Storage Fusion HCI System user interface, go to Infrastructure > Overview.
    The Overview dashboard page is displayed. For details about the overview page, see Monitoring hardware from IBM Storage Fusion HCI System user interface.
  2. Go through the Resource summary section to monitor the health of the nodes and switches.
  3. Hover over a graphical view of the hardware to identify units in the rack. For node, it shows the name of the hardware, the health status, the type of node, and the rack unit. For switch, it shows the name of the hardware, the health status, the type of switch, and the rack unit.
  4. If one or more nodes or switches are in an error, degraded, or disabled state, take appropriate action on the hardware component.
    In the rack picture, the color of the node or switch indicates the health and status.
    Option Description
    Node or switch in Green color It indicates that the status is healthy or normal, and no action is required.
    Node in Red or Yellow color It indicates that the component failed and is in a critical state. Do the following steps to resolve the issue:
    1. Click the component that is in red color.

      A slide out window is displayed with the Hardware status, type, firmware, S/N, rack, and rack unit details.

    2. Go through the details in the slide out window. If you want more information about the node, click View full details.

      It opens Nodes page that includes the front and rear graphical views, recent events, and other details of the node and its internal components.

    3. Select Front in the Components section to see front view of the node and hover over a graphical view to check internal components and their status.

      To debug further, click the internal component that is in red color. It opens a new slide out pane for drives with more details such as slot, type, status, total capacity (GB), and serial number.

    4. Select Back in the Components section to see back view of the node and hover over a graphical view to check internal components and their status.

      To debug further, click the internal component that is in red color. It opens a new slide out pane for adapters with more details such as slot, port, type, speed, status, network address, adapter, bond, and connected to.

    5. Click View table to check all internal components and their status in the table format.

      It opens Components table page that includes Storage Drives, OS Drives, Ports, CPUs, and DIMMs in five tab pages. For more information about node details, see Node details.

    6. Go through the Recent events section to understand the error.
      Note: The recent events pane includes details of the last five events that occurred for the node.
    7. Click View all to go to the events page and view all recent events on the hardware component.

      The BMYxxx code and the error message inform you about the error and possible corrective actions.

    8. Go through the Details section to get more details of the node such as type, model, S/N, IPv6 address, rack, rack unit, firmware, architecture, CPU cores, frequency, memory, energy consumption, and temperature.
    9. To diagnose and take corrective action, try the following options:
    Switch in Red or Yellow color It indicates that the component failed and is in critical state. Do the following steps to resolve the issue:
    1. Click the component that is in red color.

      The slide out pane is displayed with the Hardware status, type, firmware, s/n, rack, and rack unit.

    2. Go through the details in the slide out pane. If you want more information about the switch, click View full details. Alternatively, go to the Network page > Switches tab and click the switch name link.

      It opens Network page that includes the graphical view, recent events, and other details of the switch and its internal components.

    3. Hover over a graphical view of the switch to check its internal components and status. To debug further, click the internal component that is in red color. It opens a new slide out pane for ports with more details such as port, speed, and status.
    4. Click View table to check all internal components and its status in the table format.

      It opens Components table page that includes Ports, Fans, and Power supply in three tab pages. For network details, see Viewing network details.

    5. Go through the Recent events section to understand the error.
    6. Click View all to go to the events page and view all recent events on the hardware component.

      The BMYxxx code and the error message informs you about the error.

    7. Go through the Details section to get more details of the switch such as type, model, IBM S/N, manufacture S/N, rack unit, rack, and firmware.
    8. To diagnose and take corrective action, try the following options:
    Node in Gray color It indicates that the hardware is in a disabled or powered off state.

    To power on or power off the hardware, see Node power operations section in Administering nodes and racks.

    Node in color blue with diagonal stripes It indicates that an action is in progress on the hardware, such as power on, power down, or a firmware upgrade.
    To know more about the actions on a node hardware, see the following links:
    Switch in color blue with diagonal stripes It indicates that a firmware upgrade is in progress. See Upgrading switch firmware.
    Note: The color gray does not exist for network switches as failed, degraded, or normal are the available states.
  5. After all the errors and failures are fixed, go to the Overview dashboard page and check the health status of problematic nodes and switches.