Monitoring nodes and switches

Procedure to monitor nodes and switches from the Overview dashboard page.

Procedure

  1. In the IBM Storage Fusion HCI System user interface, go to Infrastructure > Overview.
    The Overview dashboard page is displayed. For details about the overview page, see Monitoring hardware from IBM Storage Fusion HCI System user interface.
  2. Go through the Resource summary section to monitor the health of the nodes and switches.
  3. Hover over a graphical view of the hardware to identify units in the rack. For node, it shows the name of the hardware, the health status, the type of node, and the rack unit. For switch, it shows the name of the hardware, the health status, the type of switch, and the rack unit.
  4. If one or more nodes or switches are in a error, degraded, or disabled state, take appropriate action on the hardware component.
    In the rack picture, the color of the node or switch indicates the health and status.
    Option Description
    Node or switch in Green color It indicates that the status is healthy or normal, and no action is required.
    Node in Red or Yellow color It indicates that the component failed and is in a critical state. Do the following steps to resolve the issue:
    1. Click the component that is in red color.

      A slide out window is displayed with the Hardware status, type, firmware, S/N, rack, rack unit, and recent events. For example, the Recent events can indicate a Drive offline.

    2. Go through the details in the slide out window. If you want more information about the node, click View full details.

      It opens Nodes page with the node and its internal component details. The Nodes page displays Drives, Ports, CPUs, and DIMMs in four tab pages along with any recent events generated for the node. For more information about node details, see Node details.

    3. Go through the Recent events section to understand the error.
      Note: The recent events pane includes details of the node in the last 24 hours only.
    4. Click View all to go to the events page and view all recent events on the hardware component.

      The BMYxxx code and the error message inform you about the error and possible corrective actions.

    5. To diagnose and take corrective action, try the following options:
    Switch in Red or Yellow color It indicates that the component failed and is in critical state. Do the following steps to resolve the issue:
    1. Click the component that is in red color.

      The slide out pane is displayed with the Hardware status, type, firmware, s/n, rack, rack unit, and recent events. For example, the Recent events can indicate a Port failure, Drive offline, and so on.

    2. Go through the details in the slide out pane. If you want more information about the switch, click View full details. Alternatively, go to the Network page > Switches tab and click the switch name link.

      It displays the Status overview and Ports of the switch along with any recent events generated for the switch.

      Note: The recent events pane includes details of the switch in the last 24 hours only.

      For network details, see Viewing network details.

    3. Go through the Recent events section to understand the error.
    4. Click View all to go to the events page and view all recent events on the hardware component.

      The BMYxxx code and the error message informs you about the error.

    5. To diagnose and take corrective action, try the following options:
    Node in Gray color It indicates that the hardware is in a disabled or powered off state.

    To power on or power off the hardware, see Node power operations section in Administering nodes and racks.

    Node in color blue with diagonal stripes It indicates that an action is in progress on the hardware, such as power on, power down, or a firmware upgrade.
    To know more about the actions on a node hardware, see the following links:
    Switch in color blue with diagonal stripes It indicates that a firmware upgrade is in progress. See Upgrading switch firmware.
    Note: The color gray does not exist for network switches as failed, degraded, or normal are the available states.
  5. After all the errors and failures are fixed, go to the Overview dashboard page and check the health status of problematic nodes and switches.