Receiving and responding to TS7620 Appliance Express ProtecTIER V3.4.1 hardware alerts

Upon receiving an alert, you must identify which component generated the alert (if it is not readily apparent). Analyze the cause and severity of the fault, and decide on a course of corrective action. Methods for doing so are described in the sections that follow.

ProtecTIER Manager Hardware Resources window

About this task

The Hardware resources window provides information such as component status, FRU ID (part number), and resource fault details; for all of the CRU components, and many of the FRU components.

When ProtecTIER Manager communicates with the server on which hardware faults occurred, the information that is displayed in the Hardware resources window helps identify, diagnose, and resolve the problem.

If ProtecTIER Manager is unavailable, and your system is configured for email alerts or SNMP traps, refer to the hardware fault alert and resolution information that is provided in those resources. FOr more information, see Email alerts and Using SNMP traps for the TS7650G (Gateway), ProtecTIER V3.4.1.

Use this procedure to access the Hardware resources window.

Procedure

  1. If ProtecTIER Manager is not running, use one of the following options to start it:
    • On a PC with a Windows operating system, click: Start > All Programs > IBM > ProtecTIER Manager > IBM ProtecTIER Manager.
    • On a PC with a Linux operating system, double-click the ProtecTIER Manager icon on the Linux Desktop.

    The ProtecTIER Manager window opens.

  2. Log in to the system that includes the TS7620 Appliance Express ProtecTIER server (node) with the faulty component.
    Note: If you are unsure which system contains the faulty node, log in to each system in turn. When a system name appears in red text in the Systems list, that is the faulty system.
    1. In the left-side navigation pane of the ProtecTIER Manager, click the applicable system.

      The Login to system dialog box opens.

    2. Click Login.
    3. In the Username field, type: ptadmin
    4. In the Password field, type: ptadmin
    5. Select the Save password checkbox, click Ok, and then wait while ProtecTIER Manager saves your information and logs you in to the system.
      The following message appears:
      Figure 1. Configuration wizard reminder
      Configuration wizard reminder
    6. Click Ignore to close the message.
  3. In the Nodes section of the Systems Management pane, click the TS7620 Appliance Express server on which a fault occurred. Nodes with faults appear in red in the list, as shown:
    Figure 2. Example of faulty nodes displayed in red on a ProtecTIER server
    Faulty nodes that are displayed in red

    The ProtecTIER Manager window refreshes and changes to Nodes view, with information for the selected server displayed.

  4. In the Hardware resources window, click the tab in the middle pane of the ProtecTIER Manager for the component for which you need information.

    You might see these icons if the selected component is degraded (degraded) or is in a failed state (failed), or has another type of fault.

    In the next example, the disk drive component was selected and disk drive 12 has a small blue information icon. To see details of the information warning, scroll to the pane on the right.

    ProtecTIER Manager shows disk drive 12 is rebuilding.
    Figure 3. Hardware resources window, disk drive 12 rebuilding

    The following example shows the disk drive component where all the disk drives are working.

    Figure 4. Hardware resources window
    Hardware resources window
  5. In the Resource pane, on the right side of the window:
    1. Note the name of the component that generated the alert.
    2. Read the problem description to determine what caused the alert.
    3. Under Properties, note the FRU ID.
      Note: The FRU ID (part number) for both CRU and FRU components, is expressed as a FRU ID.
    4. Under Resource fault, read the information in the Associated Messages area, and follow any instructions.
  6. When you are finished reviewing fault and component information, use one of the following methods to exit the ProtecTIER Manager:
    • Click File > Exit.

      OR

    • Click the X in the upper-right corner of the window.
  7. Verify that the alert received was not caused by an easily resolved condition, such as a loose power cord or a defective cable.

ProtecTIER Manager Hardware Faults window

About this task

The Hardware faults window provides information on all of the hardware faults currently in effect for the specified server. When the ProtecTIER Manager is able to communicate with the server on which a hardware faults occurred, the information in the Hardware faults window might help you to identify, diagnose, and resolve the problem. If the ProtecTIER Manager is unavailable, and your system is configured for email alerts or SNMP traps, refer to the hardware fault alert and resolution information that is provided in those resources.

Use the following procedure to access the Hardware faults window.

Procedure

  1. If it is not already running, start PT Manager as described in ProtecTIER Manager Hardware Resources window.
  2. Log in to the system that includes the TS7620 Appliance Express server (node) with the faulty component, as described in 2.
  3. Select the server (node) on which the fault occurred, as described in 3.
  4. There are two ways to access Show Hardware Faults:
    • Node > Hardware > Show Hardware Faults
    • If there is a current hardware fault, click Recheck faults at the lower right of the ProtecTIER Manager window.
    The Recheck Faults window opens, with one or more fault messages on display in the Associated messages column:
    Figure 5. Recheck faults window
    Recheck faults window
  5. Review the displayed information, and make note of the defective components name (or type) and FRU ID.
    Note: The FRU ID (part number) for both CRU and FRU components, is expressed as a FRU ID.
  6. Verify that the alert received was not caused by an easy to resolve condition, such as a loose power cord or a defective cable.
  7. Refer to 4

Email alerts

About this task

If a hardware degradation or failure occurs, systems that are configured to use email alerts send a problem report message to one or more designated recipients. Email alerts notify you of hardware fault, even if you do not have access to the ProtecTIER Manager. When a hardware fault occurs, the system generates and sends a problem report, similar to the one shown in Figure 6:
Figure 6. Email alert
Email alert

Procedure

  1. Upon receiving an email alert:
    1. Open the message and review the information in the report.
    2. Make note of the defective components name (or type) and FRU ID.
      Note: The FRU ID (part number) for both CRU and FRU components, is expressed as a FRU ID.
  2. Verify that the alert you received was not caused by an easy to resolved condition, such as a loose power cord or a defective cable.