SIP3295

Use this procedure to resolve the following problem: Adapter exceeded the maximum operating temperature (SRC xxxx4080).

Procedure

  1. Is the adapter a Non-Volatile Memory Express (NVMe) device?
    • Yes: Continue with the next step.
    • No: Continue with step 3.
  2. The NVMe device has exceeded the maximum normal operating temperature. The NVMe device continues to run unless the temperature rises even more to the point where errors or hardware failures occur. The NVMe device is not likely to be the cause for exceeding the maximum operating temperature.
    To view the temperature information, complete the following steps:
    1. If the system has logical partitions, complete this procedure from the logical partition that reported the problem.
    2. Sign on to an IBM® i session with the security officer (QSECOFR) user profile.
    3. To create an NVMe device report in a spool file, type the following command at the command line of the IBM i operating system and press Enter.
      CALL PGM(QSMGSSTD) PARM('NVMEGAUGE' X'00000009' 'SSTD0100' X'00000000')
    4. To display the contents of the spool file, type wrksplf at the command line of the IBM i operating system and press Enter. The spool file contains a report for the NVMe devices.
    5. Continue with step 4 to determine the possible cause and to take necessary action to prevent the NVMe device from exceeding the maximum operating temperature.
  3. The storage controller chip has exceeded the maximum normal operating temperature. The adapter continues to run unless the temperature rises even more to the point where errors or hardware failures occur. The adapter is not likely to be the cause for exceeding the maximum operating temperature.
    To view the temperature information, complete the following steps:
    1. Access SST/DST by doing one of the following:
    2. Access the product activity log and display the SRC that sent you here. Press the F4 key to view the temperature information in Additional Information. The Detail Data section contains the Current Temperature (in degrees Celsius and in decimal notation) and the Maximum Operating Temperature (in degrees Celsius and in decimal notation) at the time the error was logged.
    3. Continue with the next step to determine the possible cause and to take necessary action to prevent the adapter from exceeding the maximum operating temperature.
  4. Determine which of the following conditions are a cause for exceeding the maximum operating temperature and take the appropriate actions listed. If this does not correct the error, contact your next level of support for assistance.

    The possible causes are:

    • The adapter is installed in an unsupported system. For information about which systems support the adapter, see Adapter information by feature code.
    • The adapter is installed in an unsupported slot location within the system unit or I/O enclosure. For information about supported slot locations, see PCI adapter placement information for the machine type model (MTM) where the adapter is located.
    • Ensure that there are no issues that are affecting proper cooling of the adapter (no fan failures or obstructions).
    Note: The adapter that is logging this error continues to log this error while the adapter remains above the maximum operating temperature or each time it exceeds the maximum operating temperature.
    This ends the procedure.