IBM Support

ServeRAID M5100, M1200, and M5200 Series: "Controller encountered a fatal error and was reset" Message Running in iMR Mode

Troubleshooting


Problem

[This abstract has been truncated due to length constraints] When running a ServeRAID M1200 Series SATA/SATA Controller, or when running a ServeRAID M5100 or M5200 Series SAS/SATA Controller with no Transportable Memory Module (TMM) attached, the controller may reset under heavy workload. In the MegaRAID Storage Manager event log, the following event will appear after the reset. "Controller encountered a fatal error and was reset" In the ServeRAID controller's firmware log, you will see the following event at the time of the controller reset. MonTask: line 280 in file ../../raid/1078int.c. [4]: fp=c112e4b0, lr=c12d29a0 - oobCmdInfo+60 [5]: fp=c112e4c8, lr=c191913c - OOBExecCmdIssue+1ac [6]: fp=c112e4f8, lr=c191a35c- OOBHandleCmdPacket+200 [7]: fp=c112e5a0, lr=c191b008 - OOB_RecvCallback+238 ServeRAID controller firmware logs can be pulled with the following StorCLI command. StorCLI is availa

Resolving The Problem

Source

RETAIN tip: H001154

Symptom

When running a ServeRAID M1200 Series SATA/SATA Controller, or when running a ServeRAID M5100 or M5200 Series SAS/SATA Controller with no Transportable Memory Module (TMM) attached, the controller may reset under heavy workload.

In the MegaRAID Storage Manager event log, the following event will appear after the reset.

  "Controller encountered a fatal error and was reset"

In the ServeRAID controller's firmware log, you will see the following event at the time of the controller reset.

  MonTask: line 280 in file ../../raid/1078int.c.

[4]: fp=c112e4b0, lr=c12d29a0 - oobCmdInfo+60
[5]: fp=c112e4c8, lr=c191913c - OOBExecCmdIssue+1ac
[6]: fp=c112e4f8, lr=c191a35c - OOBHandleCmdPacket+200
[7]: fp=c112e5a0, lr=c191b008 - OOB_RecvCallback+238

ServeRAID controller firmware logs can be pulled with the following StorCLI command. StorCLI is available from IBM Fix Central.

storcli /cX show termlog > termlog.txt

(where X = controller #)

Affected configurations

The system is configured with one or more of the following IBM Option part numbers:

  • IBM Flex System Storage Expansion Node, Option part number 68Y8588, any replacement part number
  • ServeRAID M1210e, any model
  • ServeRAID M1215 SAS/SATA Controller for IBM System x, Option part number 46C9114, any model
  • ServeRAID M5110 SAS/SATA Controller Card, Option part number 81Y4481, any replacement part number
  • ServeRAID M5110 SAS/SATA Controller for IBM System x (CTO), any replacement part number
  • ServeRAID M5110e SAS/SATA Controller for IBM System x, onboard, any embedded
  • ServeRAID M5115 SAS/SATA Controller, Option part number 90Y4390, any replacement part number
  • ServeRAID M5210 SAS/SATA Controller for IBM System x, Option part number 46C9110, any any
  • ServeRAID M5210e SAS/SATA Controller for IBM System x, Option part number 46C9117CTO, any any

This tip is not system specific.

This tip is not software specific.

Solution

This behavior has been corrected in M5100 Series SAS/SATA Controller firmware version 23.33.0-0033, and M5200/M1200 Series SAS/SATA Controller firmware version 24.9.0-0026.

The file is or will be available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and operating system on IBM Support's Fix Central web page, at the following URL:

     http://www.ibm.com/support/fixcentral/

Workaround

No hardware should be replaced. The ServeRAID controller should be updated to the latest firmware level.

Additional information

When the ServeRAID SAS/SATA controllers are running in Integrated MegaRAID (iMR) mode, with no Transportable Memory Module (TMM), a certain amount of system memory is allocated to controller so that out of band commands can be performed.

With the previous firmware, there was no check in place to verify buffer overruns did not occur within this available memory pool.

Document Location

Worldwide

Operating System

System x Hardware Options:Operating system independent / None

[{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEARE","label":"System x Hardware Options->ServeRAID->ServeRAID M and MR10 Series->81Y4481"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEVWA","label":"System x Hardware Options->ServeRAID->ServeRAID M and MR10 Series->46C9110"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEVWB","label":"System x Hardware Options->ServeRAID->ServeRAID M and MR10 Series->46C9117"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEZBN","label":"System x Hardware Options->ServeRAID->ServeRAID M and MR10 Series->46C9114"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
30 January 2019

UID

ibm1MIGR-5098682