IBM Support

NMI error occurs after commands run in Linux operating system - Servers

Troubleshooting


Problem

NMI error occurs after commands run in Linux operating system - Servers

Resolving The Problem

Source

RETAIN tip: H204966

Symptom

When running a "cat" command to read the contents of the ServeRAID controller driver and/or copying files from one folder to another folder, the server may hang and the following message is found in the IBM server Integrated Management Module (IMM) log:

  A software NMI has occurred on system "SN# XXXXXXX"
Fault in slot "All PCI Error" on system "SN# XXXXXXX"
Fault in slot "PCI 5" on system "SN# XXXXXXX"
A Uncorrectable Bus Error has occurred on system "SN# XXXXXXX"

A hard boot of the server is required to recover.

For the ServeRAID M5014 and M5015 SAS/SATA Controllers, this issue is found with firmware version 12.12.0-0056 or below.

For the ServeRAID M1015 SAS or SATA Controller, this issue is found with firmware version 20.10.1-0052 or below.

Affected configurations

The system may be any of the following IBM servers:

  • System x3400 M3, type 7378, any model
  • System x3400 M3, type 7379, any model
  • System x3500 M3, type 7380, any model
  • System x3500 M4, type 7383, any model
  • System x3550 M3, type 4254, any model
  • System x3550 M4, type 7914, any model
  • System x3650 M3, type 7945, any model
  • System x3650 M4, type 7915, any model

The system is configured with at least one of the following:

  • Red Hat Enterprise Linux 4, any Update
  • Red Hat Enterprise Linux 5, any Update
  • Red Hat Enterprise Linux 6, any Update
  • SUSE Linux Enterprise Server 10, any Service Pack
  • SUSE Linux Enterprise Server 11, any Service Pack

The system is configured with one or more of the following IBM Options:

  • ServeRAID M1015 SAS/SATA Controller, Option part number 46M0831, replacement part number (CRU) 46M0861
  • ServeRAID M5014 SAS/SATA Controller, Option part number 46M0916, replacement part number (CRU) 46M0918
  • ServeRAID M5015 SAS/SATA Controller, Option part number 46M0829, replacement part number (CRU) 46M0851

This tip is not system specific.

The LSI firmware for the ServeRAID M1015 and M5000 series Controllers is affected.

Note: This does not imply that the network operating system will work under all combinations of hardware and software.

Please see the compatibility page for more information: http://www.ibm.com/systems/info/x86servers/serverproven/compat/us/

Solution

This behavior is corrected in firmware release for the ServeRAID M5014 and M5015 Controllers higher than 12.12.0-0056.

This behavior is corrected in firmware release for the ServeRAID M1015 Controller higher than 20.10.1-0052.

The files are available at the following URL:

Workaround

Avoid running the commands that trigger this issue.

Additional information

A defect was found in the firmware that is causing the operating system to kernel panic and the system to crash.

The issue occurs because the firmware takes too long to complete the cycle in understanding the commands.

A new firmware has been released to correct this behavior.

Document Location

Worldwide

Operating System

System x:Red Hat Enterprise Linux 3

System x:Red Hat Enterprise Linux 4

System x:Red Hat Enterprise Linux 5

System x:SUSE Linux Enterprise Server 10

System x:SUSE Linux Enterprise Server 11

System x:Red Hat Enterprise Linux 6

Lenovo x86 servers:Red Hat Enterprise Linux 6

Lenovo x86 servers:SUSE Linux Enterprise Server 10

Lenovo x86 servers:SUSE Linux Enterprise Server 11

Lenovo x86 servers:Red Hat Enterprise Linux 5

[{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWX70","label":"System x->System x3400 M3"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWX80","label":"System x->System x3500 M3"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWX81","label":"System x->System x3500 M4"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWX90","label":"System x->System x3550 M3"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWX91","label":"System x->System x3550 M4"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWXA0","label":"System x->System x3650 M3"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWXA3","label":"System x->System x3650 M4"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWXC1","label":"Lenovo x86 servers->Lenovo System x3550 M4"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWXX0","label":"Lenovo x86 servers->Lenovo System x3500 M4"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWXX2","label":"Lenovo x86 servers->Lenovo System x3650 M4"},"Platform":[{"code":"PF042","label":"Caldera"},{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
30 January 2019

UID

ibm1MIGR-5089254