Troubleshooting
Problem
A 'Fatal Error Handling' message is reported in Power-On Self-Test (POST), and the system restarts continuously, if the system has this configuration: A Mellanox 10 Gb Ethernet or Mellanox 40 Gb Ethernet adapter. The adapters are installed in a Peripheral Component Interconnect Express (PCIe) slot run from the second Central Processing Unit (CPU) (Slot 2 on the system board). 64-bit Peripheral Component Interconnect (PCI) Resource Allocation is enabled
Resolving The Problem
Source
RETAIN tip: H21975
Symptom
A 'Fatal Error Handling' message is reported in Power-On Self-Test (POST), and the system restarts continuously, if the system has this configuration:
- A Mellanox 10 Gb Ethernet or Mellanox 40 Gb Ethernet adapter.
- The adapters are installed in a Peripheral Component Interconnect Express (PCIe) slot run from the second Central Processing Unit (CPU) (Slot 2 on the system board).
- 64-bit Peripheral Component Interconnect (PCI) Resource Allocation is enabled
Affected configurations
The system can be any of the following IBM servers:
- iDataPlex DWC dx360 M4 2U chassis, type 7919, any model
- iDataPlex DWC dx360 M4 server, type 7918, any model
- iDataPlex dx360 M4 2U chassis, type 7913, any model
- iDataPlex dx360 M4 server, type 7912, any model
The system is configured with one or more of the following IBM options:
- Mellanox ConnectX-3 10 Gigabit Ethernet Adapter for IBM System x, option part number 00D9690, any replacement part number
This tip is not software specific.
The following system BIOS/uEFI level is affected: Build ID: TDE134E
The system is configured with two processors.
The system has the symptom described above.
Solution
This is fixed in Unified Extensible Firmware Interface (UEFI) build 35O and Integrated Management Module (IMM) build 45Z.
The file is or will be available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and operating system on IBM Support's Fix Central web page, at the following URL:
    http://www.ibm.com/support/fixcentral/
Workaround
Disable UEFI Option ROM Execution on the Mellanox Ethernet adapters from UEFI or ASU by selecting one of these two options:
- Using the UEFI F1 Setup Utility
- Choose: System Settings -> Devices and I/O Ports -> Enable / Disable Adapter Option ROM Support.
- Select the UEFI Option ROM setting for the slots of the adapters and set to Disable.
- Using the Advanced Settings Utility (ASU)
application
Issue this command:
./asu64 set DevicesandIOPorts.SLOTXUEFIOPROM "Disable" --host <IMM IP> --user <USERNAME> --password <PASSW0RD>
Where X is the slot number of the adapter.
Additional information
64-bit PCI Resource Allocation is incorrectly handled for slots managed by the second processor. This results in the UEFI driver for the Mellanox 10 Gb Ethernet or Mellanox 40 Gb Ethernet adapters failing to properly address memory for the adapter. Disabling the UEFI Option ROM execution for Mellanox Ethernet adapters in those slots prevents the UEFI driver from executing, and avoids the allocation error.
64-bit PCI Resource Allocation is disabled by default, and is
needed only if the system is configured with adapters such as the
NVIDIA Graphical Processor Units (GPUs) or Intel Phis.
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
30 January 2019
UID
ibm1MIGR-5093827