Troubleshooting
Problem
When the setup menu option 'Live Error Recovery' (LER) is changed from the default to disabled and an uncorrectable Peripheral Component Interconnect (PCI) error occurs: The operating System freezes (locks up, hangs) for a period of time The operating System eventually restarts The system does not report or log an issue. This is not the intended behavior for this type of incident when the setting is disabled.
Resolving The Problem
Source
RETAIN tip: H212079
Symptom
When the setup menu option 'Live Error Recovery' (LER) is changed from the default to disabled and an uncorrectable Peripheral Component Interconnect (PCI) error occurs:
- The operating System freezes (locks up, hangs) for a period of
time
- The operating System eventually restarts
- The system does not report or log an issue.
This is not the intended behavior for this type of incident when the setting is disabled.
Affected configurations
The system can be any of the following IBM servers:
- System x3850 X6, type 3837 (4-socket, 3 year warranty), any model
- System x3850 X6, type 3839 (4-socket, 4 year warranty), any model
This tip is not software specific.
This tip is not option specific.
The following system BIOS or UEFI levels are affected: Build ID:
- Build ID: A8E103SUS
The system has the symptom described above.
Solution
This behavior has been corrected in a current UEFI firmware release ibm_fw_uefi_a8e108m-1.00_anyos_32-64.
The file is available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and Operating system on IBM Support's Fix Central web page, at the following URL:
http://www.ibm.com/support/fixcentral/
Workaround
IBM strongly advises against disabling the LER setting. To re-enable the setting perform the following steps:
- Press F1 during Power On Self-Test (POST) to
enter the UEFI F1 Setup menu.
- Select System Settings -> Recovery
and RAS -> Live Error Recovery ->
Enable.
- Select Save settings.
- Exit the UEFI F1 Setup menu.
- Use system with this option enabled.
Additional information
There is no reason to disable the LER setting.
This error occurs as a result of disabling the LER setting and a code defect. When disabled, the card error crashes the system before an interrupt can be generated. This is not the intended behavior when this type of fault occurs with LER disabled.
Live Error Recovery is enabled by default. When enabled, it
automatically disables the faulty Peripheral Component Interconnect
Express (PCIe) card and sends an interrupt to the UEFI error
handler. The handler will cause a 'blue screen' (Microsoft Windows
critical error) and a graceful restart, which is the expected
behavior.
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
30 January 2019
UID
ibm1MIGR-5094752