Resolving a system firmware boot failure
Learn how to identify the service action that is needed to resolve a failure while booting your system firmware.
- After you pressed the power button, did the system turn on but fail to display the Petitboot
menu?
If Then Yes: Continue with the next step. No: Continue with step 5. - Does the baseboard management controller (BMC) respond to commands? Note: To determine whether the BMC responds to commands, run the following ipmitool command:
ipmitool -I lanplus -U <username> -P <password> -H <bmc ip or bmc hostname> chassis statusIf Then Yes: Continue with the next step. No: Continue with step 4. - Complete the following actions:
- Use the BMC to update the system firmware. For instructions, see Updating the system firmware by using the BMC.
- Check the system event logs. For instructions, see Identifying a service action by using system event logs. Then, continue with step 5.
- Complete the following actions, one at a time, until the problem is resolved:
- Reset the BMC remotely by entering the following
command:
ipmitool -I lanplus -U <username> -P <password> -H <bmc ip or bmc hostname> mc reset cold - Disconnect the power cords from the system for 30 seconds. Reconnect the power cords, wait 5 minutes, and then go to step 2.
- Use the IPMI tool to update the system firmware. For instructions, see Updating the system firmware by using the IPMI tool.
- Complete the service action that is indicated for your system:
- If your system is an 8335-GCA or 8335-GTA, replace the system backplane. Go to 8335-GCA and 8335-GTA locations to identify the physical location and the removal and replacement procedure.
- If your system is an 8335-GTB, replace the BMC card. Go to 8335-GTB locations to identify the physical location and the removal and replacement procedure.
- If your system is an 8348-21C, replace the system backplane. Go to 8348-21C locations to identify the physical location and the removal and replacement procedure.
- Reset the BMC remotely by entering the following
command:
- Are you here because of a system event log (SEL) with the value OEM record
c0 and OEM c0 specific log information
3a1503xxxxxx?
If Then Yes: Continue with step 8. No: Continue with the next step. - Are you here because of a SEL event with the value OEM record c0 and OEM
c0 specific log information 3a1504xxxxxx?
If Then Yes: Continue with step 12. No: Continue with the next step. - Power off the system and disconnect all ac power cords
for 30 seconds. Then, reconnect the ac power cords and power on the system. Does the system boot
successfully?
If Then Yes: This ends the procedure. No: Go to Resolving a hardware problem. This ends the procedure. - Did the system complete the boot process successfully?
If Then Yes: Continue with the next step. No: Continue with step 12. - Determine whether the system is booted from the user-updated level of the system firmware image
(primary side) or the manufacturing level of the system firmware image (golden side).
- For in-band networks, enter the following command:
ipmitool sensor list | grep -i golden
- To run the command remotely over the LAN, enter the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sensor list | grep -i golden
Do both of the returned records show 0x0080 in the data fields?
If Then Yes: The error was temporary. No service action is required. This ends the procedure. No: One or both of the returned records have 0x0180 in the data fields. The system was booted from the golden side. Continue with the next step. - For in-band networks, enter the following command:
- Search for processor deconfiguration SEL events that have a time stamp in close proximity to
the time stamp of the event with value OEM record c0 that sent you
here. Processor deconfiguration SEL events are displayed in the following form:
- Processor CPU Func x | Transition to Non-recoverable | Asserted
Are processor deconfiguration events present?
If Then Yes: Complete the service actions for the processor deconfiguration events. - If your system is an 8335-GCA or 8335-GTA, go to Identifying a service action by using sensor and event information for the 8335-GCA and 8335-GTA. This ends the procedure.
- If your system is an 8335-GTB, go to Identifying a service action by using sensor and event information for the 8335-GTB. This ends the procedure.
- If your system is an 8348-21C, go to Identifying a service action by using sensor and event information for the 8348-21C. This ends the procedure.
No: Continue with the next step. - Are there other types of SEL events that require a service action and have a time stamp in
close proximity to the time stamp of the event with value OEM record c0 that
sent you here?
If Then Yes: Complete the service actions for the SEL events that require service actions. - If your system is an 8335-GCA or 8335-GTA, go to Identifying a service action by using sensor and event information for the 8335-GCA and 8335-GTA. This ends the procedure.
- If your system is an 8335-GTB, go to Identifying a service action by using sensor and event information for the 8335-GTB. This ends the procedure.
- If your system is an 8348-21C, go to Identifying a service action by using sensor and event information for the 8348-21C. This ends the procedure.
No: If the boot problem persists, reload or update the system firmware image. Go to Getting fixes and reload the system firmware with the same level of firmware or update the system firmware with a more recent level of firmware. Then, reboot the system. This ends the procedure. - Search for processor deconfiguration SEL events that have a time stamp in close proximity to
the time stamp of the event with value OEM record c0 that sent you
here. Processor deconfiguration SEL events are displayed in the following form:
- Processor CPU Func x | Transition to Non-recoverable | Asserted
Are processor deconfiguration events present?
If Then Yes: Complete the service actions for the processor deconfiguration events. - If your system is an 8335-GCA or 8335-GTA, go to Identifying a service action by using sensor and event information for the 8335-GCA and 8335-GTA. This ends the procedure.
- If your system is an 8335-GTB, go to Identifying a service action by using sensor and event information for the 8335-GTB. This ends the procedure.
- If your system is an 8348-21C, go to Identifying a service action by using sensor and event information for the 8348-21C. This ends the procedure.
No: Continue with the next step. - Are there other types of SEL events that require a service action and have a time stamp in
close proximity to the time stamp of the event with value OEM record c0 that
sent you here?
If Then Yes: Complete the service actions for the SEL events that require service actions. - If your system is an 8335-GCA or 8335-GTA, go to Identifying a service action by using sensor and event information for the 8335-GCA and 8335-GTA. This ends the procedure.
- If your system is an 8335-GTB, go to Identifying a service action by using sensor and event information for the 8335-GTB. This ends the procedure.
- If your system is an 8348-21C, go to Identifying a service action by using sensor and event information for the 8348-21C. This ends the procedure.
No: Continue with the next step. - Power off the system and disconnect all AC power cords for 30 seconds. Then, reconnect the AC power cords and power on the system. Does the system boot successfully?
If Then Yes: This ends the procedure. No: Continue with the next step. - Is the system an 8348-21C, and are all 32 of
the DIMM locations populated with 32 GB DIMMs?
If Then Yes: Continue with the next step. No: Go to step 18. - Use the baseboard management controller (BMC) to update the system
firmware. For instructions, see Updating the system firmware by using the BMC. Does the
problem persist?
If Then Yes: Continue with the next step. No: This ends the procedure. - Is your system is an 8335-GTB?
If Then Yes: Replace the Baseboard management controller (BMC) card. Go to 8335-GTB locations to identify the physical location and the removal and replacement procedure. If the problem persists, continue with the next step. Otherwise, this ends the procedure. No: Continue with the next step. - Replace the system backplane.
- If your system is an 8335-GCA or 8335-GTA, go to 8335-GCA and 8335-GTA locations to identify the physical location and the removal and replacement procedure. Then, continue with the next step.
- If your system is an 8335-GTB, go to 8335-GTB locations to identify the physical location and the removal and replacement procedure. Then, continue with the next step.
- If your system is an 8348-21C, go to 8348-21C locations to identify the physical location and the removal and replacement procedure. Then, continue with the next step.
- Does the problem persist?
If Then Yes: Go to Collecting diagnostic data. Then, go to Contacting IBM service and support. This ends the procedure. No: This ends the procedure.
Parent topic: Beginning troubleshooting and problem analysis