IBM Support

Avoiding Power9 and Power10 code update failures due to out of memory condition on the FSPs.

How To


Summary

Power9 and Power10 9080-HEX systems have a small memory leak in the FSP code that can result in an out of memory condition (OOM). If the OOM occurs during a code update, the update fails. For Power10, this issue is fixed in firmware level FW1010.40 and higher. The issue does not exist in base FW1030. For Power9, this issue is fixed in firmware level FW950.60 and higher.

Objective

This document shows how to reboot the Flexible Service Processors (FSPs) before a firmware update to ensure they do not encounter an out of memory condition during the firmware update.  All steps in this document can be done concurrently, without impact to the running system or partitions. 

Environment

The issue described in this document applies to 9080-HEX systems and all Power9 systems that are active for at least 60 days.
There is a second issue that can cause firmware update failures on 9080-HEX systems described in this document:
https://www.ibm.com/support/pages/node/6619359
For 9080-HEX systems only, review the "Diagnosing the problem" and "Resolving the Problem" portions of the linked document.  Once complete, continue with the steps in this document.

Steps

A previous version of this document stated that the Welcome screen of ASMI could be used to determine whether the FSP was up for greater than 60 days.  We now know that the uptime as viewed from the ASMI Welcome screen cannot be relied upon to determine whether the FSP was up longer than 60 days.  As the following steps are all concurrent, it is recommended to run them regardless of uptime of the FSPs.
Initiating a "Soft Reset Service Processor" on a primary FSP results in an FSP failover.  For the purposes of this document, we want to run the reset on an FSP that is in the secondary role only. 
Step 1 Launch ASMI on the system and log in ASMI on the secondary.
image-20230124155015-2
Take note the location code of the backup, in this example we see:
Service Processor: Secondary (Location: U78D6.SC1.KIC3264-P1-C3)

Step 2 Initiate a “Soft Reset Service Processor”:
image-20230124155113-3
The Soft Reset reboots the secondary FSP.  It takes approximately 20 minutes for the secondary to reconnect to the HMC. Once the secondary reconnects, we can we proceed with Step 3.
Step 3 Initiate an Administrative Failover (AFO):

Select the system and expand serviceability and FSP failover, select “Initiate.”
image 12891
Click “OK” to start the failover. Take note of the current IP address to ensure the roles swap.
image-20230124155510-4

 
image-20230124155553-5
Click “OK” to start the AFO. This process takes approximately 5 min.
Once the failover is complete, the FSP that was previously the primary is now the secondary.  We can now initiate a reset on the new secondary.
Step 4 Launch ASMI on the system and log in ASMI on the secondary:
*Note the location code change for the secondary
image-20230124155735-6
Step 5 Initiate a “Soft Reset Service Processor”:
image-20230124155113-3

Step 6 Check service processor status:
Once the secondary reconnects after approximately 20 min check “Service Processor Status” to ensure both are connected, and the state is Ready
image 12892
image-20230124160011-7
The firmware update can be initiated by using the normal procedure.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI000BK","label":"Power System E1080 Server (9080-HEX)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0007I","label":"Power System E980 Server (9080-M9S)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI000B2","label":"IBM Power System S914 (9009-41G)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0005E","label":"Power System S914 Server (9009-41A)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0005G","label":"Power System S922 Server (9009-22A)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI000B1","label":"Power System S922 Server (9009-22G)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0005F","label":"Power System S924 Server (9009-42A)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI000B3","label":"Power System S924 Server (9009-42G)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0007E","label":"Power System E950 Server (9040-MR9)"},"ARM Category":[{"code":"a8m0z000000bpKLAAY","label":"Firmware"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
04 August 2023

UID

ibm16857341