IBM Support

IMM failover and unexpected failure with IMM 1.30 - 1.32 - IBM Servers

Troubleshooting


Problem

When the system is running with Integrated Management Module (IMM) firmware v1.30 Build ID: YUOOC7E, v1.31 Build ID: YUOOC7F, or v1.32 Build ID: YUOOD4G, in rare situations, IMM is reset unexpectedly and fails to start up after several attempts. The system then automatically fails over to the alternate bank, usually with an older code level. In some cases, the IMM code in the alternate bank initializes successfully, and sometimes it does not. If the IMM code in the alternate bank fails to initialize: IMM stops functioning. For a rack server, IMM is not accessible. For a blade server, the blade is listed in Advanced Management Module (AMM) as "Comm Error" or "Init Failed". Reseating or AC cycling the system does not help to recover IMM.

Resolving The Problem

Source

RETAIN tip: H206034

Symptom

When the system is running with Integrated Management Module (IMM) firmware v1.30 Build ID: YUOOC7E, v1.31 Build ID: YUOOC7F, or v1.32 Build ID: YUOOD4G, in rare situations, IMM is reset unexpectedly and fails to start up after several attempts.

The system then automatically fails over to the alternate bank, usually with an older code level. In some cases, the IMM code in the alternate bank initializes successfully, and sometimes it does not.

If the IMM code in the alternate bank fails to initialize:

  • IMM stops functioning.
  • For a rack server, IMM is not accessible.
  • For a blade server, the blade is listed in Advanced Management Module (AMM) as "Comm Error" or "Init Failed".
  • Reseating or AC cycling the system does not help to recover IMM.

If the IMM code in the alternate bank initializes successfully and the code level is IMM v1.29 Build ID: YUOOB7 or previous (older/lower):

  • No indication of failover is provided in logs.
  • Users will observe the IMM code level change.
  • IMM will change to default settings.
  • On a multi-node system a firmware version mismatch will occur.
  • In this situation, if users performs a firmware update to IMM v1.30, v1.31, or v1.32 to recover the corrupted bank, IMM will fail to initialize and stop functioning.

Affected Configurations

The system can be any of the following IBM servers:

  • BladeCenter HX5, type 1909, any model
  • BladeCenter HX5, type 7872, any model
  • BladeCenter HX5, type 7873, any model
  • System x3690 X5, type 7147, any model
  • System x3690 X5, type 7148, any model
  • System x3690 X5, type 7149, any model
  • System x3690 X5, type 7192, any model
  • System x3850 X5, type 7143, any model
  • System x3850 X5, type 7145, any model
  • System x3850 X5, type 7146, any model
  • System x3850 X5, type 7191, any model
  • System x3950 X5, type 7143, any model
  • System x3950 X5, type 7145, any model

This tip is not software specific.

This tip is not option specific.

The following system firmware level(s) are affected:

  • IMM versions 1.30, 1.31, or v1.32

The system has the symptom described above.

Solution

This behavior was corrected in IMM firmware 1.33 Build ID: YUOOE3C and later.

The file is available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and Operating system on IBM Support's Fix Central web page, at the following URL:

Workaround

Take the appropriate action depending which case applies:

Case 1

If the system was running with IMM v1.30, v1.31 or, v1.32 and unexpectedly failed over to an IMM release earlier than v1.30, follow these steps to recover IMM:

  1. Flash IMM firmware to level v1.28 Build ID: YUOOB7C or v1.29 Build ID: YUOOB7F and then reset IMM to activate IMM v1.28 or v1.29, making sure that the active IMM firmware level is v1.28 or v1.29 after resetting.
  2. Flash IMM firmware to level v1.32 Build ID: YUOOD4G and then reset IMM to activate IMM v1.32, making sure that the active IMM firmware level is v1.32 after resetting.
  3. Flash IMM firmware to level v1.32 Build ID: YUOOD4G again, reset IMM to activate, then verify that the active IMM firmware level is v1.32.

Case 2

If the system was running with IMM v1.30, v1.31, or v1.32 and either of the following are true:

  • IMM failed after the automatic failover.
  • IMM failed after performing firmware update to v1.30, v1.31, or v1.32 when automatic failover is seen.

Case 2a: For IBM System x3850 X5 or IBM System x3950 X5, contact IBM service to have the I/O Shuttle replaced.

Case 2b: For IBM System x3690 X5 or IBM BladeCenter HX5, follow the following steps to attempt the recovery procedure:

Note: A jumper will be needed to attempt this recovery method. If no jumper is available, service should be called.

  1. Power off the system, unplug the power cord (for blade, unseat the blade from the chassis).
  2. Remove the cover and place a jumper on the IMM recovery jumper:

    For IBM BladeCenter HX5: Place jumper on J229 on the Central Processing Unit (CPU) board.

    Jumper on IBM BladeCenter HX5 board

    For IBM System x3690 X5: Place jumper on J76 pins 2-3 on the system board (pin 3 is closest to front of server).

    Jumper on IBM System x3690 X5 board

  3. Plug the power cord and start to start the IMM from backup start code (for blade, insert the blade back into the chassis).

    Note: The recovery method above does not guarantee that the system can be 100 percent recovered. If the failure continues to occur, the system board should be replaced.

  4. If the IMM starts up successfully, follow the remaining steps to finish the recovery procedure. Otherwise, replace the system board.
  5. Update IMM firmware to level v1.32 Build ID: YUOOD4G and then reset IMM to activate IMM v1.32, making sure the active IMM firmware level is v1.32.
  6. Update IMM firmware to level v1.32 Build ID: YUOOD4G again, activating and verifying that the active IMM firmware level is v1.32.
  7. Disconnect the power cord from the server (for blade, unseat the blade from the chassis). Remove the cover and remove the Backup IMM start image jumper.
  8. Connect the power cord to the server (for blade, insert the blade back into the chassis), and check if the IMM has recovered.

Additional Information

A change in IMM firmware architecture was released beginning in third quarter 2011. After releasing the newer firmware, the possibility for undesirable results could occur if the two IMM bank images were different architecturally.

In Workaround Case 1, the action results in making both IMM banks the same architecturally, thus eliminating the symptom caused by running a pre-third quarter 2011 released version with a corrupted third quarter 2011 or newer alternate bank.

In Workaround Case 2a, the two bank images are the same architecturally. Failure of both IMM banks has occurred, and requires system board replacement.

In Workaround Case 2b, the action uses a hardware designed-in capability that might allow recovering the IMM if corruption was limited to a portion of the IMM firmware. This allows users to manually select the temporary use of a pre-set piece of the alternate bank firmware.

Document Location

Worldwide

Operating System

BladeCenter:Operating system independent / None

System x:Operating system independent / None

[{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU04SRF","label":"System x->System x3850 X5->7146"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW233","label":"BladeCenter HX5"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"LOB57","label":"Power"}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU04SRO","label":"System x->System x3850 X5->7145"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU04SZB","label":"System x->System x3950 X5->7145"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW233","label":"BladeCenter HX5"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"LOB57","label":"Power"}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU04WDX","label":"System x->System x3690 X5->7149"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU04WDY","label":"System x->System x3690 X5->7148"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU90ABO","label":"System x->System x3850 X5->7191"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU90ABQ","label":"System x->System x3690 X5->7147"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU90ABX","label":"System x->System x3850 X5->7143"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU90ACM","label":"System x->System x3690 X5->7192"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"SUNSET","label":"PRODUCT REMOVED"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU90ADT","label":"System x->System x3950 X5->7143"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
30 January 2019

UID

ibm1MIGR-5090827