Troubleshooting
Problem
The symptom applies to the BladeCenter H, BladeCenter T, BladeCenter HT chassis. With the Power Management Policy setting of "Redundant With Performance Impact", the total amount of power allocated for use in each power domain of the chassis and allowedto be utilized by the blade servers can be greater than the capacity of a single power supply module. Upon a loss of power module redundancy (due to an internal module fault, a system fault, or loss of input AC voltage), if the total power in use is above the single power module capacity, the blade servers will enter a reduced CPU performance state (processors will be "throttled") to lower power consumption of the domain to be at or below the single module limit. Throttling status of blade servers can be found in the system status, the system log, or under CPU cycles in the power management interface screens of the Management Module.
Resolving The Problem
Source
RETAIN tip: H191701
Issue
The symptom applies to the BladeCenter H, BladeCenter T, BladeCenter HT chassis.
With the Power Management Policy setting of "Redundant With Performance Impact", the total amount of power allocated for use in each power domain of the chassis and allowed to be utilized by the blade servers can be greater than the capacity of a single power supply module. Upon a loss of power module redundancy (due to an internal module fault, a system fault, or loss of input AC voltage), if the total power in use is above the single power module capacity, the blade servers will enter a reduced CPU performance state (processors will be "throttled") to lower power consumption of the domain to be at or below the single module limit.
Throttling status of blade servers can be found in the system status, the system log, or under CPU cycles in the power management interface screens of the Management Module.
Blades may not enter their low power (throttled) state when simulating system power faults by undocking/removing an operating power supply module. Possible outcomes of undocking an operating power supply module are as follows:
- None of the blades will indicate they are throttling in the system log, system status, or CPU cycle fields. The remaining power in use will be a negative value without any sign of throttling.
- Blades may indicate they are throttling in the system status and the error log, but not under CPU cycles in power management. In this case, the power in use is higher than the maximum power limit, leaving remaining power to be a negative number.
- Throttling may be indicated in all three places with reduced power consumption and works correctly.
Affected configurations
The system may be any of the following IBM servers:
- BladeCenter H, type 7989, any model
- BladeCenter H, type 8852, any model
- BladeCenter HT, type 8740, any model
- BladeCenter HT, type 8750, any model
- BladeCenter LS21, type 7971, any model
- BladeCenter LS22, type 7901, any model
- BladeCenter LS41, type 7972, any model
- BladeCenter LS42, type 7902, any model
- BladeCenter QS20, type 0200, any model
- BladeCenter QS21, type 0792, any model
- BladeCenter T, type 8720, any model
- BladeCenter T, type 8730, any model
- BladeCenter HS20, type 8843, any model
- BladeCenter HS21 XM, type 1915, any model
- BladeCenter HS21 XM, type 7995, any model
- BladeCenter HS21, type 1885, any model
- BladeCenter HS21, type 8853, any model
This tip is not option specific.
This tip is not software specific.
Workaround
Undocking a power supply module to simulate a power module fault in order to test power management policy settings may not allow blades to get the correct power status from the system.
Do not undock an operating power supply module to simulate a fault of a power supply module in a BladeCenter Chassis.
The recommended procedure to simulate a power module fault is to remove the AC input power by disconnecting the line cord.
Then, the power supply module will be able to communicate fault status to the system and the blades will be signaled to reduce power, if required, when in the Power Management Policy setting of "Redundant With Performance Impact".
Additional information
Signals required to indicate the power supply status are disconnected when an operating power supply module is removed from the chassis. By disconnecting AC input power, the power supply module will communicate a fault condition to the systems (Advanced Management Module and blades) and the blades will react to these fault status signals.
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
18 April 2023
UID
ibm1MIGR-5072413