Troubleshooting
Problem
During production runtime, one of the following symptoms may occur: - Server crashes and the ServeRAID controllers are not found during POST (i.e. no ServeRAID POST banner). - Server crashes and the ServeRAID controller displays a Controller Failure message and/or a Status: "Failed" message on the POST banner. - When booting to the IBM ServeRAID support CD, the application does not recognize a ServeRAID controller in the server. - During the ServeRAID POST process, a battery failure message is displayed. - The ServeRAID Manager utility may report a bad battery. - Server crashes due to the battery being exposed to high temperature above 50 degrees Celsius.
Resolving The Problem
Source
RETAIN tip: H191476
Symptom
During production runtime, one of the following symptoms may occur:
- Server crashes and the ServeRAID controllers are not found during Power On Self Test (POST) (i.e. no ServeRAID POST banner).
- Server crashes and the ServeRAID controller displays a Controller Failure message and/or a Status: "Failed" message on the POST banner.
- When booting to the IBM ServeRAID support CD, the application does not recognize a ServeRAID controller in the server.
- During the ServeRAID POST process, a battery failure message is displayed.
- The ServeRAID Manager utility may report a bad battery.
- Server crashes due to the battery being exposed to high temperature above 50 degrees Celsius.
Affected configurations
The system is configured with one or more of the following IBM Options:
- ServeRAID-4H Ultra160 SCSI Controller - Cache Daughter Card FRU 37L6902
- ServeRAID-4H Ultra160 SCSI Controller, Option 37L6889
- ServeRAID-4M Ultra160 SCSI Controller (Japan), Option 19K0565
- ServeRAID-4M Ultra160 SCSI Controller, Option 37L6080
- ServeRAID-4Mx Ultra160 SCSI Controller, Option 06P5736
- ServeRAID-5i Controller, Option 25P3492
- ServeRAID-6i Controller, Option 71P8595
- ServeRAID-6i+ Controller, Option 13N2190
- ServeRAID-8i Controller, Option 13N2227
- ServeRAID-8k SAS Controller, Option 25R8064
- ServeRAID-8s SAS PCIe Controller, Option 39R8765
This tip is not hardware specific.
This tip is not software specific.
The system has the symptom described above.
Solution
During a scheduled maintenance period:
- Power down the server.
- Remove the top or side cover of the server.
- Verify if the battery on the ServeRAID controller has symptoms of swelling conditions.
- If this symptom is found, remove and replace with a new battery to avoid any of the above related symptoms.
Use one of the following methods to purchase a replacement cache battery (for a list of Field Replaceable Unit (FRU) numbers, refer to the "Additional Information" section, below):
- Call 1-800-388-7080 and select Option 2.
- Go to the following URL: http://www-132.ibm.com/content/home/store_IBMPublicUSA/en_US/parts/parts_main.html
Workaround
Disconnect the battery from the ServeRAID controller or change the write cache policy on each logical drive to write-through mode using the IBM ServeRAID manager application until the battery is replaced. After the battery replacement, change the write cache policy on each logical drive to write-back mode.
Additional Information
IBM SeveRAID batteries are consumable items. The following table lists ServeRAID controller battery part numbers that can be purchased using your machine type and serial number:
- ServeRAID 4x Controller Battery, Option 37L6903
- ServeRAID 5i Controller Battery, Option 25P3482
- ServeRAID 6i Controller Battery, Option 39R8799
- ServeRAID 7k Controller Battery, Option 39R8804
- ServeRAID 8k Controller Battery, Option 25R8088
- ServeRAID 8i/8s Controller Battery, Option 25R8118
The ServeRAID battery could swell due to unacceptable conditions such as over usage in systems or operating in extreme temperatures of over 50 degrees Celsius which is not a support environment to run IBM servers.
If any of the above issues is present, an unplanned outage may result to physically check the battery condition.
The IBM ServeRAID Manager application shows the battery status under the Controller Properties Status tab. In the event of a power outage or failure, the battery-backup cache protects the data stored in the ServeRAID cache memory when using the write-back setting of the write-cache mode.
Note: The replacement battery will arrive fully discharged.
Once the server is powered, the battery begins a full charge cycle. The controller can be used during this time; however, the battery is unable to meet the specified holdover time until it is fully charged. The battery is still able to handle brief power losses during the initial charge cycle.
There are different ways in which systems behave when the battery is approaching the end of its usable life. First, in many cases, the battery no longer charges to its specified voltage as it reaches the end of its usable life. This is detected by the ServeRAID controller and an alert is generated by the ServeRAID Manager to indicate that the FRU battery needs to be replaced.
Secondly, as the battery fails the charging circuit on the ServeRAID controller draws a short current surge from the PCI/PCI-X slot. The PCI/PCI-X current-limiting circuit on some servers, for example, xSeries 360 and xSeries 365, turns off power to the PCI slot, and the controller is no longer seen at POST. This behavior can be confirmed by watching the controller in the slot at power on.
Making the temporary changing of the write-cache policy on each logical drive to write-through eliminates the risk of data loss due to a power loss at the expense of decreased write performance.
For additional information on IBM ServeRAID batteries, refer to RETAIN tip H001648.
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
29 January 2019
UID
ibm1MIGR-5079049