IBM Support

RAID configuration and data scrubbing to prevent RAID rebuild failure - Servers

Troubleshooting


Problem

Before installing software or data for the first time on an IBM PC Server RAID system, the following steps in this document must be performed.

Resolving The Problem

Read and understand this document prior to applying any steps/ procedures.

Before installing software or data for the first time on an IBM PC Server RAID system, the following must be performed:

1. UPDATE THE RAID ADAPTER FIRMWARE TO THE FOLLOWING MINIMUM FIRMWARE LEVEL OR HIGHER.

 ADAPTER FIRMWARE BIOS a. Micro Channel RAID Adapter 2.21 FRU p/n06H3059 
(Opt. p/n70G9263) b. PCI RAID Adapter FRU p/n06H5078 2.43 (Opt. p/n94G2764) c. 
ServeRAID Adapter FRU p/n06H9334 2.23s.6* 2.30.04* (Opt. p/n70G8489) d. ServeRAID 
Adapter FRU p/n76H6875 2.23s.6* 2.30.04* (Opt. p/n70G8489) e. ServeRAID II Adapter 
FRU p/n76H3587 2.30.04* 2.30.04* (Opt. p/n76H3584) f. ServeRAID Onboard Controller 
97239* 2.30.04*

* The Firmware/BIOS diskette 2.30 contains the BIOS flash 2.30.04 as well as the Firmware flashes. NOTE: These Firmware and BIOS versions are to date at the time this document was released. Firmware and BIOS levels are subject to change over time. Always check for the latest BIOS and Firmware utility.


NOTE: Be sure that the latest version of the corresponding RAID utility diskette is used to ensure compatibility with the latest Firmware and BIOS on the corresponding adapter.


2. Initialize RAID level 0, 1, and 5 logival drives (all RAID adapters).
3. Synchronize all RAID 5 logical drives after initialization (prior to installing software and data) on the ServeRAID adapter, ServeRAID II adapter, or ServeRAID onboard controller, or data loss may occur.


NOTE: Synchronization is done automatically when initializing RAID 5 logical drives on the following adapters:
Micro Channel RAID Adapter FRU p/n92F0335 (Opt. p/n none) Micro Channel RAID Adapter FRU p/n06H3059 (Opt. p/n70G9263) PCI RAID Adapter FRU p/n06H5078 (Opt. p/n94G2764) 4. Data scrub all RAID 5 logical drives using the Synchronize Utility weekly (after software and data are installed) to provide a high level of protection against data loss.


NOTE: "Data Scrubbing" of the drives may be accomplished one of two ways on the following adapters:

  • Micro Channel RAID Adapter FRU p/n06H3059 (Opt. p/n70G9263)
  • PCI RAID Adapter FRU p/n06H5078 (Opt. p/n94G2764)
  • ServeRAID Adapter FRU p/n06H9334 (Opt. p/n70G8489)
  • ServeRAID Adapter FRU p/n76H6875 (Opt. p/n70G8489)

The Raid Utility Diskette may be used to apply "Data Scrubbing" of Raid level 1 and 5 Logical drives using the "Synchronize" utility. This method requires that you "down" the server.

Netfinity Manager 5.0 or higher may be used to allow "Data Scrubbing" via Synchronization to be run in the background while the server is up. This will allow users to access data on the Logical drive.


NOTE: See the matrix of utilities vs. adapters vs. Network Operating Systems in the White Paper; "Using IBM RAID Adapters to Avoid Data Loss".
The WEB URL to search for this White Paper is:

www.ibm.com/systems/support


Click Search at the top of the page and use "White Paper" as Keywords.

NOTE: "Data Scrubbing" runs automatically in the background on the ServeRAID II Adapter. The firmware of the adapter must be at 2.30.04 or higher to include this feature.

Details:
When a hard drive fails and is replaced in a RAID-1 or RAID-5 array, data loss may occur if a sector on one of the remaining working drives cannot be read.

RAID-5 logical drives must be synchronized immediately after they are created to ensure that the parity data stripe units (RAID 5 ) accurately reflect the data.

The IBM ServeRAID Adapter, IBM ServeRAID II Adapter and the ServeRAID Onboard controller requires the user to synchronize the RAID 5 Logical drives after initialization before any data is stored on the drives.

"Data Scrubbing" is recommended as a preventative maintenance procedure to reduce the risk of an array rebuild failure, or possible data loss if using the ServeRAID adapter. IBM recommends that "Data Scrubbing" be run weekly to provide a high level of protection. The level of protection increases as more frequent "Data Scrubbing" is performed. To reduce the frequency of "Data Scrubbing" to once or twice a month and still maintain a high level of protection, schedule "Data Scrubbing" along with other preventative maintenance procedures like regular tape backups.

Over time a hard disk may accumulate grown defects. This is normal. Defects are corrected on accessed files by the hardfile ECC or RAID subsystem. If a grown defect is encountered when a file is accessed, the data is reconstructed using either the ECC on the hardfile or the RAID redundant information. However, if a grown defect appears on an area that is not accessed (the area is free space, or because the file is accessed from cache), then "Data Scrubbing" is required to detect it. Once detected, the hardfile will reallocate the sector. In the case where all drives are online, the ECC on the hardfile or the RAID redundant information is used to reconstruct the lost stripe unit. However, if a drive has a grown defect, and another drive has failed completely, then there is not enough information to reconstruct the data and data loss may occur after the rebuild.

Predictive Failure Analysis (PFA) has been developed to monitor performance of drives, analyze data from periodic internal measurements, and recommend replacement when specific thresholds are exceeded. The data from periodic internal measurements is collected when actual accesses of the data sectors occur. "Data Scrubbing" , which forces all data sectors to be read, provides more data to improve the accuracy of PFA. IBM recommends that customers read the following White Papers to ensure a thorough understanding of RAID and hardfile technologies:

Document
- Using IBM RAID Adapters to Avoid Data Loss
- Understanding Hard Disk drive Media Defects.
- Ensuring High Availability of Your Raid Subsystem with:
> IBM SCSI-2 Fast/Wide PCI-Bus RAID Adapter.
> IBM Fast/Wide Streaming RAID Adapter.
- Ensuring High Availability Using the PC ServeRAID Adapter.


- The IBM Website at URL: http://www.ibm.com/systems/support. Choose Servers, then select Hints and Tips.

NOTE: WITH THE SERVERAID ADAPTER, SERVERAID ONBOARD CONTROLLER AND SERVERAID II ADAPTER, SYNCHRONIZATION IS REQUIRED TO ENSURE THE PARITY ACCURATELY REFLECTS THE DATA. IF SYNCHRONIZATION OR DATA SCRUBBING IS PERFORMED ON AN ARRAY THAT WAS NEVER PREVIOUSLY SYNCHRONIZED, THEN ANY MEDIA DEFECTS FOUND THAT REQUIRE RAID RECONSTRUCTION MAY BE REBUILT USING INCORRECT PARITY WHICH MAY RESULT IN DATA LOSS.

NOTE: Use the "IBM PC ServeRAID Synch Verify Update Diskette" ver 1.10 or higher to determine the status of any RAID arrays on the ServeRAID adapter ONLY. Be sure to read the README file prior to executing any programs on the diskette. The diskette can be located at and downloaded from the IBM Website at URL:
http:/www.ibm.com/systems/support
 SAS KEYWORDS: 
PSY2 PSY2ADPT D/T8640 D/T8642 320 06H5078 06H3059 92F0335 06H9334 DDD DEFUNCT 
520 720 SERVER 500 SYNCHRONIZE RAID 320 SCRUB D/T8639 325 330 704 D/T8650 DATA 
SCRUBBING REBUILD FAILS DATA LOSS HARDFILE PARITY D/T8639 D/T8640 D/T8641 D/T8642 
D/T8650 D/T8651 RAID BIOS RAID FIRMWARE UNCLASSIFIED NETFINITY 7000 HEALTH  

Document Location

Worldwide

Operating System

Older System x:Operating system independent / None

[{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWN01","label":"Older System x->Netfinity 7000"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWS16","label":"Older System x->PC Server 320"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWS17","label":"Older System x->PC Server 325"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWS18","label":"Older System x->PC Server 330"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWS19","label":"Older System x->PC Server 500"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWS22","label":"Older System x->PC Server 704"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HWS23","label":"Older System x->PC Server 720"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
28 January 2019

UID

ibm1MCGN-3HKK6P