IBM Support

Potential Loss of Access to Data After Multiple Drive Failures In Distributed RAID Arrays

Flashes (Alerts)


Abstract

If multiple drives in a Distributed RAID array fail concurrently, and at least one of those drives is replaced, multiple node warmstarts may be seen due to APAR HU01792.

This issue only occurs on systems running 7.8.1.5, 8.1.1.1 or 8.1.2.0 software. V9000 systems without SAS enclosures are not affected.

Content

Recovery from a failed drive in a Distributed RAID array consists of two phases: rebuild (where data is automatically rewritten to rebuild areas on other drives in the array), and copyback (where that data is copied to a new drive, after the failed drive is physically replaced).

If more drives have failed than there are rebuild areas in the array, the copyback operates in a degraded mode. In affected software versions, this degraded copyback will fail, leading to multiple node warmstarts and temporary loss of access to data.

Fix

Systems running affected software versions, and using Distributed RAID should be upgraded to 7.8.1.6, 8.1.1.2 or 8.1.2.1 to prevent this issue.

Workaround

Until the system is upgraded, exercise care in replacing failed drives in a Distributed RAID array.

The GUI shows a "Rebuild Areas total" value for each array.

  • If the number of failed drives in the array is less than the rebuild areas total, the drive can be replaced as normal.
  • If the number of failed drives in the array is equal to, or greater than, the rebuild areas total, then urgently upgrade the software to a fixed version before replacing the drive. When the upgrade has completed, the drive can be replaced without risk of triggering this issue.
[{"Product":{"code":"ST3FR7","label":"IBM Storwize V7000"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"7.8","Platform":[{"code":"","label":"IBM Storwize V7000"}],"Version":"7.7;7.8;8.1","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Product":{"code":"STLM6B","label":"IBM Storwize V3500 (2071)"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Product":{"code":"STLM5A","label":"IBM Storwize V3700 (2072)"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Product":{"code":"STHGUJ","label":"IBM Storwize V5000"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Product":{"code":"STKMQV","label":"IBM FlashSystem V9000"},"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"7.7;7.8;8.1","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Product":{"code":"STPVGU","label":"SAN Volume Controller"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":" ","Platform":[{"code":"","label":"SAN Volume Controller"}],"Version":"7.7;7.8;8.1","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

More support for:
IBM Storwize V7000

Software version:
7.7, 7.8, 8.1

Operating system(s):
IBM Storwize V7000

Document number:
651077

Modified date:
28 March 2023

UID

ssg1S1012392

Manage My Notification Subscriptions