Drives

Correlated drive failures (CDFs) occur either when an event causes many drives in physical proximity to fail or when a manufacturing defect in the same class, manufacturer, and type of drives occurs.

CDFs can occur due to any of the following conditions:
  • Drive controller failure
  • System downtime or restart
  • Manufacturing defect in batch of drives
  • Strong vibration in a rack or site
  • Power surge or lightning strike
  • Faulty memory or software on a server
  • Logical error in operating system or file system

These failures can cause a group of drives to either become temporarily unavailable or experience unrecoverable failures. Storage systems that house all drives within the same server, rack, or site are susceptible to correlated drive failures. It might take only one event to trigger enough drive failures to exceed the system's failure tolerance. Within a system, related slices for the same data are always kept on both different drives and different servers. Therefore, a correlated failure that impacts all the drives within a server cannot cause data unavailability or loss.