So here we are in January, named after the two-faced Roman god Janus, who in their mythology was the god of gates and doors, and beginnings and endings.
-- Roger von Oech[Our "Janus-Like" Powers]
Well, it's 2008, which could mark the end to RAID5 and mark the beginnings of a new disk storage architecture. IBM starts the year with exciting news, acquiring new disk technology from a small start-up called XIV, led by former-EMCer Moshe Yanai. Moshe was ousted publicly in 2001 from his position as EMC's VP of engineering, and formed his own company. It didn't take long for EMC bloggers to poke fun at this already. Mark Twomey, in his StorageZilla blog, had mentioned XIV before back in August,[XIV], and again today in [IBM Buys XIV].
The following is an excerpt from the [IBM Press Release]:
To address the new requirements associated with next generation digital content, IBM chose XIV and its NEXTRA™ architecture for its ability to scale dynamically, heal itself in the event of failure, and self-tune for optimum performance, all while eliminating the significant management burden typically associated with rapid growth environments. The architecture also is designed to automatically optimize resource utilization of all the components within the system, which can allow for easier management and configuration and improved performance and data availability.
"We are pleased to become a significant part of the IBM family, allowing for our unique storage architecture, our engineers and our storage industry experience to be part of IBM's overall storage business," said Moshe Yanai, chairman, XIV. "We believe the level of technological innovation achieved by our development team is unparalleled in the storage industry. Combining our storage architectural advancements with IBM's world-wide research, sales, service, manufacturing, and distribution capabilities will provide us with the ability to have these technologies tackle the emerging Web 2.0 technology needs and reach every corner of the world."
The NEXTRA architecture has been in production for more than two years, with more than four petabytes of capacity being used by customers today.
Current disk arrays were designed for online transaction processing (OLTP) databases. The focus was on using fastest most expensive 10K and 15K RPM Fibre Channel drives, with clever caching algorithms for quick small updates of large relational databases. However, the world is changing, and people now are looking for storage designed for digital media, archives, and other Web 2.0 applications.
One problem that NEXTRA architecture addresses is RAID rebuild. In a standard RAID5 6+P+S configuration of 146GB 10K RPM drives, the loss of one disk drive module (DDM) was recovered by reconstructing the data from parity of the other drives onto the spare drive. The process took46 minutes or longer, depending on how busy the system was doing other things. During this time,if a second drive in the same rank fails, all 876GB of data are lost. Double-drive failures are rare,but unpleasant when they happen, and hopefully you have a backup on tape to recover the data from.Moving to slower, less expensive SATA drives made this situation worse. The drives have higher capacity, but run at slower speeds. When a SATA drive fails in a RAID5 array, it could take several hours to rebuild, and that is more time exposure for a second drive failure. A rebuild for a 750GBSATA drive would take five hours or more,with 4.5 TB of data at risk during the process if a second drive failure occurs.
The XIV architecture doesn't use traditional RAID ranks or spare DDMs. Instead, data is carved up into 1MB objects, and each object is stored on two physically-separate drives. In the event of a DDM loss, all the data is readable from the second copies that are spread across hundreds of drives. New copies are made on the empty disk space of the remaining system. This process can be done for a lost 1TB drive in under 30 minutes. A double-drive failure is highly unlikely in this case, but if it were ever to happen, most of the time you would not lose any data, and in the worst cases, you would lose only a few GB of data that could easily be identified and the files recovered in less time than traditional full-volume recovery from a double-drive failure on RAID5 (see update 2 below for more details).
The XIV storage system was designed and optimized for random-access to databases, on-line transaction processing (OLTP), email repositories, as well as unstructured content, like medical images, music, videos, Web pages, documents, and other discrete files (see update 1 below for more details).
IBM will continue to offer disk arrays like the IBM System Storage DS8000 and DS4800 for mixed random and sequential workloads, and offer XIV system for this new surge in random-access workloads, databases, email repositories, and other digital content of unstructured data. Recognizing this trend, disk drive module manufacturers will phase out 10K RPM drives, and focus on 15K RPM for high sequential throughput, and wide-striped SATA for everything else.
Update 1: This blog post was originally written based on the version of XIV box available as of January 2008 that was built by XIV prior to the IBM acquisition. IBM has since made a major revision, available August 2008 that addresses a variety of workloads, including database, OLTP, email, as well as digital content and unstructured files. Contact your IBM or IBM Business Partner for the latest details!
Update 2: We have learned a lot in the last two years. Check out my update: [Double Drive Failure Debunked: XIV Two Years Later] to see that XIV is far more reliable than people originally believed!
Bottom line, IBM continues to celebrate the new year, while the EMC folks in Hopkington, MA will continue to nurse their hangovers. Now that's a good way to start the new year!
technorati tags: Janus, two-faced, Roman god, Roger Von Oech, IBM, RAID5, XIV, EMC, Moshe Yanai, Mark Twomey, StorageZilla, NEXTRA, double-drive failure, rebuild, HDD, DDM, HDD, digital content, unstructured data