Tomorrow, I will be presenting at the STU Orlando for Storage and Cognitive Systems (formerly POWER servers). This conference will be held in New Orleans, Louisiana, October 16-20, 2017.
Here is my speaking schedule:
||The Seven Tiers of Business Continuity and Disaster Recovery (BC/TR)
||IBM's Cloud Storage Options
||Introduction of IBM Cloud Object Storage System and its Applications (powered by Cleversafe)
||The Pendulum Swings Back -- Understanding Converged and Hyperconverged Integrated Systems
||New generation of storage tiering: Simpler management, lower costs, and increased performance
||Introduction of IBM Cloud Object Storage System and its Applications (powered by Cleversafe) **repeat**
||IBM Spectrum Scale for File and Object storage
If these topics seem familiar, I have presented them at prior events earlier this year, including the STU Orlando in Orlando Florida, and the one in Melbourne Australia. However, I have made updates! New products have been announced!
If you are planning to attend, here are some of my past blog posts to help you get up to speed:
- STU Orlando - Orlando, Florida
This event was a large 5-day event to replace the technical portion of IBM's previous "Edge" conference.
- STU Orlando - Melbourne, Australia
This event was a smaller 3-day event to bring STU to other countries. We used to call these "Edge Comes to You" events, but now we call them "IBM Systems Technical University" just like the ones in the USA.
The STU at New Orleans will be a 5-day event. Instead of a "Meet the Experts" session, they are having a "Poster Session" in its place. Many of the posters will have QR codes, so make sure you have a "QR Scanner" application installed on your smartphone so you can scan them quickly!
Everyone, speakers and attendees alike, should consider making a QR code for themselves for this event. Go to [any number of websites] that generate a QR code. This could a VCF file with all of your contact information, a link to your blog or website, or point to your presentations on Slideshare or IBM@Box.
The next time someone at the event asks for this information, display the QR code on your smartphone, and let them scan it. Alternatively, you can send the image via MMS text message.
(My QR Code is fully functional, go ahead and practice scanning it with your smartphone for practice!)
I arrive in to New Orleans Sunday afternoon, so if you are in town, give me a shout! Or tweet me at @az990tony
technorati tags: IBM, #IBMTechU, New Orleans, Orlando Florida, Melbourne Australia, Poster Session, QR Code
I have been blogging for more than 10 years now, so I am no stranger to commenting on competitive comparisons. In some cases, I am setting the record straight, and other times, poking fun at competitor results, claims or conclusions. This comparison from Brian Carmody was too juicy to ignore.
(FCC Disclosure: I work for IBM. I have no financial interest in Infinidat, Dell EMC, nor Pure Storage, mentioned in this post. I do have friends and former co-workers who now work for Infinidat. This blog post can be considered a "paid celebrity endorsement" for IBM FlashSystem products.)
Fellow blogger Brian Carmody, formerly with IBM but now Chief Technology Officer at a startup called Infinidat, wrote [Flash is not Fast, and the Sky is Falling].
Here is an excerpt, I have added (Infinidat) wherever Brian says "we" just so there is no confusion:
"... So last week we (Infinidat) finally got around to running the same profiles against an INFINIDAT F6230 in our Waltham Solution Center, configured with 1.1TB of DDR-4 DRAM, 200TB TLC NAND, and 480 3TB Nearline HDDs.
In summary, we (Infinidat) wrecked the Pure and EMC systems. Here are the results side by side with EMC's data:
||EMC Unity 600F
|16K IOPS (80% Read)
||9x Pure, 5x Unity
|256K BW MBps
||10.6x Pure, 3x Unity
||4.5x Pure, 1.6x Unity
|Steady-state latency (ms)
||1/7 Pure, 1/2 Unity
By the way, we (Infinidat) took the liberty of running the test with a 200TB data set instead of Pure and EMC's 50TB because modern workloads require performance at scale, and we ran it with in-line compression enabled because our compression algorithm doesn't hurt performance.
This was an interesting test to run, and we (Infinidat) hope it helps the storage industry move away from media type wars and benchmarks (you will lose every time on performance if INFINIDAT is in the mix) ..."
Notice anything wrong here? anything missing?
The Tortoise beat "Hare 1" and "Hare 2", but did not invite the Cheetah to the race?
Brian was smart enough not to compare their product to anything from IBM. IBM has a wide variety of All-Flash Arrays, including the DS8880F models, the Storwize V7000F and V5030F models, and Elastic Storage Server models. However, for this workload, IBM would probably recommend the FlashSystem V9000, A9000 or A9000R.
Any All-Flash Array with a steady-state latency of 2 milliseconds or greater is embarassing, but then Infinibox is not really an All-Flash Array.
The architecture of their Infinibox appears much like the original XIV. It has a mix of DRAM memory and SSD cache, combined with spinning drives. It offers only compression, not data deduplication. Unlike the IBM XIV powered by six to 15 servers, the Infinibox appears under-powered with just three servers.
The Infinibox uses software-based in-line compression, which must put a huge tax on the few CPUs they have in those three servers. Infinidat chose not to compress the data in their cache, probably to reduce the additional overhead on their over-taxed CPUs.
The IBM FlashSystem V9000 has an innovative design, based on IBM Spectrum Virtualize, the mature software that you also find in the IBM SAN Volume Controller and Storwize family of products.
The FlashSystem V9000 offers hardware-accelerated compression. IBM takes advantage of the integrated Intel QuickAssist co-processor which runs the compression algorithm 20 times faster than standard Intel Broadwell CPU.
IBM compresses its cache, using a two-tier approach. The "upper cache" receives the data uncompressed, so that it can then tell the application to continue, for fastest turn-around time. Then the data is compressed, and stored in the "lower cache", optimizing the value and benefits of DRAM memory. Many databases get up to 80 percent savings, resulting in a 5-to-1 benefit in DRAM cache memory.
The IBM FlashSystem A9000 and A9000R also have an innovative, based on IBM Spectrum Accelerate, the code originally developed for IBM XIV storage system.
(Fun fact: Infinidat's founder, [Moshe Yanai], was formerly the founder and designer of XIV, and it appears that Infinidat is just a re-design of old XIV technology architecture, re-packaged with a few differences. Since Moshe left, IBM has drastically enhanced the IBM XIV.)
Like the IBM Spectrum Virtualize family, the IBM FlashSystem A9000 and A9000R have hardware-accelerated in-line compression, and two-tier approach to cache. The "upper cache" receives the data uncompressed, then the data is compressed and deduplicated, and stored in the "lower cache", optimizing the value and benefits of DRAM memory.
The IBM FlashSystem A9000 and A9000R also offer in-line data deduplication. Modern workloads are virtualized, and Virtual Machine (VM) and Virtual Desktop Infrastructure (VDI) get significant benefits from data deduplication. Infinidat does not play here. For the FlashSystem A9000, most of the metadata related to data deduplication is in cache, minimizing the overhead.
IBM FlashSystem A9000 and A9000R have full performance that blows these published Infinibox results away WITH compression and deduplication turned on.
Brian ran a workload that used the DRAM and SSD cache exclusively, eliminating the reality that any REAL WORLD workload would have to tap into those much slower spinning drives. This is not really a side-to-side benchmark. He is comparing his live run on Infinibox to published numbers from a previous comparison run on a completely different set of data.
This raises the question, why pay for all those spinning drives at all, if you plan to only use the DRAM and Flash storage for your workloads?
A week later, Brian followed up with another post [The INFINIDAT Challenge], acknowledging his comparison was bogus. Here's an excerpt. Again, I have added (Infinidat) wherever Brian is referring to his employer just so there is no confusion:
"... It's not likely that a room full of storage engineers will ever agree on parameters for a synthetic benchmark since storage evaluations are competitive and control of test parameters will invariably predetermine the 'winner'. However, I hope we can all agree that synthetic benchmarks are a waste of time, and that real world performance is what matters in the data center.
So, what can we (Infinidat) do about it?
We (Infinidat) cordially invite every enterprise storage customer who wants lower latency and lower storage cost to visit [FasterThanAllFlash.com] and sign up for The INFINIDAT Challenge.
- We (Infinidat) will Give you an Infinibox system to test
- We (Infinidat) will Help you clone and test your environment with Infinibox
- We (Infinidat) Guarantee your applications will run faster on Infinibox than your All-Flash Array.
- If we (Infinidat) fail, we'll take the system back and Donate $10,000 to the charity of your choice.
- If our technology delivers, you can keep the system, and we'll (Infinidat) Donate $10,000 in your name to the charity of our choice (The American Cancer Society).
Thanks again to all who participated in the dialog over the past week. I know the post generated some controversy. Traditional storage companies are fighting for their lives trying to keep enterprise storage expensive; indeed their business models are predicated upon maintaining price levels from a bygone era...."
As consolidation play doing full range of data services, I do not see this Infinibox working out. Talking to clients who have the Infinibox, the performance deteriorates in REAL WORLD workloads as you add more data to the unit.
The Infinibox seems fine for workloads that do not demand high performance, so I was surprised Brian compared it to All-Flash arrays. The Infinibox is out of its league!
(To be fair, Pure Storage and EMC XtremeIO aren't really in the same league as IBM FlashSystem, either, given that both of those products are based on commodity SSD. IBM FlashSystem models are consistently 4 to 10 times lower latency than these Commodity-SSD based competitors.)
The Infinibox also lacks features many people expect in an Enterprise-class storage array, like Call-Home capability to identify problems quickly, and Synchronous remote mirroring for disaster recovery. It is often common for startups like Infinidat to deliver a [Minimum Viable Product] as their first offering.
To paraphrase Brian himself, your applications will lose every time on performance if INFINIDAT is in your datacenter.
technorati tags: IBM, FlashSystem, A9000, A9000R, Brian Carmody, Infinidat, Infinibox, Pure Storage, EMC, EMC Unity, Infinidat F6230, Infinibox F6230, IBM XIV, Moshe Yanai, SSD, VDI, All-Flash Array, AFA, Call-Home, Synchronous Mirror, Disaster Recovery, Minimum Viable Product, Spectrum Virtualize, Intel QuickAssist, American Cancer Society
This week, I was reminded that back in 2011, Watson beat two human players, Ken Jennings and Brad Rutter on the TV game show "Jeopardy!" On his last response, Ken wrote "I for one welcome our new computer overlords." With IBM investing heavily in Cognitive Solutions, should people be worried, or welcome the new technology?
Back in 1950, Isaac Asimov proposed "Three laws of robots":
- A robot may not injure a human being or, through inaction, allow a human being to come to harm.
- A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
- A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Let's take a look at how Artificial Intelligence has been represented in the movies over the past few decades. I have put these in chronological order when they were initially released in the United States.
(FCC Disclosure and Spoiler Alert: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for cognitive solutions made by IBM. While IBM may have been involved or featured in some of these movies, I have no financial interest in them. I have seen them all and highly recommend them. I am hoping that you have all seen these, or at least familiar enough with their plot lines that I am not spoiling them for you.)
- 2001: A Space Odyssey
Back in 1968, Stanley Kubrick and Arthur C. Clarke made a masterpiece movie about a mysterious obelisk floating near Jupiter. To investigate, a crew of human beings takes a space ship managed by a sentient computer named [HAL-9000].
(Many people thought HAL was a subtle reference to IBM. Stanley Kubrick clarifies:
"By the way, just to show you how interpretation can sometimes be bewildering: A cryptographer went to see the film, and he said, 'Oh. I get it. Each letter of HAL's name is one letter ahead of IBM. The H is one letter in front of I, the A is one letter in front of B, and the L is one letter in front of M.'
Now this is a pure coincidence, because HAL's name is an acronym of heuristic and algorithmic, the two methods of computer programming...an almost inconceivable coincidence. It would have taken a cryptographer to have noticed that."
Source: The Making of 2001: A Space Odyssey, Eye Magazine Interview, Modern Library, pp. 249)
The problem arises when HAL-9000 refuses commands from the astronauts. The astronauts are not in control, HAL-9000 was given separate orders from ground control back on earth, and it has determined it would be more successful without the crew.
In 1973, Michael Crichton wrote and directed this movie about an amusement park with three uniquely themed areas: Medieval World, Roman World, and Westworld. Robots are used to staff the parks to make them more realistic, interacting with the guests in character appropriate for each time period.
A malfunction spreads like a computer virus among the robots, causing them to harm or kill the park's guests. Yul Brenner played a robot called simply "the Gunslinger". Equipped with fast reflexes and infrared vision, the Gunslinger proves especially deadly!
(Michael Crichton also wrote "Jurassic Park", which had a similar story line involving dinosaurs with catastrophic results!)
Last year, HBO launched a TV series called "Westworld", based on the same themes covered in this movie. The first season of 10 episodes just finished, and the next season is scheduled for 2018.
- Blade Runner
Directed by Ridley Scott, this 1982 movie stars Harrison Ford as Rick Deckard, a law enforcement officer. Rick is tasked to hunt down and "retire" four cognitive androids named "replicants" that have killed some humans and are now in search of their creator, a man named J. F. Sebastian.
(I enjoy the euphemisms used in these movies. Terms like kill, murder or assassinate apply to humans but not machines. The word "retire" in this movie refers to destruction of the robots. As we say in IBM, "retirement is not something you do, it is something done to you!")
Destroying machines does not carry the same emotional toll as killing humans, but this movie explores that empathy. A sequel called "Blade Runner 2049" will be released later this year.
In 1983, Matthew Broderick plays David, a young high school student who hacks into the U.S. Military's War Operation Plan Response (WOPR) computer. The WOPR was designed to run various strategic games, including war game simulations, learning as it goes. David decides to initiate the game "Global Thermonuclear War", and the military responds as if the threats were real.
Can the computer learn that the only way to win a war is not to wage it in the first place? And if a computer can learn this, can our human leaders learn this too?
In this series of movies, a franchise spanning from 1984 to 2009, the US Military builds a defense grid computer called [Skynet]. After cognitive learning at an alarming rate, Skynet becomes self-aware, and decides to launch missiles, starting a nuclear war that kills over 3 billion people.
Arnold Schwarzenegger plays the Terminator model T-800, a cognitive solution in human form designed by Skynet to finish the job and kill the remainder of humanity.
- I, Robot
In this 2004 movie, Will Smith plays Del Spooner, a technophobic cop who investigates a crime committed by a cognitive robot.
(Many people associate the title with author Isaac Asimov. A short story called "I, Robot" written by Earl and Otto Binder was published in the January 1939 issue of 'Amazing Stories', well before the unrelated and more well-known book 'I, Robot' (1950), a collection of short stories, by Asimov.
Asimov admitted to being heavily influenced by the Binder short story. The title of Asimov's collection was changed to "I, Robot" by the publisher, against Asimov's wishes. Source: IMDB)
Del Spooner uncovers a bigger threat to humanity, not just a single malfunctioning robot, but rather the Virtual Interactive Kinesthetic Interface, or simply VIKI for short, a cognitive solution that controls all robots. VIKI interprets Asimov's three laws in a manner not originally intended.
- Ex Machina
In this 2015 movie, Domhnall Gleeson plays Caleb, a 26 year old programmer at the world's largest internet company. Caleb wins a competition to spend a week at a private mountain retreat. However, when Caleb arrives he discovers that he must interact with Ava, the world's first true artificial intelligence, a beautiful robot played by Alicia Vikander.
(The title derives from the Latin phrase "Deus Ex-Machina," meaning "a god from the Machine," a phrase that originated in Greek tragedies. Sources: IMDB)
Nathan, the reclusive CEO of this company, relishes this opportunity to have Caleb participate in this experiment, explaining how Artificial Intelligence (AI) will transform the world.
(The three main characters all have appropriate biblical names. Ava is a form of Eve, the first woman; Nathan was a prophet in the court of David; and Caleb was a spy sent by Moses to evaluate the Promised Land. Source: IMDB)
The premise is based in part on the famous [Turing Test], developed by Alan Turing. This is designed to test a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.
Movies that depict the bad guys as a particular nationality, ethnicity or religion may be offensive to some movie audiences. Instead, having dinosaurs, monsters, aliens or robots provides a villain that all people can fear equally. This helps movie makers reach a more global audience!
Of course, if robots, androids and other forms of Artificial Intelligence did exactly what humans expect them to, we would not have the tense, thrilling action movies to watch on the big screen.
This is not a complete list of movies. Enter in the comments below your favorite movie that features Artificial Intelligence and why it is your favorite!
technorati tags: IBM, Watson, Jeopardy, Ken Jennings, Brad Rutter, computer overlords, cognitive solutions, Isaac Asimov, three laws of robots, Artificial Intelligence, Stanley Kubrick, Arthur C. Clarke, HAL 9000, Space Odyssey, Westworld, Michael Crichton, Yul Brenner, Jurassic Park, HBO, Blade Runner, Ridley Scott, Harrison Ford, WarGames, Matthew Broderick, WOPR, Terminator, Skynet, Arnold Schwarzenegger, I, Robot, Will Smith, VIKI, Ex Machina, Domhnall Gleeson, Alicia Vikander, Turing Test, Alan Turing
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. Here is my recap of breakout sessions for Monday, Sep 19, 2016:
- How do you storage a Zettabyte? IBM and Microsoft Know...
- A [Zettabyte] is a million Petabytes, or a billion Terabytes, of data. Most clients I deal with have less than 10 PB of centralized storage in their data center, but there are a few that have much larger data repositories.
Ed Childers, IBM STSM and manager for Tape and LTFS development, and Aaron Ogus, Microsoft Architect, discussed different solutions developed by IBM and Microsoft. IBM's solution has been productized, and is available as IBM Spectrum Scale and IBM Spectrum Archive. Microsoft's solution is not productized, but is being "operationalized" to be used within Microsoft's Azure Cloud.
Not surprisingly, to be able to store a Zettabyte of data, you have to be creative and cost-effective with storage media. The current winner is magnetic tape, which continues to be 20 times less expensive than disk. IBM developed the Linear Tape File System (LTFS) and then shared it with other leading IT vendors. Ed also covered some future storage media developments, from using Macro-molecular strands of DNA, to Phase Change Memory (PCM).
- All Flash is not Created Equal - Contrasting IBM FlashSystem with Solid State Drives (SSD)
Many IBM FlashSystem presentations focus on the product, but don't explain the underlying technology, specifically what differentiates IBM FlashSystem from substantially slower competitive alternatives like EMC XtremIO and PureStorage that are based instead on fallible commodity Solid State Drives (SSD).
By working closely with our chip vendor, Micron, IBM was able to improve the write endurance of these Multi-level cell (MLC) chips by 9.4x, and reduce write amplification by 45 percent.
I explained IBM's clever asymmetrical wear-level balancing, heat segregation, read disturb mitigation, voltage level shifting, and health binning, all of which contribute to the performance and reliability of this solution. IBM's innovative Error Correcting Code provides LDPC-like correction strength but at much faster BCH-like latency speed.
This was a popular session. Despite being moved to a much larger room, they still had to turn people away, so I will be repeating this session on Wednesday, 11:00am.
- Real-time Compression: Bendingo and Adelaide Bank's Perspective
James Harris, Senior Storage Systems Specialist for [Bendingo and Adelaide Bank], presented his success story with the use of Real-time Compression. Oracle RAC databases got 60-70 percent savings. SQL databases got 70-80 percent savings. VMware VMFS datastores average 50 percent savings. For IBM i, he is getting 60-70 percent savings for SYSBAS, and over 70 percent savings of the rest of his IBM i production data.
As a result, the bank has not had to make any Capital Expenditures (CAPEX) for disk for 2-3 years since they started compressing in 2014.
- Storage Options for Big Data and Analytics: IBM FlashSystem or Traditional Disk Systems?
Eric Sperley, IBM Software Defined Storage Architect, presented the basics of Hadoop and the Hadoop File System (HDFS), then explained how IBM Spectrum Scale, when combined with the right tiers of flash and disk technology, could be used to optimize an environment for big data analytics.
- Solutions EXPO
The Solutions EXPO is open all day, for people to visit the booths in between sessions. I stopped in for the evening reception. This is a great way to catch up on the latest products, re-connect with some clients or colleagues that I haven't seen in person for awhile, and meet new friends.
Shown here is Angie Welchert, who just started working for IBM a few years ago! I took her around to introduce her to some IBM executives at the Solutions EXPO.
It was a long and productive day.
technorati tags: IBM, #IBMedge, #IBMstorage, Ed Childers, IBM tape, LTFS, Aaron Ogus, Microsoft, Flash, FlashSystem, FlashSystem 900, FlashSystem V9000, FlashSystem A9000, FlashSystem A9000R, Solid State Drive, SSD, Micron, MLC, LDPC, BCH, James Harris, Real-time Compression, Eric Sperley, Hadoop, HDFS, Solutions EXPO, Angie Welchert
As we get to larger and larger flash and spinning disk drives, a common question I get is whether to use RAID-5 versus RAID-6. Here is my take on the matter.
- A quick review of basic probability statistics
Failure rates are based on probabilities. Take for example a traditional six-sided die, with numbers one through six represented as dots on each face. What are the chances that we can roll the die several times in a row, that we will have no sixes ever rolled? You might think that if there is a 1/6 (16.6 percent) chance to roll a six, then you would guarantee hit a six after six rolls. That is not the case.
|# of Rolls
||Probability of no sixes (percent)
So, even after 24 rolls, there is more than 1 percent chance of not rolling a six at all. The formula is (1-1/6) to the 24th power.
Let's say that rolling one to five is success, and rolling a six is a failure. Being successful requires that no sixes appear in a sequence of events. This is the concept I will use for the rest of this post. If you don't care for the math, jump down to the "Summary of Results" section below.
- Error Correcting Codes (ECC) and Unreadable Read Errors (URE)
When I speak to my travel agent, I have to provide my six-character [Record Locator] code. Pronouncing individual letters can be error prone, so we use a "spelling alphabet".
The International Radiotelephony Spelling Alphabet, sometimes known as the [NATO phonetic alphabet], has 26 code words assigned to the 26 letters of the English alphabet in alphabetical order as follows: Alfa, Bravo, Charlie, Delta, Echo, Foxtrot, Golf, Hotel, India, Juliett, Kilo, Lima, Mike, November, Oscar, Papa, Quebec, Romeo, Sierra, Tango, Uniform, Victor, Whiskey, X-ray, Yankee, Zulu.
||Foxtrot Golf Mike Oscar Victor Whiskey
|Foxtrot Gold Mine Oscar Vector Whisker
|Boxcart Golf Miko Boxcart Victor Whiskey
Having five or so characters to represent a single character may seem excessive, but you can see that this can be helpful when communications link has static, or background noise is loud, as is often the case at the airport!
If spelling words are misheard, either (a) they are close enough like "Gold" for "Golf", or "Whisker" for "Whiskey", that the correct word is known, or (b) not close enough, such that "Boxcart" could refer to either "Foxtrot" or "Oscar" that we can at least detect that the failure occurred.
For data transfers, or data that is written, and later read back, the functional equivalent is an Error Correcting Code [ECC], used in transmission and storage of data. Some basic ECC can correct a single bit error, and detect double bit errors as failures. More sophisticated ECC can correct multiple bit errors up to a certain number of bits, and detect most anything worse.
When reading a block, sector or page of data from a storage device, if the ECC detects an error, but is unable to correct the bits involved, we call this an "Unrecoverable Read Error", or URE for short.
- Bit Error Rate (BER)
Different storage devices have different block, sector or page sizes. Some use 512 bytes, 4096 bytes or 8192 bytes, for example. To normalize likelihood of errors, the industry has simplified this to a single bit error rate or BER, represented often as a power of 10.
||Bit Error Rate per read (BER)
|Consumer HDD (PC/Laptops)
|Enterprise 15k/10k/7200 rpm
|Solid-State and Flash
|IBM TS1150 tape
In other words, the chance that a bit is unreadable on optical media is 1 in 10 trillion (1E13), on enterprise 15k drives is 1 in 10 quadrillion, and on LTO-7 tape is 1 in 10 quintillion.
There are eight bits per byte, so reading 1 GB of data is like rolling the die eight billion times. The chance of successfully reading 1GB on DVD, then would be (1 - 1/1E13) to the 8 billionth power, or 99.92 percent, or conversely a 0.08 percent chance of failure.
- Probability of drive failure
An often cited resource for the probability of drive failure is the [Failure Trends in a Large Disk Drive Population (13-page PDF)] by Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz Andre Barroso of Google Inc.
In this paper, Google had studied drive failure using an "Annual Failure Rate" or AFR. Here are two graphs from this paper:
This first graph shows AFR by age. Some drives fail in their first 3-6 months, often called "infant mortality". Then they are fairly reliable for a few years, down to 1.7 percent, then as they get older, they start to fail more often, up to 8.3 percent.
This second graph factors in how busy the drives are. Dividing the drive set into quartiles, "Low" represents the least busy drives (the bottom quartile), "Medium" represents the median two quartiles, and "High" represents the busiest drives, the top quartile. Not surprisingly, the busiest drives tend to fail more often than medium-busy drives.
Given an AFR, what are the chances a drive will fail in the next hour? There are 8,766 hours per year, so the success of a drive over the course of a year is like rolling the die 8,766 times. This allows us to calculate a "Drive Error Rate" or DER:
||Drive Error Rate per hour (DER)
For example, an AFR=3 drive has a 1 in 287,800 chance of failing in a particular hour. The probability this drive will fail in the next 24 hours would be like rolling the die 24 times. The formula is (1-1/287,800) to the 24th power, resulting in a failure rate of roughly 0.008 percent.
- RAID-5 considerations
Let's take a typical RAID-5 rank with 600GB drives at 15K rpm, in a 7+P RAID-5 configuration.
During normal processing, if a URE occurs on a individual drive, RAID comes to the rescue. The system can rebuild the data from parity, and correct the broken block of data.
When a drive fails, however, we don't have this rescue, so a URE that occurs during the rebuild process is catastrophic. How likely is this? Data is read from the other seven drives, and written to a spare empty drive. At 8 bits per byte, reading 4200 GB of data is rolling the die 33.6 trillion times. The formula is then (1-1/E16) to the 33.6 trillionth power, or approximately 0.372 percent chance of URE during the rebuild process.
The time to perform the rebuild depends heavily on the speed of the drive, and how busy the RAID rank is doing other work. Under heavy load, the rebuild might only run at 25 MB/sec, and under no workload perhaps 90 MB/sec. If we take a 60 MB/sec moderate rebuild rate, then it would take 10,000 seconds or nearly 3 hours. The chance that any of the seven drives fail during these three hours, at AFR=10 rolling the DER die (7 x 3) 21 times, results in a 0.025 percent chance of failure.
It is nearly 15 times more likely to get a URE failure than a second drive failure. A rebuild failure would happen with either of these, with a probability of 0.397 percent.
The situation gets worse with higher capacity Nearline drives. Let's do a RAID-5 rank with 6TB Nearline drives at 7200 rpm, in a 7+P configuration. The likelihood of URE reading 42 TB of data, is rolling the die 336 trillion times, or approximately 3.66 percent chance of URE failure. Yikes!
The time to rebuild is also going to take longer. A moderate rebuild rate might only be 30 MB/sec, so that rebuilding a 6TB drive would take 55 hours. The chance that one of the other seven drives fail, assuming again AFR=10, during these 55 hours results in a 0.462 percent.
This time, a URE failure is nearly eight times more likely than a double drive failure. The chance of a rebuild failure is 4.12 percent. Good thing you backed up to tape or object storage!
The math can be done easily using modern spreadsheet software. The URE failure rate is based on the quantity of data read from the remaining drives, so a 4+P with 600GB drives is the same as 8+P with 300GB drives. Both read 2.4 TB of data to recalculate from parity. The Double Drive failure rate is based on the number of drives being read times the number of hours during the rebuild. Slower, higher capacity drives take longer to rebuild. However, in both the 15K and 7200rpm examples, the chance of a URE failure was 8 to 15 times more likely than double drive failure.
- RAID-6 considerations
Many of the problems associated with RAID-5 above can be mitigated with RAID-6.
After a single drive fails, any URE during rebuild can be corrected from parity. However, if a second drive fails during the rebuild process, then a URE on the remaining drives would be a problem.
Let's start with the 600GB 15k drives in a 6+P+Q RAID-6 configuration. The chance of a second drive failing is 0.0252 percent, as we calculated above. The likelihood of a URE is then based on the remaining six drives, 3600 GB of data. Doing the math, that is 0.0319 percent chance. So, the change of a URE during RAID-6 failure is the probability of both occurring, roughly 0.0000806 percent. Far more reliable than RAID-5!
Likewise, we can calculate the probability of a triple drive failure. After the second drive fails, the likelihood of a third drive at AFR=10, results in 0.00000546 percent.
Combining these, the chance of failure of rebuild is 0.000861 percent.
Switching to 6 TB Nearline drives, in a 6+P+Q RAID-6 configuration, we can do the math in the same manner. The likelihood of URE and two drives failing is 0.0145 percent, and for triple drive failure is 0.00183 percent. Chance of rebuild failure is 0.0163 percent.
- Summary of Results
Putting all the results in a table, we have the following:
||RAID-5 rebuild failure (percent)
||RAID-6 rebuild failure (percent)
|600GB 15K rpm
|6 TB 7200rpm
Hopefully, I have shown you how to calculate these yourself, so that you can plug in your own drive sizes, rebuild rates, and other parameters to convince yourself of this.
In all cases, RAID-6 drastically reduced the probability of rebuild failure. With modern cache-based systems, the write-penalty associated with additional parity generally does not impact application performance. As clients transition from faster 15K drives to slower, higher capacity 10K and 7200 rpm drives, I highly recommend using RAID-6 instead of RAID-5 in all cases.
technorati tags: IBM, RAID-5, RAID-6, ECC, URE, Spelling Alphabet, BER, Google, AFR