Inside System Storage -- by Tony Pearson

Tony Pearson Tony Pearson is a Master Inventor and Senior IT Specialist for the IBM System Storage product line at the IBM Executive Briefing Center in Tucson Arizona, and featured contributor to IBM's developerWorks. In 2011, Tony celebrated his 25th year anniversary with IBM Storage on the same day as the IBM's Centennial. He is author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services. You can also follow him on Twitter @az990tony.
(Short URL for this blog: ibm.co/Pearson )
  • Add a Comment
  • Edit
  • More Actions v
  • Quarantine this Entry

Comments (5)

1 localhost commented Permalink

Tony,I am not so sure that not being "vertically integrated" helps with disk reliability. <div>&nbsp;</div> When it comes to reliability, perhaps it helps to be in control of the product ‘end to end’, disk and controller firmware/hardware included.<div>&nbsp;</div> Coca Cola is not in the data business.... they don’t have to worry about reliability. <div>&nbsp;</div> As for the 'super secret' recipe…. It has not changed for a long time , but they own it.

2 localhost commented Trackback

Richard, I was not trying to imply that not being vertically integrated made the situation better, just that nobody is vertically integrated anymore. The reliability of the drives we get from our qualified suppliers is sufficient for our needs to be competitive in the marketplace. If they weren't, we'd have to build our own. <p>Depending on the aspect of the business, some Lines of Business are more immune to IT outages than others. There are probably some parts of Coca-Cola that rely heavily on their IT systems.</p>

3 localhost commented Permalink

Ok, I know I'm not the brightest bulb, so I must be missing something. I don't see ANY response to the issues raised in the response from IBM above. As I read it you're saying "hey, go talk to the disk drive manufacturers". The problem with that is that it's the DISK ARRAY that determines when a drive has failed an starts the rebuild process. That IS under the control of IBM, specifically the controller. But more importantly, it effects my risk of data loss.<div>&nbsp;</div> As I see it, my risk of data loss with RAID-5 is influenced by two main factors. 1 - The drive replacement rate and 2 - The rebuild time (which to a great extent is a function of the drive size) both of which IBM has some control over. <div>&nbsp;</div> So, I think that the question in my mind is, what's the tipping point? Where does the risk of using RAID-5 protection exceed what I'm willing to accept, and I need to move to some other protection mechanism like RAID-6? Is it when the rebuild times exceed 12 hours? 24 hours? 48 hours?<div>&nbsp;</div> Also, I wonder why IBM isn't publishing some information to help me make these kinds of decisions?<div>&nbsp;</div> --joerg

4 localhost commented Trackback

Please answer the Joerg Hallbauer questions... thanks!<div>&nbsp;</div>

5 localhost commented Trackback

Michael,I address these in a series of posts here:http://www.ibm.com/developerworks/blogs/page/InsideSystemStorage?entry=ibm_acquires_xivhttp://www.ibm.com/developerworks/blogs/page/InsideSystemStorage?entry=emc_electrocutes_the_elephanthttp://www.ibm.com/developerworks/blogs/page/InsideSystemStorage?entry=spreading_out_the_re_replicationhttp://www.ibm.com/developerworks/blogs/page/InsideSystemStorage?entry=more_questions_about_ibm_xivhttp://www.ibm.com/developerworks/blogs/page/InsideSystemStorage?entry=cleaning_up_the_circus_gold<div>&nbsp;</div> (1) The original post was based on research by Google and Carnegie Mellon University that companies replace their drives more often than suggested manufacturing rates. As it turns out, these rates are often estimated by disk manufacturers based on applying heat, cold or humidity extremes, and projected mathematically. Secondly, some companies, at their option, choose to replace drives when temporary errors occur, rather than wait for a permanent error to occur. Most disk array controllers detect and report temporary errors, but do not perform rebuild until permanent errors are detected.<div>&nbsp;</div> (2) To address the RAID-5 issue, IBM now offers RAID-6 on some of our storage devices, and the new RAID-X that will be made available on our XIV Nextra units.<div>&nbsp;</div> The tipping point has happened already with today's large capacity SATA disk, and to a lesser extent high-capacity FC disks. Using the metric of "GB-hours" that I mentioned in one of the above posts, one can decide how much data they wish to put at risk under RAID-5, or decided to switch to a RAID-6 or RAID-X scheme instead.<div>&nbsp;</div> I have reviewed the various technical papers on RAID-6, RAID-X and other technologies, but am afraid that they are often too technical and academic in language to be understood by the average person. Instead, I have made an attempt to simplify the key points in this blog, especially in the posts above.<div>&nbsp;</div> I hope this helps. Please feel to comment again if you have any further questions.

Add a Comment Add a Comment