Inside System Storage -- by Tony Pearson

Tony Pearson Tony Pearson is a Master Inventor and Senior IT Specialist for the IBM System Storage product line at the IBM Executive Briefing Center in Tucson Arizona, and featured contributor to IBM's developerWorks. In 2011, Tony celebrated his 25th year anniversary with IBM Storage on the same day as the IBM's Centennial. He is author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services. You can also follow him on Twitter @az990tony.
(Short URL for this blog: ibm.co/Pearson )
  • Add a Comment
  • Edit
  • More Actions v
  • Quarantine this Entry

Comments (6)

1 localhost commented Trackback

Interesting observation: your A-SIS analysis indicated it could save you between 10-20% on your laptop.<div>&nbsp;</div> I just used NTFS compression on my laptop's hard drive, and I reduced my used capacity by 30%. <div>&nbsp;</div> For free, and done while I was actively using my data (A-SIS de-dups must be done off-line - at a rate of about 10 hours per TB). <div>&nbsp;</div> And NTFS compress comes without the risk that corruption of the A-SIS mapping tree renders my entire hard drive unusable.

2 localhost commented Trackback

BarryB, my results were 10-20% above the use of my existing NTFS compression. Sadly, compression is not a POSIX standard for file systems, and as such is not readily available on most of the file systems people use for business (JFS, EXT3, etc.)<div>&nbsp;</div> The perception that file system compression is free is also mistaken. It comes at no additional charge with the Microsoft Windows operating system, but consumes cycles on the application server to handle the compression/decompression process. This decompression occurs not just when an end user or application reads data, but also during backup, archive, and anti-virus scanning.<div>&nbsp;</div> Like A-SIS, some data gets great benefit from compression, while other data does not compress well at all. However, with A-SIS, the application server is not consuming cycles away from its primary mission, instead A-SIS processing is all done out-board, and not during the write process. That is indeed a benefit over real-time byte-for-byte comparison techniques.

3 localhost commented Trackback

Tony, NTFS isn't the only compressing file system - just the one that the vast majority of us use on our laptops and desktops.<div>&nbsp;</div> And so long as you realize that the A-SIS definition of "out of band" is "when the data is taken off-line and unavailable to applications," I'll let you get away with your assertion.<div>&nbsp;</div> But A-SIS is no more "free" in terms of CPU cycles - you're just using cycles on another system. Most people don't continuously use all the CPU power in their laptops or desktops, and the overhead of compression is so minimal as to be entirely unnoticable. It's sort of a practical application of "grid" computing - millions of little CPUs compressing on the edges: not only reduces storage requirements, but also reduces network traffic. Win-Win!<div>&nbsp;</div> On the other hand, many (most) of the NetApp filers CPU's are overworked 24x7x365 - TCP/IP+NFS+CIFS are "heavy" protocols, WAFL is not inexpensive, nor is maintaining all the pointers for snaps and thin devices. That's why A-SIS compression is done off-line - if you were to try it while running normal operations, you'd face significant delays, if not timeouts.<div>&nbsp;</div> But my point is that A-SIS is interesting, but necessarily unique in its benefits. And a single data corruption can wipe out EVERYTHING in any de-duped world, while the relatively lightweight NTFS compression is far safer (a corruption effects only a single file).

4 localhost commented Trackback

StorageZilla corrects me that their MD5 flaw in the Centera product was related to data integrity of the archive records, not single-instance-storage, and that the malicious hacking was to tamper existing data, not delete unique data. I stand corrected and will update.

5 localhost commented Trackback

BarryB, yes, in an ideal world you could mix and match any file system with any operating system. Unfortunately, not every operating system supports NTFS, and sometimes compression is not available on the subset of file systems that are supported for a given OS.<div>&nbsp;</div> I chose my laptop C: and D: drives as an example only to provide a basis of my discussion. A-SIS is not intended for local laptop or personal workstation file systems, but rather for external disk shared in an SMB or large enterprise data center environment.<div>&nbsp;</div> The old debate of where CPU cycles should be spent is as old as computers themselves. Some are willing to use their application server cycles for activities, and others look to offload this to external devices.

Add a Comment Add a Comment