This week is Thanksgiving holiday in the USA, so I thought a good theme would be things I am thankful for.
I'll start with saying that I am thankful EMC has finally announcedAtmos last week. This was the "Maui" part of the Hulk/Maui rumors we heard over a year ago. To quickly recap, Atmos is EMC's latest storage offeringfor global-scale storage intended for Web 2.0 and Digital Archive workloads. Atmos can be sold as just software, or combined with Infiniflex,EMC's bulk, high-density commodity disk storage systems. Atmos supports traditionalNFS/CIFS file-level access, as well as SOAP/REST object protocols.
I'm thankful for various reasons, here's a quick list:
- It's hard to compete against "vaporware"
Back in the 1990s, IBM was trying to sell its actual disk systems against StorageTek's rumored "Iceberg" project. It took StorageTek some four years to get this project out,but in the meantime, we were comparing actual versus possibility. The main feature iswhat we now call "Thin Provisioning". Ironically, StorageTek's offering was not commercially successful until IBM agreed to resell this as the IBM RAMAC Virtual Array (RVA).
Until last week, nobody knew the full extent of what EMC was going to deliver on the many Hulk/Maui theories. Severalhinted as to what it could have been, and I am glad to see that Atmos falls short of those rumored possibilities. This is not to say that Atmos can't reach its potential, and certainly some of the design is clever, such as offering native SOAP/REST access.
Instead, IBM now can compare Atmos/Infiniflex directly to the features and capabilities of IBM's Scale Out File Services [SoFS], which offers a global-scale multi-site namespace with policy-based data movement, IBM System Storage Multilevel Grid Access Manager[GAM] that manages geographical distrubuted information,and IBM [XIV Storage System] that offers high-density bulk storage.
- Web 2.0 and Digital Archive workloads justify new storage architectures
When I presented SoFS and XIV earlier this year, I mentioned they were designed forthe fast-growing Web 2.0 and Digital Archive workloads that were unique enough to justify their own storage architectures. One criticism was that SoFS appeared to duplicate what could be achieved with dozens of IBM N series NAS boxes connected with Virtual File Manager (VFM). Why invent a new offering with a new architecture?
With the Atmos announcement, EMC now agrees with IBM that the Web 2.0 and DigitalArchive workloads represent a unique enough "use case" to justify a new approach.
- New offerings for new workloads will not impact existing offerings for existing workloads
I find it amusing that EMC is quickly defending that Atmos will not eat into its DMXbusiness, which is exactly the FUD they threw out about IBM XIV versus DS8000 earlier this year. In reality, neither the DS8000 nor the DMX were used much for Web 2.0 andDigital Archive workloads in the past. Companies like Google, Amazon and others hadto either build their own from piece parts, or use low-cost midrange disk systems.
Rather, the DS8000 and DMX can now focus on the workloads they were designed for,such as database applications on mainframe servers.
- Cloud-Oriented Storage (COS)
Just when you thought we had enough terminology already, EMC introduces yet another three-letter acronym [TLA]. Kudos to EMC for coining phrases to help move newconcepts forward.
Now, when an RFP asks for Cloud-oriented storage, I am thankful this phrase will help serve as a trigger for IBM to lead with SoFS and XIV storage offerings.
- Digital archives are different than Compliance Archives
EMC was also quick to point out that object-storage Atmos was different from theirobject-storage EMC Centera. The former being for "digital archives" and the latter for"compliance archives". Different workloads, Different use cases, different offerings.
Ever since IBM introduced its [IBM System Storage DR550] several years ago, EMC Centera has been playing catch-up to match IBM'smany features and capabilities. I am thankful the Centera team was probably too busy to incorporate Atmos capabilities, so it was easier to make Atmos a separate offering altogether. This allows the IBM DR550 to continue to compete against Centera's existingfeature set.
- Micro-RAID arrays, logical file and object-level replication
I am thankful that one of the Atmos policy-based feature is replicating individualobjects, rather than LUN-based replication and protection. SoFS supports this forlogical files regardless of their LUN placement, GAM supports replication of files and medical images across geographical sites in the grid, and the XIV supports this for 1MBchunks regardless of their hard disk drive placement. The 1MB chunk size was basedon the average object size from established Web 2.0 and DigitalArchive workloads.
I tried to explain the RAID-X capability of the XIV back in January, under muchcriticism that replication should only be done at the LUN level. I amthankful that Marc Farley on StorageRap coined the phrase[Micro-RAID array] to helpmove this new concept further. Now, file-level, object-level and chunk-level replication can be considered mainstream.
- Much larger minimum capacity increments
The original XIV in January was 51TB capacity per rack, and this went up to 79TB per rack for the most recent IBM XIV Release 2 model. Several complained that nobody would purchase disk systems at such increments. Certainly, small and medium size businessesmay not consider XIV for that reason.
I am thankful Atmos offers 120TB, 240TB and 360TB sizes. The companies that purchasedisk for Web 2.0 and Digital Archive workloads do purchase disk capacity in these large sizes. Service providers add capacity to the "Cloud" to support many of theirend-clients, and so purchasing disk capacity to rent back out represents revenue generating opportunity.
- Renewed attention on SOAP and REST protocols
IBM and Microsoft have been pushing SOA and Web Services for quite some time now.REST, which stands for [Representational State Transfer] allows static and dynamic HTML message passing over standard HTTP.SOAP, which was originally [Simple Object Access Protocol], and then later renamed to "Service Oriented Architecture Protocol", takes this one step further, allowingdifferent applications to send "envelopes" containing messages and data betweenapplications using HTTP, RPC, SMTP and a variety of other underlying protocols.Typically, these messages are simple text surrounded by XML tags, easily stored asfiles, or rows in databases, and served up by SOAP nodes as needed.
- It's hard to show leadership until there are followers
IBM's leadership sometimes goes unnoticed until followerscreate "me, too!" offerings or establish similar business strategies. IBM's leadership in Cloud and Grid computing is no exception.Atmos is the latest me-too product offering in this space, trying pretty muchto address the same challenges that SoFS and XIV were designed for.
So, perhaps EMC is thankful that IBM has already paved the way, breaking throughthe ice on their behalf. I am thankful that perhaps I won't have to deal with as much FUD about SoFS, GAM and XIV anymore.
technorati tags: IBM, SoFS, XIV, GAM, DS8000, EMC, Atmos, Hulk, Maui, Infiniflex, STK, StorageTek, Iceberg, RVA, thin provisioning, VFM, SOAP, REST, DMX, RAID-X, Micro-RAID
In his post on Rough Type
titled ["McKinsey surveys the new software landscape"
], Nick Carr discusses the growing acceptance in the marketplace for Software-as-a-Service, or SaaS.He summarizes the results of McKinsey's recent[Enterprise Software Customer Survey 2008].IBM is already well established as part of the Web 2.0 Big "5" (the other four are Google, Yahoo, Amazon and Microsoft), so it may not be much surprise that it introduced some new offerings focused on this emerging market.
Whether you are looking to contract out for SaaS, or to provide a service to others over the cloud, IBM can help!
- Managed Hosting
For managed hosting, [IBM Managed Storage Services] hasbeen extended to support archive data through its entire lifecycle: supporting access, migration, non-erasablenon-rewriteable (NENR) protection, and expiration/destruction. This offering supports locating the storage onthe customer premises, a hosting center, or an IBM Service Deliver Center. IBM's blended disk and tape approachprovides a better alignment between information value and storage costs.
- Application-Led Service
Last December IBM acquired Arsenal Digital, which offers a remote "Enterprise Email Archive" service, supporting retention policies that can apply per user,per group, or even my message, as needed. This service provides fast user access to email archives, as well as e-discovery search. The search is not just for the email body text, but supports over 370different attachment types as well. Deduplication technology is used to reduce the actual amount of storage needed by 80percent. All of this with the security and comfort of knowing that these email archives are encrypted and protected in a disaster recovery class datacenter managed by IBM.Blocks and Files presents their thoughts on this in the article["IBM storing data and mail in the cloud"].
The Radicati Group has published some interesting statistics about email archive in[Volume 4, Issue 3]. Here's an excerpt:
- "In 2007, a typical corporate email account receives about 18 MB of data per day. This number is expected to grow to over 28 MB by 2011. Today, there is no way to effectively manage these messages, but with the help of an archiving solution.
- Today, the worldwide percentage of corporate mailboxes protected by archiving solutions is estimated to be around 14%, however it is growing at a fast pace, and is expected to reach over 70% by 2011.
- A survey of 102 corporate organizations worldwide, showed that 68% of large businesses view compliance as their top security concern in 2007."
- Cloud Computing
For those who are actually providing these services to others, over the cloud, then you might want to use the new[IBM System x iDataPlex].Compared to traditional server environments, the iDataPlex provides five times the computing power by doubling the number of servers per rack, but with 40 percent less energy consumption. Thanks to clevercooling technology, the system can run in standard office "room temperature" environments. You cancustomize with a mix of compute, network and storage nodes to meet your application requirements.In addition to Web 2.0 and SaaS workloads, the iDataPlex can be useful for financial risk analysis,high performance computing, and even batch processing.
technorati tags: Rough Type, Nick Carr, McKinsey, SaaS, Google, Yahoo, Amazon, Microsoft, managed hosting, storage services, NENR, archive, IBM, Service Delivery Center, Arsenal Digital, deduplication, Radicati Group, iDataPlex, Web2.0[Read More]
Well, today is April 1, and I just love [April Fools' Day
].This day has a rich history of practical jokes. Those not familiar can review this list of [Top 100 pranks and hoaxes
Tim Ferris started the festivities with [The Grand Illusion: The Real Tim Ferriss speaks]. He claimed that for the past year, he outsourced the writing of his blog to a writer from India, and an editor from the Philippines. Given that his post was dated March 31, and he writes frequently about the benefits of outsourcing, it appeared like a legitimate post. However, Tim fessed up the following day, claiming that it was April 1 in Japan where he wrote it.
Guy Kawasaki wrote[April Fools' Stories You Shouldn't Believe]including my favorite #12 "Ruby on Rails cited Twitter as the centerpiece of its new 'Rails Can Scale' marketing program." Speaking of Twitter, Fellow IBM blogger Alan Lepofsky from our Lotus Notes team wrote[Great, now there is Twitter Spam]. It looked like a real post, but then I realized, ... everything on Twitter is spam!
Topics like energy consumption and global warming were fodder for posts and pranks.The post[Was Earth Hour a joke again?], argued thatthe preparation of "Earth Hour" last week in effect used up more energy than the hour of this annual "lights-off event" actually saved. This reminded me of John Tierney's piece in the New York Times ["How virtuous is Ed Begley, Jr.?"] where a scientist explains that it is more "green" for the environment to drive a car short distances than to walk:
If you walk 1.5 miles, Mr. Goodall calculates, and replace those calories by drinking about a cup of milk, the greenhouse emissions connected with that milk (like methane from the dairy farm and carbon dioxide from the delivery truck) are just about equal to the emissions from a typical car making the same trip. And if there were two of you making the trip, then the car would definitely be the more planet-friendly way to go.
Wayan Vota, my buddy over at OLPCnews, writes in his post[Windows XO Child Centric Development] that the "Sugar" operating environment on the innovative Linux-based XO laptops will soon be re-named the"Windows XO Operating System", with their new motto "Windows XO: A Child-Centric Operating Platform for Learning, Expression and Exploration." The mocked up photo of an XO laptop with the Windows XO logo was excellent!
Gretchen Rubin reminds us that this is a great day to play tricks on your kids in[How April Fool’s day can be a source of happiness], and last week, Kai Ryssdal on NPR Radio investigated if [Mind Habits] was [a video game that's good for you?]This claims that just playing five minutes per day can reduce stress. I haven't been able to stop playing after five minutes, Mind Habits is like the proverbial potato chip, you can't just eat one!
The economists from Freakonomics explain in [And While You're at it, Toss the Nickel] that it costs the US Government 1.7 cents to produce each penny. The US government loses $50 million dollars each year making pennies. Each nickel costs 10 cents to produce. This one was dated March 31, so it could actually be true. Sad, but true.
My favorite, however, was EMC blogger Barry Burke's post["5773 > c"] explaining howtheir scientists were able to reduce latency on the EMC SRDF disk replication capability:
What the de-dupe team found is that there is a hidden feature within recent generations of this chip that allow a single bit, under certain circumstances, to represent TWO bits of information.
Still, almost 34% of the total bits transferred were in fact aligned double-zeros, far more than all other bit combinations - and most importantly, these were quite frequently byte-aligned, as required by this new-found capability. Makes sense, if you think about it - most of those 32- and 64-bit integers are used to store numbers that are relatively small (years, months, days, credit charges, account balances, etc.). So that's why the team decided to use this new two-fer bit to represent "00".
Mathematically, if you can transmit 34% of the data using half as many bits, you reduce the number of bits you have to transfer in total by 17%. Which, while not necessarily earth-shattering, is nothing to be ashamed of. On top of the SRDF performance enhancements delivered in 5772 (30% reduction in latency or 2x the distance), this new enhancement adds another 17% latency improvement (or ~1.4x more distance at the same latency). Combined with 5772, SRDF/S customers could see a 50% reduction in latency. And 5773 allows SRDF/A cycle times to be set below 5 seconds (with RPQ) - this new feature adds a little headroom to maximize bandwidth efficiency for the shortest possible RPO.
Again, this looked real, until I did the math. Start with the speed of light in a vacuum of space ("c" in BarryB's title) which is roughly 300,000 kilometers per second, or put into more understandable units, 300 kilometers per millisecond. However, light travels slower through all other materials, and for fiber optic glass it is only 200 kilometers per millisecond. Sending a block of data across 100km, and then getting a response back that it arrived safely, is a total round-trip distance of 200km, so roughly 1 millisecond. However, EMC SRDF often takes two or three round-trips per write, versus IBM Metro Mirror on the IBM System Storage DS8000 which has got this down to a single round-trip. The number of round-trips has a much bigger effect on latency than EMC's double-bit data compression technique. With IBM, you only experience about 1 millisecond latency per write for every 100km distance between locations, the shortest latency in the industry.
It is good that once a year, you should be skeptical of what you read in the blogosphere, and sometimes check the facts!
technorati tags: April Fools Day, Tim Ferris, 4HWW, outsourcing, Guy Kawasaki, Ruby on Rails, Twitter, Alan Lepofsky, Lotus, Notes, Earth Hour, spam, John Tierney, Ed Begley Jr., milk, carbon dioxide, Wayan Vota, OLPCnews, Windows XO, Gretchen Rubin, Kai Ryssdal, Freakonomics, NPR, Mind Habits, penny, nickel, EMC, BarryB, SRDF, IBM, DS8000, Metro Mirror, latency, fiber optic, speed of light
I got some interesting queries about IBM's Scale-Out File Services [SoFS
] that I mentioned in my post yesterday [Area rugs versus Wall-to-Wall carpeting
]. I thought I would provide some additional details of the product.
SoFS combines three key features: a global namespace, a clustered file system, and Information LifecycleManagement (ILM). Let's tackle each one.
- Global Name Space
A long time ago, IBM acquired a company called Transarc that developed Andrew File System (AFS) and DistributedFile System (DFS). These both provided global namespace capability, meaning that all of your files could beaccessible from a single URL file tree. Imagine if you have data centers in Tucson, Austin, Raleigh and Chicago.Normally, to access files from each city, you would have to mount a unique IP address for that location, and thento get to files in a different city, you'd have to mount a second, and so on. But with a global namespace, you could mount a single drive letter Z: and access files simply by using Z:/Tucson/abc or Z:/Austin/xyz. IBM uses its DFS to make this happen.
Just because you have access to a global namespace doesn't give you read/write authority to every file. IBM SoFS has full NTFS Access Control List (ACL) support, so that only those who can read or write data can access the files. A "hide unreadable" feature provideswhat I like to call "parental controls": you don't even get to see on your directly list any file or subdirectory that you don't have access to. For example, if there is a directory with 50 projects, but you only have authority tothree projects, then you only see the three subdirectories related to those projects, and nothing else.
There are other ways to get a global namespace. IBM also offers the IBM System Storage N series Virtual FileManager, Brocade offers Storage/X, and F5 acquired Acopia. These all work by putting a box in front of a set ofindependent NAS storage units, and giving you a single mount point to represent all of the file systems managedbehind the scenes. This however can sometimes be a bottleneck for performance.
- Clustered File System
Often, when you have a lot of data in one place, you are also expected to deliver that data to lots of clientswith relatively good performance. Otherwise, end users revolt and get their own internal direct attach storage.To solve this, you need a clustered architecture that provides access in parallel to the data.
First, we start with a node that is optimized for CIFS and NFS access. We have clocked our node to run CIFS at577 MB/sec, and NFS at 880 MB/sec, through a 10GbE pipe between a single client and a single SoFS node. Comparethat to the 400 MB/sec you get today with 4Gbps FCP, or the 800 MB/sec you will get if you upgrade to 8 GbpsFCP, and quickly you recognize that this is comparable performance for demanding workloads.
Then, you combine multiple nodes together, and have them all be able to read/write any file in the file system, andfront-end that with a load-balancing Virtual IP address (VIPA) that spreads the requests around, and you've gotyourself a lean and mean machine for accessing data.
In 2005, IBM delivered[ASC Purple] with the world's fastest file system. 1536 nodeswere able to access billions of files in the 2 Petabyte of data. The record of 126 GB/sec access to a single filewas set, and has yet to be beaten by any other vendor since.This same file system is used in SoFS, as well as a variety of other IBM storage offerings.
The back-end storage can be SAS or FC-attached, from the DS3200 to our mighty DS8300 Turbo, as well as ourIBM System Storage DCS9550 and SAN Volume Controller (SVC), and a variety of tape libraries.
- Information Lifecycle Management
Lastly, we get to ILM. With SoFS, you can have different tiers of storage, high-speed SAS or FC disk, low-speedFATA and SATA disk, and even tape. Policy-based automation allows you to place any file onto any disk tier whencreated, and other policies can migrate or delete the data trigged by certain threshold, age, or other criteria.The advantage is that this is on a file by file basis, so Z:/Tucson/Project could have a bunch of files, some ofthem on my FC disk, some of them on my SATA, and some on tape. The file path doesn't change when they move, anddifferent files in the same directory can be on different tiers.
Data movement is bi-directional. If you know you will be using a set of files for an upcoming job, say perhapsquarter-end or year-end processing, you can pre-fetch those files from tape and move them to your fastest disk pool.
There is also integrated backup support. Typically, a large NAS environment is difficult to backup. Traditionalmethods take days to scan the directory tree looking for files in need of backup. A single SoFS node can scana billion files in 95 minutes, and 8 nodes in a cluster can scan a billion files in under 15 minutes.
Recovery is even more impressive. When you recover, SoFS brings back the entire directory structure first, withall the file names in place. This would make it appear that all the data is restored, but actually it is still on tape.When you access individual files, it will then drive the recovery of that file, so your applications and end usersbasically determine the priority of the recovery. Traditional methods would wait until every file was restoredbefore letting anyone access the system.
SoFS is part of IBM's [Blue Cloud] initiativethat was launched last November 2007. Of course, IBM isn't the only one competing in this space. HDS has partneredwith BlueArc, HP has acquired PolyServe, and Sun acquired CFS for their Lustre file system. Isilon and Exanet arestart-up companies with some offerings. EMC acquired Rainfinity,and have hinted at a Hulk/Maui project that they might deliver later this year or perhaps in 2009, but by thenmight be a dollar-short and a day-late.
But why wait? IBM SoFS is available today and is orders of magnitude more scalable!
technorati tags: IBM, SoFS, Acopia, VFM, Brocade, ILM, global namespace, clustered, file system, disk, tape, storage, system, CIFS, NFS, NAS, NTFS, ACL, DFS, AFS, Transarc, ASC Purple, DS3200, SAS, FC, FCP, DS8300, Turbo, DCS9550, SVC, FATA, SATA, nodes, backup, restore, recovery, Blue Cloud, cloud computing, PolyServe, HDS, BlueArc, HP, Sun, CFS, Lustre, Isilon, Exanet, EMC, Rainfinity, Hulk, Maui
Last week, I covered backup issues in [Deduplicationversus Best Practice for Backups
]. This week, I thought I would cover issues with email.
At IBM, our standard is to have a limit of 200MB per user mailbox. A few of us get exceptions and have up to500MB limit because of the work we do. By comparison, my personal Gmail account is now up to 6500MB. Whenthis limit is exceeded, you are unable to send out any mail until it is brought down below the limit, and a request to be "re-enabled for send" is approved, a situation we call "mail jail".
The biggest culprit are attachments. Only 10 percent of emails have attachments, but those that do take up 90percent of the total space! People attach a 15MB presentation or document, and copy the world ondistribution list. Everyone saves their notes with these attachments, and soon, the limits are blown. Not surprisingly, deduplication has been cited as a "killer app" to address email storage, exactly for this reason.If all the users have their mailboxes all stored on the same deduplication storage device, it might find theseduplicate blocks, and manage to reduce the space consumed.
A better practice would be to avoid this in the first place. Here are the techniques I use instead:
- Point to the document in a database
We are heavy users of Lotus Notes databases. These can be encrypted and controlled with Access Control Lists (ACL)that determine who can create or read documents in each database. Annually, all the database ACLs are validatedso that people can confirm that they continue to have a need-to-know for the documents in each database. Sendinga confidential document as a "document link" to a database entry takes only a few bytes, and all the recipientsthat are already on the ACL have access to that document.
- Point to the document on a web page
If the document is available on an internal or external website, just send the URL instead of attaching the file.Again, this takes only a few bytes. We have websites accessible only to all internal employees, websites thatcan be accessed only by a subset of employees with special permissions and credentials based on their job role, and websites that are accessible to our IBM Business Partners.
In my case, if I happen to have a blog posting that answers a question or helps illustrate an idea, I will sendthe "permalink" URL of that blog post in my email.
- Point to the document on shared NAS file system
Internally, IBM uses a "Global Storage Architecture" (GSA) based on IBM's Scale-Out File Services [SoFS] with everyone getting initially 10GB of disk space to store files, with the option to request more if needed. The system has policy-based support for placing and migrating older data to tape to reduce actual disk usage, and combines a clustered file system with a global name space.
My SoFS space is now up to 25GB, and I store a lot of presentationsand whitepapers that are useful to others. A URL with "ftp://" or "http://" is all you need to point to a filein this manner, and greatly reduces the need for attachments. I can map my space as "Drive X:" on my Windows system,or as a NFS mount point on my Linux system, which allows me to easily drag files back and forth.
Departments that don't need to offer "worldwide access" use NAS boxes instead, such as the IBM System Storage N series.
Pointing to files in a shared space, rather than as attachments in email, may take some getting used to. I've hada few recipients send me requests such as "can you send that as an attachment (not a URL)" because they plan toread it on the airplane or train, where they won't have online connectivity.
This all relates to new ways for employees to collaborate. Shawn from Anecdote writes in the post[Fostering a Collaboration Culture]:
"Have you invested in the latest and greatest in collaboration technology but still feel people are still not collaborating? How many Microsoft Sharepoint servers and IBM Quickplaces remain relatively untouched or only used by the organization's technorati? I think it's a big problem because this narrow view of collaboration starts to get the concept a bad name: "yeah, we did collaboration but no one used it." And then there the issue of the vast amount of money wasted and opportunities lost. We can't afford to loose faith in collaboration because the external environment is moving in a direction that mandates we collaborate. The problems we face now and into the future will only increase in complexity and it will require teams of people within and across organizations to solve them."
Well, sending pointers instead of attachments works for me, and has kept me out of "mail jail" for quite some timenow.
technorati tags: IBM, deduplication, email, mailbox, Gmail, attachment, Lotus, Notes, database, URL, Permalink, GSA, NAS, SoFS, disk, Anecdote
In addition to creating the Dilbert cartoon, Scott Adams has a blog, which sometimes is quite serious,and other times quite funny. The anticipated 30x cost of "Flash Drives" for Enterprise disk systems reminded meof one of Scott's articles from November 2007 titled [Urge to Simplify
].Here's an excerpt:
Now the casinos have people trained, like chickens hoping for pellets, to take money from one machine (the ATM), carry it across a room and deposit in another machine (the slot machine). I believe B.F. Skinner would agree with me that there is room for even more efficiency: The ATM and the slot machine need to be the same machine.
The casinos lose a lot of money waiting for the portly gamblers with respiratory issues to waddle from the ATM to the slot machines. A better solution would be for the losers, euphemistically called “players,” to stand at the ATM and watch their funds be transferred to the hotel, while hoping to somehow “win.” The ATM could be redesigned to blink and make exciting sounds, so it seems less like robbery.
I’m sure this is in the five-year plan. Longer term, people will be trained to set up automatic transfers from their banks to the casinos. People will just fly to Vegas, wander around on the tarmac while the casino drains their bank accounts, then board the plane and fly home. The airlines are already in on this concept, and stopped feeding you sandwiches a while ago.
Perhaps EMC can redesign its DMX-4 to "blink and make exciting sounds" as well. The Flash Drives were designedfor the financial services industry, so those disk systems could be directly connected to make transfers between the appropriate bank accounts.
technorati tags: Scott Adams, Dilbert, B.F. Skinner, ATM, casinos, EMC, DMX-4
I'm here at the Los Angeles airport on my way to Canada.
On my post last week[My Blook is Now Available],Cheryl Hagedorn comments:
I've just posted about your blook at Blooking Central http://blooking.blogspot.com/2007/11/inside-system-storage.html
I'll love to hear from you (I post letters from authors!) about how you put the blook together. Many folks have used cut and paste from blog page into word processor. Others have simply backed up their blogs, then cut and pasted. Some folks had the foresight to compose their posts in a word processor before posting!
Anyway, I'd like to know whatever ins and outs you'd like to share. Thanks.
Well Cheryl, I couldn't find any email address to send you a response, so Idecided to post here instead and post a traceback on your blog.
After learning about the Blooker Prize, I had asked our IBM Developerworks team if anyone else within IBM had published a blook, but nobody had heard of anything, so I had to look elsewhere.I got a lot of guidance from Lulu's [Book Publishing FAQs], and Don Campbell's[Five Steps to Publishing Your Paperback Book at Lulu],and how-to articles over at [bookcatcher.com].
- Decision 1: Defining the Container
Before you can cut-and-paste anything, you need a container file to put it in. Here were my key decisions:
- Page Size: Novel 6"x9" (15cm x 23cm) to support both perfect-bound paperback and dust-jacket hardcopy editions
- Colors: Full-color covers with black-and-white interior
- Fonts: 10pt Book Antiqua for the text, Courier for the monospaced computer examples,8pt for the "copyright" fine print
- Format: *.doc Microsoft Word file, using [Lulu's ready-to-use templates]
- Software: Office 2003 version of Microsoft Word on Windows XP system
- Front matter: Title, Copyright, Dedication, Table of Contents, Foreword, Introduction
- Back matter: Blog Roll, Blogging Guidelines, Glossary, Reference table, What people have written about me and my blog
According to Lulu, you could use OpenOffice instead with RTF files. I didn't try that. I did tryusing CutePDF to upload ready-made PDFs, that didn't work. I also tried saving text in PDF formaton my Mac Mini running OS X 10.4 Tiger, but Lulu didn't like that either.IBM now offers a free download of [LotusSymphony] that might be an alternative for my next book.
For my blook, the "Blog Roll" serves instead of a more formal [Bibliography]. I could have also includedonline magazines and other web resources.
- Decision 2: Chapter Configuration
I reviewed other blooks to see how they were organized. I thought I might organize the blog posts by topic or category, but all the blooks I looked atwere strictly chronological, oldest post first. This of course is exactly opposite as theyappear on the web browser. I decided to keep things simple, with just 12 chapters, one for each calendar month.
Each chapter was separated by a section break with unique footers, starting on odd page number. The footers have the page numbers on the outside edges, so that even pages had numbers on the left, and odd pages on the right. I also added the name of the chapter and the book, like so:
40 ................December 2006| |Inside System Storage.... 41
This was a lot of work, but makes the book look more "professional".
- Decision 3: Cut-and-Paste
People have asked me why it took three months to put my blook together, and I explainedthat the cut-and-paste process was manually intensive. My posts are either HTML entereddirectly into Roller webLogger, or typed in HTML on Windows Notepad and cut-and-pastedover to Roller later. I have access to the HTML source of each post, as wellas how it appears on the webpage, and tried cut-and-paste both ways. Copying theHTML source meant having to edit out all the HTML tags. I hadn't even looked into the idea of "backing up" through Roller all the entries, but they would probably have been HTMLsource as well.
In turned out that copying the webpage directly from the browser was better, which retains more of the formatting,and automatically eliminates all of the pesky HTML tags. I wanted the printed versions to resemblethe web page version.
Microsoft Word indicates all hyperlinks as bright blue underlined text which I didn't like, so I removedall hyperlinks, to avoid having to pay extra for "colored pages". This can be done manually, one by one, or pasting with the "text only" option butthis removes out all the other formatting as well. (Specifying black-and-white interior on Lulu might have converted all of these automaticallyto greyscale, so I might have been safe to leave them in,which I probably could have done if I wanted an online e-book version with links active, ... oh well)
To indicate where the hyperlinks would have been, I wrapped all the linked text in[square brackets]. I have now gotten in the habit of doing this for future blog posts, soif I ever make another book, it will cut down the work and effort on the cut-and-paste.
Some of the items I linked to posed a problem. I had to convert YouTube videos to flat imagesof the first frame to include them into the book. Older links were broken, and I had tofind the original graphics. I also sent a note to Scott Adams related about the use of one of his Dilbert cartoons.
I decided to also cut-and-paste my technorati tags and comments. For comments I mademyself, I labeled them "Addition" or "Response". A few people did not realize thatI was "az990tony" making the comments as the blog author, so I changed all to say "az990tony (Tony Pearson)" to make this more clear, and now do this on all future blogposts to minimize the work for my next book.
Because I used a lot of technical terms and acronyms, Microsoft Word actually gave mean error message that there were so many gramattical and spelling errors that it wasunable to track them all, and would no longer put wavy green or red lines underneath.
I did all the cut-and-paste work myself, but since the website is publicly accessible,I could have gotten someone else to do this for me.Had I read Timothy Ferriss' book The Four Hour Work Week sooner,I might have taken his advice on [Outsourcing the project to someone in India]. I might consider doing this for my next book.
- Decision 4: Numbering the Posts
I decided I wanted to standardize the title of each post. The date was not uniqueenough, as there were days that I made multiple posts. So, I decided to assign eacha unique number, from 001 to 165, like so:
2006 Dec 12 - The Dilemma over future storage formats (033)
Posts that referred back to one of my earlier posts within the book had (#nnn) added so that readers couldgo jump back to them if they were interested. This eliminated trying to keep track of pagenumbers.
- Decision 5: Adding behind-the-scenes commentary
- One of the reasons I rent or buy DVDs is for the director's audio commentary and deleted scenes. These extras provided that added-value over what I saw in the movietheatre. Likewise, 80 percent of a blook is already out in the public for reading, so I felt I needed to provide some added value. At the beginning of each month, I describewhat is going on behind the scenes, and then in front of specific posts, I providedadditional context. This could be context of what was going on in the blogosphere at thetime, announcements or acquisitions that happened, what country I was blogging from, orwhat unannounced products or projects that were being developed that I can now talk aboutsince they are now announced and available.
To distinguish these side comments from the rest of the blog posts,I decorated them with graphics. Searching for copyright-free/royalty-free clip-art, graphics, and photos that represented eachconcept was time-consuming. I shrunk each down to about 1 inch square in size, and changed themfrom color to greyscale. (LuLu conversion to PDF probably would have automaticallyconverted the color graphics to greyscale for me, in which case leaving them in full colormight have been nice for an e-book edition, ... oh well)
I did complete each chapter one at a time. So, for each month, I cut-and-pasted all the blog posts,tags and comments, then fixed up and numbered all the post titles, then added all the behindthe scenes commentary, and cleaned up all the font styles and sizes. I recommend you do this at least for the first chapter, so you can get a good feel for what the finished version will look like.
- Decision 6: Adding a Glossary
I sent early copies of the books to five of my coworkers knowledgeable about storage, andfive local friends who know nothing about storage.
Some of my early reviewers suggested having an index, so that people can find a specific poston a particular topic. Others suggested I spell out all the acronyms that appear everywhereand put that into the Reference section, rather than on each and every occurrence inthe book itself. Both were good ideas, and my IBM colleague Mike Stanek suggested calling ita GOAT (Glossary of Acronyms and Terms). Acronyms are spelled out, and terms or phrasesthat need additional explanation have a glossary definition. For eachitem, I put the post or posts that uses that term. Some terms are covered in dozens ofposts, so I tried to pick five or fewer posts representing the most pertinent.
The glossary was far more time-consuming than I first imagined, with over 50 pages containingover 900 entries. I struggled deciding which terms and acronyms needed explanation, and which were obvious enough. On the good side, itforced me to read and re-read the entire book cover to cover, and I caught a lot of othermistakes, misspellings, and formatting errors that way. Also, I have a large internationalreadership on my blog, so the glossary will help those whose English is not their native language,and will help those readers who are not necessarily experts in the storage industry.
- Decision 7: Designing the Covers
Up to this point, I had been printing early drafts with simple solid color covers. Lulu hasthree choices for covers:
- Just type in the text, upload an "author's photo" and chose a background color or pattern
- Upload PNG files, one for the front cover, one for the back cover, and chose the textand color of the spine.
- Upload a single one-piece PDF file that wraps around the entire book.
I had no software to generate the PDF for the third option, so I decided to try the secondoption. My first attempt was to format the front title page in WORD, capture the screen,convert to PNG and upload it as the front cover. I did same for the back cover, with a smallpicture of me and some paragraphs about the book.
I chose a simple straightforward title on purpose. Thousands of IBM and other IT marketing and technicalpeople will be ordering this book, and submitting their expenses for reimbursement as work-related, and didn't want to cause problems with a cute title like "An Engineer in Marketing La-La Land".
The next step was to use [the GIMP] GNU image manipulationprogram, similar to PhotoShop, to add a cream colored background, a slanted green spine, and some graphics that we had developed professionally for some of our IBM presentations.I learned how to use the GIMP when making tee-shirts and coffee mugs for our [Second Life] events, so I was already familiar. For newblook authors, I suggest they learn how to use this for their covers, or find someone who can do thisfor them.
I did the paperback version first, and once done, it was easy to use the same PNG files forthe dust jacket of the hardcover edition, adding some extra words for the front and back flaps.
The adage "Don't judge a book by its cover" seems to apply to everything except booksthemselves. The book cover is the first impression online, and in a bookstore. I have seenpeople pick books up off the shelf at my local Barnes & Noble, read the front and back covers, peruse the front and backflaps, and make a purchase decision without ever flipping a single page of the contents inside.From an article on Book Catcher [SELF-PUBLISHING BOOK PRODUCTION & MARKETING MISTAKES TO AVOID]:
According to selfpublishingresources website, three-fourths of 300 booksellers surveyed (half from independent bookstores and half from chains) identified the look and design of the book cover as the most important component of the entire book. All agreed that the jacket is the prime real estate for promoting a book.
While many struggle to find the right title and cover art, I think it is interesting that Lululets you post the same book with slightly different titles and covers, each as separate projects, and let market forces decide which one people like best. This is a common practice among marketresearch firms.
- Decision 8: Finding someone to write the Foreword
With the book nearly done, I thought it would be a nice touch to have an IBM executive write a Foreword at the frontof the book. Several turned me down, so I am glad I found a prominent Worldwide IBM executiveto do it. I should have started this process sooner, as she wanted to read my book in its entirety beforeputting pen to paper. I had not planned for this. I was hoping to be done by end of October,but waiting for her to finish writing the Foreword added some extra weeks. Next time,I will start this process sooner.
- Decision 9: Printing Early Drafts
You need to have Lulu print at least one copy to review before making it available to the public,and it doesn't hurt to order a few intermediary draft copies to make sure everything looks right.However, from the time I order it on Lulu, to the time it is in my hands, is over two weeks withstandard shipping, so I needed a way to print drafts to look at in between.
To avoid wear-and-tear on my color ink-jet printer, I went and bought a large black-and-white[Brother HL-5250DN] laser printer. Rather than buying specialty 6x9 paper, I used standard 8.5x11 paperusing the following 2-up duplex method:
- Upload the DOC file to Lulu, and get it converted to PDF
- Download the resulting PDF from Lulu back to your computer
- View the PDF in Adobe Reader, and print it using 2-up "Booklet" mode.
For example, if you print 60 pages in booklet mode, it prints two mini-pages on thefront side, and two more mini-pages on the back side of each sheet of paper, resulting in 15 standard 8.5" x 11" pages that can be folded, stapled, and read like a mini-booklet. My entire blook could be printed on seven of these mini-booklets, saving paper, and giving me a close approximation to what the final book would look like. Eachmini-page is 5.5"x8.5", so just slightly smaller than the final 6"x9" form factor.I fount that 60 pages/15 sheets was about the maximum before it becomes hard to fold in half.
So, if I had to do it all over again, I might have chosen 11pt Garamond (the default), or changedthe default to 11pt Book Antiqua up front, so as not to have spend so much time converting thefonts. I might have left out the glossary. I might have left in all the hyperlinks and graphicsin full color for a separate e-book edition. And I definitely would have looked for an author formy Foreword much earlier in the process.
I didn't plan to write a blook when I started blogging. I have started putting [square brackets]around all my links. I have started putting "az990tony (Tony Pearson)" on all my comments. I hadassumed that people were jumping to all the links I provided in context, but I learned that the blogpost has to stand on its own, so now I make sure that I either paraphrase the important parts, oractually quote the text that I feel is important, so that the blog post makes sense on its own.This is perhaps good advice in general, but even more important if you plan to write a blook later.
Lastly, I decided up front to write blog posts that were 500-700 words long, about the average lengthof magazine or newspaper articles. In my blook, the average is 639 words per post, so I hit thatgoal. I have seen some blogs where each post is just a few sentences. Maybe they are posting fromtheir cell phone, or don't have time to think out a full thought, but who wants to read a year'sworth of [twitter] entries.
Well Cheryl, I hope that helps. If you need anymore, click on the "email" box on the right panel.
technorati tags: Cheryl Hagedorn, Blooking Central, Lulu, Don Campbell, IBM, Developerworks, Book Antiqua, Courier, Garamond, Microsoft, Word, OpenOffice, Lotus, Symphony, PDF, CutePDF, OS X, HTML, Hyperlinks, blook, reference, glossary, Twitter, Timothy Ferriss, fourhourworkweek, outsourcing, India
Continuing my week's theme on Innovations that matter, I thought I would tackle energy efficiency and the recent excitement over the Smart car.
USA Today had an article [America crazy about breadbox on wheels called Smart car]. This car weighs only 2400 pounds, gets a respectable 33 MPG City,and 40 MPG Highway, with a list price of $11,590 US dollars. These have been in Europe for some time now.The "Smart" name comes from combining the S from Swatch, the M from Mercedes and ART. The car was designed byNicholas Hayek, founder of the SWATCH wristwatch line, and manufactured by Daimler, who also makes Mercedes cars.
We have many communities here in Tucson that people drive street-legal golf carts. People don't realize but bothelectric and electric/gas hybrid golf carts have been around for a long time. Some of the nicer golf carts run forabout $7,000 US dollars, with a shelf on the back that can hold two sets of golf clubs, or groceries.Of course, you would never take a golf cart on the highway, so that is where the Smart car comes in, with a 10gallon tank, could easily get you from one major city to another.
Like golf carts, the Smart-for-Two model being sold in the US will hold only two people, which is perfect for manyAmerican families. The standard 4-person or 5-person sedan is too big for most DINKS (Dual Income, No Kids), and other families with kids often opt for the 7-person SUV instead.
It is good to see that energy consumption is finally getting the attention it deserves. IBM recently announced some exciting offerings to help data centers manage their energy consumption:
- IBM Systems Director Active Energy Manager V3.1 [AEM]:
A new, key component of IBM's [Cool Blue portfolio] offering, AEM helps clients manage and even potentially lower energy costs. According to Gartner, insufficient power and excessive heat remain the greatest challenges in the data center. With AEM, IT managers can understand exact power/cooling costs, manage the efficiency of the current environment and reduce energy costs. AEM is the only energy management software tool that can provide clients with a single view of the actual power usage across multiple IBM platforms, including x86, blades, Power and storage systems, with plans to extend support to the mainframe.
- IBM Usage and Accounting Manager Virtualization Edition V7.1 [UAV]for System p and System x:
UAV gives IT managers more information to manage data center costs. These powerful usage management tools are designed to accurately measure, analyze, and report resource utilization of virtualized/consolidated/shared resources. With UAV, IT managers can better manage costs and justify new systems by determining who is using how much of which resource; assessing the cost of an IT service or application; and accurately charging each user or department. Working with AEM capabilities, it will also allow tracking of energy consumption costs by server and by user. This level of reporting eliminates a key inhibitor to the adoption of virtualization and consolidation and further differentiates IBM systems.
- IBM Tivoli Usage and Accounting Manager[UAM]:
This solution -- ideal for heterogenous IT shops -- serves as an accurate measurement tool underlying billing processes and SLA compliance. UAM provides usage-based accounting and charging for virtually any IT resources across the enterprise -- ranging from mainframes to virtualized servers to storage networks and more. The Usage and Accounting Manager Virtualization offerings seamlessly integrate into it.
Whether you are trying to reduce energy consumption in your data center, or in your transportation around town, these innovations can help you stay "green".
technorati tags: Smart Car, USAToday, golf cart, street legal, hybrid, MPG, green, energy, IBM Systems, Director, AEM, UAV, UAM, TUAM, SLA, management, virtualization, DINKS, SUV
A few weeks ago, my Tivo(R) digital video recorder (DVR) died. All of my digital clocks in my house were flashing 12:00 so I suspect it wasa power strike while I was at the office. The only other item to die was the surge protector,and so it did what it was supposed to do, give up its own life to protect the rest of myequipment. Although somehow, it did not protect my Tivo.
I opened a problem ticket with Sony, and they sent me instructions on how to send itover to another state to get it repaired.Amusingly, the instructions included "Please make a backup of the drive contents beforesending the unit in for repair." Excuse me? How am I supposed to do that, exactly?
My model has only a single 80GB drive, and so my friend and I removed the drive and attachedit to one of our other systems to see if anything was salvageable. It failed every diagnostictest. There was just not enough to read to be usable elsewhere.
This is typical of many home systems. They are not designed for robust usage, high availability, nor any form of backup/recovery process. Some of the newer models havetwo drives in a RAID-1 mode configuration, but most have many single points of failure.
And certainly, it is not mission critical data. Life goes on without the last few episodesof Jack Bauer on "24", or the various Food Network shows that I recorded for items I planto bake some day. For the past few weeks, I have spent more time listening to the radioand reading books. Somehow, even though my television runs fine without my Tivo, watchingTV in "real time" just isn't the same.
I suspect that if you gave someone a method to do the backup, most would not bother to useit. People are now relying more and more heavily on their home-basedinformation storage systems, digital music, video and cherished photographs. Perhaps experiencing a "loss" will help them appreciate backup/recovery systems so much more than they do today.
technorati tags: Tivo, Digital Video Recorder, DVR, RAID, backup, recovery, loss, information, storage, systems[Read More]
A recent blog by Chris Mellor makes the outlandish conspiracy theory that IBM and HDS copied virtualisation technology
from small start-up company DataCore
(Chris doesn't actually name who is his source making such a claim, whether thatsomeone was employed by any of the parties involved at the time the events occurred,or is currently employed by a competitor like EMC bitterly jealous of the success IBM and HDScurrently enjoy with their offerings.)
As I already posted before about IBM'slong history of storage virtualization, SAN Volume Controller was really part of a sequence of major product in this area, after the successful 3850 MSS and 3494 VTS block virtualization products.
In the late 1990's, our research teams in Almaden, California and Hursley, UK were exploring storagetechnologies that could take advantage of commodity hardware parts and the industry-leadingLinux operating system.
As is often the case, while IBM was working on "the perfect product", small start-ups announce "not-yet-perfect" products into the marketplace. Tactical moves like partneringwith DataCore was a smart move, for the following reasons:
- Helps identify market segments. Identify which subset of customers would most benefit fromdisk virtualization. While our 3850 MSS and 3494 VTS were focused on mainframe customers, this newtechnology was focused on distributed Unix, Windows and Linux servers.
- Helps prioritize market requirements. What are the most appealing features?What drives clients to buy disk virtualization for distributed systems platforms?
- Helps evaluate packaging options. Should we deliver pure software and expect customersto purchase their own servers? Should we offer this as a "service offering" with installation anddeployment services included? Should we offer this as hardware with software pre-installed?
The partnership proved worthwhile, not just to prove to IBM that this was a worthwhile market to enter, but also how "NOT" to package a solution. Specifically, DataCore SANsymphony was software that you had to install on your own Windows-based server. The client was left with the task of orderinga suitable Intel-based server, with the right amount of CPU cycles, RAM and host bus adapter ports,and configure the Windows operating system and DataCore software.
It didn't go well. Basically, customers were expected to be their own "hardware engineers", having to knowway too much about storage hardware and software to design a combination that worked for theirworkloads. Most clients were disappointed with the amount of effort involved, and the resulting poor performance.
To fix this, IBM delivered the SAN Volume Controller, with an optimized Linux operating system and internally-writtensoftware that runs on IBM System x(tm) server hardware optimized for performance.
I can't speak for HDS, but I suspect they came to similar conclusions that resulted in a similar decisionto build their product in-house. I welcome Hu Yoshida to correct me if I am wrong on this.
technorati tags: Chris Mellor, DataCore, SANsymphony, IBM, SVC, HDS, EMC, Invista, disk, storage, virtualization, Hu Yoshida, Windows, Linux[Read More]
What a great way to wrap up another excellent week!
While I was away on vacation last week, IBM Storage and Software Offeringswon Brand Impact 2007 Awardsfrom leading brand marketing organization Liquid Agencyat the Brand Summit Awards Dinner.Other awards went to Cisco, Google and Sony, which I also highly admire.
For those in the USA, next Monday isMemorial Day. I'll be in Australia, and they have a similar ANZAC Day which happened last month (April 25).
Have a safe weekend!
technorati tags: IBM, storage, software, awards, brand, impact, Liquid Agency, dinner, Cisco, Google, Sony, Memorial Day, ANZAC
In Storage Technology News, Marc Staimer makes hisSeven network storage predictions for 2007
. Let's take a closer look at each one.
- Federal Rules for Civil Procedures (FRCP) will increase adoption of unstructured data classification, email archive systems and CAS.
CAS continues to flounder, but the rest I can agree with. Regulations are being adopted world wide. Japan has its own Sarbanes-Oxley (SOX) style legislation go into effect in 2008.IBM TotalStorage Productivity Center for Data is a great tool to help classify unstructured file systems. IBM CommonStore for email supports both Microsoft Exchange and Lotus Domino, and can be connected to IBM System Storage DR550 for compliance storage.
- Unified storage systems (combined file and block storage target systems) will become increasingly attractive in 2007, because of their ease of use and simplicity.
I agree with this one also. Our sales of IBM N series in 2006 was great, and looking to continue its strong growth in 2007. The IBM N series brings together FCP, iSCSI and NAS protocols into one disk system. With the SnapLock(tm) feature, N series can store both re-writable data, as well as non-erasable, non-rewriteable data, on the same box. Combine the N series gateway on the front-end with SAN Volume Controller on the back-end, and you have an even more powerful combination.
- Distributed ROBO backup to disk will emerge as the fastest growing data protection solution in 2007.
IDC had a similar prediction for 2006. ROBO refers to "Remote Office/Branch Office", and so ROBO backup deals with how to back up data that is out in the various remote locations. Do you back it up locally? or send it to a central location?Fortunately, IBM Tivoli Storage Manager (TSM) supports both ways, and IBM has introduced small disk and tape drives and auto-loaders that can be used in smaller environments like this. I don't know whether "backup to disk" will be the fastest growing, but I certainly agree that a variety of ROBO-related issues will be of interest this year.
- 2007 will be remembered as the year iSCSI SAN took off because of the much reduced pricing for 10 Gbit iSCSI and the continued deployment of 10 Gbit iSCSI targets.
While I agree that iSCSI is important, I can't say 2007 will be remembered for anything.We have terrible memory in these things. Ask someone what year did Personal Computers (PC) take off, and they will tell you about Apple's famous 1984 commercial. Ask someone when the Internet took off, cell phones took off, etc, and I suspect most will provide widely different answers, but most likely based on their own experience.
For the longest time, I resisted getting a cell phone. I had a roll of quarters in my car, and when I needed to make a call, I stopped at the nearby pay-phone, and made the call. In 1998, pay phones disappeared. You can't find them anymore. That was the year of the cell phones took off, at least for me.
Back to iSCSI, now that you can intermix iSCSI and SAN on the same infrastructure, either through intelligent multi-protocol switches available from your local IBM rep, or through an N series gateway, you can bring iSCSI technology in slowly and gradually. Low-cost copper wiring for 10 Gbps Ethernet makes all this very practical.
Another up-and-coming technology is AoE, or ATA-over-Ethernet. Same idea as iSCSI, but taken down to the ATA level.
- CDP will emerge as an important feature on comprehensive data protection products instead of a separate managed product.
Here, CDP stands for Continuous Data Protection. While normal backups work like a point-and-shoot camera, taking a picture of the data once every midnight for example. CDP can record all the little changes like a video camera, with the option to rewind or fast-forward to a specific point in the day. IBM Tivoli CDP for Files, for example, is an excellent complement to IBM Tivoli Storage Manager.
The technology is not really new, as it has been implemented as "logs" or "journals" on databases like DB2 and Oracle, as well as business applications like SAP R/3.
The prediction here, however, relates to packaging. Will vendors "package" CDP into existing backup products, possibly as a separately priced feature, or will they leave it as a separate product that perhaps, like in IBM's case, already is well integrated.
- The VTL market growth will continue at a much reduced rate as backup products provide equivalent features directly to disk. Deduplication will extend the VTL market temporarily in 2007.
VTL here refers to Virtual Tape Library, such as IBM TS7700 or TS7510 Virtualization Engine. IBM introduced the first one in 1997, the IBM 3494 Virtual Tape Server, and we have remained number one in marketshare for virtual tape ever since. I find it amusing that people are now just looking at VTL technology to help with their Disk-to-Disk-to-Tape (D2D2T) efforts, when IBM Tivoli Storage Manager has already had the capability to backup to disk, then move to tape, since 1993.
As for deduplication, if you need the end-target box to deduplicate your backups, then perhaps you should investigatewhy you are doing this in the first place? People take full-volume backups, and keep to many copies of it, when a more sophisticated backup software like Tivoli Storage Manager can implement backup policies to avoid this with a progressive backup scheme. Or maybe you need to investigate why you store multiple copies of the same data on disk, perhaps NAS or a clustered file system like IBM General Parallel File System (GPFS) could provide you a single copy accessible to many servers instead.
The reason you don't see deduplication on the mainframe, is that DFSMS for z/OS already allows multiple servers to share a single instance of data, and has been doing so since the early 1980s. I often joke with clients at the Tucson Executive Briefing Center that you can run a business with a million data sets on the mainframe, but that there wereprobably a million files on just the laptops in the room, but few would attempt to run their business that way.
- Optical storage that looks, feels and acts like NAS and puts archive data online, will make dramatic inroads in 2007.
Marc says he's going out on a limb here, and that's good to make at least one risky prediction. IBM used to have anoptical library emulate disk, called the IBM 3995. Lack of interest and advancement in technology encouraged IBM to withdraw it. A small backlash ensued, so IBM now offers the IBM 3996 for the System p and System i clients that really, really want optical.
As for optical making data available "online", it takes about 20 seconds to load an optical cartridge, so I would consider this more "nearline" than online. Tape is still in the 40-60 second range to load and position to data, so optical is still at an advantage.
Optical eliminates the "hassles of tape"? Tape data is good for 20 years, and optical for 100 years, but nobody keeps drives around that long anyways. In general, our clients change drives every 6-8 years, and migrate the data from old to new. This is only a hassle if you didn't plan for this inevitable movement. IBM Tivoli Storage Manager, IBM System Storage Archive Manager, and the IBM System Storage DR550 all make this migration very simple and easy, and can do it with either optical or tape.
The Blue-ray vs. DVD debate will continue through 2007 in the consumer world. I don't see this being a major player in more conservative data centers where a big investment in the wrong choice could be costly, even if the price-per-TB is temporarily in-line with current tape technologies. IBM and others are investing a lot of Research and Development funding to continue the downward price curve for tape, and I'm not sure that optical can keep up that pace.
Well, that's my take. It is a sunny day here in China, and have more meetings to attend.
technorati tags: IBM, FRCP, SOX, TotalStorage, Productivity Center, Microsoft, Exchange, Lotus, Domino, DR550, SnapLock, unified storage, NAS, iSCSI, FCP, ROBO, Tivoli, Storage Manager, TSM, Ethernet, AoE, CDP, DB2, Oracle, SAP, VTL, TS7700, TS7510, GPFS, DFSMS, Optical, 3995, 3996, Blue-Ray, D2D2T,DVD
Continuing my week's theme on travel, conferences, and Japan, I saw two items in the newsthat seem to follow a common theme.
- According to the "The Daily Yomiuri", a local Japanese paper, "double happy weddings" arebecoming more and more popular in Japan. These would be called "stotgun" weddings in the US, butin Japan, couples pay extra to have a wedding between the fifth and seventh month ofpregnancy. As Dave Barry would say, I am not making this up. 27% of couples in Japan got married while or after pregnant. The logic is that they can celebrate both events with one ceremony. Many couples believe that the primary purpose of marriage is to have children, and somethat fail to have children suffer terrible anguish or divorce. Waiting untilbeing pregnant helps ensure the couple will be "successful" in this regard.
- IBM acquires Softek, a software company that develops a product called Transparent Data MoverFacility (TDMF) to move mainframe data from one disk system to another, while applicationsare running. This can be used, for example, to move data from outdated disk systems to IBMdisk systems. This is not to be confused with IBM's archive and retention software partner,Princeton Softech.
Softek is the software spin-off of Fujitsu (a Japanese computer hardware manufacturer). Fora while, Fujitsu made IBM-compatible mainframe servers, but was not successful at developingits own system software, relying heavily on IBM for this. Unable to compete against IBM, it stoppedmaking mainframe servers, but continues making other kinds of hardware equipment.
With TDMF, the process of moving data is simple. The software runs on z/OS and intercepts all writes intendedfor a source volumes on the old array, and re-directs a copy to destination volumes on the new device.Systems can run with old and new equipment side by side for a few weeks, with the new devicestaying in-sync with the old. When the client is ready to cross over, the systems arepointed to the new disk, and the old disk systems are detached and removed from the sysplex.
Afraid that installing TDMF will mess with your applications? IBM Global Technology Services (GTS)is able to roll-in a separate mainframe, move the data, than disconnect it along with the old storage.
(For customers running Linux, UNIX or Windows on other platforms, IBM offers SAN Volume Controller (SVC).While SVC is not marketed as a "data migration device", per se, it does have this capability.Many clients were able to cost-justify purchase of an SVCto move data from old storage to new in similar fashion to how TDMF works on the mainframe.)
What do these stories have to do with one another, other than both relating to Japan? IBM has beenusing TDMF for years as part of a service offering to move data from one disk system to another.Since Sam Palmiasano took over in 2002, IBM has acquired 51 companies, 31 of them software companies.Often, these have been "successful" turning quickly profitable because IBM was already well familiar with the companies they acquire, in much the same way that husbandsare well familiar with their brides-to-be at a "double happy wedding".
So, welcome Softek! It looks like its time to celebrate again!
technorati tags: IBM, Japan, Daily Yomiuri, double happy, wedding, shotgun, Dave Barry, Softek, TDMF, z/OS, Fujitsu
Continuing on my theme of storage area networking, today I thought I would coverthe concept of convergence. This is the notion of disparate things that come together.
Convergence plays a big role in Apple's new iPhone.ExpatJane has a nicecollection of news articles.Gizmodo has a two part hands-on experience of the iPhone hereand here. Seth Godin opines that theiPhone is not for everyone.
I would fall into the "not for me" category, at least at this time. The iPhone is GSM-capable phone with the ability to store 4GB or 8GB of music, photos and video, and has incorporated a 2 megapixel camera. Currently, I have separate components:
- A cell phone that is GSM plus CDMA, with features like "speakerphone" which I use quite a lot, but NO camera.
- A 7 megapixel camera, also very small, with removable memory cards.
- A 60GB iPod, with music and photos. My model is older and doesn't handle videos.
Since I visit government agencies, research and development labs, and other places that don't allow cameras, I have to either chose a cell phone that does not have camera capability in it, or have a camera phone that I leave behind in the car or at the front desk. I have chosen to get cell phones with NO camera. So, NOT having a camera is a primary feature I look for, but this is getting harder and harder these days. I don't know if Apple plans to have a non-camera version of their iPhone, but that would be a deal-breaker for me.
I do carry a separate camera, and where it is permissible, use it separately. This is especially useful if you do a lot of whiteboard or flipchart presentations, and want to capture what you have written for later. (For a great example of how effectively whiteboards can be used, check out these videos from UPS.)A picture is worth a thousand words, and is easier to convey an idea with pictures, especially in countries that may not speak English. Last month, I got a 7 megapixel camera to replace my 5 megapixel. For my work, 2 megapixel as found in the iPhone is not detailed enough.
As for my iPod, I enjoy that I can carry 60GB of music and photos. When I go on vacations, I can bring my camera and iPod, and connect the two, transferring and viewing the pictures that I take. I can easily free up 5-10 GB of space on my iPod for photos in preparation for a trip, then replace that with music when I am back at home. I also use my iPod as a remote disk drive for my laptop on business trips. Again, the 4GB and 8GB may not be enough for what I need.
Printers were never converged into Personal Computers, but they did have their own convergence. I have a multi-function printer/scanner/fax machine. I used to have separate printer, scanner and fax machines, but now the technology is so inexpensive that it got all combined into one solution.
The same is happening for Storage Area Networking gear.
- Thanks to Fibre Channel, switches and directors can handle both SCSI commands (FCP) and CCW commands (FICON). This allows the mainframe and distrbuted systems to converge their traffic onto a single network, and is less expensive than trying to maintain one network for the mainframes, and another for the distributed platforms.
- On the SCSI side, there are now switches that let you have pluggable ports of different flavors. For example, you can have some ports be Fibre Channel to receive FCP, and other ports to be Ethernet to carry iSCSI. iSCSI is a protocol co-developed between IBM and Cisco to carry SCSI commands over Ethernet. Since most computers already have Ethernet "network interface cards" and most buildings are already wired with an Ethernet infrastructure, this provides a less expensive alternative to Fibre Channel.
- Routers, and combination Router/Switches, can send all the FCP/FICON/iSCSI traffic over various long distances to remote data centers, using either iFCP or FCIP protocols. This is a less expensive alternative to dropping your own private "dark fiber" between the two locations, which often involves negotiating access rights to dig trenches through other people's property.
Which brings me back to Apple's iPhone. One device can make calls, watch video, and download webpages all because the networks have converged into sending all data in "packets". The network just routes packets from one place to another. It doesn't care that a packet is a voice packet, a video packet or a webpage packet. It doesn't matter.
This convergence then lets the convenience of a handheld device serve as the conduit for doing business, potentially replacing the credit card.IBM helped Visa and Nokia join forces to use cell phones as wallets. According to the article...
"Users can pay for groceries and other purchases by swiping a phone over a reader that electronically communicates with a microchip on the phone. Phone owners confirm the purchase with the push of a button and the deal is complete.
The platform is the result of many years of trials around the world and will enable mobile contactless payments, remote payments, person-to-person payments, and mobile coupons."
Now that's convergence I can get excited about!
technorati tags: IBM, SAN, Apple, iPhone, GSM, CDMA, iPod, UPS, whiteboard, FCP, FICON, SCSI, iSCSI, Ethernet, iFCP, FCIP, dark fiber, Visa, Nokia, Cisco , convergence
For those of us in the northern hemisphere, yesterday was this year's Winter Solstice
, representingthe shortest amount of daylight between sunrise and sunset. So today, I thought I would blog on my thoughtsof managing scarcity.
Earlier in my career, I had the pleasure to serve as "administrative assistant" to Nora Denzel for the week at a storage conference. My job was to make her look good at the conference, which if you know Nora, doesn't take much. Later, she left IBM to work at HP, and I gotto hear her speak at a conference, and the one thing that I remember most was her statement that thewhole point of "management" was to manage scarcity, as in not enough money in the budget,not enough people to implement change, or not enough resources to accomplish a task.(Nora, I have no idea where you are today, so if you are reading this, send me a note).
Of course, the flip-side to this is that resources that are in abundance are generallytaken for granted. Priorities are focused on what is most scarce. Let's examine some of theresources involved in an IT storage environment:
- Capacity - while everyone complains that they are "running out of space", the truth is that most external disk attached to Linux, UNIX, or Windows systems contain only 20-40% data. Many years ago, I visitedan insurance company to talk about a new product called IBM Tivoli Storage Manager. This company had 7TB of disk on their mainframe,and another 7TB of disk scattered on various UNIX and Windows machines. In the room were TWO storage admins for
the mainframe, and 45 storage admins for the distributed systems. My first question was "why so many people forthe mainframe, certainly one of you could manage all of it yourself, perhaps on Wednesday afternoons?" Their response was that they acted as eachother's backup, in case one goes on vacation for two weeks. My follow-up question to the rest of the audience was:"When was the last time you took two weeks vacation?" Mainframes fill their disk and tape storage comfortablyat over 80-90% full of data, primarily because they have a more mature, robust set of management software, likeDFSMS.
- Labor - by this I mean skilled labor able to manage storage for a corporation. Some companies I have visitedkeep their new-hires off production systems for the first two years, working only on test or development systemsonly until then. Of course, labor is more expensive in some countries than others. Last year, I was doing a whiteboard session on-site for a client in China, and the last dry-erase pen ran out of ink. I asked for another pen, and they instead sent someone to go re-fill it. I asked wouldn't it be cheaper just to buy another pen, and they said "No, labor is cheap, but ink is expensive." Despite this, China does complain that there is a shortage of askilled IT labor force, so if you are looking for a job, start learning Mandarin.
- Power and Cooling - Most data centers are located on raised floors, with large trunks of electrical power and hugeair conditioning systems to deal with all the heat generated from each machine. I have visited the data centers ofclients that are forced now to make decisions on storage based on power and cooling consumption, because the coststo upgrade their aging buildings are too high. Leading the charge is IBM, with technology advancements in chips, cards, and complete systems that use less power, and generate less heat. While energy is still fairly cheap in the grand scheme of things, fears ofGlobal Warmingand declining oil supplies, the costs ofpower and cooling have gotten some news lately. In 1956, Hubbert predicted US would reach peak oil supplies by1965-1970 (it happened in 1971), and this year Simmonsestimated that world-wide oil production began its decline already in 2005. Smart companies like Google have movedtheir server farms to places like Oregon in the Pacific Northwest for cheaper hydroelectric power.
- Bandwidth - Last year IBM introduced 4Gbps Fibre Channel and FICON SAN networking gear, along with the servers and storage needed to complete the solution. 4Gbps equates to about 400 MB/sec in data throughput. By comparison, iSCSI is typically run on 1Gbps Ethernet, but has so much overheads that you only get abour 80 MB/sec. Next year, we may see both 8 Gbps SAN, and 10 GbE iSCSI, to provide 800 MB/sec throughputs. My experience is that the SAN is not the bottleneck, instead people run out of bandwidth at the server or storage end first. They may not have a million dollars to buy the fastest IBM System p5 servers, or may not have enough host adapters at the storage system end.
- Floorspace - I end with floorspace because it reminds me that many "shortages" are temporary or artificially created. Floorspace is only in short supply because you don't want to knock down a wall, or build a new building, to handle your additional storage requirements.In 1997, Tihamer Toth-Fejel wrote an article for the National Space Society newsletter that estimated that ...Everybody on Earth could live comfortably in the USA on only 15% of our land area, with a population density between that of Chicago and San Francisco. Using agricultural yields attained widely now, the rest of the U.S. would be sufficient to grow enough food for everyone. The rest of the planet, 93.7% of it, would be completely empty.Of course, back in 1997 the world population was only 5.9 billion, and this year it is over 6.5 billion.
This last point brings me back to the concept of food, and I am not talking about doughnuts in the conference room, or pizza while making year-end storage upgrades. I'm talking aboutthe food you work so hard to provide for yourself and your family. The folks at Oxfam came up with a simpleanalogy. If 20 people sit down at your table, representing the world’s population:
- 3 would be served a gourmet, multi-course meal, while sitting at decorated table and a cushioned chair.
- 5 would eat rice and beans with a fork and sit on a simple cushion
- 12 would wait in line to receive a small portion of rice that they would eat with their hands while sitting on the floor.
So for those of you planning a special meal next Monday, be thankful you are one of the lucky three, and hopefulthat IBM will continue to lead the IT industry to help out the other seventeen.
Happy Winter Solstice!
technorati tags: IBM, Northern, Hemisphere, Winter, Solstice, Nora+Denzel, Oxfam, scarcity, Linux, UNIX, Windows, TSM, Tivoli+Storage+Manager, storage, admins, global+warming, climate+change, peak+oil, National+Space+Society, special, meal
On SearchStorage.com, my buddy Tony Asaro recaps the latest Storage Acquisition Frenzy
It has always been the case in fast pace technology areas that you can't tell the players without a program card, andthis is especially true for storage.
When analyzing each acquistion move, you need to think of what is driving it. What are the motives?Having been in the storage business 20 years now, and seen my share of acquisitions, both from within IBM,as well as competition, I have come up with the following list of motives.
Although slavery was abolished in the US back in the 1800's, and centuries earlier everywhere else, many acquisitionsseem to be focused on acquiring the people themselves, rather than the products or client list. I have seen statistics such as "We retained 98% of the people!" In reality, these retentions usually involve costly incentives,sign-in bonuses, stock options, and the like. Desptie this, people leave after a few years, often because ofpersonality or "corporate culture" clash. For example, many former STK employees seem to be leaving after their company was acquired by Sun Microsystems.
If you can't beat them, join them. Acquisitions can often be used by one company to raise its ranking in marketshare, eliminating smaller competitors. And now that you have acquired their client list, perhaps you can sellthem more of your original set of products!
Symantec had acquired Veritas, which in turn had acquired a variety of other smaller players, and the end result is that they are now #1 backup software provider, even though none of theirproducts holds a candle to IBM's Tivoli Storage Manager. Meanwhile, EMC acquired Avamar to try to get more into the backup/recovery game, but most analysts still find EMC down in the #4 or #5 place in this category.
Next month,Brocade's acquisition of McData should take effect, furthering its marketshare in SAN switch equipment.
Prior to my current role as "brand market strategist" for System Storage, I was a "portfolio manager" where wetried to make sure that our storage product line investments were balanced. This was a tough job, as the investmentshad to balance the right development investments into different technologies, including patent portfolios.Despite IBM's huge research budget, I am not surprised that some clever inventions of new technologies comefrom smaller companies, that then get acquired once their results appear viable.
- Value Shift
The last motive is value shift. This is where companies try to re-invent themselves, or find that they are stuck in acommodity market rut, and wish to expand into more profitable areas.
LSI Logic acquisition of StoreAge is a good exampleof this. Most of the major storage vendors have already shifted to software and services to provide customer value,as predicted in 1990's by Clayton Christensen in his book "The Innovator's Dilemma". The rest are still strugglingto develop the right strategy, but leaning in this general direction.
I hope that provides some insight.[Read More]
I've only had this blog since Sep. 1, and already it is listed in the Data Storage Bloggers wiki list.
In last week's System Storage Portfolio Top Gun class in Dallas, some of the students were not familiarwith Really Simple Syndication (RSS). For the uninitiated, this can be intimidating.I thought a quick overview of what I've done might help:
- Chose a "feed reader". I chose Bloglines but there are many others.
- Use Technorati to search other blogs for keywords or phrases I am looking for.
- When I find a blog that I like to continue tracking, I "add" it to my subscription list on bloglines. Just hit "add" and copy the URL of the blog you want to track. Bloglines will figure out the RSS keywords required.I track eight blogs at the momemnt, but some people with lots of time on their hands track 20 or more. It is easy to unsubscribe, so don't be afraid to try some out for a few days.
- Since I was actually going to run a blog of my own, I read a few books on the topic. One I recommend is "Naked Conversations" by Robert Scoble and Shel Israel, both experienced bloggers.
- Finally, I am not big on spell checking, but most places have the option to preview your post or comment before it actually gets posted, which is not a bad idea if you use any HTML tags.
For a quick taste of blogging, consider using Data Storage Blogger Feed Reader. This has a lot of blogs on the topic of storage, already added and categorized for your convenience, ready for your perusal.
I am sure there are many other ways to enjoy the Blogosphere, but this works for me.[Read More]
IBM Master Inventor, Senior IT Architect, and Event Content Manager
Well, it's Tuesday again, and you know what that means? IBM Announcements!
It has been 71 days since my last post, and people were beginning to worry. Did Tony with the lottery? Was Tony hit by a bus? Did Tony get abducted by aliens from another planet? No, none of these things happened.
I got a new job! I am still working with IBM Systems, but now am one of the Event Content Managers for the [IBM Systems Technical University] events, or TechU for short. For those who have attended these events, I have taken over for Glenn Anderson, who retired December 31, 2018.
Last year, the IBM TechU team asked me to present eight topics at an event in Johannesburg, South Africa, as a last-minute replacement for other speakers. While there, Glenn Anderson mentioned to me that he was planning to retire. "Do you know anyone interested in taking over for me?" he asked.
I jumped at the chance to apply for the job! There was stiff competition, but after three rounds of interviews and background checks, I was offered the position! This all happened after the zTechU event in Hollywood, Florida last October, so if this is the first you have heard of it, you are not alone.
For those wondering "What about the IBM Tucson Client Experience Center?" you have good reason to ask. I had worked at the center for the past 12 years, the last six as the chief subject matter expert (SME) for all things IBM System Storage. Who is going to replace me? The job posting is still open, and the new manager, John Zupetz, has been reviewing resumes.
As time permits, I will continue to do storage briefings to help out during this transition, both here in Tucson, Arizona, as well as outbound to various client and IBM locations. I have also offered to help train whomever gets hired for the job.
In my new role, I will be managing TechU events, selecting topics, accepting speakers, scheduling sessions, and even presenting sessions at the events themselves. I will be focused on IBM Z and LinuxONE mainframe servers and related Storage solutions, but will also manage sessions on soft skills for IT Leadership and Professional Development.
We plan to have about 18 events this year, spanning countries across six continents! I just finished smaller 3-day events in Istanbul, Turkey and Cairo, Egypt, and am now working on larger events to be held in Dubai, Atlanta, and Berlin!
Shameless plug! Registration is open for these TechU events. I plan to be at all three, if you want to meet me in person!
In the meantime, I have decided to take on two extra co-authors for this blog. Let me introduce them to you:
- Christopher Vollmar
Christopher Vollmar, or Chris for short, is a Storage Architect supporting customers in Canada and the Caribbean. He is a Certified Consulting IT Specialist and a Design Thinking practitioner.
Chris and I have worked together for many years. Most recently, we were in Zurich Switzerland as contributing authors to write two IBM Redpapers: [IBM Private, Public, and Hybrid Cloud Storage Solutions] and the [IBM Software-Defined Storage Guide].
You can follow Chris on twitter @vollmar_chris and LinkedIn Chris Vollmar.
- Lloyd Dean
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19-plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory.
I also have known Lloyd for years. In prior roles, Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the Advanced Technical Skills (ATS) organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
I have spent the last 10 weeks working with the IBM developerWorks team converting from a single-author blog to a multi-author blog. I had no idea it would be so complicated to re-work the HTML templates, acquire all the legal and managerial approvals, and then authorize additional contributors!
I look forward to working with my new co-authors!
Well, it's Tuesday again, and you know what that means? IBM Announcements! This week IBM announced new and refreshed storage products.
On Feb 20, there will be a [Live Stream event] to watch the announcements online. The event is at Half Moon Bay in California, starting at 9:30am Pacific Standard Time (PST).
IBM decided to do things a bit differently for this launch. Instead of dozens of stodgy press releases, IBM has opted to complement with a series of blog posts, with [Storage innovation drives 21st century business] providing an overall recap.
(FTC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement")
- IBM Spectrum NAS
IBM Spectrum NAS is a new software-defined storage offering to address three specific market segments:
- General purpose file serving and home directories
- Native SMB protocol NAS for Microsoft Windows Applications
- File serving for Virtualization Environments, such as VMware and Hyper-V
IBM Spectrum NAS is software that you can run on your x86 servers, either bare metal or as Virtual Machines. You start with four nodes, and can scale out to tens of machines as you grow.
IBM Spectrum NAS was written from scratch, not based on open source SAMBA software. It has already been deployed internally within IBM last year, and now is being productized. It is very compatible with the SMB2 and SMB3.1 protocol specifications, and supports the NFS3, NFS4 and NFS4.1 protocols as well.
As a scale-out solution, it is both more robust and scalable than a single Windows server, and less expensive to run than traditional dual-controller NAS filers.
To learn more, see the [IBM Spectrum NAS V1.7] press release, and [Easy to manage IBM Spectrum NAS] blog post.
- IBM Spectrum Protect and Protect Plus
IBM Spectrum Protect has been enhanced to detect ransomware attacks, and improved auditing to meet European Union's General Data Protection Regulation [GDPR] privacy legislation.
(If you are not in Europe, and feel this legislation does not apply, you may be sadly mistaken. This legislation may affect any company that shares information with EU companies, or has even a single client from the European Union. Think of it as this year's [Y2K crisis]. It hits globally on May 25, 2018.)
IBM Spectrum Plus offers snapshot support for both VMware and Hyper-V virtualization environments. The vSnap repository can now be replicated to remote facility for Business Continuity and Disaster Recovery (BC/DR). IBM Spectrum Plus is now also available as a Software-as-a-Service (SaaS) offering on IBM Cloud.
To learn more, see [Innovations just keep on coming with IBM Spectrum Protect and Protect Plus].
- NVMe over Fabric (NVMe-oF)
IBM FlashSystem 900 storage system will be enabled for NVMe-oF, initially over InfiniBand. In my January 2018 blog post [NVMe and the future of NVMe-OF and FC-NVMe], I cover these technologies in detail.
IBM also made an official Statement of Direction (SoD) to offer NVMe-oF support on many of its other storage systems as well.
To learn more, see [Driving new level performance with NVMe storage solutions].
- IBM Spectrum Virtualize
IBM Spectrum Virtualize is the software in SAN Volume Controller, FlashSystem V9000, and the Storwize V7000 and V5000 series. It is also available as software you can deploy on your own x86 servers, or in the IBM Cloud. Fellow IBM master inventor and blogger Barry Whyte has a great post on the details of Spectrum Virtualize v8.1.2 latest release, including [Data Reduction Pools].
To learn more, see [Spectrum Virtualize v8.1.2], [IBM SAN Volume Controller, IBM FlashSystem V9000, and IBM Storwize family], and [New architecture for IBM Spectrum Virtualize].
- IBM Cloud Object Storage System
Cohasset Associates has reviewed the IBM Cloud Object Storage (IBM COS) Compliance Enabled Vaults (CEV) capability and determined that this feature meets the U.S. Security Exchange Commission SEC 17a-4 requirement for non-erasable, non-rewriteable (NENR) tamperproof enforcement.
Some clients also refer to this as Immutability, Content Addressable Storage, or Write-Once Read-Many (WORM). Rather than invent new terminology, IBM opts to use Non-erasable, Non-rewriteable to match the standard language in the SEC 17a-4 legislation.
IBM COS is now also eligible for "Storage Utility" pricing. See my blog post [ IBM Announcements 2017 November] for details on how Storage Utility pricing is implemented.
To learn more, see [Regulatory Support for IBM COS], and [Storage Utility Pricing for IBM COS].
- IBM Spectrum Connect
More than 15 years ago, I was the chief architect for IBM Spectrum Control, which back then was called the IBM TotalStorage Productivity Center.
A subset of IBM Spectrum Control was needed for a variety of IBM storage products to support VMware in a consistent manner, so IBM made this available as the "Spectrum Control Base Edition", entitled at no additional charge. Last year, IBM also merged in storage enablement for containerized environments like Docker.
Since "IBM Spectrum Control Base Edition with Storage Enablement for containerized environments" is too long to say, IBM shortened this to "Spectrum Connect". In addition to VMware and Docker support, Spectrum Connect also supports Microsoft PowerShell and IBM Cloud Private.
To learn more, see [One stop integration for storage with IBM Spectrum Connect].
- IBM XIV Gen3 and the FlashSystem A9000/R
In 2016, IBM [announced the IBM FlashSystem A9000 and A9000R] as the official follow-on products for the XIV Gen3.
If you have 11.6.2a microcode on your XIV Gen3, you can now perform Online Volume Migration (OLVM) to FlashSystem A9000 and A9000R systems running 12.2.1 release. This will help clients in their migration efforts.
To learn more, see [A bridge across XIV generations].
Yes, this was a big launch! Send me your thoughts in the comments section below.
technorati tags: IBM, Half Moon Bay, Spectrum NAS, VMware, Hyper-V, open source, SAMBA, SMB, SMB2, SMB3, NFS3, NFS4, Spectrum Control, TotalStorage, Productivity Center, Spectrum Control Base Edition, SCBE, Docker, Microsoft PowerShell, IBM Cloud private, Spectrum Connect, XIV Gen3, FlashSystem A9000, FlashSystem A9000R, Spectrum Virtualize, SVC, FlashSystem V9000, Storwize, Storwize V5000, Storwize V7000
Well, it's Tuesday again, and you know what that means? IBM Announcements!
Today, IBM announces a complete refresh of its IBM FlashSystem® all-flash array product line.
(FCC Disclosure: I work for IBM. Compression, data footprint reduction, and performance results, based here on internal IBM tests, vary widely by data and workload type. Your mileage may vary. This blog post can be considered a "paid celebrity endorsement".)
- New FlashSystem 900 model AE3
The new AE3 model introduces new Microlatency cards at larger capacities: 3.6, 8.5 and 18 TB. Compare that to the previous model AE2 at 1.2, 2.9 and 5.7 TB.
These capacities are achieved by combining three-dimensional (3D) chip layout with Triple-Level Cell (TLC) transistors, often referred to as 3D-TLC. The previous technology was single-layer 2-dimensional, multi-level cells (MLC).
Last week, at IBM Systems Technical University in New Orleans, Clod Barrera, IBM Distinguished Engineer and Chief Technical Strategist, explained this via an analogy. The 2-dimensional is like a Bungalow. If you want to pack in more people, you need to make the rooms smaller, which is getting more difficult. Alternatively, you could build a multi-story skyscraper, adding more floors relieves pressure to shrink the rooms down.
Triple-level cell holds three bits per transistor. In the past, we had Single-level Cell (SLC) that stored one bit, and Multi-level Cell (MLC) that stored two bits. A future technology, Quad-level Cell (QLC) is not yet ready for production workloads in a datacenter.
The new AE3 models also offer Embedded inline Compression (EiC), with "Always-On" compression being done right on the Microlatency cards. With a fully-loaded 12 card 2U drawer, that is 10+P+S RAID-5 configuration, the amount of effective capacity is drastically increased:
|FlashSystem 900 Model AE3
|2U Drawer (Usable TB)
|2U Drawer (Effective TB) w/EiC
The compression gets 2x to 3.5x on typical data, but your mileage may vary. The small latency cards are capped at 110 TB, and the medium and large at 220 TB effective capacity, to avoid overwhelming the on-board DRAM cache. For clients who need smaller amounts of flash, IBM will continue to sell the AE2 models with 1.2 TB MLC Microlatency cards.
After the compression, the data is encrypted with AES 256-bit encryption. This is same as the previous AE2 models, so nothing changing there.
The EiC compression and encryption do not impact performance. The new Microlatency cards achieve as low as 95 microsecond latency, about 10x faster than traditional Solid-State Drives (SSD) found in Dell EMC XtremIO and Pure Storage competitive offerings, and 40 percent faster than the new NVMe Solid-State drives. A 2U drawer can deliver up to 1.2 million IOPS, slightly more than the AE2 models (1.1 Million IOPS).
To learn more, see [IBM FlashSystem 900 accelerates applications] press release.
- New FlashSystem V9000 AE3 enclosure models
The new FlashSystem V9000 take advantage of the new FlashSystem 900 AE3 models, effectively tripling the usable capacity.
The interesting thing now is compression. Both are hardware-accelerated, with EiC being done on the Flash cards, and Real-time Compression (RtC) being done by the Intel QuickAssist chips in the controllers.
The EiC method works on 4KB blocks, so only gets 2.5x to 3.5x on typical data. The RtC method works on larger 32KB blocks, is therefore able to find more replicated sequence of characters, gets up to 5x ratio, with compressed data in the controller node cache for better cache hit ratios.
However, RtC is limited to only 512 volumes, so admins would run the [Comprestimator tool] and select the cache friendly workloads with the best compression, such as Databases and CAD/CAM images.
With new FlashSystem V9000, you now get the benefits of both. Continue to use RtC for data that is better served with 4x-5x compression, and let EiC compress everything else!
|FlashSystem V9000 model AE3
|Usable (1 drawer) TB
|Usable (8 drawers) TB
Running a typical 70/30 workload, representing 70 percent reads and 30 percent writes, each controller pair can deliver up to 600,000 IOPS. With four V9000 controller pairs clustered together, that is 2.4 Million IOPS. For more read-intensive, cache-friendlier workloads, IBM has clocked the system up to 1.3 million IOPS per controller node-pair, and 5.2 million for a four-pair cluster.
As with the previous model, the FlashSystem V9000 offers "Easy Tier" automatic sub-LUN tiering, and "storage virtualization" to manage both SAS-attached and SAN-attached storage. Over 400 different devices from major vendors are supported. This means that the busiest blocks will be moved up to low-latency Flash, and less active data will be moved to spinning disk.
To learn more, see [IBM FlashSystem V9000 Control Enclosure Model AE3] press release.
- New FlashSystem A9000 and A9000R model 425
As with the FlashSystem V9000, the A9000/R model 425 use the new FlashSystem 900, increasing the effective capacity.
The A9000/R models will continue to do "Data Footprint Reduction" of pattern removal, data deduplication and RtC compression for data to achieve up to 5x compression ratio. However, to improve performance, internal metadata will not be compressed with RtC, allowing the underlying Flash cards to do EiC instead. This reduces CPU workload.
The FlashSystem A9000 model 425, aka "The Pod", has three grid controllers combined with the new FlashSystem 900 model AE3 for compact 8U solution that can store nearly a Petabyte. For smaller deployments, IBM also offers an 8-card partially-filled drawer for lower entry system size.
|A9000 Model 425
|Number of cards/drawer
|Effective @5x TB
The FlashSystem A9000R model 425, aka "The Rack", has two to four grid elements, each grid element has two grid controllers and one FlashSystem 900 AE3 drawer. The previous 415 model supported five and six grid elements, but for now, model 425 is limited to just two, three or four. The A9000R model 425 supports all three Microlatency sizes, whereas the previous 415 model only supported medium (2.9 TB) and large (5.7 TB) sizes.
|FlashSystem A9000R model 425
|Usable (2 elements) TB
|Usable (3 elements) TB
|Usable (4 elements) TB
Performance of both the A9000 and A9000R are based on the number of grid controllers. Each grid controller gets about 300,000 IOPS. The A9000 pod with three controllers gets up to 900,000 IOPS. Each A9000R grid element has two controllers, so 600,000 IOPS per element, with 2.4 million IOPS for a maxed out four-element A9000R rack.
Along with the hardware changes, IBM released version 12.2 of the Spectrum Accelerate software that runs in the FlashSystem A9000/R models.
This version supports Asynchronous mirroring between FlashSystem A9000/R systems and IBM XIV Gen3 storage. The replication can go in either direction, but the intent is to use FlashSystem for production, replicating to XIV Gen3 at a disaster recovery facility. Version 12.2 also increased the number of volumes, snapshots, and consistency groups supported.
- 24,000 volumes and snaps
- 1024 consistency groups, 512 volumes per consistency group
The new version applies to both the new model 425, as well as the previous 415 models!
To learn more, see [IBM FlashSystem A9000 leverages the latest 3D-TLC NAND flash] and [ IBM FlashSystem A9000/R Software V12.2] press releases.
I realize this is a lot to absorb, feel free to read this blog post over and over until it sinks in!
technorati tags: IBM, #AllFlash, All-Flash Array, AFA, FlashSystem, FlashSystem 900, FlashSystem AE2, FlashSystem AE3, Microlatency, 3D-NAND, NAND Flash, 3D-TLC, SLC, MLC, QLC, Clod Barrera, bungalow, skyscraper, Embedded inline Compression, EiC, Always-On Compression, RAID-5, DRAM, FlashSystem V9000, Real-time Compression, RtC, Intel QuickAssist, Comprestimator Tool, Database, CAD/CAM, IOPS, Easy Tier, automatic tiering, FlashSystem 415, FlashSystem 425, FlashSystem A9000, FlashSystem Pod, FlashSystem A9000R, FlashSystem Rack, CPU, Spectrum Accelerate, XIV Gen3, Asynchronous Mirror