Continuing my drawn out coverage of IBM's big storage launch of February 9, today I'll cover the IBM System Storage TS7680 ProtecTIER data deduplication gateway for System z.
On the host side, TS7680 connects to mainframe systems running z/OS or z/VM over FICON attachment, emulating an automated tape library with 3592-J1A devices. The TS7680 includes two controllers that emulate the 3592 C06 model, with 4 FICON ports each. Each controller emulates up to 128 virtual 3592 tape drives, for a total of 256 virtual drives per TS7680 system. The mainframe sees up to 1 million virtual tape cartridges, up to 100GB raw capacity each, before compression. For z/OS, the automated library has full SMS Tape and Integrated Library Management capability that you would expect.
Inside, the two control units are both connected to a redundant pair cluster of ProtecTIER engines running the HyperFactor deduplication algorithm that is able to process the deduplication inline, as data is ingested, rather than post-process that other deduplication solutions use. These engines are similar to the TS7650 gateway machines for distributed systems.
On the back end, these ProtecTIER deduplication engines are then connected to external disk, up to 1PB. If you get 25x data deduplication ratio on your data, that would be 25PB of mainframe data stored on only 1PB of physical disk. The disk can be any disk supported by ProtecTIER over FCP protocol, not just the IBM System Storage DS8000, but also the IBM DS4000, DS5000 or IBM XIV storage system, various models of EMC and HDS, and of course the IBM SAN Volume Controller (SVC) with all of its supported disk systems.Here is the [Announcement Letter]
Comment (1) Visits (13966)
Five years ago, I sprayed coffee all over my screen from something I read on a blog post from fellow blogger Hu Yoshida from HDS. You can read what cased my reaction in my now infamous post [Hu Yoshida should know better]. Subsequently, over the years, I have disagreed with Hu on a variety of of topics, as documented in my 2010 blog post [Hu Yoshida Does It Again].
(Apparently, I am not alone, as the process of spraying one's coffee onto one's computer screen while reading other blog posts has been referred to as "Pulling a Tony" or "Doing a Tony" by other bloggers!)
Fortunately, my IBM colleague David Sacks doesn't drink coffee. Last month, David noticed that Hu had posted a graph in a recent blog entry titled [Additional Storage Performance Efficiencies for Mainframes], comparing the performance of HDS's Virtual Storage Platform (VSP) to IBM's DS8000.
For those not familiar with disk performance graphs, flatter is better, lower response time and larger IOPS are always desired. This graph implies that the HDS disk system is astonishingly faster than IBM's DS8000 series disk system. Certainly, the HDS VSP qualifies as a member of the elite [Super High-End club] with impressive SPC benchmark numbers, and is generally recognized as a device that works in IBM mainframe environments. But this new comparison graph is just ridiculous!
(Note: While SPC benchmarks are useful for making purchase decisions, different disk systems respond differently to different workloads. As the former lead architect of DFSMS for z/OS, I am often brought in to consult on mainframe performance issues in complex situations. Several times, we have fixed performance problems for our mainframe clients by replacing their HDS systems with IBM DS8000 series!)
Since Hu's blog entry contained very little information about the performance test used to generate the graph, David submitted a comment directly to Hu's blog asking a few simple questions to help IBM and Hu's readers determine whether the test was fair. Here is David's comment as submitted:
Unlike my blog on IBM, HDS bloggers like Hu are allowed to reject or deny comments before they appear on his blog post. We were disappointed that HDS never posted David's comment nor responded to it. That certainly raises questions about the quality of the comparison.
So, perhaps this is yet another case of [Hitachi Math], a phrase coined by fellow blogger Barry Burke from EMC back in 2007 in reference to outlandish HDS claims. My earliest mention was in my blog post [Not letting the Wookie Win].
By the way, since the test was about z/OS Extended Address Volumes (EAV), it is worth mentioning that IBM's DS8700 and DS8800 support 3390 volume capacities up to 1 TB each, while the HDS VSP is limited to only 223 GB per volume. Larger volume capacities help support ease-of-growth and help reduce the number of volumes storage administrators need to manage; that's just one example of how the DS8000 series continues to provide the best storage system support for z/OS environments.
Personally, I am all for running both IBM and HDS boxes side-by-side and publishing the methodology, the workload characteristics, the configuration details, and the results. Sunshine is always the best disinfectant!
Continuing my coverage of the 30th annual [Data Center Conference]. we had a Solution Showcase booth open Monday, Tuesday and part of Wednesday.
Here is the IBM System z114 mainframe with David Ayd in his white lab coat.
Dana Grove in the white lab coat shows off the "IBM Watson" simulator to Steve Sams.
Here is a side view, to see how thin the "IBM watson" simulator is.
Across the aisle was the ever-popular IBM Portable Modular Data Center (PMDC)
We were conveniently positioned between the wine and dessert areas. The Solution Showcase is a great opportunity to catch up with the latest technologies and vendors.
This week, IBM celebrates its Centennial, 100 years since its incorporation on June 16, 1911.
A few months ago, the Tucson Executive Briefing Center ordered its latest IBM System Storage [DS8800] to be on display for demos. This was manufactured in Vác, Hungary (about an hour north of Budapest), and was going to be shipped over to the United States.
However, Sam Palmisano, IBM Chairman and CEO, was in Hannover, Germany for the [CeBIT conference] and wanted this DS8800 to be re-directed to Germany first for this event. He was kind enough to sign it for us. Brian Truskowski, IBM General Manager for Storage, and Rod Adkins, IBM Senior Vice President for IBM Systems Technolgoy Group (and my fifth-line manager), also signed this as well!
I am pleased to say this "signed" DS8000 has arrived to Tucson. This is the latest model in a family of market-leading high-end enterprise-class disk systems designed to attach to all computers, including System z mainframes, POWER systems running AIX and IBM i, as well as servers running HP-UX, Solaris, Linux or Windows.
For more on IBM's other innovations over the past 100 years, check out the [Icons of Progress], which includes these storage innovations:
If you are planning a visit to Tucson, please ask for a tour to see this DS8800, a historic monument to disk innovation!
Continuing my saga for my [New Laptop], I have gotten all my programs operational, and now it is a good time to re-evaluate how I organize my data. You can read my previous posts on this series: [Day 1], [Day 2], [Day 3].
I started my career at IBM developing mainframe software. The naming convention was simple, you had 44 character dataset names (DSN), which can be divided into qualifiers separated by periods. Each qualifier could be up to 8 characters long. The first qualifier was called the "high level qualifier" (HLQ) and the last one was the "low level qualifier" (LLQ). Standard naming conventions helped with ownership and security (RACF), catalog management, policy-based management (DFSMS), and data format identification. For example:
In the first case, we see that the HLQ is "PROD" for production, the application is PAYROLL and this file holds job control language (JCL). The LLQ often identified the file type. The second can be a version for testing a newer version of this application. The third represents user data, in which case my userid PEARSON would have my own written TEST JCL. I have seen successful naming conventions with 3, 4, 5 and even 6 qualifiers. The full dataset name remains the same, even if it is moved from one disk to another, or migrated to tape.
(We had to help one client who had all their files with single qualifier names, no more than 8 characters long, all in the Master Catalog (root directory). They wanted to implement RACF and DFSMS, and needed help converting all of their file names and related JCL to a 4-qualifer naming convention. It took seven months to make this transformation, but the client was quite pleased with the end result.)
While the mainframe has a restrictive approach to naming files, the operating systems on personal computers provide practically unlimited choices. File systems like NTFS or EXT3 support filenames as long as 254 characters, and pathnames up to 32,000 characters. The problem is that when you move a file from one disk to another, or even from one directory structure to another, the pathname will change. If you rely on the pathname to provide critical information about the meaning or purpose of a file, that could get lost when moving the files around.
I found several websites that offered organization advice. On The Happiness Project blog, Gretchen Rubin [busts 11 myths] about organization. On Zenhabits blog, Leo Babauta offers [18 De-cluttering tips]. Peter Walsh's [Tip No. 185] suggests using nouns to describe each folder. Granted these are about physical objects in your home or office, but some of the concepts can apply to digital objects on your disk drive.
Other websites were specific to organizing digital files on your personal computer. On her Lifehacker blog, Gina Trapani shows her approach to [Organizing "My Documents"]. Chanel Wood offers her [How to organize your computer and still remember where you put everything], based on a simple alphabetic system. Microsoft offers [9 tips to organize files better]. Most of the advice was common sense, but this one, from Peter Walsh's [Tip No. 190], I found amusing:
"Use the computer’s sorting function. Put “AAA” (or a space) in front of the names of the most-used folders and “ZZZ” (or a bullet) in front of the least-used ones, so the former float to the top of an alphabetical list and the latter go to the bottom."
Personally, I hate spaces anywhere in directory and file names, and the thought of putting a space at the front of one to make it float to the top is even worse. Rather than resorting to naming folders with AAA or ZZZ, why not just limit the total number of files or directories so they are all visible on the screen. I often sort by date to access my most frequently-accessed or most
Of all the suggestions I found, Peter Walsh's "Use Nouns" seemed to be the most useful. Wikipedia has a fascinating article on [Biological Classification]. Certainly, if all living things can be put into classifications with only seven levels, we should not need more than seven levels of file system directory structure either! So, this is how I decided to organize my files on my new Thinkad T410:
I'll give this new re-organization a try. Since I have to take a fresh backup to Tivoli Storage Manager anyways, now is the best time to re-organize the directory structure and update my dsm.opt options file.
Continuing my coverage of last week's Data Center Conference 2009, I attended another "User Experience" that was very well received. This time, it was Henry Sienkiewicz of the Department Information Systems Agency (DISA) presenting a real-world example of the business model behind a private cloud implementation. DISA is the US government agency that develops and runs software for the Army, Navy and Air Force.
Being part of the military presents its own unique set of challenges:
Using Cloud Computing simplifies provisioning, encourages the use of standards, and provides self-service. DISA has several solutions.
In their traditional approach, a software project would take six months to procure the hardware, another 6-12 months code and test, and then another 6 months in certification, for a total of 18-24 months. With the new Cloud Computing approach that DISA adopted, procurement was down to 24-72 hours with RACE, code test took only 2-6 months with Forge.Mil, and certification could be done in days on RACE, resulting in a new total of only 3-6 months. Some challenges they found:
Some lessons learned from this two-year experience: