This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections Platform is now in read-only mode and content is only available for viewing. No new wiki pages, posts, or messages may be added. Please see our FAQ for more information. The developerWorks Connections platform will officially shut down on March 31, 2020 and content will no longer be available. More details available on our FAQ. (Read in Japanese.)
Down the street, in Times Square, IBM made it on the big board.
Continuous Data Availability
Jeanine Cotter, IBM VP for Data Center Services, started out with a video about Sabre. IBM developed this revolutionary airline reservation system to handle the huge volume of transactions. Today, 18 percent of organizations consider downtime unacceptable for their tier-1 applications, and 53 percent would be seriously impacted by an outage lasting an hour or more.
Eventually, companies cross the "Continuous Availability" threshold, the point where they discover that the possibility of downtime is too costly to ignore. IBM has clients using 3-site Metro/Global Mirror that can fail-over an entire data center in just five mouse clicks.
Jeanine also mentioned Euronics, which is using SAN Volume Controller's Stretched Cluster capability, which allows them to easily vMotion virtual guest images from one data center to another. SVC has had this capability for a while, but now, with full VMcenter plug-in and VAAI support, the capability is fully integrated with VMware.
A final example was a mid-sized University, they are using IBM Storwize V7000 with Metro Mirror. The primary location's Storwize V7000 manages Solid-state drives with Easy Tier. The secondary location's Storwize V7000 has high-capacity SATA drives and FlashCopy.
Customer Testimonial - University of Rochester Medical Center
Rick Haverty, Director of IT infrastructure at University of Rochester Medical Center [URMC] provided the next client testimonial. The mission of the URMC is to use science, education and technology to improve health. URMC gets over $400 million USD in NIH grants, which puts them around 23rd largest University-based academic medical centers in the country. They have over 900 doctors, general practice and specialists.
URMC has an IBM BlueGene supercomputer, a Cisco network over 45,000 ports, and over 7.5 million square feet of Wi-Fi wireless internet coverage. They have three datacenters. The first is 7500 square feet, the second is 6000 square feet, and the third is just 800 square feet to hold their "off-site tapes".
URMC has digitized all of their records, including Electronic Medical Records (EMR) system, medical dosage history, imaging "priors", calibration of infusion pumps, RFID monitoring, and even provide IT support while the patient is on the operating table. RFID monitoring ensures all of the refrigerators are keeping medications at the right temperature. A single failed refrigerator can lose $20,000 dollars worth of medication.
When is a good time for downtime? At URMC, they handle 90,000 Emergency Room vists per year, so the answer is never. When is the ER busiest? Monday morning. (not what I expected!)
URMC's EMR software (Epic) runs on clustered POWER7 servers, with DS8700 disk systems using Metro Mirror to secondary location. They also keep a third "shadow" POWER7 for read-only purposes, and a separate system that provides web-based read-only access. Finally, they have 90 stand-alone Personal Computers (PCs) that contain information for all the patients that have reservations this week, just in case all the other systems fail.
The exploding volume of data comes from medical imaging. For radiology (X-rays), each image is called a "study" takes 20-30 MB each, and they have 650,000 studies per year. This represents about 16TB storage per year, with 3 second response time access. These must be kept for 7 years since last view, or until the patient reaches the age of 18 years old, which ever is later.
But radiology is just one discipline. Healthcare has a whole bunch of "ologies". Another is "Pathology" which looks at cells between glass slides in a microscope. Each study consumes 10-20GB, and URMC does about 100,000 pathology studies per year, representing 150TB per year.
URMC has identified that they have 42 mission-critical applications. The data for these are stored on DS8000, XIV, Storwize V7000 and DS5000, all managed behind SAN Volume Controller.
This is my final post on my coverage of the 30th annual [Data Center Conference]. IBM was a Platinum sponsor, and there were over 2,600 attendees, of which 27 percent were IT Directors or higher. Two thirds of the companies have 5000 employees or more. Here is a recap of the last few sessions I attended.
Best Practices for Data Center consolidation
As if the conference co-chairs aren't already super-busy, here they are presenting one of the breakout sessions. In the 1990s, consolidation was done purely to reduce total cost of ownership (TCO). Today, there are a variety of other reasons, including issues with power and cooling, service level agreements, and security.
Of these, 25 percent plan to have more data centers in three years, and 47 percent plan to consolidate to fewer. The benefits to consolidation include economies of scale, staff reduction, reduced hardware facilities costs, and application retirement. Challenges include dealing with politics, building new facilities to replace the old ones, and bandwidth. Here were some of the primary reasons why data center consolidation projects fail:
Human Resources (HR) issues
Resources not freed available
Lack of Project Management skills
No rationalization at consolidated site
Interactive Polling Results
The last keynote session was Thursday morning. The conference co-chairs present the highlights of the interactive polling that was done during the week at this conference.
The first topic was social media. There was a lot of Twitter activity with hashtag #GartnerDC that I followed throughout the week. Most of the tweets seem to be from people who were not actually at the conference.
Some 45 percent of the attendees have implemented social media initiatives at their companies. What tooling are they using to accomplish this? There are some provided by the major ITSM vendors, tools specific for corporate social media such as Yammer, collaboration tools like Microsoft SharePoint and IBM's Lotus Connections, and public sites like Facebook and Twitter. Here were the poll results:
The next topic was focused on Mobile devices and Cloud Computing. For example, do companies store data in public cloud, or plan to in the future, for mobile devices?
One third of the attendees allow employees to bring their own tablet to work with full IT support. Only 18 percent allow employees to bring their own PC or laptop. Over 40 percent felt that their IT department was not yet ready to support smartphones.
What are the main drivers to adopt private cloud? Some are deploying private clouds as a way to defend their IT jobs from going to the public cloud. Here were the poll results:
What problems are companies trying to solve with cloud computing? Here were the poll results:
A majority of attendees that use VMware are exploring LInux KVM, such as Red Hat Enterprise Virtualization (RHEV) or Microsfot Hyper-V. What storage protocol are attendees using for their server virtualization? Here were the poll results:
The next topic was the process for IT service management. The top three were ITIL, CMMI and DevOps, with the majority using ITIL or ITIL in combination with something else. These are needed for release management, change management, performance management, capacity management and incident management. How collaborative is the relationship between IT operations and application development? Here were the poll results:
How well does IT operations contribute to business innovation? This year 38 percent were satisfied, and 33 percent unsatisfied. This was a big improvement over last year, that found 19 percent satisfied, 64 percent unsatisfied.
Building a Private Storage Cloud: Is It a Science Experiment?
While everyone understands the benefits of private and public cloud computing, there seems to be hesitation about hosted cloud storage. Some people have already adopted some form of cloud storage, and other plan to within 12 months. Here were the poll results:
The top three reasons for considering public cloud storage was to adopt lower-cost storage tier, to benefit from off-site storage, and staff constraints. The top concerns were security and performance.
The IT department will need to start thinking like a cloud provider, and perhaps adopt a hybrid cloud approach. What IT equipment can be re-used? What will the new IT operations look like in a Cloud environment? What were the primary use cases for cloud storage? Here were the poll results:
In addition to the major cloud providers (IBM, Amazon, etc.) there are a variety of new cloud storage startups to address these business needs.
So that wraps up my coverage of this conference. In addition to attending great keynote and breakout sessions, I was able to have great one-on-one discussions with clients at the Solution Showcase booth, during breaks and at meals. IBM's focus on Big Data, Workload-optimized Systems, and Cloud seems to resonate well with the analysts and attendees. I want to give special thinks to Lynda, Dana, Peggy, Hugo, David, Rick, Cris, Richard, Denise, Chloe, and all my colleagues, friends and family from Arizona for their support!
Well, it's Tuesday, and you all know what that means... IBM announcements!
This week, IBM announced the IBM Tivoli Storage Productivity Center for Disk Midrange Edition, affectionately referred to as "MRE". This is basically TPC for Disk but with two key differences:
A special license that covers only DS3000, DS4000, DS5000 series, whether natively attached or virtualized behind SAN Volume Controller.
A new pricing model based on the number on controllers and drawers, rather than by TB managed. For example, if you have a DS5300 and two expansion drawers, then you pay for three units of MRE. As you upgrade from smaller capacity disks to larger capacity disks, your license costs won't increase. This eliminates the quarterly hassle to "true up" your software licenses to match actual capacity that is required on TB-based licensing.
Every January, we look back into the past as well as look into the future for trends to watch for the upcoming year. Ray Lucchesi of Silverton Consulting has a great post looking back at the [Top 10 storage technologies over the last decade]. I am glad to see that IBM has been involved with and instrumental in all ten technologies.
Looking into the future, Mark Cox of eChannel has an article [Storage Trends to Watch in 2011], based on his interviews with two fellow IBM executives: Steve Wojtowecz, VP of storage software development, and Clod Barrera, distinguished engineer and CTO for storage. Let's review the four key trends:
Cloud Storage and Cloud Computing
No question: Cloud Computing will be the battleground of the IT industry this decade. I am amused by the latest spate of Microsoft commercials where problems are solved with someone saying "...to the cloud". Riding on the coat tails of this is "Cloud Storage", the ability to store data across an Internet Protocol (IP) network, such as 10GbE Ethernet, in support of Cloud Computing applications. Cloud Storage protocols in the running include NFS, CIFS, iSCSI and FCoE.
Mark writes "..vendors who aren't investing in cloud storage solutions will fall behind the curve."
Economic Downturn forces Innovation
The old British adage applies: "Necessity is the mother of invention." The status quo won't do. In these difficult economic times, IT departments are running on constrained budgets and staff. This forces people to evaluate innovative technologies for storage efficiency like real-time compression and data deduplication to make better use of what they currently have. It also is forcing people to take a "good enough" attitude, instead of paying premium prices for best-of-breed they don't really need and can't really afford.
IT Service Management
Companies are getting away from managing individual pieces of IT kit, and are focusing instead on the delivery of information, from the magnetic surface of disk and tape media, to the eyes and ears of the end users. The deployment mix of private, hybrid and public clouds makes this even more important to measure and manage IT as a set of services that are delivered to the business. IT Service Management software can be the glue, helping companies implement ITIL v3 best practices and management disciplines.
Smarter Data Placement
A recent survey by "The Info Pro" analysts indicates that "managing storage growth" is considered more critical than "managing storage costs" or "managing storage complexity".
This tells me that companies are willing to spend a bit extra to deploy a tiered information infrastructure if it will help them manage storage growth, which typically ranges around 40 to 60 percent per year. While I have discussed the concept of "Information Lifecycle Management" (ILM), for the past four years on this blog, I am glad to see it has gone mainstream, helped in part with automated storage tiering features like IBM System Storage Easy Tier feature on the IBM DS8000, SAN Volume Controller and Storwize V7000 disk systems. Not all data is created equal, so the smart placement of data, based on the business value of the information contained, makes a lot of sense.
These trends are influencing what solutions the various different vendors will offer, and will influence what companies purchase and deploy.
I'm glad to be back home in Tucson for a few weeks. All of these conferences kept mefrom reading up with what was going on in the blogosphere.
A few of us at IBM found it odd that EMC would announce their new Geographically Dispersed Disaster Restart (GDDR) the weekBEFORE their "EMC World" conference. Why not announce all of the stuff all at once instead at the conference?Were they worried that the admission that "Maui" software is still many months awaythat much of a negative stigma? The decision probably went something like this:
EMCer #1: GDDR is finally ready, should we announce now, or wait ONE week to make it part of the thingswe announce at EMC World?
EMCer #2: We are not announcing much at EMC World and what people really want us to talk about, Maui, wearen't delivering for a while. Why can't people understand we are company of hardware engineers, not software programmers! So, better not be associated with that quagmire at all.
EMCer #1: Yes, boss, I see your point. We'll announce this week then.
My fellow blogger and intellectual sparring partner, Barry Burke, on his Storage Anarchist blog, posted [are you wasting money on your mainframe dr solution?"] to bringup the GDDR announcement. The key difference is that IBM GDPS works withIBM, EMC and HDS equipment, being the fair-and-balanced folks that IBM clientshave come to expect, but it appears EMC GDDR works only with EMC equipment.Because GDDR does less, it also costs less. I can accept that. You get whatyou pay for. Of course, IBM does have a variety of protection levels, one probably will meet your budget and your business continuity needs.
To correct Barry's misperception, companies that buy IBM mainframe servers do have a choice.They can purchase their operating system from IBM, get their Linux or OpenSolarisfrom someone else like Red Hat or Novell, or build their own OS distribution fromreadily available open source. And unlike other servers that might require at leastone OS partition from the vendor, IBM mainframes can run 100 percent Linux.GDPS supports a mix of OS data. z/OS and Linux data can all be managed by GDPS.Companies that own mainframes know this. I can forgive the misperception from Barry,as EMC is focused on distributed servers instead, and many in their company may not have muchexposure to mainframe technology, or have ever spoken to mainframe customers.
But what almost had me fall out of my chair was this little nugget from his post:
"If you're an IBM mainframe customer, you are - by definition - IBM's profit stream."
Honestly, is there anyone out there that does not realize that IBM is a for-profitcorporation? In contrast, Barry would like his readers to believe that EMC is selling GDDR at cost, andthat EMC is a non-profit organization. While IBM has been delivering actual solutions thatour clients want, EMC continues to rumor that someday they might get around to offering something worthwhile.In the last six months, the shareholders have interpreted both strategies for what they really are,and the stock prices reflect that:
(Disclosure: I own IBM stock. I do not own EMC stock. Stock price comparisonsby Yahoo were based on publicly reported information. The colors blue and red to represent IBM and EMC, respectively, were selected by Yahoo graph-making facility. The color red does not necessarily imply EMC is losing money or having financial troubles.)
Of course, I for one would love to help Barry's dream of EMC non-profitability come true. If anyone has any suggestions how we can help EMC approach this goal, please post a comment below.
While many are just becoming familiar with the end-user interfaces of Web 2.0, from blogs and wikis to FaceBook and FlickR, fewer may be familiar with the "information infrastructure" of servers and storagebehind the scenes.
Last year, I bought an XO laptop under the One Laptop Per Child [OLPC] foundation's Give-1-Get-1 program and posted my impressions on this blog. One in particular, my post[Printingon XO laptop with CUPS and LPR] showed how to print from the XO laptop over to a network-attached printer.This caught the attention of the OLPC development team, who asked me tohelp them with another project as a volunteer. Before accepting, I had to learn what skills they were really looking for, especially since I do notconsider myself an expert in neither printing nor networking.
(Unlike a regular 9-to-5 job where most people just try to look busy for eight hours a day, doingvolunteer work means being ready to ["roll up your sleeves"] and actuallyaccomplish something. This applies to any kind of volunteer work, from hammering nails for [Habitat for Humanity] to sorting cans at the [Community Food Bank].Best Buy uses the phrase "Results Oriented Work Environment" [ROWE] to describetheir latest program, modeled in part after the mobile workforce policies of Web2.0-enlightened companiesIBM and Sun, but that is perhaps a topic for another blog post!)
Apparently, to support a school full of students with XO laptops, it would be nice to have a few serversthat provide support to manage the class lesson plans, make reading materials and other content available,and keep track of results. What they need is an "information infrastructure"! They decided on two specific servers:
School Server -- this would run a popular class management system called [Moodle]
Library Server -- a server for a digital library collection, based on Fedora Commons[16-minute video]
In keeping with OLPC philosophy to use free and open source software[FOSS], both servers are based on the [LAMP] platform. LAMP is an acronym for thecombined software bundle of Linux, Apache, MySQL and a Programming language like PHP. The "XS" team working onthe school server wanted me to build a LAMP server and install Moodle to help test the configuration, determinewhat other software is required, and perhaps develop a backup/recovery scenario. Basically, they needed someone with Linux skills to put some hardware and software together.
(I am no stranger to Linux. Back in the 1990s, I was part of the Linux for S/390 team, led the effort to createthe infamous "compatible disk layout" (CDL) that allows z/OS to access ESCON and FICON-attached Linux volumes,took my LPI certification exam, and led a team to validate FCP drivers for our disk and tape storage systems. For an IBMer to volunteer foran Open Source community project, you have to take an "open source" class and get management approval to reviewfor any possible "conflicts of interest". I got this all taken care of, and accepted to help the XS team.)
Building a test environment is similar to baking a cake. You have a recipe, utensils, and ingredients. Here'sa bit of description of each of the ingredients:
Like Windows, the Linux operating system comes in different flavors to run on handhelds, desktops and servers. For servers, IBM tends to focus on Red Hat Enterprise Linux (RHEL) and SUSE Linux Eneterprise Server (SLES). However, the XS team decidedinstead to use [Fedora 7], a community-supported version from Red Hat. Earlier versions of Fedora were known as "Fedora Core", but apparently with version 7, the word "Core" has been dropped. Fedora 7 can be used in either desktop or server mode.
[Apache] is web server software, and half of all web servers on the internet use it. It competes head-on against Micorosofts Internet Information Services (IIS) serverprovided in Windows 2003. The Apache name is partly from thefact that its origins were "a patchy" variant of the NCSA HTTPd 1.3 codebase. Thepopular [IBM HTTP Server] is poweredby Apache, with added support to the rest of the IBM WebSphere software portfolio. The XS team chose Apache v2as the web server platform.
[MySQL] is a relational database management system (RDBMS) software, similar to commercial products like IBM DB2 Universal Database, Oracle DB, or Microsoft SQL Server. The SQL stands for Structured Query Language, developed by IBM in the early 1970s as a standard languageto update and query database tables. MySQL comes in two flavors, MySQL Enterprise for commercial use, and MySQLCommunity, which is community-supported. There are over 10 million instances of MySQL running websites on the internet, which helps explain why Sun Microsystems agreed to acquire MySQL AB company last month.The XS team decided on MySQL 5.0 as the database platform.
To make HTML pages dynamic, including the possibility to add or query database contents, requires programming.A variety of web scripting languages were developed, all starting with the letter "P" to claim to be the programming part of the LAMP platform, including [PHP], Perl, and Python. Later, new programming language frameworks have been developed that do not start with the letter "P", like [Ruby on Rails]. PHP is short for PHP: Hypertext Preprocessor which explains that it pre-processes HTML during web serving,looking for special tags indicating PHP code, allowing programming logic to insert HTML content, such as information extracted from a database.While Python is the language that runs the Sugar interface on the XO laptops, the XS team decided onPHP v5 as the programming language for the server.
As for utensils, you only need a few utilities
A simple text editor: I go old-school and use the classic "vi" (to learn this editor, see the["Cheat Sheet" method] on IBM Developerworks)
secure socket shell (SSH): this allows you to access one server from another
browser access to the internet: when you encounter problems, get error messages, or whatever, it pays to know how to search for things with Google
As for a recipe, the Moodle website spells out some unique details and parameters. For the base LAMP platform,I chose to follow the book [Fedora 7 Unleashed] that has specific chapters on setting up SSH, Apache, MySQL, PHP, Squid and so on. The resultingconfiguration looks like this:
Here were the sequence of events:
I took an old PC that I wasn't using anymore, backed up the Windows system, and installed Linux on top. Thebook above had a Fedora 7 DVD on the back jacket, but I used the [OLPC LiveCD] that had some values pre-configured.
Set the IP address static. I set mine to 192.168.0.77 which nobody sees except my other systems.
My school server is "headless" which means it does not have its own keyboard, video or mouse. It also runs only to Linux run level 3, command line interface only, no graphics.I was able toshare using a KVM switch], but this meant having to remember something on one screen while I was switching over to the other. My Windows XP system has mybrowser connection to the internet to follow instructions or read error messages, so I need that up all thetime. To get around this, on my Windows XP system,I generated SSH public and private keys, copied the public key over to my new Linux system, and used [OpenSSH for Windows] to connect over. Now, on one screen,I have my Windows XP Firefox browser, and a separate command line window that is accessing my Linux schoolserver.
With SSH up and running, I can now use "vi" to edit files, and issue commands to install or activatethe remaining software. First up, Apache. I got this working, and from Windows XP, verified that going to"http://192.168.0.77" showed the Apache test screen.
I installed PHP, and tested it with a simple short index.php file.
I installed MySQL, setup the base "installation databases", and created a test database. Here is whereyou might want to set a password for the MySQL root user, but I chose to do that later for now.
I installed Moodle. It was smart enough to check that Apache, PHP, and MySQL were operational, andapparently I missed a few special "PHP" modules that had to be linked in. I was able to find them, downloadthem, and get them installed.
I brought up Moodle, created a "class category" of SCIENCE and a new class "Chemistry 101", and it allworked.
I also activated Squid, which is a web proxy cache server that stores web pages for faster access.
Another idea was to activate Samba, to provide CIFS file and print sharing, but I decided to put this off.
I got all of this done last Saturday, start to finish. Now the fun begins. We are going to run throughsome tests, document the procedures, and try to get a system up and running in a remote school in Nepal. Fornow, I have only one XO laptop to simulate what the student sees, and one laptop that can represent eithera teacher's Windows-based laptop, or run QEMU and emulate a second XO laptop.For tuning, I might go through the procedures mentioned on IBM Developerworks "Tuning LAMP"[Part 1, Part 2,Part 3].
For those in the server or storage industry that need to understand Web 2.0 information infrastructure better,building a LAMP server like this can be quite helpful.
It takes me 20-30 minutes to complete a crossword or Sudoku puzzle. I am in no hurry, and I find the process relaxing. But what if you were paid to complete a puzzle? In that case, finishing the puzzle sooner, in fewer minutes, means more money in your paycheck per hour worked! However, getting paid would mean that doing these puzzles may no longer be fun or relaxing.
The idea of converting a hobby into a revenue-generating activity is not new. Who wouldn't want to earn money doing something you were planning to do already? The television is full of commercial advertisements for credit cards where you can earn Double Miles or Cash Rewards just for spending money on things you were going to spend on anyways.
But is "earn" the right word? The merchants pay a percentage fee every time a patron uses a credit card, and the bank is just providing a marketing incentive in the form of a portion of those fees back to the consumer, to encourage more usage of their card versus other forms of payment. Sort of like "profit sharing".
(FTC Disclosure: I am a full-time employee and shareholder of the IBM Corporation. This blog post should not be considered an endorsement for anything. My opinions and writings are based on publicly available information and my own experiences doing freelance work prior to my employment at IBM. I have no hands-on experience with Amazon Mechanical Turk, neither as a worker nor requester, have not participated in TopCoder contests, nor have I used the Viggle app. I do not have any financial interest in Amazon, TopCoder, Viggle or any other third-party company mentioned on this blog post, nor has anyone paid me to mention their company names, brands or offerings.)
Here's how it works. You get the app on your phone, and register each television show as you watch it. You can watch the show live, or much later recorded on your Tivo. You watch the shows you were going to watch anyways, and just provide your demographics, all in the name of market research. You get two points per minute of watching, and after 7,500 points, you get a $5 gift card from retailers such as from retailers such as Burger King, Starbucks, Best Buy, Sephora, Fandango, and CVS drugstores. For the typical American, it would take about three weeks to watch that much television!
Of course, this is not the only way to earn money working from home. A reader asked me for my opinions of [Amazon Mechanical Turk]. While the other examples above are done for marketing purposes, Mechanical Turk can be used for a variety of other things. Up to now, the IT industry has regarded the Cloud as the delivery of computing as a service, with the infrastructure, hardware and software existing on internationally networked servers, effectively invisible to the end user. This model is now to being applied broadly to people.
Basically, Mechanical Turk acts as a marketplace, where employers post Human Intelligent Tasks (HITs) that workers can do. Most can be completed in minutes and you are paid pennies to do so. Some examples might help illustrate what a HIT looks like:
Call a business and get the email address of the manager in charge.
Review a photograph and describe its style or content in three words or less
Select among multiple choices to categorize a job listing or company position
As a Mechanical Turk worker, you only work on the HITs you choose to work on, presumably those that interest you, and that you can do well and quickly. Workers can do this anytime, anywhere, such as 2:00am in the morning, at home, when you can't sleep or taking care of children. You can choose to work as much or as little as you like.
The employers--referred to as Mechanical Turk requesters--put money into their payroll accounts, load up their tasks, and hit publish. This gives them immediate access to a global, on-demand 24-by-7 workforce that can help complete thousands of HITs in minutes. These employers won't have to put an advertisement in the want ads and interview potential candidates, just to let them go later when the project is over.
Just like any other job, Mechanical Turk wages are reported to the IRS, and each person's work is evaluated for quality. In doing these tasks, you build up your "digital reputation" that will either prevent you or allow you to work on certain HITs. You can also take tests to reach Qualification levels to be eligible to work on HITs not available to everyone else.
Software engineers would have a hard time writing an Artificial Intelligence [AI] program to do these simple tasks, so being able to generate a HIT for something in the middle of a computer program might be the easiest way to get past a difficult part of an algorithm. Amusingly, Amazon describes this form of [crowdsourcing] as an artificial form of Artificial Intelligence!
While this approach may work for small, easily defined tasks, what about works that require a high amount of Human Intelligence, like storage software or hardware development?
When I was working for IBM as a software engineer in the 1980s and 1990s, it took us years to get a project done, using the traditional [Waterfall Model]. My job as a software architect was to estimate the thousands of lines of code (KLOC) a project would require, estimate the number of Person-Years (PY) it would take, and recommend the appropriate sized team. Back then, each engineer averaged only about 1,000 lines of software code per year, so KLOC and PY were often used interchangeably. Fellow IBM author Fred Brooks wrote an excellent book on the process called [The Mythical Man-Month].
The Waterfall model has the advantage that people only have to work a portion of the cycle on the project. In between, there was plenty of downtime to attend training, improve your skills, or take vacation. As our director Lynn Yates would often complain, "if they are only writing two lines of code in the morning, and two in the afternoon, why do they need time to rest?"
The Waterfall model was not perfect, and had its share of critics. One downside was that the clients didn't see anything until General Availability (GA), with a few getting a glimpse a few months earlier during our Early Support Program (ESP). By the time clients could tell us it was not what they wanted or expected, it was too late to change until the next release.
To address this concern, 17 software engineers wrote the now famous [Agile Manifesto]. The authors felt that collaboration, between the developers and with the clients, is critical to success. Business people and developers must work together daily throughout the project. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation. The best architectures, requirements, and designs emerge from self-organizing teams. The result is an iterative approach that allows the client to see working prototypes early in the process, allowing last-minute changes to requirements to influence the final product.
Combining the Mechanical Turk concept with Agile programming methodology gives you what IBM calls an "Outcomes Model" approach. In the IBM research paper [Software Economies] (PDF, 5 pages), the authors argue that there are four fundamental principles needed for an "Outcomes Model" approach:
Autonomy. All of the actions necessary to bring jobs to completion should be driven by market forces; the process is
never gated by an entity outside of the market.
Inclusiveness. Everyone who provides information or performs work that leads to improvements should share in the
Transparency. The system should be transparent with respect to both the flow of money in the market and the tasks
performed by workers in the market.
Reliability. The system should be immune to manipulation, robust against attack (e.g., via insertion of untrusted code),
and prevent "shallow" work which would have to be re-done later.
I was surprised to see that [the TopCoder Community is 390,593 strong], nearly the size of the entire IBM company. TopCoder is focused on computer programming and digital creation using the Outcomes Model approach. Rather than paying everyone for their work, however, the platform is designed around challenges and competitions, and the top players or contributors are rewarded with cash prizes.
As an innovative company, IBM constantly explores a variety of means and approaches to offer value to its clients and customers. These new approaches may have some distinct advantages not just for IBM and its shareholders, but also for its clients and the freelancers hired to work on these projects. The global marketplace is getting flatter, smaller and smarter. It will be interesting how this plays out. If the discussion above encourages you to hone your technical skills, perhaps that is motivation enough to get off the couch and stop watching so much television!
IBM makes another breakthrough today with an announcement about tape data density. Unlike hard disk drive technologies that are hitting physical limits, IBM is proving that tape technology still has plenty of life in its future.
When I first started working for IBM in Tucson, back in 1986, a 3420 tape reel held only 180MB of data, and a 3480 tape cartridge improved this to 200MB of data. Today's enterprise tapes, like 3592 cartridges for the TS1130 drives, or LTO4 cartridges for the IBM TS1040 drives, are half-inch wide, half-mile long, and can store 1 TB or more of data per cartridge, depending on how well the data can compress. To increase cartridge capacity, designers can make changes in three dimensions:
Wider tape: The film industry tried this, going from 35mm to 70mm film, only to find that most cinemas did not want to upgrade their equipment. Keeping the media dimensions to half inch wide allows much of the engineering hardware to continue unchanged.
Longer tape: The problem with longer tape is that either the reel inside gets fatter, or you need to develop flatter media to fit within the existing cartridge dimensions. Wider reels means a bigger tape cartridge external dimensions, forcing changes to shelving units, cartridge trays, and carrying units. The media just can't get any flatter without risking getting more brittle.
Denser bit recording: once a convenient width and length were established, improving bit density turned out to be the best way to increase cartridge capacity.
Working with FujiFilm Corporation of Japan, my colleagues at IBM Research facility in Zurich were able to demonstrate an incredible 29.5 Gigabits per square inch, nearly 40 times more dense than today's commercial tape technology. In the near future, we will be able to hold a 35TB tape cartridge in our hand. There was actually a lot to make this happen, improved giant magentoresistive read/write heads, better servo patterns to stay on track, thinner tracks less than a micron thick, and better signal-to-noise processing to accomplish this. To learn more, you can read the [Press Release] or watch this quick [4-minute YouTube video].
This week I am in Orlando, Florida for the IBM Edge conference. Thursday evening after all the other sessions, we had a Free-for-All, a Q&A panel across all storage topics, moderated by Scott Drummond. The conference officially ends at noon tomorrow, but for many, this is the last session, as people fly out Friday morning. Here are the questions and the panel responses during the session.
When will IBM unify their storage management between Mainframe z/OS and the distributed systems platforms?
IBM offers a Change and Configuration Management Data Base (CCMDB) for this purpose with appropriate collectors from z/OS and distributed systems, but hasn't sold well.
When will IBM devices have RESTful interfaces?
Both IBM Systems Director and IBM Tivoli Storage Productivity Center (TPC) offer RESTful APIs. IBM Systems Director can manage z/VM and Linux on System z, as well as Power Systems and x86 based distributed systems. Since October 2008, IBM's Project Zero introduced RESTful interfaces to PHP and Groovy software running on WebSphere sMash environments. We have not heard much about this since 2008.
Will IBM TPC support NPIV on Power Systems?
TPC 5.1 has toleration support for this, showing the first port connection discovered, but not all connections, and we expect to retrofit this toleration to TPC 4.2.2 Fixpack 2. Hopefully, we will have full support in a future release.
We would like TPC for Replication to run on Linux for System z. We do not run z/OS at the disaster recovery site location.
Submit an IBM Request for Enhancement [RFE] for this. We have TPC for Replication on z/OS, as well as the distributed systems version that runs on Windows, Linux and AIX.
We have enhancements we would like to see for XIV and SONAS also, can we use the RFE process for this also?
Yes, submit the requirements for our review.
We heard the Statement of Direction that there would be storage integrated into the PureSystems. What exactly does that mean?
The PureSystems family of expert-integrated systems is based on a new chassis that has a front part, a midplane, and a back-part. All IBM System Storage products that support x86 and Power Systems can work with PureSystems. However, IBM does not yet offer storage that fits in the front part of the PureFlex chassis, but the Statement of Direction indicates that we intend to offer that option. Until then, the IBM Storwize V7000 is the storage of choice that can be put into the PureSystems rack, but outside the individual chasses.
We see some features like Real-Time Compression being put into the SAN Volume Controller (SVC), and other features put into the back-end devices. How are we supposed to make sense of this?
IBM's new pilot program, the SmartCloud Virtual Storage Center, to bring these all together. In general, we have design teams of system architects that determine which features go in which products, and prioritize accordingly.
We heard the IBM Executives during the opening session indicate that IBM's strategy involves supporting Big Data, but I haven't seen any storage that supports native Hadoop interfaces. Did I miss something?
First, I want to emphasize that Big Data is more than just MapReduce workloads. IBM offers Streams and BigInsights software to handle text, as well as Business Intelligence and Data Warehouse solutions for structured data. IBM's General Parallel File System (GPFS) has a Shared-Nothing-Cluster (SNC) mode with Hadoop interfaces that runs twice as fast as Hadoop's native HDFS file system. The storage products we recommend for Big Data are the SONAS and the DCS3700 disk systems, as both are optimized for the sequential workloads Big Data represents.
Everytime we upgrade our SVC, we review the list for SDDPCM multi-pathing and see that we need to upgrade our back-end DS8000 microcode up to recommended levels. Can we get a list of combinations that work from other customers?
The advantage of storage hypervisors like SVC is that we can separate the multi-pathing driver from the back-end managed disk systems. You only need the SDDPCM to support the SVC, not the back-end devices. For the most part, SVC has not dropped support for any level of previously supported OS or multi-pathing software.
On SVC, when we migrate volumes (vDisks) from one storage pool to another, we would like to throttle this process during FlashCopy.
Yes, we had several requests like this, which is why we now recommend using Volume Mirorring to perform migrations. In fact the GUI wizard uses Volume Mirroring by default when migrations are performed. As for throttling, IBM has implemented "I/O Priority Manager" that offers Quality of Service classes for DS8000 and XIV Gen3, and might consider porting this to other products in our portfolio.
Sizing systems is an art. I just need to know if the DS8000 is running hot. Can we have the equivalent of "red lines" for our disk systems similar to automobile engines?
Storage Optimizer was added to TPC 4.2 to help in this area, identifying heat-maps for IBM DS8000, DS6000, DS5000, DS4000, SVC and Storwize V7000. We recommend you look at the performance violation reports.
How can we evaluate the characteristics of our workloads?
Yes, TPC can do this.
When we are replacing non-IBM storage with IBM, we don't have good tools to evaluate the non-IBM equipment. What is IBM doing for this?
IBM's Disk Magic modeling tool can take inputs from a variety of sources, including iostat from the servers themselves. You can also install a 90-day trial of TPC to help with this.
We really like EMC's "Grab" program, does IBM have one also?
Updating the Host Attachment Kit (HAK) for AIX is quite painful for the SVC. We prefer the method employed for the XIV.
Thanks for the feedback.
For SVC, we need to correlate disk with VMware and VIOS. Can we get vSCSI information on VIOS?
TPC 5.1 has this support, and we believe it has been retrofitted to TPC 4.2.2 Fixpack 2, coming out this month.
Currently, with SVC, when volumes are part of a Global Mirror (GM) session, we need to cancel GM, expand the source volume, expand the target volume, then restart GM. We would like this to be fully automated and non-disruptive.
Sounds like a great requirement to submit for the RFE process.
Can we get an RSS Feed for the RFE community.
Yes, you can subscribe to it. You can also set up "Watch Lists".
Thanks to all of the IBM experts on the panel for their participation at this event!
I got some interesting queries about IBM's Scale-Out File Services [SoFS] that I mentioned in my post yesterday [Area rugs versus Wall-to-Wall carpeting]. I thought I would provide some additional details of the product.
SoFS combines three key features: a global namespace, a clustered file system, and Information LifecycleManagement (ILM). Let's tackle each one.
Global Name Space
A long time ago, IBM acquired a company called Transarc that developed Andrew File System (AFS) and DistributedFile System (DFS). These both provided global namespace capability, meaning that all of your files could beaccessible from a single URL file tree. Imagine if you have data centers in Tucson, Austin, Raleigh and Chicago.Normally, to access files from each city, you would have to mount a unique IP address for that location, and thento get to files in a different city, you'd have to mount a second, and so on. But with a global namespace, you could mount a single drive letter Z: and access files simply by using Z:/Tucson/abc or Z:/Austin/xyz. IBM uses its DFS to make this happen.
Just because you have access to a global namespace doesn't give you read/write authority to every file. IBM SoFS has full NTFS Access Control List (ACL) support, so that only those who can read or write data can access the files. A "hide unreadable" feature provideswhat I like to call "parental controls": you don't even get to see on your directly list any file or subdirectory that you don't have access to. For example, if there is a directory with 50 projects, but you only have authority tothree projects, then you only see the three subdirectories related to those projects, and nothing else.
There are other ways to get a global namespace. IBM also offers the IBM System Storage N series Virtual FileManager, Brocade offers Storage/X, and F5 acquired Acopia. These all work by putting a box in front of a set ofindependent NAS storage units, and giving you a single mount point to represent all of the file systems managedbehind the scenes. This however can sometimes be a bottleneck for performance.
Clustered File System
Often, when you have a lot of data in one place, you are also expected to deliver that data to lots of clientswith relatively good performance. Otherwise, end users revolt and get their own internal direct attach storage.To solve this, you need a clustered architecture that provides access in parallel to the data.
First, we start with a node that is optimized for CIFS and NFS access. We have clocked our node to run CIFS at577 MB/sec, and NFS at 880 MB/sec, through a 10GbE pipe between a single client and a single SoFS node. Comparethat to the 400 MB/sec you get today with 4Gbps FCP, or the 800 MB/sec you will get if you upgrade to 8 GbpsFCP, and quickly you recognize that this is comparable performance for demanding workloads.
Then, you combine multiple nodes together, and have them all be able to read/write any file in the file system, andfront-end that with a load-balancing Virtual IP address (VIPA) that spreads the requests around, and you've gotyourself a lean and mean machine for accessing data.
In 2005, IBM delivered[ASC Purple] with the world's fastest file system. 1536 nodeswere able to access billions of files in the 2 Petabyte of data. The record of 126 GB/sec access to a single filewas set, and has yet to be beaten by any other vendor since.This same file system is used in SoFS, as well as a variety of other IBM storage offerings.
The back-end storage can be SAS or FC-attached, from the DS3200 to our mighty DS8300 Turbo, as well as ourIBM System Storage DCS9550 and SAN Volume Controller (SVC), and a variety of tape libraries.
Information Lifecycle Management
Lastly, we get to ILM. With SoFS, you can have different tiers of storage, high-speed SAS or FC disk, low-speedFATA and SATA disk, and even tape. Policy-based automation allows you to place any file onto any disk tier whencreated, and other policies can migrate or delete the data trigged by certain threshold, age, or other criteria.The advantage is that this is on a file by file basis, so Z:/Tucson/Project could have a bunch of files, some ofthem on my FC disk, some of them on my SATA, and some on tape. The file path doesn't change when they move, anddifferent files in the same directory can be on different tiers.
Data movement is bi-directional. If you know you will be using a set of files for an upcoming job, say perhapsquarter-end or year-end processing, you can pre-fetch those files from tape and move them to your fastest disk pool.
There is also integrated backup support. Typically, a large NAS environment is difficult to backup. Traditionalmethods take days to scan the directory tree looking for files in need of backup. A single SoFS node can scana billion files in 95 minutes, and 8 nodes in a cluster can scan a billion files in under 15 minutes.
Recovery is even more impressive. When you recover, SoFS brings back the entire directory structure first, withall the file names in place. This would make it appear that all the data is restored, but actually it is still on tape.When you access individual files, it will then drive the recovery of that file, so your applications and end usersbasically determine the priority of the recovery. Traditional methods would wait until every file was restoredbefore letting anyone access the system.
SoFS is part of IBM's [Blue Cloud] initiativethat was launched last November 2007. Of course, IBM isn't the only one competing in this space. HDS has partneredwith BlueArc, HP has acquired PolyServe, and Sun acquired CFS for their Lustre file system. Isilon and Exanet arestart-up companies with some offerings. EMC acquired Rainfinity,and have hinted at a Hulk/Maui project that they might deliver later this year or perhaps in 2009, but by thenmight be a dollar-short and a day-late.
But why wait? IBM SoFS is available today and is orders of magnitude more scalable!
Clod Barrera is an IBM Distinguished Engineer and Chief Technical Strategist for IBM System Storage. He predicts that by 2015, 10 percent of the servers and storage purchases, as well as 25 percent of the network gear purchases, will be related to Cloud deployments. Cloud Storage is expected to grow at a compound annual growth rate (CAGR) of 32 percent through 2015, compared to only 3.8 percent growth for non-Cloud storage.
Cloud Computing is allowing companies to rethink their IT infrastructure, and reinvent their business. Clod presented an interesting chart on the "Taxonomy" of storage in Cloud environments. On the left he had examples of Storage that was part of a Cloud Compute application. On the right he had storage that was accessed directly through protocols or APIs. Under each he had several examples for transactional data, stream data, backups and archives.
Clod feels the only difference between Private and Public clouds is a matter of ownership. In private clouds, these are owned by the company that uses them via their private Intranet network. Public clouds are owned by Cloud Service providers and are accessed over the public Internet. Clod presented IBM's strategy to deliver Cloud at five levels:
Private Cloud: on-site equipment, behind company firewall, managed by IT staff
Managed Private Cloud: on-site equipment, behind company firewall, managed by IBM or other Cloud Service provider
Hosted Private Cloud: dedicated, off-premises equipment, located and managed by IBM or other Cloud Service Provider, and access through VPN
Shared Cloud Services: shared, off-premises equipment, located at IBM or other Cloud Service Provider, managed by IBM or Cloud Service provider, and access through VPN. The facility is intended for enterprises only, on a contractual basis, and will be auditable for compliance to government regulations, etc.
Public Cloud: shared, off-premises equipment, located and managed by IBM or other Cloud Service provider, targeted to offer cloud compute and storage resources, with standardized platforms of operating systems and middleware, for individuals, small and medium sized businesses.
As with storage in traditional data center deployments, storage in clouds will be tiered, with Tier 0 being the fastest tier, to Tier 4 for "deep and cheap" archive storage. IBM SONAS is an example of Cloud-ready storage that can help make these tiers accessible through standard Ethernet protocols. Cloud Service providers will use metering and Service Level Agreements (SLAs) to offer different rates for different tiers of storage in the cloud.
Clod wrapped up his session explaining IBM's Cloud Computing Reference Architecture (CCRA). This is an all-encompassing diagram that shows how all of IBM's hardware, software and services fit into Cloud deployments.
This post concludes my series of posts on Oracle OpenWorld 2011 conference. Here are some pictures from Wednesday and Thursday.
IBM as the yardstick by which everyone measures against
Our friends at Violin Memory systems mentioned our joint-venture success results with IBM GPFS, scanning 10 billion files in less than an hour. (Their booth must have been slow, because members of their team spent a lot of time in our IBM booth!)
In fact, it seemed every company compared themselves to IBM in one fashion or another. Larry said that "IBM is a great company" and mentioned the IBM systems several times in comparisons to Oracle's newly announced hardware offerings.
Larry's Sailing Vessel
When things slowed down, I took a walk to see the other parts of the exhibition area. In the Moscone West building was Larry's catamaran that won [last year's America's Cup].
I used to sail myself, and have been part of crews in sailing races in both Japan and Australia. A few years ago, I watched the America's Cup time trials in New Zealand.
On the Streets of San Francisco
On the streets, IBM had advertised some of its products in a manner that thousands of attendees would see every day. Here we have some factoids related to IBM Netezza and DB2 database on POWER servers. We were very careful not to mention either product in the IBM Booth itself, as we all understand that IBM is a guest in Oracle's house this week. We certainly don't want to do anything to upset Larry in any way to make him treat IBM like he treated HP last year, or Salesforce.com this year.
Rest in Peace, Steve Jobs, 1955-2011
On Wednesday evening at Oracle OpenWorld, we were tearing down the booth when we heard that Apple co-founder Steve Jobs had passed away. This is truly a loss for the entire IT industry. I never met Steve in person, nor have I been to any Apple conferences like MacWorld that he spoke at.
At various keynote sessions, Larry Ellison compared his Oracle products to those of Apple, Inc., suggesting that Oracle is the "Apple for the Enterprise".
On our way back to the Hilton hotel on O'Farrell, there was a candle vigil at the Apple Store near Union Square. People left sticky notes on the glass window.
There were a lot of tributes to Steve Jobs, but I liked this 15-minute video of his 2005 Commencement Speech at Stanford University titled [How to Live before you Die.
This will be one of those moments where years later, many people will remember exactly where they were, and what they were doing, when they heard the news. For many, that news came as tweets or text messages on the very iPhones and iPads he helped design.
Rock Concert - Wednesday night
On Wednesday evening, I joined thousands of other attendees on Treasure Island to hear and watch Sting, Tom Petty and the Heartbreakers, and the English Beat in concert. It was cold and dark, but we all had a good time. Needless to say, I didn't make it to Marc Benioff's 8:00am Thursday morning session!
A word of advice: If you go to an evening rock concert at Treasure Island, dress warmly!
Despite the sad news about Steve Jobs, I had a great time at this conference. I learned a lot about what other IT vendors are doing, talked to dozens of IBM clients at the booth, and got to make some new friends that work in other parts of IBM.
(FTC Disclosure: I work for IBM. IBM and Apple are technology partners. I proudly own an Apple iPod, several Mac Mini computers and shares of stock in both IBM and Apple, Inc.)
In his Backup Blog, fellow blogger Scott Waterhouse from EMC has yet another post about Tivoli Storage Manager (TSM) titled [TSM and the Elephant]. He argues that only the cost of new TSM servers should be considered in any comparison, on the assumption that if you have to deploy another server, you have to attach to it fresh new disk storage, a brand new tape library, and hire an independent group of backup administrators to manage. Of course, that is bull, people use much of existing infrastructure and existing skilled labor pool every time new servers are added, as I tried to point out in my post [TSM Economies of Scale].
However, Scott does suggest that we should look at all the costs, not just the cost of a new server, which we in the industry call Total Cost of Ownership (TCO). Here is an excerpt:
Final point: there is actually a really important secondary point here--what is the TCO of your backup infrastructure. In some ways, TSM is one of the most expensive (number of servers and tape drives, for example), relative to other backup applications. However, I think it would be a really interesting exercise to critically examine the TCO of the various backup applications at different scales to evaluate if there is any genuine cost differentiation between them.
Fortunately, I have a recent TCO/ROI analysis for a large customer in the Eastern United States that compares their existing EMC Legato deployment to a new proposed TSM deployment. The assessment was performed by our IBM Tivoli ROI Analyst team, using a tool developed by Alinean. The process compares the TCO of the currently deployed solution (in this case EMC Legato) with the TCO of the proposed replacement solution (in this case IBM TSM) for 55,000 client nodes at expected growth rates over a three year period, and determines the amount of investment, cost savings and other benefits, and return on investment (ROI).
Here are the results:
"A risk adjusted analysis of the proposed solution's impact was conducted and it was projected that implementing the proposed solutions resulted in $16,174,919 of 3 year cumulative benefits. Of these projected benefits, $8,015,692 are direct benefits and $8,159,227 are indirect benefits.
Top cumulative benefits for the project include:
Backup Coverage Risk Avoidance - $6,749,796
Reduction in Maintenance of Competitive Products - $1,576,000
Reduction in Existing Tivoli Maintenance (Storage and Monitoring) - $1,490,000
IT Operations Labor Savings - Storage Management - $982,919
Network Bandwidth Savings - $575,196
Standardization - $366,667
Future cost avoidance of addtional competitive licenses - $350,000
These benefits can be grouped regarding business impact as:
$6,456,025 in IT cost reductions
$1,559,667 in business operating efficiency improvements
$8,159,227 in business strategic advantage benefits
The proposed project is expected to help the company meet the following goals and drive the following benefits:
Reduce Business Risks $6,749,796
Consolidate and Standardize IT Infrastructure $4,975,667
Reduce IT Infrastructure Costs $2,057,107
Improve IT System Availability / Service Levels $1,409,431
Improve IT Staff Efficiency / Productivity $982,919
To implement the proposed project will require a 3 year cumulative investment of $5,760,094 including:
$0 in initial expenses
$4,650,000 in capital expenditures
$1,110,094 in operating expenditures
Comparing the costs and benefits of the proposed project using discounted cash flow analysis and factoring in a risk-adjusted discount rate of 9.5%, the proposed business case predicts:
Risk Adjusted Return on Investment (RA ROI) of 172%
Return on Investment (ROI) of 181%
Net Present Value (NPV) savings of $8,425,014
Payback period of 9.0 month(s)
Note: The project has been risk-adjusted for an overall deployment schedule of 5 months."
IBM Tivoli Storage Manager uses less bandwidth, fewer disk and tape storage resources than EMC Legato. For even a large deployment of this kind, payback period is only NINE MONTHS. Generally, if you can get a new proposed investment to have less than 24 month payback period you have enough to get both CFO and CIO excited, so this one is a no-brainer.
Perhaps this helps explain why TSM enjoys such a larger marketshare than EMC Legato in the backup software marketplace. No doubt Scott might be able to come up with a counter-example, a very small business with fewer than 10 employees where an EMC Legato deployment might be less expensive than a comparable TSM deployment. However, when it comes to scalability, TSM is king. The majority of the Fortune 1000 companies use Tivoli Storage Manager, and IBM uses TSM internally for its own IT, managed storage services, and cloud computing facilities.
"The murals in restaurants are on par with the food in museums." --- Peter De Vries
The quote above applies to blogs as well. Those about competitive products of which the blogger has little to no hands-on experience tend to be terribly misleading or technically inaccurate. We saw this last month as Sun Microsystems' Jeff Savit tried to discuss the IBM System z10 EC mainframe.
This time, it comes from EMC bloggers discussing NetApp equipment, and by association, IBM System Storage N series gear.I was going to comment on the ridiculous posts by fellow bloggers from EMC about SnapLock compliance feature on the NetApp, but my buddies at NetApp had already done this for me, saving me the trouble.
The hysterical nature of writing from EMC, and the calm responses from NetApp, speak volumes about the culturesof both companies.
The key point is that none of the "Non-erasable, Non-Rewriteable" (NENR) storage out there are certified as compliant by any government agency on the planet. Governments just aren't in the business of certifying such things. The best you can get is a third-party consultant, such as [Cohasset Associates], to help make decisions that are best for each particular situation.
In addition to SnapLock on N series, IBM offers the [IBM System Storage DR550], WORM tape and optical systems, all of which have been deemed compliant to the U.S. Securities and Exchange Commission [SEC 17a-4] federal regulations by Cohasset Associates. For medical patient records and images like X-rays, IBM offers the Grid Medical Archive Solution [GMAS]designed to meet the requirements of the U.S. Health Insurance Portability and Accountability Act[HIPAA].For other government or industry regulations, consult with your legal counsel.
Continuing my coverage of the Data Center Conference 2009, held Dec 1-4 in Las Vegas, the title of this session refers to the mess of "management standards" for Cloud Computing.
The analyst quickly reviewed the concepts of IaaS (Amazon EC2, for example), PaaS (Microsoft Azure, for example), and SaaS (IBM LotusLive, for example). The problem is that each provider has developed their own set of APIs.
(One exception was [Eucalyptus], which adopts the Amazon EC2, S3 and EBS style of interfaces. Eucalyptus is an open-source infrastrcture that stands for "Elastic Utility Computing Architecture Linking Your Programs To Useful Systems". You can build your own private cloud using the new Cloud APIs included Ubuntu Linux 9.10 Karmic Koala termed Ubuntu Enterprise Cloud (UEC). See these instructions in InformationWeek article [Roll Your Own Ubuntu Private Cloud].)
The analyst went into specific Virtual Infrastructure (VI) and public cloud providers.
Private Clouds can be managed by VMware tools. For remote management of public IaaS clouds, there is [vCloud Express], and for SaaS, a new service called [VMware Go].
Citrix is the Open Service Champion. For private clouds based on Xen Server, they have launched the [Xen Cloud Project] to help manage. For public clouds, they have [Citrix Cloud Center, C3], including an Amazon-based "Citrix C3 Labs" for developing and testing applications. For SaaS, they have [GoToMyPC and [GoToAssist].
Amazon offers a set of Cloud computing capabilities called Amazon Web Services [AWS]. For virtual private clouds, use the AWS Management Console. For IaaS (Amazon EC2), use [CloudWatch] which includes Elastic Load Balancing.
If you prefer a common management system independent of cloud provider, or perhaps across multiple cloud providers, you may want to consider one of the "Big 4" instead. These are the top four system management software vendors: IBM, HP, BMC Software, and Computer Associates (CA).
A survey of the audience found the number one challenge was "integration". How to integrate new cloud services into an existing traditional data center. Who will give you confidence to deliver not tools for remote management of external cloud services? Survey shows:
28 percent: VI Providers (VMware, Citrix, Microsoft)
19 percent: Big 4 System Management software vendors (IBM, HP, BMC, CA)
13 percent: Public cloud providers (Amazon, Google)
40 percent: Other/Don't Know
For internal private on-promise Clouds, the results were different:
40 percent: VI Providers (VMware, Citrix, Microsoft)
21 percent: Big 4 System Management software vendors (IBM, HP, BMC, CA)
13 percent: Emerging players (Eucalyptus)
26 percent: Other/Don't Know
Some final thoughts offered by the analyst. First, nearly a third of all IT vendors disappear after two years, and the cloud will probably have similar, if not worse, track record. Traditional server, storage and network administrators should not consider Cloud technologies as a death knell for in-house on-premises IT. Companies should probably explore a mix of private and public cloud options.
In less than a month, I will be presenting at the annual IBM Storage Technical University, July 18-22, at the Hilton in Orlando, Florida. This is one of my favorite conferences! You can sign up for this at their [Online Registration Page].
I will be covering a variety of topics:
IBM Storage Strategy in the Era of Smarter Computing - After IBM has led the IT industry through the "Centralized Computing" era, and then later the "Distributed Computing" era, we are now entering the third era, that of Smarter Computing. Come learn IBM's strategy for Storage to address today's big challenges, including Big Data, Integrated Workload-optimized systems, and Cloud service delivery models.
IBM Information Archive for Email, Files and eDiscovery - This session will cover the latest announcement for our non-erasable, non-rewriteable compliance storage, the Information Archive (IA), how this can be used to protect your emails and files, and provide indexed search to assist with eDiscovery.
IBM Tivoli Storage Productivity Center Overview and Update - I was one of the original lead architects for Productivity Center. Come learn what this software is all about, and how the latest features and functions can help you manager your IT environment.
IBM SONAS and the Smart Business Storage Cloud - Confused about Cloud Computing and Cloud Storage? I will explain everything you need to know, including how the integrated SONAS appliance operates, IBM's customized solutions for private cloud deployments, and IBM's public cloud offerings.
BOF on Social Media - BOF stands for "Birds of a Feather", and his normally an after-hours discussion on a single theme. This BOF will be a four-expert Q&A panel, including myself, John Sing, Rich Swain and Ian Wright. We will discuss how we got started in Social Media, and how it has boosted our careers and our ability to get work done.
Well, it's Tuesday again, and you know what that means? IBM Announcements! After much needed vacation in Cancun Mexico, Lake Havasu and Sedona, Arizona, I am glad to be back at work! This week, I was visiting clients in the Los Angeles area.
IBM FlashSystem 9100
IBM's latest addition to its lineup of All-Flash Arrays is the FlashSystem 9100.
There are actually two models: the 9110 (model AF7) has 8-core processors, and the 9150 (model AF8) has 14-core processors. Both models are 2U 19-inch shelves with 24 drives on the front, with two control node canisters in the back. The term "FlashSystem 9100" applies to both 9110 and 9150 models.
Each canister has two processors, 64GB to 768GB of cache memory, an on-board 1GbE port for management, four 10GbE ports for Ethernet, and three HIC slots for I/O adapters, which can be any mix of quad-port FC cards, dual-port 25GbE Ethernet cards, or 12Gb SAS cards for expansion drawers.
For drives, you can have any mix of FlashCore Modules (FCM) or Industry-Standard NVMe (ISN) drives. The FlashCore modules are similar to the FlashCore boards in the FlashSystem 900, including Variable-Striped RAID, advanced flash management, heat binning, health separation, hardware-embedded encryption and compression.
These FCM are packaged into standard NVMe SSD form-factor, with 4.8, 9.6 and 19.2 TB capacities. The Industry-Standard NVMe drives come in 1.92, 3.84, 7.68 and 15.36 TB capacities to offer additional price/capacity options to clients.
A fully maxed out twenty-four FCM module system at 19.2TB represents approximately 400TB usable capacity, combined with 5:1 data footprint reduction with deduplication and compression, can provide up to an effective 2PB in as little as 2U of rack space!
The NVMe and FlashCore technology truly accelerates performance. Latencies as low as 100 microseconds are 2.5x lower than competitive offerings. Each control enclosure can deliver up to 2.5 Million IOPS, and a four-way cluster up to 10 million IOPS in just 8U!
You can mix and match FCM and ISN drives in the same controller, but FCM and ISN have to be in their own separate RAID groups. To use Distributed RAID6 (DRAID6), you need at least six drives for this.
IBM has made a "Statement of Direction" that these models are NVMe-OF hardware ready and will support both FC-NVMe and NVMe-OF over Ethernet by year end. Part of this involves changes to server-side software, including various operating systems, device drivers, and multi-pathing drivers.
The FlashSystem 9100 support up to 40U of expansion drawers, over 12Gb SAS, in two sizes. A 2U drawer for 24 SFF drives, and 5U for 92 SFF/LFF drives. Each FlashSystem 9100 can support up to 760 drives. These expansion drawers are not NVMe, so the Solid-State Drives (SSD) inside them use standard SAS. Consider using Easy Tier sub-LUN automated tiering to move fast data up to the FCM/ISN drives, and slower data to these SAS-based SSD.
Even though it doesn't have a "V" in its name, the FlashSystem 9100 runs Spectrum Virtualize, so you can also virtualize other storage behind it. Over 400 different storage devices from leading storage vendors are supported. The FlashSystem 9100 can be virtualized behind SVC or FlashSystem V9000.
FlashSystem 9100 can also cluster with Gen2 and Gen2+ models of the Storwize V7000 and V7000F controllers. You can connect up to four of any of these into a single cluster, supporting up to 3,040 drives.
The FlashSystem 9100 offers all of the features you have come to love from the rest of the Spectrum Virtualize products: data deduplication and compression, encryption, high-availability guarantee, data footprint reduction guarantee, hardware refresh option after three years, storage utility pricing, and IBM Storage Insights support.
IBM has no plans to withdraw either the existing FlashSystem V9000 nor the Storwize V7000/F models anytime soon. They continue to be available for purchase.
To complement the hardware features of the FlashSystem 9100, IBM has come up with three Multi-cloud solutions.
Multi-Cloud Solution for Data Reuse, Protection and Efficiency - this combines Spectrum CDM with Spectrum Protect Plus to take snapshots of volumes on FlashSystem 9100. These snapshots are not just for data protection, but can also be "reused" for other purposes, like dev/test, DevOPS, or analytics.
Multi-Cloud Solution for Business Continuity and Data Reuse - combines Spectrum CDM with Spectrum Virtualize in the Public Cloud, allowing you to take snapshots to the IBM Cloud for disaster recovery. The snapshots can be used in the cloud, or copied back to the same or different data center.
Multi-Cloud Solution for Private Cloud Flexibility and Data Protection - combines IBM Cloud Private, Spectrum CDM, and Spectrum Connect to support client's efforts to re-factor their applications with Docker containers and Kubernetes. IBM FlashSystem 9100 can be used as persistent storage for containerized applications.
This release applies only to the Storwize V7000/F and the new FlashSystem 9100 models, and provides support for iSCSI Extensions over RDMA (iSER) on the 25GbE NIC cards. If you want to cluster existing Storwize V7000/F models to the new FlashSystem 9100 models, you need all of them to be at least v8.2.0 release.
Lower latencies and higher bandwidth requirements can be addressed by using RDMA to implement iSCSI. iSER is a new interconnect protocol that allows iSCSI to run on top of RDMA technology. RDMA can be implemented by using RoCE (RDMA over Converged Ethernet) or iWARP (Internet Wide-area RDMA Protocol). iSER enables iSCSI to run on top of it regardless of which of these technologies is used underneath.
The "Storage Utility" pricing available for many of IBM's other products has been extended to include the IBM FlashSystem 9100 and IBM Cloud Object Storage.
Basically, this is a variable-priced usage-based lease. Let's say you lease 500TB of capacity, but only use 150TB, the first few months you only pay for 150TB, a bit later, you use more, and now start paying more monthly, say 200TB. The price can go up or down. At the end of the lease, typically 36 or 60 months, you have a choice: give the equipment back, or pay the difference.
Maria Boonie is the IBM Director for IBM Worldwide Training and Technical Conferences. She indicated that there were 1500 attendees this week crossing both the System Storage and System x conferences at this hotel. There are 35 vendors that have sponsored this event, and they will be at the "Solutions Center" being held Monday through Wednesday this week.
She took this opportunity to plug IBM's latest education offerings, including Guaranteed-to-Run implementation classes, and Instructor-Led Online (ILO) technical classes.
Brian Truskowski is IBM General Manager for System Storage and Networking. I used to directly report to him in a previous role, and a few years ago he used to be the IBM CIO that helped with IBM's internal IT transformation.
Brian indicates that the previous approach to growth was to "Just Buy More", but this has some unintended consequences. He argued that companies need to adopt one or more of the following approaches to growth:
Stop storing so much - reduce data footprint using storage efficiency capabilities like data deduplication and compression
Store more with what is already on the floor - improve storage utilization with technologies like storage virtualization and thin provisioning
Move data to the right place - implement automated tiering, such as "Flash & Stash" between Solid-state drives and spinning disk, and/or Information Lifecycle Management (ILM) between disk and tape. Studies at some clients have found over 70 percent of data has not beed touched in the last 90 days
This time of dramatic change is the result of a "perfect storm" of influences, including the rising costs and risks associated with losing data, the increased need to index and search data, the desire for "Business Analytics", and the expectation for 100 percent up-time. This is driving IBM to offer hyper-efficient backup, Continuous Data Availability, and Smart Archive solutions.
The case study of SPRINT is a good example. SPRINT is a Telecommunications provider for cell phone users. They were challenged with 35 percent utilization, 165 storage arrays from six different vendors, and an expected 100 percent increase in their IT maintenance costs. After implementing IBM SAN Volume Controller (SVC) and Tivoli Storage Productivity Center (TPC) to manager 2.9 PB of data, SPRINT increased their utilization to 82 percent, reduced down to 70 storage arrays from only three vendors, and reduced their maintenance costs by 57 percent. Today, SPRINT now manages over 5 PB of data with SVC and TPC, have reduced their power and cooling by 3.5 million KWh, representing $320,000 USD in savings.
Roland Hagan is the IBM Vice President for the System x server platform. He talked about the "IT Conundrum" that represents a vicious cycle of "IT Sprawl", "Untrusted Data" and "Inflexible IT" that seem to feed each other. IBM is trying to change behavior, from thinking and dealing with physical boxes representing servers, storage and network gear, to a more holistic view focused on workloads, shared resource pools, independent scaling, and automated management.
IBM is leading the server marketplace, in part because of clever things IBM is doing, especially in developing the eX5 chipset that surrounds x86 commondity processors, and in part because of actions or decisions the competition have taken:
It doesn't break IBM's heart that Oracle decided to drop software support of their database on Itanium, which focued entirley against HP. Oracle runs on IBM servers better than Oracle/Sun or HP servers today, so it does not impact us, other than IBM has had a lot of people leaving HP to switch over to IBM.
HP has taken on a new CEO and reduced their R&D budget, causing them to be late-to-market on some of their offerings.
Dell continues to focus on the small and medium sized customer, and have not really broken into the "Enterprise".
Newcomer Cisco has some great technology that only seems to be adoptable in "Green Field" situations, as it does not integrate well with existing data center infrastructures.
The combination of ex5 chip-set architecture, Max5 memory expansion capabilities and Virtual Network Interface Cards (NICs), provide for a very VM-aware platform. For those who are not ready to fully adopt an integrated stack like IBM CloudBurst, IBM offers the Tivoli Service Automation software on its own, and a new [IBM BladeCenter Foundation for Cloud] as stepping stones to get there.
There are certainly more attendees here than last year, which reflects either the change in location (Orlando, Florida rather than Washington DC) as well as the economic recovery. I'm looking forward to an excellent week!
Continuing my coverage of last week's Data Center Conference 2009, I attended another "User Experience" that was very well received. This time, it was Henry Sienkiewicz of the Department Information Systems Agency (DISA) presenting a real-world example of the business model behind a private cloud implementation. DISA is the US government agency that develops and runs software for the Army, Navy and Air Force.
Being part of the military presents its own unique set of challenges:
Acquisition of hardware to develop and test software is difficult
Budgets fluctuate so an elastic pay-for-use was desirable
End user access had to be secure and meet government regulations
It had to meet the technical aspects of scalable, elastic, dynamic, multi-tenant using shared resources
Using Cloud Computing simplifies provisioning, encourages the use of standards, and provides self-service. DISA has several solutions.
Rapid Access Computing Environment (RACE)
RACE is an internal private cloud with 24-hour provisioning for development and test requests, and 72 hour provisioning for production requests. The amount used is billed on a month-to-month basis, and offers a self-service portal so that developers and administrators can just pick and choose what they need. The result is a hosted server, similar to what you get from 1and1.com or GoDaddy.
Global Content Delivery Service (GCDS)
This provides long-term storage of data. An internal version of "Cloud Storage" for archive and fixed content.
This provides a place to maintain source code, basically their internal version of "SourceForge" used by Open Source projects.
In their traditional approach, a software project would take six months to procure the hardware, another 6-12 months code and test, and then another 6 months in certification, for a total of 18-24 months. With the new Cloud Computing approach that DISA adopted, procurement was down to 24-72 hours with RACE, code test took only 2-6 months with Forge.Mil, and certification could be done in days on RACE, resulting in a new total of only 3-6 months. Some challenges they found:
Service Level management and continuing the use of ITIL best practices
Balancing Military-level Security with Self-service Usability
Internal Funding and Chargeback, they had even adopted a way for developers to pay with their credit card
Cultural inertia, developers don't like to change or do things in a different way
Some lessons learned from this two-year experience:
It's a journey. Most of the user experiences for cloud adoption took two or more years to complete
Infrastructure Fundamentals continue to matter
Know your "marketplace", in this case, software development for military applications
Engage in your end-users early. In this case, Henry had wished he had involved input from software developers that would be using RACE, GCDS and Forge.MIL earlier in the process.
Return on Value analysis, this is different than Return on Investment, as many of the benefits of cloud like higher morale are intangible at first
Avoid fixed costs in negotiations with vendors. For example, he cited they use a lot of IBM because of IBM's pay-for-use billing model. They pay for MIPS used on IBM mainframes, and their IBM Tivoli software pricing is usage-based.
Raj hails from Toronto, Canada and will be able to provide the Canadian perspective on all things Storage. I had the pleasure to meet Raj in person here in Tucson when him and dozens of his cohorts came down for a multi-customer briefing at the [IBM Executive Briefing Center] where I work.