Last week, I presented "An Introduction to Cloud Computing" for two hours to the local Institute of Management Accountants [IMA] for their Continuing Professional Education [CPE]. Since I present IBM's leadership in Cloud Storage offerings, I have had to become an expert in Cloud Computing overall. The audience was a mix of bookkeepers, accountants, auditors, comptrollers, CPAs, and accounting teachers.
I have posted my charts on Slideshare.Net:
Here is a sample of the questions I took during and after my presentation:
If I need to shut down host machine, I lose all my virtual machines as well?
No, it is possible to seemlessly move virtual machines from one host to another. If you need to shut down a host machine, move all the VMs to other hosts, then you can shut down the empty host without impacting business.
Does the SaaS provider have to build their own app, can they not buy an app and then rent it out?
Yes, but they won't have competitive differentiation, and the software development they buy from will want a big cut of the action. SaaS developers that build their own applications can keep more of the profits for themselves.
How do backups work in cloud computing? Do I have to contact someone at the cloud computing company to find the backup tape?
Large datacenters often keep the most recent backups on disk, and older versions on tape in automated tape libraries that can fetch your backup in less than 2 minutes. Because of this, there is no need to talk to anyone, you can schedule or invoke your own backups, and often perform the recovery yourself using self-service tools.
Last month, my sister tried to rent a car during the week the Tucson Gem Show, but they were out of cars she wanted to drive. Could this happen with Cloud Computing?
Not likely. With rental cars, the cars have to be physically in Tucson to rent them. Rental companies could have brought cars down from Phoenix to satisfy demand. With Cloud Computing, it is all accessible over the global network, you are not limited to the cloud providers nearest you.
Is there a reason why Amazon Web Services (AWS) charges more for a Windows image than a Linux image?
Yes, Amazon and Microsoft have a patent cross-licensing agreement where Amazon pays Microsoft for the priveledge of offering Windows-based images on their EC2 cloud infrastructure. It just makes business sense to pass those costs onto the consumer. Linux is a free open source operating system, and is often the better choice.
So if we rent a machine from Amazon, they send it to my accounting office? What exactly am I getting for 12 cents per hour?
No. The computer remains in their datacenter. You get a virtual machine that runs 1.2Ghz Intel processor, with 1700MB of RAM, and 160GB of hard disk space, with Windows operating system running on it, comparable to a machine you can get at the local BestBuy, but instead of it running in the next room, it is running in a datacenter somewhere else in the United States with electricity and air conditioning.
You access it remotely from your desktop or laptop PC.
Why would I ever rent more than one computer?
It depends on your workload. For example, Derek Gottfrid at the New York Times needed to convert 11 million articles from TIFF format to PDF format so that he could put them up on the web. This would have taken him months using a single computer, so he rented 100 computers and got the entire stack converted in 24 hours, for a cost of about $240. See the articles [Self-Service, Prorated, Super Computing] and [TimesMachine] for details.
What about throughput? Won't I need to run cables from my accounting office to this cloud computing data center?
You will need connectivity, most likely from connections provided by your local telephone or cable company, or through the Internet. Certainly, there can be cases where direct privately-owned fiber optic cables, known as "dark fiber", can directly connect consumers to local Cloud service providers, for added security.
What about medical records? Will Cloud Computing help the Healthcare industry?
Yes, hospitals are finding that digitizing their records greatly reduces costs. IBM offers the Grid Medical Archive Solution [GMAS] as a private cloud storage solution to store X-ray images and other electronic medical records on disk and tape, and these records can be accessed from multiple hospitals and clinics, wherever the doctor or patient happens to be.
The advantage of personal computers was individualization, I could put on my own choices of software, and customize my own settings, won't we lose this with Cloud Computing?
Yes, customized software and settings cost companies millions of dollars with help desk calls. Cloud Computing attempts to provide some standardization, reducing the amount of effort to support IT operations.
Won't putting all the computers into a big datacenter make them more vulnerable to hackers?
Security is a well-known concern, but this is being addressed with encryption, access control lists, multi-tenancy isolation, and VPN connections.
My daughter has a BlackBerry or iPod or something, and when we mentioned that someone in Phoenix wore a monkey suit to avoid photo-radar speed cameras, she was able to pull up a picture on her little hand-held thing, is this the future?
Yes, mobile phones and other hand-held devices now have internet access to take advantage of Cloud Computing services. People will be able to access the information they need from wherever they happen to be. (You can see the picture here: [Man Dons Mask for Speed-Camera Photos])
IBM offers a variety of Cloud Computing services, as well as customized solutions and integrated systems that can be deployed on-premises behind your corporate firewall. To learn more, go to [ibm.com/cloud].
The second speaker was local celebrity Dan Ryan presenting the financials for the upcoming [Rosemont Copper] mining operations. Copper is needed for emerging markets, such as hybrid vehicles and wind turbines. Copper is a major industry in Arizona.
technorati tags: , IBM, Cloud Computing, IMA, CPE, IaaS, PaaS, SaaS, Amazon, AWS, Linux, Eucalyptus, Ubuntu, Rosemont Copper
In my last blog post [Full Disk Encryption for Your Laptop] explained my decisions relating to Full-Disk Encryption (FDE) for my laptop. Wrapping up my week's theme of Full-Disk Encryption, I thought I would explain the steps involved to make it happen.
Last April, I switched from running Windows and Linux dual-boot, to one with Linux running as the primary operating system, and Windows running as a Linux KVM guest. I have Full Disk Encryption (FDE) implemented using Linux Unified Key Setup (LUKS).
Here were the steps involved for encrypting my Thinkpad T410:
- Step 0: Backup my System
Long-time readers know how I feel about taking backups. In my blog post [Separating Programs from Data], I emphasized this by calling it "Step 0". I backed up my system three ways:
- Backed up all of my documents and home user directory with IBM Tivoli Storage Manager.
- Backed up all of my files, including programs, bookmarks and operating settings, to an external disk drive (I used rsync for this). If you have a lot of bookmarks on your browser, there are ways to dump these out to a file to load them back in the later step.
- Backed up the entire hard drive using [Clonezilla].
Clonezilla allows me to do a "Bare Machine Recovery" of my laptop back to its original dual-boot state in less than an hour, in case I need to start all over again.
- Step 1: Re-Partition the Drive
"Full Disk Encryption" is a slight misnomer. For external drives, like the Maxtor BlackArmor from Seagate (Thank you Allen!), there is a small unencrypted portion that contains the encryption/decryption software to access the rest of the drive. Internal boot drives for laptops work the same way. I created two partitions:
- A small unencrypted partition (2 GB) to hold the Master Boot Record [MBR], Grand Unified Bootlloader [GRUB], and the /boot directory. Even though there is no sensitive information on this partition, it is still protected the "old way" with the hard-drive password in the BIOS.
- The rest of the drive (318GB) will be one big encrypted Logical Volume Manager [LVM] container, often referred to as a "Physical Volume" in LVM terminology.
Having one big encrypted partition means I only have to enter my ridiculously-long encryption password once during boot-up.
- Step 2: Create Logical Volumes in the LVM container
I create three logical volumes on the encrypted physical container: swap, slash (/) directory, and home (/home). Some might question the logic behind putting swap space on an encrypted container. In theory, swap could contain sensitive information after a system [hybernation]. I separated /home from slash(/) so that in the event I completely fill up my home directory, I can still boot up my system.
- Step 3: Install Linux
Ideally, I would have lifted my Linux partition "as is" for the primary OS, and a Physical-to-Virtual [P2V] conversion of my Windows image for the guest VM. Ha! To get the encryption, it was a lot simpler to just install Linux from scratch, so I did that.
- Step 4: Install Windows guest KVM image
The folks in our "Open Client for Linux" team made this step super-easy. Select Windows XP or Windows 7, and press the "Install" button. This is a fresh install of the Windows operating system onto a 30GB "raw" image file.
TuxRadar has a good [Howto: Linux and Windows virtualization with KVM and Qemu] on this if you are not in IBM. To my surprise, the Windows XP runs better as a KVM guest under Linux than it did running natively in my dual-boot configuration.
(Note: Since my Thinkpad T410 is Intel-based, I had to turn on the 'Intel (R) Virtualization Technology' option in the BIOS!)
There are only a few programs that I need to run on Windows, so I installed them here in this step.
- Step 5: Set up File Sharing between Linux and Windows
In my dual-boot set up, I had a separate "D:" drive that I could access from either Windows or Linux, so that I would only have to store each file once. For this new configuration, all of my files will be in my home directory on Linux, and then shared to the Windows guest via CIFS protocol using [samba].
In theory, I can share any of my Linux directories using this approach, but I decide to only share my home directory. This way, any Windows viruses will not be able to touch my Linux operating system kernels, programs or settings. This makes for a more secure platform.
- Step 6: Transfer all of my files back
Here I used the external drive from "Step 0" to bring my data back to my home directory. This was a good time to re-organize my directory folders and do some [Spring cleaning].
- Step 7: Re-establish my backup routine
Previously in my dual-boot configuration, I was using the TSM backup/archive client on the Windows partition to backup my C: and D: drives. Occasionally I would tar a few of my Linux directories and storage the tarball on D: so that it got included in the backup process. With my new Linux-based system, I switched over to the Linux version of TSM client. I had to re-work the include/exclude list, as the files are different on Linux than Windows.
One of my problems with the dual-boot configuration was that I had to manually boot up in Windows to do the TSM backup, which was disruptive if I was using Linux. With this new scheme, I am always running Linux, and so can run the TSM client any time, 24x7. I made this even better by automatically scheduling the backup every Monday and Thursday at lunch time.
There is no Linux support for my Maxtor BlackArmor external USB drive, but it is simple enough to LUKS-encrypt any regular external USB drive, and rsync files over. In fact, I have a fully running (and encrypted) version of my Linux system that I can boot directly from a 32GB USB memory stick. It has everyting I need except Windows (the "raw" image file didn't fit.)
I can still use Clonezilla to make a "Bare Machine Recovery" version to restore from. However, with the LVM container encrypted, this renders the compression capability worthless, and so takes a lot longer and consumes over 300GB of space on my external disk drive.
Backing up my Windows guest VM is just a matter of copying the "raw" image file to another file for safe keeping. I do this monthly, and keep two previous generations in case I get hit with viruses or "Patch Tuesday" destroys my working Windows image. Each is 30GB in size, so it was a trade-off between the number of versions and the amount of space on my hard drive. TSM backup puts these onto a system far away, for added protection.
- Step 8: Protect your Encryption setup
In addition to backing up your data, there are a few extra things to do for added protection:
- Add a second passphrase. The first one is the ridiculously-long one you memorize faithfully to boot the system every morning. The second one is a ridiculously-longer one that you give to your boss or admin assistant in case you get hit by a bus. In the event that your boss or admin assistant leaves the company, you can easily disable this second passprhase without affecting your original.
- Backup the crypt-header. This is the small section in front that contains your passphrases, so if it gets corrupted, you would not be able to access the rest of your data. Create a backup image file and store it on an encrypted USB memory stick or external drive.
If you are one of the lucky 70,000 IBM employees switching from Windows to Linux this year, Welcome!
technorati tags: IBM, Linux, LUKS, FDE, encryption, KVM, Windows, TSM, Clonezilla
Continuing my saga for my [New Laptop], I have gotten all my programs operational, and now it is a good time to re-evaluate how I organize my data. You can read my previous posts on this series: [Day 1], [Day 2], [Day 3].
I started my career at IBM developing mainframe software. The naming convention was simple, you had 44 character dataset names (DSN), which can be divided into qualifiers separated by periods. Each qualifier could be up to 8 characters long. The first qualifier was called the "high level qualifier" (HLQ) and the last one was the "low level qualifier" (LLQ). Standard naming conventions helped with ownership and security (RACF), catalog management, policy-based management (DFSMS), and data format identification. For example:
In the first case, we see that the HLQ is "PROD" for production, the application is PAYROLL and this file holds job control language (JCL). The LLQ often identified the file type. The second can be a version for testing a newer version of this application. The third represents user data, in which case my userid PEARSON would have my own written TEST JCL. I have seen successful naming conventions with 3, 4, 5 and even 6 qualifiers. The full dataset name remains the same, even if it is moved from one disk to another, or migrated to tape.
(We had to help one client who had all their files with single qualifier names, no more than 8 characters long, all in the Master Catalog (root directory). They wanted to implement RACF and DFSMS, and needed help converting all of their file names and related JCL to a 4-qualifer naming convention. It took seven months to make this transformation, but the client was quite pleased with the end result.)
While the mainframe has a restrictive approach to naming files, the operating systems on personal computers provide practically unlimited choices. File systems like NTFS or EXT3 support filenames as long as 254 characters, and pathnames up to 32,000 characters. The problem is that when you move a file from one disk to another, or even from one directory structure to another, the pathname will change. If you rely on the pathname to provide critical information about the meaning or purpose of a file, that could get lost when moving the files around.
I found several websites that offered organization advice. On The Happiness Project blog, Gretchen Rubin [busts 11 myths] about organization. On Zenhabits blog, Leo Babauta offers [18 De-cluttering tips].
Peter Walsh's [Tip No. 185] suggests using nouns to describe each folder. Granted these are about physical objects in your home or office, but some of the concepts can apply to digital objects on your disk drive.
Other websites were specific to organizing digital files on your personal computer. On her Lifehacker blog, Gina Trapani shows her approach to [Organizing "My Documents"]. Chanel Wood offers her [How to organize your computer and still remember where you put everything], based on a simple alphabetic system. Microsoft offers [9 tips to organize files better]. Most of the advice was common sense, but this one, from Peter Walsh's [Tip No. 190], I found amusing:
"Use the computer’s sorting function. Put “AAA” (or a space) in front of the names of the most-used folders and “ZZZ” (or a bullet) in front of the least-used ones, so the former float to the top of an alphabetical list and the latter go to the bottom."
Personally, I hate spaces anywhere in directory and file names, and the thought of putting a space at the front of one to make it float to the top is even worse. Rather than resorting to naming folders with AAA or ZZZ, why not just limit the total number of files or directories so they are all visible on the screen. I often sort by date to access my most frequently-accessed or most-recently-updated files.
Of all the suggestions I found, Peter Walsh's "Use Nouns" seemed to be the most useful. Wikipedia has a fascinating article on [Biological Classification]. Certainly, if all living things can be put into classifications with only seven levels, we should not need more than seven levels of file system directory structure either! So, this is how I decided to organize my files on my new Thinkad T410:
- C: Drive
Windows XP operating system programs and applications. I have structured this so that if I had to replace my hard disk entirely while traveling, I could get a new drive and restore just the operating system on this drive, and a few critical data files needed for the trip. I could then do a full recovery when I was back in the office. If I was hit with a virus that prevented Windows from booting up, I could re-install the Windows (or Linux) operating system without affecting any of my data.
- D: Drive
This will be for my most active data, files and databases. I have the Windows "My Documents" point to D:\Documents directory. Under Archives, I will keep files for events that have completed, projects that have finished, and presentations I used that year. If I ever run out of space on my disk drive, I would delete or move off these archives first. I have a single folder for all Downloads, which I can then move to a more appropriate folder after I decide where to put them. My Office folder holds administrative items, like org charts, procedures, and so on.
As a consultant, many of my files relate to Events, these could be Briefings, Conferences, Meetings or Workshops. These are usually one to five days in duration, so I can hold here background materials for the clients involved, agendas, my notes on what transpired, and so on. I keep my Presentations separately, organized by topic. I also am involved with Projects that might span several months or ongoing tasks and assignments. I also keep my Resources separately, these could be templates, training materials, marketing research, whitepapers, and analyst reports.
A few folders I keep outside of this structure on the D: drive. [Evernote] is an application that provides "folksonomy" tagging. This is great in that I can access it from my phone, my laptop, or my desktop at home. Install-files are all those ZIP and EXE files to install applications after a fresh Windows install. If I ever had to wipe clean my C: drive and re-install Windows, I would then have this folder on D: drive to upgrade my system. Finally, I keep my Lotus Notes database directory on my D: drive. Since these are databases (NSF) files accessed directly by Lotus Notes, I saw no reason to put them under the D:\Documents directory structure.
- E: Drive
This will be for my multimedia files. These don't change often, are mostly read-only, and could be restored quickly as needed.
I'll give this new re-organization a try. Since I have to take a fresh backup to Tivoli Storage Manager anyways, now is the best time to re-organize the directory structure and update my dsm.opt options file.
technorati tags: , mainframe, DFSMS, HLQ, LLQ, DSN, naming convention, RACF, JCL, file system, de-clutter, organization, Peter Walsh, Windows, Linux, TSM
I had an interesting query about my last blog post [Enterprise Systems are Security-Ready], basically asking me what I decided to do for Full-Disk Encryption (FDE) for my laptop.
Earlier this year, IBM mandated that every employee provided a laptop had to implement Full-Disk Encryption for their primary hard drive, and any other drive, internal or external, that contained sensitive information. An exception was granted to anyone who NEVER took their laptop out of the IBM building. At IBM Tucson, we have five buildings, so if you are in the habit of taking your laptop from one building to another, then encryption is required!
The need to secure the information on your laptop has existed ever since laptops were given to employees. In my blog post [Biggest Mistakes of 2006], I wrote the following:
"Laptops made the news this year in a variety of ways. #1 was exploding batteries, and #6 were the stolen laptops that exposed private personal information. Someone I know was listed in one of these stolen databases, so this last one hits close to home. Security is becoming a bigger issue now, and IBM was the first to deliver device-based encryption with the TS1120 enterprise tape drive."
Not surprisingly, IBM laptops are tracked and monitored. In my blog post [Using ILM to Save Trees], I wrote the following:
"Some assets might be declared a 'necessary evil' like laptops, but are tracked to the n'th degree to ensure they are not lost, stolen or taken out of the building. Other assets are declared "strategically important" but are readily discarded, or at least allowed to [walk out the door each evening]."
When it was [time for a new laptop] in 2010, I spent a week [re-partitioning the drive], [transfering files], [installing programs], [re-organizing my folders], and finally [testing my system]. It was dual-boot so that I could run either Windows or Linux, as needed, to demonstrate various software solutions at the IBM Tucson Executive Briefing Center.
Unfortunately, dual-boot environments won't cut it for Full-Disk Encryption. For Windows users, IBM has chosen Pretty Good Privacy [PGP]. For Linux users, IBM has chosen Linux Unified Key Setup [LUKS]. PGP doesn't work with Linux, and LUKS doesn't work with Windows.
For those of us who may need access to both Operating Systems, we have to choose. Select one as the primary OS, and run the other as a guest virtual machine. I opted for Red Hat Enterprise Linux 6 as my primary, with LUKS encryption, and Linux KVM to run Windows as the guest.
I am not alone. While I chose the Linux method voluntarily, IBM has decided that 70,000 employees must also set up their systems this way, switching them from Windows to Linux by year end, but allowing them to run Windows as a KVM guest image if needed.
Let's take a look at the pros and cons:
- LUKS allows for up to 8 passphrases, so you can give one to your boss, one to your admin assistant, and in the event they leave the company, you can disable their passphrase without impacting anyone else or having to memorize a new one. PGP on Windows supports only a single passphrase.
- Linux is a rock-solid operating system. I found that Windows as a KVM guest runs better than running it natively in a dual-boot configuration.
- Linux is more secure against viruses. Most viruses run only on Windows operating systems. The Windows guest is well isolated from the Linux operating system files. Recovering from an infected or corrupted Windows guest is merely re-cloning a new "raw" image file.
- Linux has a vibrant community of support. I am very impressed that anytime I need help, I can find answers or assistance quickly from other Linux users. Linux is also supported by our help desk, although in my experience, not as well as the community offers.
- Employees that work with multiple clients can have a separate Windows guest for each one, preventing any cross-contamination between systems.
- Linux is different from Windows, and some learning curve may be required. Not everyone is happy with this change.
(I often joke that the only people who are comfortable with change are babies with soiled diapers and prisoners on death row!)
- Implementation is a full re-install of Linux, followed by a fresh install of Windows.
- Not all software required for our jobs at IBM runs on Linux, so a Windows guest VM is a necessity. If you thought Windows ran slowly on a fully-encrypted disk, imagine how much slower it runs as a VM guest with limited memory resources.
In theory, I could have tried the Windows/PGP method for a few weeks, then gone through the entire process to switch over to Linux/LUKS, and then draw my comparisons that way. Instead, I just chose the Linux/LUKS method, and am happy with my decision.
technorati tags: IBM, Windows, PGP, Linux, LUKS, FDE, encryption
They say "Great Minds think alike" and that imitation is "the sincerest form of flattery." Both of these quotes came to mind when I read fellow blogger Chuck Hollis' (EMC) excellent April 7th blog post [The 10 Big Ideas That Are Shaping IT Infrastructure Today]. Not surprisingly, some of his thoughts are similar to those I had presented two weeks ago in my March 22nd post [Cloud Computing for Accountants]. Here are two charts that caught my eye:
- Telephone Operators
On page 13 of my deck, I had an old black and white photo of telephone operators, as part of a section on the history of selecting "cloud" as the iconic graphic to represent all networks. Chuck has this same graphic on his chart titled "#1 The Industrialization of IT Infrastructure".
Looks like Chuck and I use the same "stock photo" search facility!
- Arms Dealers
On page 45 on my deck, I had a list of major "arms dealers" that deliver the hardware and software components needed to build Cloud Computing. Chuck has a similar chart, titled "#2 The Consolidation of the IT Industry", but with some interesting differences.
Let's look at some of the key differences:
- The left-to-right order is slightly different. I chose a 1-2-4-2-1 symmetrical pattern purely on aesthetic reasons. My presentation was to a bunch of accountants, and so I was trying not to make it sound like an "Infomercial" for IBM products and offerings. My sequence is roughly chronological, in that Oracle announced its intention to acquire Sun, then Cisco, VMware and EMC announced their VCE coalition, followed closely by Cisco, VMware and NetApp announcing they work together well also, followed by [HP extended alliance with Microsoft] on Jan 13, 2010. As the IT marketplace is maturing, more and more customers are looking for an IBM-like one-stop shopping experience, and certainly various "mini-mall" alliances have formed to try to compete in this space.
- I had HP and Microsoft in the same column, referring only to the above-mentioned January announcement. HP is all about private cloud hardware infrastructures, but Microsoft is all about "three screens and the public cloud", so not sure how well this alliance will work out from a Cloud Computing perspective. This was not to imply that the other stacks don't work well with Microsoft software. They all do. Perhaps to avoid that controversy, Chuck chose to highlight HP's acquisition of EDS services instead.
- I used the vendor logos in their actual colors. Notice that the colors black, blue and red occur most often. These happen to be the three most popular ballpoint pen ink colors found on the very same paper documents these computer companies are trying to eliminate. Paper-less office, anyone? Chuck chose instead to colorize each stack with his own color scheme. While blue for IBM and orange for Sun Microsystems make some sense, it is not clear if he chose green for Cisco/VMware/EMC for any particular reason. Perhaps he was trying to subtly imply that the VCE stack is more energy efficient? Or maybe the green refers to money to indicate that the VCE stack is the most expensive? Either way, I would pit IBM's server/storage/software stack up against anything of comparable price from these other stacks in any energy efficiency bake-off.
- What about the Cisco/VMware/NetApp combination? All three got together to assure customers this was a viable combination. IBM is the number one reseller of VMware, and VMware runs great with IBM's N series NAS storage, so I do not dispute Cisco's motivation here. It makes sense for Cisco to two-time EMC in this manner. Why should Cisco limit itself to a single storage supplier? Et tu VMware? Having VMware chose NetApp over its parent company EMC was a bit of a shock. No surprise that Chuck left NetApp out of his chart.
- No love for Dell? I give Dell credit for their work with Virtual Desktop Images (VDI), and for embracing Ubuntu Linux for their servers. Dell's acquisitions of EqualLogic iSCSI-based disk systems and Perot Systems for services are also worth noting. Dell used to resell some of EMC's gear, but perhaps that relationship continues to fade away, as I [predicted back in 2007]. Chuck's decision to leave Dell off his chart speaks volumes to where this relationship stands, and where it is going.
Perhaps we are all in just one big ["echo chamber"], as we are all coming up with similar observations, talking to similar customers, and reviewing similar market analyst reports. I am glad, at least this time, that Chuck and I for the most part agree where the marketplace is going. We live in interesting times!
technorati tags: , Chuck Hollis, EMC, IBM, HP, Microsoft, telephone operators, IT infrastructure, cloud computing, Cisco, VMware, , , Oracle, Sun, NetApp, EqualLogic, iSCSI, NAS, Dell, VDI, Ubuntu, Linux