A recent post in Anecdote talks about the long list of cognitive biases which affect business decision making. This list is a good explanation of why so many people have a difficult time identifying appropriate recovery scenarios as the basis for Business Continuity planning. Their "cognitive biases" get in the way.
Again, using my IBM Thinkpad T60 laptop as an example, here are a variety of different scenarios:
- Corrupted File System
Some file systems are more fragile than others. If your NTFS file system gets corrupted, you might be able to run
CHKDSK C: /Fbut this just puts damaged blocks into dummy files, it doesn't really repair your files back to their pre-damage level.All kinds of things can damage the file system, including viruses, software defects, and user error.
I keep my programs and data in separate file systems. C: has my Windows operating system and applications, and D: holds my pure data. If one file system is corrupted, the other one might be in tact, mitigating the risk.
- Hard Disk Crash
Hopefully, you will have temporary read/write errors to provide warning prior to a complete failure. In theory, if I kept a spare hard disk in my laptop bag, I could swap out the bad drive with the good drive. I don't have that. The three times that I have had a disk failure all occurred while I was in Tucson.
Instead, I keep the few files I need for my trip on a separate USB key, and carry bootable Live CD, which allows you to boot entirely from CDrom drive, either to run applications, or perform rescue operations.
The latest one that I am trying out is Ubuntu Linux, which has OpenOffice 2.2 that can read/write PowerPoint, Word, and Excel spreadsheets; Firefox web browser; Gimp graphics software; and a variety of other applications, all in a 700MB CDrom image. I even have been able to get Wireless (Wi-Fi) working with it, and the process to create your own customized Live CD with the your own application packages is fairly straightforward. Combined with a writeable USB key, you can actually get work done this way. Special thanks to IBM blogger Bob Sutor for pointing me to this.
(If you have a DVD-RAM drive, there are bigger Live CDs from SUSE and RedHat Fedora that provide even more applications)
- Laptop Shell Failure
This might catch some people by surprise. I have had the keyboard, LCD screen, or some essential port/plug fail on my laptop. The disk drive and CDrom drive work fine, but unless you have another "laptop" to stick them into, they don't help you recover. This can also happen if the motherboard fails, or the battery is unable to hold a charge.
IBM provides a 24-hour turn around fix. Basically, IBM sends me a laptop shell, no drive, no CDrom, with instructions to move the disk drive and CDrom drive from your broken shell, to the new shell, then send the bad shell back in the same shipping box.
Here, again, I am thankful that I keep my key files on an USB key. Often I travel with other IBMers, and can borrow their laptop to make presentations, check my e-mail, or other work, until I can get my replacement shell. In you are travelling outside the US, you might be able to move your disk drive into a colleague's laptop, access the data, copy it to your USB key or burn a copy on CD or DVD.
In a data center, many outages are really "failures to access data", but the data is safe. For example, power outages, network outages, and so on, can prevent people from using their IT systems, but the data is safe when these are re-established.
- Temporary Separation
At times, I have been temporarily separated from my laptop. Three examples:
- A higher level executive had technical difficulties with his laptop, and usurped mine instead.
- A colleague forgot his power supply for his laptop, and borrowed my laptop instead. (I wish there were a standard for laptop power plug connectors)
- Customs agents confiscate your laptop, give you a receipt, and eventually you get it back.
In all cases, I was glad that no "recovery" was required, and that the few files I needed were on my USB key. A few times, I was able to get by on the machines available at the nearest Internet Cafe, in the meantime.
With some imagination, you can recognize that this scenario is similar to the previous one for laptop shell failure.Here is a good example that you can identify different scenarios, and then later discover they have similar properties in terms of recovery, and can be treated as one.
- Permanent Separation
Laptops are stolen every day. Luckily, I've only had this happen twice to me in my career at IBM, and I managed to get a replacement soon enough. The key lesson here is to keep your USB key and recovery media in separate luggage.I know it is more convenient to keep all computer-related stuff in one place, but a thief is going to take your whole laptop bag, to make sure that all cables and power supplies are included, and is not going to leave anything behind. That would just slow them down.
technorati tags: IBM, Business, Continuity, plan, plans, planning, Thinkpad, T60, laptop, NTFS, CHKDSK, hard disk crash, USB, key, Live, CD, LiveCD, DVD, Ubuntu, Linux, SUSE, RedHat, Fedora, shell, failure