Ruins from the 1906 San Francisco earthquake, remembered as one of the worst natural disasters in United States history (Wikipedia).
I am going to stray considerably from the topic of business rules today, but I'd like to share some information on personal backups for disaster recovery and hear your thoughts.
If you are anything like me you've amassed a lot of digital information over the past twenty odd years, some of which is "mission critical", such as scans of important documents or photos of landmark events in your life. I hope you are already backing this information up onto redundant storage, such as a dedicated external hard drive. If not, given the price of external hard drives these days, you really have very little excuse! The Mean Time To Failure for hard drives is difficult to estimate due to the widely different operating conditions for consumer hard drives, but probably lies in the 3-7 year range. So it is not a question of IF the drive will fail, just WHEN. Copying the data onto redundant hard disks, either in an ad-hoc fashion or using a RAID device will clearly reduce (but not eliminate) the chances of catastrophic device failure.
The illustration above is from the excellent paper "Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?" by Bianca Schroeder and Garth A. Gibson. Google has also published an interesting paper on failure rates for hard drives, that includes information on the predictive capabilities of SMART technology that is available in many hard disks.
So given that your most valuable data is mirrored onto at least two hard disks and perhaps also burnt onto CD/DVD what is the next step? The sad story of the robbery of Francis Ford Coppola's house is a useful object lesson: he lost 15 years of backups because the thieves stole not only his computers but also his backup devices. Obviously a house fire, flood or lighting strike could have the same effect! So device redundancy is not enough, you also need off-site backup. You could rent a safe-deposit box at the bank and drop a hard disk or DVD off occasionally but the danger is that the inconvenience will make your backups dangerously infrequent. If you have enough bandwidth to your house moving your most precious data onto a remote server is a very good option, as backups can be automated or performed regularly without having to use physical media.
When you get into the realm of off-site backup a bunch of issues arise:
- Who is hosting my backup and will they provide a good service when I need to get my data in a hurry?
- Do I trust the host with my most sensitive information? What if their servers are compromised?
- What software can I use to upload and download my backups? What operating systems are supported?
- What is the cost of the service?
- Is it relatively easy to move to a new host in the future or am I locked in?
- Can I access my backups from different machines?
- Can I automate my backup process and upload data to the host?
- How efficient (incremental) are the backups, especially for individual large binary files?
- How future-proof is the solution?
Everyone's situation is different, but I will describe my reasoning and my solution. First, here are my requirements:
- I need to backup a relatively small amount of data (less than 4 GB) that changes infrequently (a few times per year).
- I want to encrypt the data locally on my machine. The host should have no access to the unencrypted files - ever.
- I need access to the backups from 2 Mac OS X machines (one is running 1.5, the other 1.4) as well as a machine running Windows XP and a machine running Ubuntu Linux. Any of the machines may need to download, modify and re-upload the backup.
It turns out that the requirement to access the backup from Mac, Windows and Linux eliminates almost all the services aimed at the home user. The encryption requirement also filters out a lot of hosts, as many of the services offer a web interface to interact with your backups, which requires that they can browse your files on their servers. The requirement for efficient upload and download of the backup eliminates the hosts that just provide online storage, allowing your to upload or download individual files. If you have very large individual files that may have undergone small internal changes (such as media catalog files) you will have to upload the entire file, which is not practical when bandwidth is limited.
The solution I selected is:
The biggest downside of this solution is that it is too complex to put in place for many people, which makes it hard to recommend to some friends and family.