Autonomic continuous data protection is no longer just about backups
An autonomic computing story that is starting to bear fruit centers on data protection that goes beyond simple backups. In the developerWorks article "Continuous Data Protection," author Chris Stakutis, a renowned data storage industry inventor, technologist, and author with more than 20 years of industry experience, explains the different flavors of CDP and their value propositions.
CDP's adoption rate
According to a study of end users by Peripheral Concepts, site implementations of Continuous Data Protection solutions are up from 34 percent in 2005 to 43 percent in 2006. One conclusion Peripheral Concepts draws from the survey is that for many organizations, CDP is transforming business continuity while greatly simplifying the data protection process, easing the administrator's burden, and ensuring compliance with service-level agreements and best practices. The survey also reports that 80 percent of CDP implementations are also used to create copies of production data for reporting, audits, and other compliance related requirements.
CDP is different from traditional backup and recovery techniques in its ability to provide near-instantaneous restore to virtually any point in time without having to physically move or copy data. Other data gathered from the survey:
- A majority of respondents rank data protection highest among their storage-management challenges.
- Data recovery stands out as the major related problem for 40 percent of the respondents.
- The average recovery objectives (RTO and RPO) is less than 20 minutes.
- While many continue to use snapshot, one third of the users estimate CDP more apt to cover their needs with ERP being the most common CDP application.
- 61 percent of the respondents expect to see ROI in less than two years.
Peripheral Concepts estimates that 28 percent of the market population will either implement CDP for the first time or extend its use in 2007.
Types of CDP
According to Stakutis, the point of CDP is to provide nearly infinite recover points so that any change can be recovered. This is important because to be cost-effective, an organization needs to be able (when doing a rollback) to go directly to the datapoint that is most important for the business task at hand. He details the three major versions of CDP:
- Block-based CDP. Great at capturing the data transparently, great at presenting a "view" of some past point in time, but requires work by the application or user to make effective use of the historical view. So if you're looking for high application transparency, no performance hit on the application, and a typical agnosticity of hardware and platforms, block-based CDP is a good tool.
- Application-based CDP. Specific applications (a database or similar application) are completely responsible for doing all of the journaling necessary to roll back to any time -- this means you can have a far richer set of recovery capabilities. However, you are stuck with the application and there is probably a price to pay in overhead and resource use on the application servers.
- File-based CDP. This runs on the application hosts (file servers or workstations) and is similar to application-based CDP in that file serving is essentially an application; however, it may offer broader value since many applications and users use file-based data naturally. Also, the file-based solution can have different sets of priorities for different files or file groups. Really good for end-users and file servers because the asset being protected (files) matches the style of CDP which provides for better granularity and ease of recovery. Not to mention that it's pretty light-weight.
CDP's not about disaster recovery
According to Geoff Nesnow ("A Close Look at Continuous Data Protection," Top Tech News), there is an important distinction to make -- CDP does not generally help companies with disaster recovery.
CDP systems typically have agent software that resides on the target storage platform or server and a dedicated application component on a different machine. This agent is charged with the task of capturing and transmitting every filesystem transaction to the dedicated CDP application system which then records the sequence, timing, and content of the transaction. Most systems keep a rolling transaction log (with a limited time window for rollback) and not a complete copy of the data (just a copy of the change); because of the time limitations and incompleteness of data storage, CDP is excellent when you want to recover from data corruption but not so good for rescue from system or hardware failure.
Of course, CDP's very ability to track filesystem changes is what makes it so fast -- you simply reverse disk transactions.
CDP in action from Tivoli
To demonstrate CDP in real-world action, the article goes on to discuss Tivoli Continuous Data Protection for Files, a real-time, continuous data protection solution for file servers and users endpoints. Instead of waiting for a scheduled interval, Tivoli CDP for Files transparently backs up the most important files the moment they are saved. You can specify as many as three target backup/replication areas for high-priority files, a local disk, a file server or Network Attached Storage appliance, and an IBM Tivoli Storage Manager server. These areas will capture every save of a file when it occurs to help protect against corruption, file loss, and system loss.
Tivoli CDP for Files also versions the separate copies of the files to facilitate date-based restore; auto-manages the local backup target area as a pool with a configurable size, deleting old versions to make room for new versions; lets you specify a remote file server or Tivoli Storage Manager for off-machine protection when a user is connected so that you still get a real-time backup; and provides off-site copies of backup data for vaulting, auditable disaster recovery plan, and tape and media management (via Tivoli Storage Manager).
For more on Continuous Data Protection, try this on demand Webcast "Continuous Data Protection: When Once a Day Is Not Enough" which will demonstrate how CDP can help by:
- Reducing or eliminating backup times thereby maximizing backup resource utilization.
- Enabling fast recovery from disk—local or remote.
- Improving IT staff productivity by allowing users to easily recover data without IT involvement (ahh! an autonomic computing connection).
Get products and technologies