Backup and disaster recovery explained
By IBM Cloud Education
Learn the basics of backup and disaster recovery so you can formulate effective plans that minimize downtime.
- Recognize the difference between backup and disaster recovery, and understand key concepts that are critical for developing effective strategies
- Evaluate multiple cloud and on-premises deployment options to find the right fit for your organization
- Identify the best technologies for achieving your backup and disaster recovery goals
Understanding the essentials of backup and disaster recovery is critical for minimizing the impact of unplanned downtime on your business. Across industries, organizations recognize that downtime can quickly result in lost revenue. Unfortunately, natural disasters, human error, security breaches and ransomware attacks can all jeopardize the availability of IT resources. Any downtime can derail customer interactions, sap employee productivity, destroy data and halt business processes.
Differentiating backup from disaster recovery, defining key terms, and evaluating various deployment options and technologies can help you develop effective strategies for avoiding the consequences of downtime.
Are backup and disaster recovery the same thing?
Not at all. There’s an important distinction between backup and disaster recovery. Backup is the process of making an extra copy (or multiple copies) of data. You back up data to protect it. You might need to restore backup data if you encounter an accidental deletion, database corruption or problem with a software upgrade.
Disaster recovery, on the other hand, refers to the plan and processes for quickly reestablishing access to applications, data and IT resources after an outage. That plan might involve switching over to a redundant set of servers and storage systems until your primary data center is functional again.
Some organizations mistake backup for disaster recovery. But as they may discover after a serious outage, simply having copies of data doesn’t mean you can keep your business running. To ensure business continuity, you need a robust, tested disaster recovery plan.
The importance of backup and disaster recovery planning
Your organization cannot afford to neglect backup or disaster recovery. If it takes hours to retrieve lost data after an accidental deletion, your employees or partners will sit idle, unable to complete business-critical processes that rely on your technology. And if it takes days to bring your business back online after a disaster, you stand to permanently lose customers. Given the amount of time and money you could lose in both cases, investments in backup and disaster recovery are completely justified.
Key terms in backup and disaster recovery
Understanding a few essential terms can help shape your strategic decisions and enable you to better evaluate backup and disaster recovery solutions.
- Recovery time objective (RTO) is the amount of time it takes to recover normal business operations after an outage. As you look to set your RTO, you’ll need to consider how much time you’re willing to lose — and the impact that time will have on your bottom line. The RTO might vary greatly from one type of business to another. For example, if a public library loses its catalog system, it can likely continue to function manually for a few days while the systems are restored. But if a major online retailer loses its inventory system, even 10 minutes of downtime — and the associated loss in revenue — would be unacceptable.
- Recovery point objective (RPO) refers to the amount of data you can afford to lose in a disaster. You might need to copy data to a remote data center continuously, so that an outage will not result in any data loss. Or you might decide that losing five minutes or one hour of data would be acceptable.
- Failover is the disaster recovery process of automatically offloading tasks to backup systems in a way that is seamless to users. You might fail over from your primary data center to a secondary site, with redundant systems that are ready to take over immediately.
- Failback is the disaster recovery process of switching back to the original systems. Once the disaster has passed and your primary data center is back up and running, you should be able to fail back seamlessly as well.
- Restore is the process of transferring backup data to your primary system or data center. The restore process is generally considered part of backup rather than disaster recovery.
One last term might be helpful as you consider alternatives for managing your disaster recovery processes and your disaster recovery environment:
- Disaster recovery as a service (DRaaS) is a managed approach to disaster recovery. A third party hosts and manages the infrastructure used for disaster recovery. Some DRaaS offerings might provide tools to manage the disaster recovery processes or enable organizations to have those processes managed for them.
Once you understand the key concepts, it’s time to apply them to your workloads. Many organizations have multiple RTOs and RPOs that reflect the importance of each workload to their business.
For a major bank, the online banking system might be a critical workload — the bank needs to minimize time and data loss. However, the bank’s employee time-tracking application is less important. In the event of a disaster, the bank could allow that application to be down for several hours or even a day without having a major negative impact on the business. Defining workloads as Tier 1, Tier 2 or Tier 3 can help provide a framework for your disaster recovery plan.
Evaluate deployment options
The next step in designing a disaster recovery plan is to evaluate deployment options. Do you need to keep some disaster recovery functions or backup data on premises? Would you benefit from a public cloud or hybrid cloud approach?
Cloud-based backup and disaster recovery solutions are becoming increasingly popular among organizations of all sizes. Many cloud solutions provide the infrastructure for storing data and, in some cases, the tools for managing backup and disaster recovery processes.
By selecting a cloud-based backup or disaster recovery offering, you can avoid the large capital investment for infrastructure as well as the costs of managing the environment. In addition, you gain rapid scalability plus the geographic distance necessary to keep data safe in the event of a regional disaster.
Cloud-based backup and disaster recovery solutions can support both on-premises and cloud-based production environments. You might decide, for example, to store only backed up or replicated data in the cloud while keeping your production environment in your own data center. With this hybrid approach, you still gain the advantages of scalability and geographic distance without having to move your production environment. In a cloud-to-cloud model, both production and disaster recovery are located in the cloud, although at different sites to ensure enough physical separation.
In some cases, keeping certain backup or disaster recovery processes on premises can help you retrieve data and recover IT services rapidly. Retaining some sensitive data on premises might also seem appealing if you need to comply with strict data privacy or data sovereignty regulations.
For disaster recovery, a plan that relies wholly on an on-premises environment would be challenging. If a natural disaster or power outage strikes, your entire data center—with both primary and secondary systems—would be affected. That’s why most disaster recovery strategies employ a secondary site that is some distance away from the primary data center. You might locate that other site across town, across the country or across the globe depending on how you decide to balance factors such as performance, regulatory compliance and physical accessibility to the secondary site.
Examine various technologies for backup and disaster recovery
Depending on which deployment options you choose, you might have several alternatives for the types of technologies and processes you employ for backup and for disaster recovery.
Despite having been around for decades, traditional magnetic tape storage can still play a role in your backup plan. With a tape solution, you can store a large amount of data reliably and cost-effectively.
While tape can be effective for backup, it is not usually employed for disaster recovery, which requires the faster access time of disk-based storage. Also, if you need to physically retrieve a tape from an offsite vault, you could lose several hours or even days of availability.
A snapshot-based backup captures the current state of an application or disk at a moment in time. By writing only the changed data since the last snapshot, this method can help protect data while conserving storage space.
Snapshot-based replication can be used for backup or disaster recovery. Of course, your data is only as complete as your most recent snapshot. If you take snapshots every hour, you must be willing to lose an hour’s worth of data.
Many organizations are moving toward continuous replication for disaster recovery as well as for backup. With this method, the latest copy of a disk or application is continuously replicated to another location or the cloud, minimizing downtime and providing more granular recovery points.
Don’t wait for disaster
For most organizations, backup and disaster recovery strategies are absolutely critical to maintain the health of the business. As you evaluate and update your strategies, consider exploring managed and cloud-based services, which can help you control complexity and cost. Whatever you do, don’t wait to assess your strategies. Backup and disaster recovery plans can help only if they are designed, deployed and tested long before they are needed.