What Is Backup and Disaster Recovery?

What is backup and disaster recovery?

Backup and disaster recovery involves periodically creating or updating more copies of files, storing them in one or more remote locations, and using the copies to continue or resume business operations in the event of data loss due to file damage, data corruption, cyberattack or natural disaster.

The sub processes—'backup’ and ‘disaster recovery’—are sometimes mistaken for each other or for the entire process. Backup is the process of making the file copies. Disaster recovery is the plan and processes for using the copies to quickly reestablish access to applications, data and IT resources after an outage. That plan might involve switching over to a redundant set of servers and storage systems until your primary data center is functional again.

Simply having copies of data doesn’t mean that a company can keep the business running. Ensuring business continuity requires a robust and tested backup and disaster recovery plan.

Realize the full value of your hybrid cloud

Connect and integrate your systems to prepare your infrastructure for AI.

Related content

The importance of planning

Your organization cannot afford to neglect backup or disaster recovery. If it takes hours to retrieve lost data after an accidental deletion, your employees or partners will sit idle, unable to complete business-critical processes that rely on your technology. And if it takes days to bring your business back online after a disaster, you stand to permanently lose customers. Given the amount of time and money you might lose in both cases, investments in backup and disaster recovery are completely justified.

Key terms

Understanding a few essential terms can help shape your strategic decisions and enable you to better evaluate backup and disaster recovery solutions.

Recovery time objective (RTO) is the amount of time that it takes to recover normal business operations after an outage. As you look to set your RTO, you’ll need to consider how much time you’re willing to lose—and the impact that time will have on your bottom line. The RTO might vary greatly from one type of business to another. For example, if a public library loses its catalog system, it can likely continue to function manually for a few days while the systems are restored. But if a major online retailer loses its inventory system, even 10 minutes of downtime—and the associated loss in revenue—would be unacceptable.

Recovery point objective (RPO) refers to the amount of data that you can afford to lose in a disaster. You might need to copy data to a remote data center continuously so that an outage will not result in any data loss. Or you might decide that losing five minutes or one hour of data would be acceptable.
Failover is the disaster recovery process of automatically offloading tasks to backup systems in a way that is seamless to users. You might fail over from your primary data center to a secondary site, with redundant systems that are ready to take over immediately.
Failback is the disaster recovery process of switching back to the original systems. Once the disaster has passed and your primary data center is back up and running, you should be able to fail back seamlessly as well.
Restore is the process of transferring backup data to your primary system or data center. The restore process is considered part of backup rather than disaster recovery.

One last term might be helpful as you consider alternatives for managing your disaster recovery processes and your disaster recovery environment:

Disaster recovery as a service (DRaaS) is a managed approach to disaster recovery. A third party hosts and manages the infrastructure used for disaster recovery. Some DRaaS offerings might provide tools to manage the disaster recovery processes or enable organizations to have those processes managed for them.

Prioritize workloads

Once you understand the key concepts, it’s time to apply them to your workloads. Many organizations have multiple RTOs and RPOs that reflect the importance of each workload to their business.

For a major bank, the online banking system might be a critical workload—the bank needs to minimize time and data loss. However, the bank’s employee time-tracking application is less important. In the event of a disaster, the bank might allow that application to be down for several hours or even a day without having a major negative impact on the business. Defining workloads as Tier 1, Tier 2 or Tier 3 can help provide a framework for your disaster recovery plan.

Evaluate deployment options

The next step in designing a disaster recovery plan is to evaluate deployment options. Do you need to keep some disaster recovery functions or backup data on premises? Would you benefit from a public cloud or hybrid cloud approach?

Cloud

Cloud-based backup and disaster recovery solutions are becoming increasingly popular among organizations of all sizes. Many cloud solutions provide the infrastructure for storing data and, in some cases, the tools for managing backup and disaster recovery processes.

By selecting a cloud-based backup or disaster recovery offering, you can avoid the large capital investment for infrastructure as well as the costs of managing the environment. In addition, you gain rapid scalability plus the geographic distance necessary to keep data safe in the event of a regional disaster.

Cloud-based backup and disaster recovery solutions can support both on-premises and cloud-based production environments. You might decide, for example, to store only backed up or replicated data in the cloud while keeping your production environment in your own data center.

With this hybrid approach, you still gain the advantages of scalability and geographic distance without having to move your production environment. In a cloud-to-cloud model, both production and disaster recovery are located in the cloud, although at different sites to ensure enough physical separation.

On-premises

In some cases, keeping certain backup or disaster recovery processes on-premises can help you retrieve data and recover IT services rapidly. Retaining some sensitive data on premises might also seem appealing if you need to comply with strict data privacy or data sovereignty regulations.

For disaster recovery, a plan that relies wholly on an on-premises environment would be challenging. If a natural disaster or power outage strikes, your entire data center—with both primary and secondary systems—would be affected. That’s why most disaster recovery strategies employ a secondary site that is some distance away from the primary data center.

You might locate that other site across town, across the country or across the globe depending on how you decide to balance factors such as performance, regulatory compliance and physical accessibility to the secondary site.

Technologies

Depending on which deployment options you choose, you might have several alternatives for the types of technologies and processes you employ for backup and for disaster recovery.

Traditional tape

Despite having been around for decades, traditional magnetic tape storage can still play a role in your backup plan. With a tape solution, you can store a large amount of data reliably and cost-effectively.

While tape can be effective for backup, it is not usually employed for disaster recovery, which requires the faster access time of disk-based storage. Also, if you need to physically retrieve a tape from an offsite vault, you might lose several hours or even days of availability.

Snapshot-based replication

A snapshot-based backup captures the current state of an application or disk at a moment in time. By writing only the changed data since the last snapshot, this method can help protect data while conserving storage space.

Snapshot-based replication can be used for backup or disaster recovery. Your data is only as complete as your most recent snapshot. If you take snapshots every hour, you must be willing to lose an hour’s worth of data.

Continuous replication

Many organizations are moving toward continuous replication for disaster recovery as well as for backup. With this method, the latest copy of a disk or application is continuously replicated to another location or the cloud, minimizing downtime and providing more granular recovery points.