What is disaster recovery as a service (DRaaS)?
Explore IBM's DRaaS solution Subscribe for cloud updates
Woman on desk accessing data storage within a secured cloud network for backup and recovery

Published: 12 December 2023
Contributors: Mesh Flinders, Ian Smalley

What is DRaaS?

Disaster recovery as a service (DRaaS) is a third-party solution that delivers data protection and disaster recovery (DR) capabilities to enterprises on-demand, over the internet and on a pay-as-you-go basis.

DRaaS solutions replicate and host both physical and virtual servers that provide failover in the event of a disaster—a process where IT operations are switched to a secondary system when a primary one has failed. Effective DRaaS helps limit downtime and shorten recovery point objectives (RPOs) and recovery time objectives (RTOs) when a disaster strikes.

DR solutions have been gaining popularity in recent years due to a growing awareness in the business community around the importance of data security. Companies that take the DRaaS approach essentially outsource their DR planning to a third party. According to a recent report by Global Market Insights (GMI) ( link resides outside ibm.com), the market size for DRaaS was USD 11.5 billion in 2022 and was poised to grow by 22% this year (link resides outside ibm.com).

What is a disaster recovery plan?

DRaaS solutions rely on disaster recovery plans (DRPs), which are detailed documents outlining how an organization responds to an unplanned incident. Along with business continuity plans (BCPs), DR plans help ensure that businesses are prepared to face many different types of threats, including ransomware and malware attacks, natural disasters and many more. 

A strong DRP can help restore connectivity and repair data loss after a disaster. In an unplanned event, a third-party vendor providing DRaaS support is less likely to suffer the same shutdown as its customers, enabling the DRaaS vendor to enact the customer’s DRP more effectively than the customer itself.

What is failover/failback?

Failover and failback are concepts that are central to DRaaS, helping third-party vendors effectively support their customers and implement their DRP regardless of the severity of the incident they are facing. Failover is a process where IT operations are moved to a secondary system when a primary one has failed due to a power outage, cyberattack or other threat. 

Failback is the process of switching back to the original system once full functionality has been restored. In a DRaaS service model, a vendor might failover from a customer’s data center onto a secondary site where a redundant system would take effect instantly. If executed properly, failover and failback can create a seamless experience where a user isn’t even aware they are being moved to a secondary system.

Achieve workplace flexibility with DaaS

Read how desktop as a service (DaaS) enables enterprises to achieve the same level of performance and security as deploying the applications on premises.

Related content

Register for the guide on app modernization

How does DRaaS work?

The first step to DRaaS is selecting the right vendor or master service provider (MSP) for your organization. This is the company who will deliver DRaaS capabilities, including the establishment of RTOs and RPOs, and help create a business continuity plan (BCP). Typically, DRaaS MSPs compete to deliver lower and lower RTOs and RPOs, so this is a good metric to start from when assessing whether they are a good fit for your needs. 

Recovery time objective (RTO): RTO refers to the amount of time that it will take to restore business operations after an unplanned incident. In a DRaaS model, the service level agreements (SLAs) cover the RTO and explain how the MSP and the organization will work together to achieve it.

Recovery point objective (RPO): The RPO is the amount of data an organization can afford to lose in a disaster and still recover. Some enterprises require their data to be constantly copied to a remote data center to ensure continuity in case of a breach; others can tolerate an RPO of a few minutes (or even hours). Setting clear expectations about an organization’s RPO is a critical step toward setting up a DRaaS solution.

Business continuity plan: Like DRPs, RTOs and RPOs, business continuity plans (BCPs) are critical to the DRaaS process. BCPs typically take a broader look at various threats and resolution options than a DRP would, focusing on what an organization and DRaaS provider will need to do to restore basic business functions after an incident. Under the DRaaS model, an organization’s BCP is usually developed by the MSP providing the DRaaS services in close consultation with the organization’s leadership.

Choosing a backup destination

DRaaS relies on the backing up of critical systems so they can be restored after an unplanned incident. Choosing the location and type of backups that will be used is one of the most important decisions an organization will make during the DRaaS process. There are three options to choose from: data center, cloud and hybrid backups.

Data center

When an organization opts to back up its most critical data by using a data center, the data is moved offsite, protecting it from a natural disaster or localized cyberattack. Because of the need for more infrastructure—such as offsite facilities and physical servers, systems and staff—data center backups are often the most expensive option. Additionally, restoring data from an offsite data center is a lengthier process than restoring from the cloud and can sometimes take days or even weeks.

Cloud

Cloud disaster recovery plans are often the most scalable and inexpensive because they don’t need physical infrastructure and support. Cloud-based DR plans store critical data in the cloud, allowing an organization to create a virtual machine (VM) instance that can be turned on in minutes—even seconds—in the event of a disaster.

Hybrid cloud

Hybrid DRaaS plans use both a public cloud environment and data center for backup purposes. Hybrid cloud services are the most flexible of the three options available, and they are a good option for small- to medium-sized businesses (SMBs) who want enterprise-level DRaaS capabilities without investing in physical infrastructure.

Backup as a Service

Many organizations considering DRaaS also look at Backup as a Service (BaaS) as a less expensive option. BaaS is a managed service provided by a third-party vendor that helps enterprises restore their most valuable data after an incident. BaaS solutions store data in a secure, offsite location—often in the cloud—where it is safe from various threats. 

BaaS data backup can include anything of value to an organization, such as files, records and even entire workloads. Like DRaaS, BaaS is a service provided by an MSP and governed by an SLA that outlines all responsibilities and expectations from both parties.

There are three key differences between BaaS and DRaaS solutions that are worth considering:

Backup requirements: While DRaaS backs up both data and infrastructure, BaaS only backs up data. DRaaS MSPs typically take responsibility for keeping critical infrastructure like servers, office buildings and networks available and accessible to users during and immediately after an incident. BaaS providers offer no such services.

Restoration time: BaaS providers can restore data and perform data recovery, but it takes longer than it would with a DRaaS provider because of the amount of data involved. BaaS deployments typically deal with larger volumes of data than DRaaS deployments do and measure their RPOs and RTOs in hours and days. DRaaS providers can measure RPO and RTO in minutes, and sometimes even seconds.

Solution pricing: BaaS costs significantly less than DRaaS. Mainly, this is due to the cost of the resources being deployed. In a DRaaS deployment, enterprises pay for resources such as replication software and compute infrastructure in addition to storage resources, while in a BaaS deployment the organization only pays for storage resources. 

How does BCDR work?

Most organizations divide BCDR planning into two separate processes: business continuity and disaster recovery. This is an effective approach because while the two processes share many steps, there are also key differences in how the plans are built, implemented and tested.

The primary difference is that BCPs tend to be proactive, while DRPs tend to be more reactive. It’s good to keep this in mind when building the two parts of your BCDR plan because it governs how the two processes relate to each other.

A strong business continuity strategy focuses on processes, procedures and roles that are critical to business operations before, during and immediately following a disaster. DR planning is more geared toward reacting to an incident and taking appropriate actions to recover from it. 

Both processes depend heavily on two critical components, recovery time objective (RTO) and recovery point objective (RPO):

  • Recovery time objective (RTO): RTO refers to the amount of time that it takes to restore business processes after an unplanned incident. Establishing a reasonable RTO is one of the first things businesses need do when they’re creating their DRP. 
  • Recovery point objective (RPO): Your business’ RPO is the amount of data that it can afford to lose in a disaster and still recover. Since data protection is a core capability of many modern enterprises, some constantly copy data to a remote data center to ensure continuity in case of a massive breach. Others set a tolerable RPO of a few minutes (or even hours) for business data to be recovered from a backup system and know they are able to recover from whatever was lost during that time.
How to build a business continuity plan 

1. Conduct Business Impact Analysis (BIA)

To build an effective BCP, you’ll first need to understand the various risks your organization faces. Business impact analysis (BIA) plays a crucial role in risk management and business resilience. BIA is the process of identifying and evaluating the potential impact of a disaster on normal operations.

Strong BIA includes an overview of all potential existing threats and vulnerabilities—internal and external—as well as detailed plans for mitigation. Additionally, the BIA must identify the likelihood of an event occurring so the organization can prioritize accordingly.

2. Design responses

Once your BIA is complete, the next step in building your BCP is planning effective responses to each of the threats you’ve identified. Different threats will naturally require different disaster recovery strategies, so each of your responses should have a detailed plan for how the organization will spot a specific threat and address it.

3. Identify key roles and responsibilities

This step dictates how key members of your team will respond when facing a crisis or disruptive event. It documents expectations for each team member as well as the resources required for them to fulfill their roles.

This is a good part of the process to consider how individuals will communicate in the event of an incident. Some threats will shut down key networks—such as cellular or Internet connectivity—so it’s important to have fallback methods of communication your employees can rely on.

4. Test and update your plan

To be actionable, you need to constantly practice and refine your BCDR plan. Constant testing and training of employees will lead to a seamless deployment when an actual disaster strikes. Rehearse realistic scenarios like cyberattacks, fires, floods, human error, massive outages and other relevant threats so team members can build confidence in their roles and responsibilities.

How to build a disaster recovery plan

Like BCPs, DRPs require business impact analysis (BIA)—the outlining of roles and responsibilities and constant testing and refinement. But because DRPs are more reactive in nature, there is more of a focus on risk analysis and data backupand recovery. Steps 2 and 3 of DRP development, performing risk analysis (RA) and creating an asset inventory are not part of the BCP development process at all. 

Here's a widely used five-step process for creating a DRP:

1.    Conduct business impact analysis

Like in your BCP process, start by assessing each threat your company could face and what its ramifications might be. Consider how potential threats might impact daily operations, regular communication channels and worker safety.

Additional considerations for a strong BIA include loss of revenue, cost of downtime, cost of reputational repair (public relations), loss of customers and investors (short and long term) and any incurred penalties from compliance violations.

2.    Analyze risks

DRPs typically require more careful risk assessment than BCPs since their role is to focus on recovery efforts from a potential disaster. During the risk analysis (RA) portion of planning, consider a risk’s likelihood and potential impact on your business.

3.    Create an asset inventory

To create an effective DRP, you must know exactly what your enterprise owns, its purpose and function and its condition. Doing regular asset inventory helps identify hardware, software, IT infrastructure and anything else your organization might own that is crucial to your business operations. Once you’ve identified your assets, you can group them into three categories—critical, important and unimportant:

  • Critical: Only label assets as critical if they are required for normal business operations.
  • Important: Give this label to assets that are used at least once a day and, if disrupted, would have an impact on business operations (but not shut them down entirely).
  • Unimportant: These are assets your business uses infrequently that are not essential for normal business operations.

4.    Establish roles and responsibilities

Just like in your BCP development, you’ll need to clearly outline responsibilities and ensure team members have what they need to perform their required duties. Without this crucial step, no one will know how to act during a disaster. Here are some roles and responsibilities to consider when building your DRP:

  • Incident reporter: Someone who maintains contact information for relevant parties and communicates with business leaders and stakeholders when disruptive events occur.
  • DRP supervisor: The DRP supervisor ensures team members perform the tasks they’ve been assigned during an incident. 
  • Asset manager: Someone whose job it is to secure and protect critical assets when a disaster strikes. 
  • Third-party liaison: The person who coordinates with any third-party vendors or service providers you’ve hired as part of your DRP and updates stakeholders accordingly on how the DRP is going. 

5.    Test and refine

Like your BCP, your DRP requires constant practice and refinement to be effective. Practice it regularly and update it according to any meaningful changes that need to be made. For example, if your company acquires a new asset after your DRP has been formed, you’ll need to incorporate it into your plan to ensure its protected going forward.

    Benefits of DRaaS

    Modern enterprises understandably have less and less tolerance for downtime. 
    Every day, it seems another cyberattack or unplanned event has made the headlines, costing enterprises millions. Disaster recovery solutions like DRaaS provide effective data protection and disaster recovery for a variety of threats.

    Outsourcing the implementation of your DRP and backing up your organization’s most valuable data in a separate location—two key elements of DRaaS—help ensure you can recover swiftly and fully in the event of a disaster.

    Here are some of the key benefits of DRaaS solutions:

    Faster recovery times

    Today’s most competitive enterprises rely on technology for their most critical business operations. When a disaster strikes, the days, hours—even the minutes—that normal processes are knocked out can cost millions. Beyond that, cyberattacks and downtimes at well-known companies often make the news, adding reputational costs to the list of recovery expenses. Effective DRaaS provides enterprises with data protection and backup and increases their ability to bounce back from whatever threats they face.

    Reduced costs

    The cost of recovering from a disaster is getting more expensive every year. Looking at just one type of unplanned incident, data breaches, IBM’s recent Cost of Data Breach Report found that the average cost of a breach in 2023 was USD 4.45 million—a 15% increase over the last 3 years. 

    When disaster strikes, enterprises who use a DRaaS provider enjoy two significant advantages over those who don’t—their backed-up data and the party responsible for executing their DRP are both in another physical location. This makes it far less likely that the DRaaS provider will be affected by the same incident threatening the organization. Additionally, DRaaS providers offer subscription-based or pay-as-you-go models, eliminating the need for upfront investment in IT infrastructure.

    Greater scalability

    DRaaS is a highly adaptive solution that can be tailored to fit almost any enterprise’s needs. DRaaS providers leverage cloud-based functionality and the automation of key processes and tasks to maximize efficiencies and reduce overhead. Enterprises that deploy DRaaS solutions can dedicate critical resources toward more core business functions that would otherwise be taken up developing, implementing and managing their DRP.

    Improved compliance

    Heavily regulated sectors like healthcare and personal finance levy punishing fines on companies that are the victims of data breaches. Often, the amount of the fines is tied to the length of downtime during an attack and the amount of data that was compromised. DRaaS helps shorten response and recovery times, significantly reducing the financial punishments associated with data breaches.

    Stronger security

    DRaaS providers deploy the strongest cybersecurity and encryption measures available for one simple reason: It’s the core of their business. When you hire a DRaaS provider, you’re hiring specialists in data security, theft prevention and disaster recovery to do what they do best—keep you and your most critical data safe.  

    The three kinds of DRaaS

    Disaster Recovery as a Service (DRaaS) solutions come in three varieties: self-service, assisted and managed DRaaS. Depending on an organization’s needs and resources, there are important differences between these options worth considering.

    Self-service DRaaS

    Self-service DRaaS offers organizations the business tools and resources that they need to build and manage their own DRP. It’s a good fit for technologically advanced organizations with dedicated, in-house IT teams that require a high level of control over their processes. 

    Because it is so bare bones, self-service DRaaS has lower pricing options than other plans and is far more flexible. But organizations should know that with a self-service DRaaS solution, they are on their own during the planning, testing and managing phases of their DRP.

    Assisted DRaaS

    Assisted DRaaS is a good type of recovery service for organizations who need to balance their need for control with robust support from a third-party MSP. In assisted DRaaS, the MSP helps build, plan, implement, test and refine the DRP and offers valuable expertise and guidance throughout the process.

    Managed DRaaS

    Managed DRaaS is a fully outsourced DRaaS solution where the MSP takes total control over—and responsibility for—the development and implementation of an organization’s DRP. A strong option for businesses that don’t have their own IT departments, managed DRaaS provides the most robust DRaaS offering available. Unsurprisingly, it is often the most expensive as well.

    Related solutions
    IBM Cloud Disaster Recovery Solutions

    Even a minor outage can put you at a competitive disadvantage. Protect your data with a cloud disaster recovery plan. 

    Explore IBM Cloud Disaster Recovery Solutions

    IBM Security and Resiliency Services

    Enable resilient models to mitigate risks, reinforce crisis management and ensure business continuity. 

    Find out more about IBM Security and Resiliency Services

    IBM Cloud Backup Solutions

    Employ a highly durable, scalable and security-rich destination for backing up your data.

    Learn more about IBM Cloud Backup Solutions

    IBM Cloud for VMware Solutions

    Expand capacity and consolidate data center infrastructure onto an automated and centrally managed software-defined data center with IBM Cloud for VMware Solutions.

    Discover IBM Cloud for VMware Solutions

    Resources Disaster Recovery as a Service (DRaaS) vs. Disaster Recovery (DR): Which do you need?

    Learn about what factors come into play when deciding whether to invest in and manage your on-premises Disaster Recovery (DR) solutions or use Disaster Recovery as a Service (DRaaS) providers.

    What is backup and restore?

    Learn about technologies and practices for making periodic copies of data and applications, that enable your business to recover in case of a power outage, cyberattack, human error, disaster or some other unplanned event.

    Disaster recovery vs. backup: What's the difference?

    Discover critical similarities and differences between disaster recovery and backup, as well as how these solutions can help you solve your business' most important problems.

    More cyber resilience in store with FlashSystem

    Learn how you can help ensure your business's resiliency with a Cyber Recover Guarantee on a performant, affordable, sustainable and resilient FlashSystem Array.

    Are you getting the most from your hybrid cloud?

    Deploy a smarter strategy to drive real business transformation with hybrid cloud solutions. Learn how you can accelerate business outcomes with an open, hybrid multicloud and AI approach that scales resiliency and performance.

    What is immutable storage?

    Learn about immutable storage, a kind of storage protocol that protects stored data by preventing any changes or alterations for either a set or indefinite amount of time.

    Take the next step

    Simplify data and infrastructure management with IBM Storage FlashSystem, a high-performance, all-flash storage solution that streamlines administration and operational complexity across on-premises, hybrid cloud, virtualized and containerized environments.

      Explore FlashSystem storage Take a tour