As cloud adoption accelerates and enterprise- and production-level applications are hosted in the cloud, administrators face challenges because the traditional, agent-based backup solutions are ill-suited for a cloud environment. Agentless backup and recovery technologies can provide advantages to the traditional approaches that can help simplify and speed up data recovery.
This article describes agentless backup and recovery and how it works with the cloud.
Why agentless backup and recovery matters
The IT world is changing: Cloud computing backup and recovery is not immune to this change. Traditional backup approaches used with monolithic applications and data centers, as well as data protection schemes that were primarily designed for web and consumer applications that drove the early growth of the cloud, are not optimally designed for enterprise and production applications running in the cloud.
Agentless backup and recovery technology is generally designed from the ground up for cloud environments in order to provide a solution that is low-touch, less resource-intensive, enables flexible data recovery, and is simple to manage. For cloud users, this means less effort and associated costs (much of that through automation) and it means faster recovery.
Now let's look the evolution of backup, right to the present-day cloud.
The evolution of backup and the road to cloud
Let's examine two more traditional types of web-based backup:
- Agent-based backup
- Virtualization image backup
Traditional agent-based backup
Agents have historically been used to scan and collect data from operating systems, file systems, and applications. Agents can back up the entire data set, incremental file changes, or incremental block changes.
More recently agent functionality has grown to include deduplication, compression, and encryption. All of these functions require a certain amount of system resources. Figure 1 illustrates a typical agent-based backup system.
Figure 1. Typical agent-based backup software deployment
Application agents for structured database backup (RDBMS, email, ERP, etc.) are typically a unique agent or piece of code that is additional to the system agent. Each agent is unique and cannot be shared by other systems or applications.
Agents must be installed on each and every device. Most traditional backup and data protection software does not push the agents out; each device requires the administrator to manually install the agent on each device. This is true for patches, fixes, and upgrades as well.
Many backup agents require a disruptive application system reboot as well. This requires all implementations, upgrades, patches, and fixes to be scheduled and flash cut. If you have large numbers of agents, it makes this process a bit unwieldy and often leads to backup administrators putting off upgrades or patches until a scheduled maintenance period.
When server virtualization first became popular, backups were done the same way on virtual machines (VMs) as they were on physical ones. This reduced VM concentrations and consolidation because each agent consumed resources; more VMs meant more agent resources.
Backups also created contention on the I/O since each agent attempted to back up concurrently, often because they did not know they were contending for the same resources. Backup performance slowed and backup windows were often missed.
The emergence of virtualization image backup
Hypervisor vendors found agents' overhead unacceptable and developed a different approach to back up virtual machines. Most hypervisors today have some form of API that allows backups to utilize hypervisor native snapshots, therefore they offload the backup process without the use of agents.
Through the hypervisor API, the backup software media server initiates a snapshot of a specific VM or series of VMs (more specifically the virtual disk associated with the VMs). Virtualization snapshots allow a complete VM rollback recovery often erroneously referred to as bare metal restore (BMR). BMR can be a rather long process given the amount of data involved; consistent user polls have shown that greater than 90 percent of all recoveries and restores are a single file, not an entire machine. But when a VM rollback is required, BMR is a fast easy way to get it done.
There are several on-premise backup products that address file-level recovery within virtualization image backup, but they tend to be virtualization-platform- and hypervisor-specific. There are situations where a hypervisor snapshot or image capture requires the cloud servers to be offline (scheduled downtime), additional VM resources to recover data, or large increases in storage capacity to capture multiple point-in-time or clone image snapshots. Figure 2 shows a virtualization image backup and snapshots setup.
Figure 2. Virtualization image backup and snapshots
Now let's move on to agentless backup and recovery.
Smarter cloud backup: Agentless
Instead of installing agent software on all servers, applications, and devices, when you use an agentless backup design, you consolidate data access onto one or more physical or virtual data collectors. Each data collector pulls data from the source server, application, and device for backup. Data collectors are provided with the appropriate credentials from each backup target and leverages native APIs to pull the data. Data collection operates at LAN speed to ensure fast backups. Figure 3 demonstrates this.
Figure 3. Agentless backup software deployment
Let's look at several versions of agentless backup:
- File-based backup
- VM snapshot backup
- Application-aware backup
Agentless file-based backup
Agentless file-based backup is achieved using OS-based APIs for file access:
- For Windows® servers, appropriate credentials are provided to the data collector and it then uses Windows file services APIs to perform file-based backup.
- For Linux®- and Unix®-based servers, appropriate credentials are provided to the data collector and it then uses SSH or NFS protocols to perform file-based backup.
Figure 4 illustrates this.
Figure 4. Agentless file-based backup data access
Agentless VM snapshot backup
Agentless backup of virtual cloud servers can be achieved by through virtualization hypervisor APIs or through agentless file backup. You can back up VM snapshots by making calls to the hypervisor API, tell the hypervisor to snap a VM or multiple VMs, then copy the snapshot image out. (In the case of VMware, it copies only the change blocks (CBT) after the first snapshot.)
This operation provides a complete VM image backup. Each VM image can be rolled back on demand. In failures scenarios, the image can be mounted and run directly on another cloud server.
Using hypervisor APIs or agentless file backup enables fast and flexible data recovery regardless of the hypervisor or operating system. Figure 5 demonstrates this.
Figure 5. Agentless image or snapshot backup data access
To minimize backup agents used in structured database application backup, agentless technologies integrate directly with application-specific APIs. Typically a thin client using the application-specific APIs is installed on the backup target system running that application. The thin client enables the data collector to tell the database APIs to quiesce (or pause) the database, flush the cache, complete the writes, dump the database data into a flat file, and then resume the database. This is known as a hot backup and eliminates requirements for taking the database offline.
Depending on the API's capabilities, backup and recovery can potentially target specific tables or items in the database. Some examples of database APIs include RMAN/SBT for Oracle, DB2 backup and restore API, and SQL Server SQLVDI.
Windows servers have a built-in API pause mechanism called VSS or Volume Shadow Services which eliminates the need for a thin client. This is quite useful for Microsoft® applications, as well as other database structured applications modified to work with VSS. Figure 6 illustrates this.
Figure 6. Application-aware backup data access
Now let's look at a real-world example of an agentless backup and recovery system we had a hand in designing — Asigra® Cloud Backup.
Asigra Cloud Backup on IBM SmartCloud
The genesis of agentless backup at Asigra started more than 25 years ago in 1986. Asigra's founder, David Farajun, set out to solve a problem: How to help businesses recover lost data. Farajun began development committed to five design principles:
- As little human involvement as possible
- Offsite storage of the backed up information
- Centralization of the backup of all the business' information
- Protect the computing environment
- Quick, reliable recovery
To minimize human involvement and simultaneously centralize all of a business' information, Farajun focused on creating a platform that could deliver backup and recovery services over dial-up modems. This drove the development of the industry's first agentless backup and recovery platform.
Asigra Cloud Backup software consists of two core components in the data path: DS-Client and DS-System.
- DS-Client is the agentless data collector installed on Windows,
Linux, or Mac OS X operating systems running on a physical server appliance or virtual one. Each DS-Client backs up dozens to hundreds of physical and virtual servers or desktops and additional DS-Clients can be added and connected in a grid configuration for scalability, failover, and load balancing. DS-Clients can back up cloud servers running on SmartCloud Enterprise and they can also be placed on customer premises to back up local physical and virtual servers, structured database applications, and desktops.
Each DS-Client is designed to minimize backup requirements, windows of time, and WAN bandwidth. It starts with delta change block tracking to ensure only new and changed data is captured after the initial backup; then before data is sent to backup vault or DS-System (what most backup vendors call a media server), the data is deduplicated, compressed, and then encrypted with NIST FIPS 140-2-certified and -compliant 256-bit data encryption which can be configured separately for each DS-Client.
- DS-System is the multi-tenanted data vault that aggregates and stores all backup data sent from multiple DS-Clients. DS-System can be installed on Windows or Linux operating systems and can be located in the same or different data center as the DS-Clients. For further redundancy, replicated DS-Systems can be installed at remote data centers for disaster recovery. A single DS-System can aggregate backup data from dozens to hundreds of DS-Clients and additional DS-Systems can be added and connected in an N+1 configuration for scalability, failover, and load balancing.
Data backup on its own is useless without reliable data recovery. To provide improved data integrity, DS-System runs an autonomic healing process which is an automated health check that checks the data integrity and automatically fixes various data corruptions issues. DS-System also automatically restores data in the background to provide restore validation by ensuring backed up data is always recoverable.
Let's look at some of the workload topologies of Asigra Cloud Backup images on SmartCloud Enterprise, then show you how to get started on SmartCloud.
Asigra Cloud Backup images
Asigra Cloud Backup software comes pre-loaded on IBM SmartCloud Enterprise as part of the public image catalog. Asigra DS-Client, DS-System, and other Asigra tools and management software are included in the public images. The software images can be arranged into various topologies to support a variety of backup use cases.
Backup of SmartCloud cloud servers
In the example illustrated in Figure 7, one DS-Client is deployed for Windows backup and another DS-Client for Linux backup. Though not required, both DS-Clients, as well as the Windows and Linux servers, are on the same VLAN.
Figure 7. Deployment for backing up SmartCloud cloud servers in the same data center
The DS-Clients encrypts the data, as well as performs data compression and deduplication before sending the backup data to the DS-System(s). The DS-System is by default on the same VLAN and at the same SmartCloud data center, but can optionally reside at another SmartCloud data center (see Figure 8) for geographic separation or as secondary copy of backup data.
Figure 8. Deployment for backing up SmartCloud servers to a different data center
Backup of servers and devices outside of SmartCloud into SmartCloud
The primary difference in this example versus the previous example is that the DS-Client(s) used to backup remote physical and virtual servers, structured data applications, and desktops, resides outside of SmartCloud. Each DS-System can support multiple DS-Clients without limitation to each physical location provided there is network connectivity between them.
In Figure 9, the DS-Clients reside on the local network of the machines they need to back up. For mobile devices such as laptops, smart phones, and tablets, a stripped down DS-Client resides on the device as an application.
Figure 9. Deployment for backing up servers and devices outside of SmartCloud into SmartCloud
Getting started on SmartCloud
This section provides an overview of the major steps required to start backing up your SmartCloud server instances using Asigra Cloud Backup. For detailed step-by-step instructions, references to various Asigra user guides are provided.
Basically, it goes like this:
- Create an Asigra instance.
- Handle licensing.
- Register a DS-Client account.
- Create a backup set.
- Run backups.
- Restore data.
Step 1. Create new Asigra instance(s) from the public image catalog
Figure 10 shows the Asigra instances in the SmartCloud public image catalog:
Figure 10. Asigra instances in SmartCloud public image catalog
There are two Red Hat Linux images and one Windows image:
- Asigra Cloud Backup v11.2 64b (BYOL): This image is intended as a base Linux configuration which can be configured as a DS-Client or DS-System, as well as other Asigra software components. Installation and configuration instructions are provided in <PATH> in the image.
- Asigra Cloud Backup v11.2 demo 64b (BYOL): This image is intended for backup of small data loads and demo environments. All major Asigra software components come pre-installed and configured.
- Asigra Cloud Backup v11.2 Windows 64b (BYOL): This image is intended as a base Windows configuration which can be configured as a DS-Client or DS-System as well as other Asigra software components. Installation and configuration instructions are provided in <PATH> in the image.
We're using the second one since both the DS-System and DS-Client are pre-installed.
Step 2. Connect to a license server
Asigra Cloud Backup software is licensed as BYOL (Bring Your Own License) on SmartCloud. You must obtain your license from Asigra or your Asigra service provider or reseller which will allow you to connect your DS-System to a license server.
To connect to a license server you need to remotely login to your DS-System server and launch the DS-Operator GUI interface. Through DS-Operator navigate to Setup menu > License Server and enter the Primary License Server IP address or DNS (and optionally the Emergency License Server IP address or DNS, see Figure 11):
Figure 11. License server dialog box input from DS-Operator GUI
Step 3. Create and register DS-Client account
For a DS-Client to connect to a DS-System, a DS-Client account needs to be created and registered on DS-System via the DS-Operator GUI interface. (There is more information on this in Section 4 on DS-Clients in the DS-Operator Manual included with the image.)
For the demo image we are referring to there is already one DS-Client account registered with the DS-System.
Step 4. Create a backup set
Agentless backup is initiated from the DS-Client. The DS-User GUI is used to configure and manage DS-Clients. Login to the DS-Client server and launch the DS-User GUI. Various applications are supported by the Windows and Linux versions of DS-Client.
The New Backup Set Wizard walks you through a step-by-step procedure to create and configure a backup set (Figure 12).
Figure 12. New Backup Set Wizard for both Windows and Linux from DS-User GUI
You are provided options to choose a backup set type, enter server and application credentials, specify items to backup, specify options, and configure retention rules with the wizard. For more information refer to Section 4 of the included DS-Client User Guide on Creating and Modifying Backup Sets.
Step 5. Running backups
Backups can be scheduled to run at pre-determined times or can be triggered to run on-demand from the DS-User GUI. Backup scheduling and on-demand backup are described in detail in the on-cloud DS-User Guide's Section 3 and Section 8 respectively. The window looks like Figure 13; Figure 14 shows the on-demand backup wizard.
Figure 13. Backup Schedule window from DS-User GUI
Figure 14. Backup Now on-demand backup wizard
Step 6. Restoring data
The DS-User GUI provides a restore wizard to restore data. Depending on the type of backup set and original source data, there are a variety of options available on what (individual files, directories, databases, tables, etc.), where (original or alternative locations), and how (performance and restore options) to restore your data. For more information please refer to the on-cloud DS-Client User Guide, Section 9: Restoring Backups. Figure 15 illustrates this.
Figure 15. Restore Now wizard from the DS-User GUI
The example is just a simple example of backup and recovery on SmartCloud with Asigra Cloud Backup. The software provides a comprehensive feature set for backup and recovery that is not discussed in detail in this article.
Backup has never been a fun task for administrators and with the new opportunities provided by cloud computing, there are also new challenges for backup. Though there are many ways to protect and back up data, most traditional approaches are not well suited for the cloud. Asigra Cloud Backup on IBM SmartCloud Enterprise with its automated and agentless approach to backup provides significant advantages by best addressing the challenges and requirements for cloud server backup. It provides users with a variety of options to backup and recover servers and applications running on SmartCloud, as well as the ability to extend deployments to backup servers and devices outside of SmartCloud into SmartCloud.
- Learn more about the technologies and techniques mentioned in this article:
- In the developerWorks cloud developer resources, discover and share knowledge and experience of application and services developers building their projects for cloud deployment.
- Find out how to access IBM SmartCloud Enterprise.
Get products and technologies
- See the product images available for IBM SmartCloud Enterprise.
- Asigra provides cloud backup, recovery, and restore products; take a look at Asigra Cloud Backup.
- Join a cloud computing group on developerWorks.
- Read all the great cloud blogs on developerWorks.
- Join the developerWorks community, a professional network and unified set of community tools for connecting, sharing, and collaborating.