February 27, 2013 | Written by: Yaxiao Liu
Share this post:
“Is my data safe?” I think this is the most asked question in the cloud business. Most people ask it to make sure their data will not be stolen by either an intruder or the operator. However, there is another viewpoint of data safety – data backup. Who should be responsible for data backup? How should the data should be recovered or imported as the start-up?
Data backup and recovery
In the non-cloud world, data backup and recovery has a long history. There are a lot of solutions and software deployed to carry such tasks. I will not cover such complex architecture or solutions. Basically, there are three types of data backup in ordinary life:
- File-system backup, which will protect and recover the basic installation or configuration files, and in most cases, the contents in file systems.
- Database DB file, which is widely used in most database backup cases. The database will backup the transactions to ensure a safe checkpoint of recovery. In this scenario, the backed file covers data, user relations and even stored procedures. It could only be recovered to the same run time of the database as a physical backup.
- Database import/export, which is used to act as the ‘logical’ part of data backup. Data can be exported from the database instance as an XML file or other data files. The data relationship is still maintained in this file. It could be imported to other databases in different systems. However, some database definition schema will be lost.
In real architectural cases, I will use a different kind of solution of backup in a complex architecture. For example, I will use data export to transfer some data to the data requestor; I will use a combination of file system and database DB file to ensure the integrity of the architecture deployment units.
Cloud backup and backup in cloud
Cloud backup means that you have data outside the cloud, and you use the cloud as a backup center for such data.
Cloud backup is one of the most popular features of cloud. When we are talking about Amazon EC2 as a successful cloud business, we could see some other examples of cloud, such as Apple iCloud and Google Docs.
This kind of cloud provides user capabilities of cloud backup for mobile devices or desktop docs and contents. Users could synchronize, store, retrieve and share the data in such clouds.
For enterprises, they could use the same schema on cloud business. One enterprise user could apply for a virtual server in IBM Smart Cloud Enterprise (IBM SCE) and could backup all his data of either logical DB exports or physical DB files into the virtual server. This may be a cheaper way then traditional tape or virtual tape library (VTL) solutions, because you don’t need pay for the virtual servers when you don’t use them for data transmission.
The cloud backup solution will bring cheaper total cost of ownership to a non critical solution, such as home office data, sales data archive and content like pictures.
Backup in cloud
Backup in cloud means you have data in cloud. How could you make sure that your data is properly backed up?
There are different solutions for backup in cloud. For example, IBM offers Tivoli Storage Manager (TSM) in IBM SmartCloud Enterprise (IBM SCE) so that the user could backup their data from the production virtual machine to their backup and restore the virtual machine‘s storages.
However, things are difficult in other scenarios. Different kinds of data models will bring different kinds of data backup solutions and different kinds of responsibilities among services creator, services provider and services consumer.
Let’s have a look at a software as a service (SaaS) model diagram:
Figure 1 Data Models in SaaS Environment
There are three major parties in this context:
- Services Operator: It provides the SaaS environment with infrastructure, BSS/OSS and more cloud enabled middlewares.
- Services Creator: It provides the business applications which are running on the Operator’s SaaS environment. For example, the Operator may be Google App Engine, and the Service Creator 1 may be an ERP solution provider and the Service Creator 2 may be a Lotus Live services provider.
- Tenant: I would like to use ‘Tenant’ instead of Services Consumer only because the tenants may have their traditional business users at the same time. The tenants rent applications from ERP or Louts Live, and let the users inside the tenant to use it.
The tenant should be responsible for the whole data safety. However, since the services creator provides the whole application directly, the tenant has little opportunity to know how many databases really exist.
If it is the services creator who should be responsible for the data backup, the creator need rely on operator’s infrastructure to backup his tenant’s databases. But the tenant would not like the creator to know their data logic.
The operator could not provide the full services because he doesn’t know which server is database server and how the databases are interconnected. What he could do for the others will be the full file-system backup.
From this scenario, you could see a distribution of data mode and responsibilities. The operator should provide data backup/restore capabilities for the creator. The creator should provide data import/export capabilities for the tenants. And the tenant should be responsible for his logical data.
The model of SaaS is just a demonstration from user data’s view. Somebody may think that the data backup/restore is important but would be less used for just-in-case. I have some different thinking on that:
- Data initialization. When tenants first put the application into runtime, the restore function, especially the import function, could help the tenant to retrieve data from any existing data store. This will make the runtime a real production environment.
- Application cloning. When the tenant would like to expand his concurrency threads, he could use database restore for a quick clone database setup.
- Application development and test. The creator could develop customized solution based on tenants’ requirements. In this case, he could set up a development and testing database on the cloud.
Backup in cloud is a shared responsibility. Successful protection of data may include many parties. So it is very important for adopting a safe backup in cloud before we start our works. Here I provide a sample table for the responsibilities and sample works:
Backup in cloud is much more beyond what I just discussed. For example, the creator is providing some ‘standard’ ERP services. Commonly, the tenant would use the services as part of his business processes. In many cases, he will develop a reporting system or integrate the cloud ERP into a Warehouse Management System (WMS). The integration will help the ERP system to get real time stock status and generate real time business reports.
So it may need an integration of different backup/restore strategies. The strategies may include:
- Integration of different backup tools
- Synchronization of difference backup/achieve copies
- Integrity check and restore for distributed databases
The discussion may lead to a complex architecture design. I have indicated that in another post: Architecting your solutions to cloud.