We spend a lot of time at customer and user group speaking engagements presenting the latest and greatest features in DB2 Universal Database for Linux, UNIX®, and Windows® (DB2). It never fails: we're presenting on a topic and ask the audience if they are familiar with a new key feature, and most of the folks haven't heard about it, even thought it's been out for a while. In this article, we'll take you on a tour of the BACKUP utility and how it works in DB2. We'll cover its internals, and the new features that have been added long to the way to make it faster and feature-rich.
Why back up?
There are lots of reasons why you should run regular backups (and test their recoverability). If the fact that the data contained in the backup is the lifeline of your business doesn't do it for you, then think of it as your job.
Backups are required for recovering from application errors, copying a database (for example, populating development or test systems), moving a database to new hardware, migrating to a new level, ensuring recoverability before and after software update protection, setting up some sort of disaster recovery (DR) or high availability (HA) topology, and so on.
Interestingly enough, the biggest reason why people back up today is to ensure the ability to recover from application errors. It's fair to say that today's hardware (H/W) is fairly safe. For example, dual power supplies, RAID, dual controllers, and so on are pretty much the norm and can all but guarantee that you're likely not to suffer a H/W outage if you've set everything up correctly (there are obvious exceptions). But how do you protect against human error?
Why a DB2 backup and not a file system backup?
We hear this question a lot from new database administrators (DBAs). The main reason is that DB2 is very aggressive in trying to keep a hot cache (essentially, the data your application needs in memory). When you add in the proliferation of the 64-bit model, this is a trend that isn't slowing down (and nor should it, the more data you can place on a processor's L1, L2, or L3 cache, or in random memory, the faster the workload goes). Essentially you want aggressive data caching because it keeps data away from disk as much as possible and avoids expensive I/O cycles.
If you were to back up the files on a file system while DB2 was running, you could definitely run into inconsistencies, and DB2 would not be able to guarantee that you could recover your data. For example, running a file system copy while the database is running won't give you a snapshot for your database for a particular point in time (PIT). You should always use a DB2 backup to guarantee your data's consistency -- if not, there are no guarantees, without taking the entire DB2 instance offline.
As if that wasn't enough, with a DB2-based backup you can take advantage of its online capabilities, which permit DDL and DML during the backup process, so that business operations can continue as normal. You also have granular control since you can back up at the table space level. This lets you isolate key tables for backup, leaving other tables that don't need to be backed up at all, or as often.
A DB2 backup aids in recoverability too. You can roll forward through the logs to a PIT of your choice. In other words, you have granular control with respect to how you want your system to look when it "comes back to life", in contrast to what you would get with a static snapshot (that is likely inconsistent) using the file system method.
DB2 also enables "subset" recovery. For example, if you back up five table spaces, you can choose to restore just one table space whose media failed . A file system backup, on the other hand, is all or nothing.
About DB2 backup
The DB2 backup utility is fully integrated into the DB2 engine; it is not an add-on utility. As we've already mentioned, the DB2 backup utility has granular capabilities. Backup images can be any combination of:
- database or list of table spaces
- offline or online
- full, incremental, or delta
The backup utility has a number of "knobs"' that can be used to tune backup (though as of DB2 V8.2, this can be done for you -- more on that later). For example, some of the parameters you can set include how may processes will be used to read or write from the database, the number of and size of the buffers used to write to the target media, and so on.
The backup utility in DB2 makes physical copies of the data pages. The backup process is not a file system backup; it is a logical backup. A DB2 backup image includes additional information beyond the data, including metadata, the database configuration, the history file, table space definitions, and more.
When you back up a system, DB2 reads the data from the disk into its input/output (I/O) buffers and writes them out to your destination devices or some third party storage management software (such as IBM Tivoli Storage Manager). As of DB2 V8.2, for an online backup, the log files are part of the backup image (obviously, an offline backup wouldn't need the log files to restore). It is also worth noting that temporary table spaces and free DMS extents are not part of the backup image. The DB2 backup utility also has a compression feature (as of DB2 V8.1.4) and throttling capabilities (as of DB2 V8.2). We'll discuss these features later in this article.
DB2 has a very efficient and optimized backup utility that multiplexes the data pages it reads from the database and writes them to the target device in a randomized order. In other words, the data pages are not in any order that resembles their table association on your archive media. DB2 does this to optimize the performance of the BACKUP utility (we assume you are performing more backups than restores). The DB2 backup utility also supports raw devices.
There are three different kinds of backups available in DB2:
- A full backup gives you a complete backup (sometimes it is referred to a level 0 backup).
- An incremental backup grabs all changes since the last full backup image (sometimes called a level 1 backup).
- Finally, a delta backup captures anything that has changes since the last backup of any kind (sometimes called a level 2 backup).
These types of backups can be done at a database or table space level, and either online or offline, as long as the database is properly configured to support such actions.
The backup process model
It's worth spending some time discussing the DB2 backup process model. It helps you to understand the performance of your system if you know what the DB2-spawned processes are doing. Figure 1 illustrates the backup processes in DB2:
Figure 1. The backup process model.
On the left you can see the DB2 table spaces and their associated containers. When you invoke the BACKUP utility, the db2agent process is spawned to control the flow between the buffer manipulators (the db2bm processes are used to read data off disk and into shared memory) and the db2med process (which reads from shared memory and writes the pages out to the target devices).
There are no restrictions with respect to how fast these processes will run; however, you can control them in relation to your environment's workload by using the DB2 throttling feature. To architect this utility to run as a fast as possible, the buffer manipulator has been coded such that it doesn't have to give data to a specific device controller. Think of it as a "race" -- DB2 doesn't care in what order the pages are placed on the backup media, just how fast it can get it there.
There is however one level of association: each table space will be assigned to one process that is solely responsible for processing all the data in that table space. The number of buffer manipulators is controlled by the parallelism option used when invoking the BACKUP utility. For example, if you set this option to 2, you will have two db2bm processes that can each read two separate table spaces in parallel.
The db2med processes spawned are equal to the number of targets you give it. For example, with Tivoli Storage Manager, if wanted to open three sessions, DB2 sets up three streams to the Tivoli server. This will help DB2 drive parallelism to the archive media.
If you are backing your data to a file system, and that file system is a virtualization of multiple disks, you should specify the mount point multiple times. For example, in a DB2 for Windows environment, you would enter the command as follows:
Listing 1. Backing up your data to a file system when that file system is a virtualization of multiple disks
backup database sample to c: c: c:
In this case, DB2 would spawn three db2med processes to the archive media and write the data pages from the db2bm processes to them in parallel.
Incremental backup first made its way into the product in the V7.2 release. We've been seeing the popularity of this type of backup increasing since it was first introduced -- especially for data warehouses where a small percentage of the data changes.
Incremental backup allows you to back up only index and data pages that have changed since a previous backup. The exception is for long field and large object data in "dirty" table spaces, which are always backed up. There is no incremental support for this type of data because these data types have a different physical format than our index and data pages, and currently DB2 cannot determine at backup time if that data has changed. This exception will be removed in a future release of DB2.
Figure 2 illustrates the different kinds of partial backups available in DB2:
Figure 2. Incremental and delta backups.
From this figure you can see that an incremental backup is really based on the last full backup. This means that an incremental backup taken on Tuesday would include all of Monday's and Tuesday's changes since the full backup (in our example) was performed on Sunday.
Delta backups are based on the last incremental or delta backup. For a delta backup, you need to maintain all the backups taken since the last full backup in order to be able to reconstruct the data. For example, to recover data to the end of business day Wednesday, you would need the delta backup images for Monday, Tuesday, and Wednesday (or Wednesday's log files). If you took an incremental backup on Tuesday, then you would only need the incremental image from Tuesday, and Wednesday's delta backup image (or log files).
In addition to the changed pages, an incremental backup will include the database's metadata (database configuration, history file, table space definitions, and so on) to assist in recovery. This metadata is not incremental; it is a full copy each and every time.
By default, a DB2 database is not configured for incremental backups because there is a very minimal impact on runtime performance that's incurred to enable DB2 to perform this kind of backup. To enable this kind of backup, set the TRACKMOD database configuration parameter to ON (changing this parameter won't take affect until the next database activation).
When TRACKMOD is enabled, the first write operation will mark the data's hosting table space as dirty. If the table space isn't dirty, DB2 won't even look at it when a backup is started. If DB2 sees a dirty bit for a table space, it will go and inspect the extents within the marked table space (they are also marked with dirty bits) and ultimately only pull changed data pages into the backup image. The tracking feature used to support incremental backup is entirely internal and doesn't require any storage considerations.
An incremental backup is not permitted until a non-incremental backup is taken to set a base from which it can recover -- this is to support incremental restore which always requires a non-incremental base image.
Online backup considerations
DB2 can perform an online or offline backup. Online backups can run during normal SELECT, INSERT, DELETE, and UPDATE activity against the database. The only restriction when running an online backup in DB2 is that you cannot drop a table space while that table space is actively being backed up.
With an offline backup, DB2 knows that it's the only application reading from the database, and therefore it doesn't need to worry about locks. With an online backup, things are a little different. DB2 has to implement a locking strategy for an online backup. For large objects and long field data, DB2 escalates Intent None (IN) locks to Share (S) locks and therefore is approximately 10% slower.
An online backup will likely require some more memory from the UTIL_HEAP memory allocation to allocate some internal structures that help support this operation.
The database history file
The database history file is becoming a more and more crucial part of the database engine. The database history file is a record of administrative operations. It is individually restored from any type of backup image (it is part of the metadata we detailed earlier). Events recorded in this file include operations such as backup, restore, rolling forward of the logs, loads, queiscing of the database or table space, table space alters, and dropped tables (when dropped table recovery is enabled). The information about these operations that are recorded include the objects affected (database, table space, or table), the location and device type (backup image or load copy), range of relevant log files, start and completion time of the operation, and the resultant SQLCA code. Previously this was an informational file that you could query. DB2 now uses this file to support recoverability like automatic restore. The new log manager uses this file as well.
This information is kept in a file and not in a DB2 table because the information is needed for recovery. If the database was ever not available, we could not leverage it for database recoverability. Therefore, the database history is stored in an ASCII file and included in the backup image where we can retrieve and process it.
Third party backup vendor support
The media processes that DB2 uses to write data out during a backup are built on a published set of interfaces that have been available to the open market since 1993. This has led to broad DB2 support for today's most popular backup vendors, including IBM Tivoli Storage Manager (TSM), Veritas NetBackup, Legato NetWorker, Compuer Associates, and more.
Any vendor can use these interfaces to integrate their archive solution into DB2, as shown in Figure 3:
Figure 3. The DB2 backup interfaces.
When "plugged" into a vendor's archive software, instead of writing the backup information to a file, DB2 will write the backup data to these interfaces, sent as a stream of bits directly to the target archive server.
For example, if you're using TSM, then DB2 will load the TSM API, and so on. These libraries are loaded directly into the DB2 kernel and run in our address space. You don't have to worry about the quality of the vendor's plug-in code (as of DB2 V7+FP11) as DB2 will protect the instance address from a failure of a partner's code. In fact, before every operation, DB2 will get the state of the signal handlers before and reset them afterwards. This means that even if the vendor's code traps, the database engine will not go down (obviously the backup operation itself will fail).
DB2 has a long history of integration with Tivoli Storage Manager. In fact, DB2 was the second application to ever add support for the TSM API. Because of its long standing history with Tivoli (and the fact it's an IBM product), we ship direct support for TSM free of charge.
Tivoli Storage Manager hints and tips
Setting up DB2 to use TSM is simple.
First, you need to run the dsmapipw utility (as a user with administrative authority) to set the TSM password. This utility is located in the sqllib\adsm directory. This utility encrypts and stores the TSM password for the node on disk. Failure to perform this step alone is responsible for about 60% of the problems that people run into when using TSM and DB2. If the password isn't set correctly, you will receive a 137 error code.
Next, export the DB2 TSM-specific environment variables. There are three environment variables that can be set:
- DSMI_DIR is the directory where the TSM client is installed.
- DSMI_CONFIG is the configuration where you set up TSM.
- DSMI_LOG specifies the files where any errors are written.
All of these settings are captured when the instance is started, so if you change any of them (and the first time you set up DB2 to work with TSM), you have to restart the instance. You only have to restart the instance if you change any of the configuration parameters. For example, if you changed any specific TSM settings in the TSM configuration file (for example, which TSM server you want to communicate with), you don't need to restart the database engine. These environment variables are typically located in the DB2 instance's user profile.
TSM uniquely identifies all backups with a timestamp. DB2 does not use the expiration policies on the TSM server. This is an important consideration to keep in mind because it means that your backups will not expire and you'll need to devise a plan for handling this.
In DB2 V7, instead of deleting backups, DB2 would mark them inactive, so you had to set up TSM so retain inactive copies (not the default). This was changed in DB2 V8. Now, when you want a backup deleted, we will do it regardless of the TSM management class definition.
In DB2 V7, if you backed up a database on your node using TSM, only the user executing the backup could restore it. This caused issues in a DB2 V7 environment in that to go to another server, you had to "masquerade" as the original node and know their respective passwords. You had to create an empty database on a target node, set the TSM_NODENAME, TSM_PASSWORD, TSM_OWNER database configuration parameters, and change the PASSWORDACCESS=PROMPT and comment out NODENAME in the dsm.opt file.
In DB2 V 8.2, this complexity has been solved with the addition of "vendor options" support. This feature allows you to send optional parameters directly to the TSM API, including the name from which the backup image was generated as well as the original DB2 instance ID. You are no longer required to know the original's nodes TSM password. The process which allows you to restore a backup image to any node requires two steps:
- After generating the backup image you must grant access to this image to the required nodes by using the db2adutl grant option. For example:
db2adutl grant newuser on nodename newhost for db mydb
- On the new target node you must use the 'options' field on the backup command. For example:
db2 restore db mydb use tsm options '-fromnode=originalnode -fromowner=originalinstance'
Finally, there are enhancements to how you manage DB2 backups on the TSM server in DB2 V8 with the introduction of the db2audtl utility. Previously the management of backup caused some complexity because of the DB2 naming convention. DB2 names the backup files, and therefore you cannot rely on the TSM management class to manage them for you since they are uniquely named, and they will never expire.
The db2audtl command has seven options:
- DELETE: Used to mark backups inactive (DB2 version < 8 ), delete backups (DB2 version 8) and delete logs. For DB2 version < 8, TSM will flag the inactive DB2 database backup objects for deletion based on the backup copy group definition in the associated management class. When the TSM Expire Inventory command runs, the objects flagged for deletion will be removed from the TSM server. For db2adutl to be able to remove database backups, the Backdel parameter must be set to yes for the TSM node. To erase DB2 logs that have been saved on the TSM server, the Archdel Yes parameter would need to be set for the Node. These parameters can be specified when registering the TSM node, or by performing an update of the node information on the TSM server.
- QUERY: Lists all or specific DB2 objects created on that node. The Show inactive clause can be used to view backup images that have been marked inactive.
- EXTRACT: Creates disk images from TSM objects. If a database image is marked inactive it can no longer be restored, however it can be still be extracted from the TSM server and the extracted image can be used to perform a restore. Note that in version DB2 version 8, db2adutl is used to delete an object, it is not just marked inactive, but is set for immediate expiration from the TSM Server.
- VERIFY: There is also a VERIFY option that has been shipping with the db2adutl utility since V6. It allows the image to be read from the TSM server and have verification performed on it. It's basically a way to run the db2ckbkp utility (formerly known as dumpimage) through TSM in order to verify if the image is going to be restorable.
- GRANT: Provides any user on any host the ability to access the db2 backup images and log files) associated with databases generated on the node issuing the GRANT option.
- REVOKE: Provides any user on any host the ability to revoke access to the DB2 backup images and log files associated with the specified or ALL databases generated on the node issuing the REVOKE option.
- QUERYACCESS: Provides any user on any host the ability to query the granted access to the db2 backup images and log files associated with the specified or ALL databases generated on the node issuing the queryaccess option.
For completeness, the following table shows the support for TSM and DB2 V7:
Figure 4. TSM support in DB2 V7.
In DB2 V8, this table looks like:
Figure 5. TSM support in DB2 V8.
What's new in BACKUP in DB2 V8.1 and beyond
In this section we'll discuss some of the new features introduced in DB2 V8.1, and others that subsequently followed.
In DB2 V8.1.2, we enabled the BACKUP utility to be throttled. This is a tremendously powerful feature that DBAs trying to adhere to stringent service-level agreements (SLAs) are bound to want to try out.
Let's first start by defining the problem. As we've outlined in this article, the DB2 BACKUP utility was designed with one thing in mind: performance. It's exactly what our customers asked us to do. When customers asked for performance, they were faced with shrinking batch windows and had to fit more and more maintenance into smaller and smaller maintenance windows. Then the world changed. The business environment morphed into a continuously available one, 24x7, never stopping for anyone or anything.
Ironically, the same customers that asked us to make our BACKUP utility as fast as possible, now asked us to figure a way to make it run slower. Introducing, utility throttling.
Utility throttling lets you define a policy for the BACKUP utility such that this utility will not impact the production workload by a certain percentage of "normal operations" -- in other words, it limits the resources that can be used by DB2. You can speed up, slow down, or pause the utility in response to varying factors in your business environment.
Figure 6 shows an example of the effect of throttling a utility such as backup.
Figure 6. The effects of the Backup utility on performance -- think you don't need throttling?
In the previous chart, you can see that the throughput (transactions per minute) when no backup is run. When BACKUP runs full out, the transactional rate of this system drops to under 100 Tpm. Throttling allows you to set an non-adaptive rate (the number indicate the allowable impact on the production workload) such that you can customize the throughput of BACKUP to your environment.
If your SLA was based on maintaining just over 300 transactions per minute, you may want to set a non-adaptive throttle rate at 70% (meaning that BACKUP could consume resources such that it would impact the production environment by 30%). DB2 would then adjust the resources available to BACKUP such that the average impact to your production workload was 30%.
In DB2 V8.1.4 we enhanced the BACKUP utility with an optionable compression feature, and some new TSM integration command (GRANT and REVOKE)
You can ask DB2 to compress your database backups. The backup utility compresses the data before hitting the disk, in the buffer. The data is compressed as it is read from the database using a modified Lempel-Zev (LZ) algorithm (described in "A Technique for High Performance Data Compression," Welch, Terry A., IEEE Computer, vol. 17, no. 6 (June 1984), pp. 8-19.). The default algorithm is platform independent, so you can take a backup from a DB2 for AIX machine and put it on a DB2 for HP machine (little endian encoding can only go to little endian encoded databases, and the same is true for big endian encoded backup images).
You can provide your own compression algorithm through the BACKUP utility's open plug-in interface. If you specify your own compression library and you lose your machine, don't worry! DB2 copies your algorithm's executables into the backup image so we can restore your image.
The following diagram shows an sample of the space savings using the DB2 BACKUP compression algorithm. In this example, an typical ERP database was backed up with the compression option. The backup size went from approximately 180MB to 30 MB.
Figure 7. Backup compression results
In another test, a completely randomized database (consisting of both ASCII characters and numerics) went from 250MB to 185MB.
Remember that this compression is software-based, so its benefits will tax your environment with extra CPU cycles. For example, in the ERP example, the backup times went from 56 minutes to 1:46 minutes. However, if your network's bandwidth is a bottleneck, this could help since you'll typically experience decreased media I/O time due to the decreased image size.
DB2 V8.2.4 also introduced the new TSM GRANT/REVOKE commands, as described in the discussion of db2adutl in the TSM hints and tips section earlier in this article.
DB2 V8.2 introduced many enhancements in the BACKUP arena. First, the BACKUP utility is now self-tuning (or at least better at it). In fact, in most cases, you won't have to spend time deciding what values to specify for the number of buffers, the degree of parallelism you need, or the size of the buffers. Some example of running BACKUP in DB2 V8.2 compared to DB2 V8.1 are shown in Figure 8.
Figure 8. Self-tuning backup
In DB2 V8.2, the BACKUP utility has autonomic and optimized settings for the number of buffers, degree of parallelism, and the buffer size. These settings are picked by DB2 based on the number of table spaces, the EXTENTSIZE of each table space, the number of available CPUs, and available memory.
The autonomic settings are chosen by DB2 for you each time you run a BACKUP command, provided any of the previously mentioned parameters are not specified on the command's invocation and not set to 0 if calling BACKUP from an API.
DB2 V8.2 also introduced the capability for DB2 to alert you for the need of a backup, to actually perform a backup, or both. This is beyond scheduling a database backup at some interval. You can actually define business-based policies that trigger the backup operation. For example, total log space consumed since the last backup could trigger an automatic backup. Those DBAs that don't want DB2 to do anything for them can still leverage this capability to receive an email when DB2 believes it is time to run another backup based on the policy the DBA defined -- and of course, it will combine this capability with throttling to ensure the production environment is not impacted.
Finally, online backup images of databases and table spaces will include the set of log files necessary to restore and recover the backup image to a consistent point in time. This gives clients the ability to have a single object to ship to disaster recovery sites that would allow the customer to bring the database back to a consistent point in time. In our experience, lots of clients weren't grabbing the log files when doing the backup, and in the event of a failure, they weren't able to recovery. This feature is available only for online backup and can be invoked by specifying the "include logs" options on the backup command. During restore, the log files will only be extracted from the backup image if the LOGTARGET field is supplied and if the LOGSONLY parameter is set then only the log files will be extracted from the backup image, not the database data.
- DB2 Technical Support is the ideal place to locate resources such as the Version 8 Information Center and PDF product manuals.
- The DB2 UDB zone is a resource for the latest info on DB2 UDB for Linux, UNIX, and Windows, with many articles covering details on the newest features.