Back up, restore, and roll forward in an InfoSphere Warehouse data partitioning environment
The data processing resources of business organizations increasingly rely on relational database management systems (RDBMS) for their ever-growing data storage and processing needs. As databases and data warehouses become larger, management of an increasing amount of stored data becomes a key factor for performance of RDBMS. To handle database growth, database systems must demonstrate scalable performance so as to be able to apply additional computing resources. The data partitioning capability available with IBM InfoSphere Warehouse enables scaling of a single database to more than one server. With partitioning, you can distribute data across various partitions.
Previous to July 2010, this capability was called the Database Partitioning Feature (DPF) and was available for DB2 Enterprise Server Edition. Now it is no longer available as a DB2 feature, but comes with InfoSphere Warehouse, which is based on DB2. We'll refer to this partitioning capability as DPF in this article.
With DPF, your database is scalable, because you can add new machines or nodes as your data grows, and spread your database across them. This means more CPUs, more memory, and more disks from each of the additional machines for your database. DPF is ideal for managing data warehousing, data mining, and online analytical processing (OLAP) workloads. It can also work well with online transaction processing (OLTP) workloads.
Backup and restore are the key capabilities for ensuring data recoverability for any database management system. Performing these operations in a partitioned database environment is a bit tricky. The goal of this article is to provide straightforward, step-by-step instructions showing how to perform backup, restore and rollforward in a DPF environment.
This article begins by reviewing the concepts and considerations. It then explains to accomplish the following tasks:
- Back up a database in a DPF environment
- Restore a database back into a DPF environment
- Roll forward the logs in DPF
- Perform a redirected restore in DPF
- Perform a point-in-time recovery of a database in this environment
Finally, it includes guidelines for calculating log space requirements for archival logs.
Quick review of DPF
As the name implies, when you use the Database Partitioning Feature, your database spans two or more database partitions, allowing data to be distributed across each partition. The database partition, also called the database node, contains its own data, indexes, configuration files, and transaction log files, and holds just a portion of a much larger database. Data retrieval and update requests from users are decomposed automatically into sub-requests and executed in parallel among all applicable database partitions. The fact that a database is distributed across multiple database partitions is transparent to the end users issuing SQL statements.
The main purpose of the database partitioning is to improve the data processing performance by executing queries in parallel. Parallel query execution refers to the ability to execute queries using multiple CPUs simultaneously. When used, query parallelism can reduce the amount of time required to execute large queries by increasing the number of CPUs used to process the query. DPF also simplifies data management.
In a partitioned database, the partition from which the
CREATE DATABASE command is executed is called the catalog
partition. The catalog partition is where all system catalog tables are stored. Thus all access to the system tables must go
through this partition. The partition or partitions that accept connection requests from clients are known as coordinator partitions.
This article demonstrates backup and recovery in a sample DPF environment with two physical partitions, each having two logical partitions. In other words, there are four partitions—0,1,2, and 3—where 0 and 1 partitions are located on one server, SERVER_A, and 2 and 3 are located on another server, SERVER_B. Partition zero will be considered the catalog node, as shown in Figure 1.
Figure 1. Partitioned database environment
Partitions are listed in the db2nodes.cfg file. The first column represents the partition number. The second column represents the server address or the host name, and the third column represents the logical ports.
Listing 1. Output of db2nodes.cfg.
$cat db2nodes.cfg #this file exist in the sqllib directory of your instance directory 0 SERVER_A 0 1 SERVER_A 1 2 SERVER_B 0 3 SERVER_B 1
Backing up the database
The backup operation in DB2 can either be online or offline. You never know when a disaster or failure may hit your system, resulting in downtime. Therefore, it is best to be prepared and protect your data not only from external factors, but also from internal users who might inadvertently corrupt your database with incorrect information. For these reasons, it is critical that you understand how backup and recovery features function for your database management system, and that you have a well-planned backup strategy.
In DPF, you can run the commands sequentially on one database partition server after another, or you can run the commands in parallel.
db2_all, you can run commands on all database partition
servers that you specify. You can restrict the list to any number of database
partition servers using the
<<+xxx< prefix sequences.
<<-xxx< prefix, you can issue a command to all partitions in DPF, except the node specified by
xxx. Similarly, using
prefix, you can issue a command only to partition specified by
db2_all prepends the following assignment to your command.
export DB2NODE=xxx (Korn shell syntax)
In this assignment,
xxx is the database partition number and
can be taken from db2nodes.cfg file, so the command gets routed to a specific database partition server.
Here is an example of a backup of a database across all nodes in parallel.
Listing 2. Backup across all nodes
$ db2 backup db TEST on all dbpartitionnums Part Result ---- ------------------------------------------------------------------------ 0000 DB20000I The BACKUP DATABASE command completed successfully. 0001 DB20000I The BACKUP DATABASE command completed successfully. 0002 DB20000I The BACKUP DATABASE command completed successfully. 0003 DB20000I The BACKUP DATABASE command completed successfully. Backup successful. The timestamp for this backup image is : 20110302231028
Each database partition is backed up separately. The backup taken at the same time for all the database partitions is known as a version backup. These backup images of all the database partitions are present under the current directory.
Listing 3. List all the backup files
$ls -l -rw------- 1 root root 78454784 Mar 02 23:10 TEST.0.inst_test.NODE0000.CATN0000.20110302231028.001 -rw------- 1 root root 18116608 Mar 02 23:10 TEST.0.inst_test.NODE0001.CATN0000.20110302231028.001 -rw------- 1 root root 18116608 Mar 02 23:10 TEST.0.inst_test.NODE0002.CATN0000.20110302231028.001 -rw------- 1 root root 18116608 Mar 02 23:10 TEST.0.inst_test.NODE0003.CATN0000.20110302231028.001
You have four backup images representing the four partitions in the environment. The name of the backup files
provide detailed information about the backup. For example, let's look at the backup file name
TEST: Represents the name of the database
0: Type of backup operation, 0 represents a full offline database-level backup
inst_test: Instance name
NODE0000: The partition number
CATN0000: The database partition number where the catalog node (syscatspace) resides
001: Sequence number
Here is the backup of a database on a single node.
Listing 4. Backup on single node
# Backup taken at Node 2 $ export DB2NODE=2 $ db2 terminate DB20000I The TERMINATE command completed successfully. $ db2 backup db TEST Backup successful. The timestamp for this backup image is : 20151010061808 $ ls TEST.0.inst_test.NODE0002.CATN0000.20110302231028.001
Restoring the database
In the event of data corruption or data loss, the duplicate copy of the database can be used to restore the database at least to the state that existed at the time that the backup operation was performed on the database. Restore is also a part of database upgrade or migration. In DPF, restore is a bit tricky operation. If you are working with DPF, you need to restore the catalog node first and then continue restore on the rest of the nodes. In this example, the partition number zero is the catalog node.
Listing 5. Restoring database on catalog node
$ db2_all '<<+0< db2 RESTORE DATABASE TEST into TEST REPLACE EXISTING' SQL2539W Warning! Restoring to an existing database that is the same as the backup image database. The database files will be deleted. DB20000I The RESTORE DATABASE command completed successfully. 126.96.36.199: db2 RESTORE DATABASE ... completed ok
Now restore the remaining partitions.
Listing 6. Restoring database on remaining partitions
$ db2_all '<<-0< db2 RESTORE DATABASE TEST into TEST REPLACE EXISTING' rah: omitting logical node 0 SQL2539W Warning! Restoring to an existing database that is the same as the backup image database. The database files will be deleted. DB20000I The RESTORE DATABASE command completed successfully. 188.8.131.52: db2 RESTORE DATABASE ... completed ok SQL2539W Warning! Restoring to an existing database that is the same as the backup image database. The database files will be deleted. DB20000I The RESTORE DATABASE command completed successfully. 184.108.40.206: db2 RESTORE DATABASE ... completed ok SQL2539W Warning! Restoring to an existing database that is the same as the backup image database. The database files will be deleted. DB20000I The RESTORE DATABASE command completed successfully. 220.127.116.11: db2 RESTORE DATABASE ... completed ok
If you have archival logging enabled and you try to connect to database after restoring the database, you will get an error saying that database is in roll-forward pending state.
Listing 7. Database in rollforward pending state after restore
$ db2 connect to test SQL1117N A connection to or activation of database "TEST" cannot be made because of ROLL-FORWARD PENDING. SQLSTATE=57019
To remove this inconsistency, you need to roll forward the database to the end of logs to bring it to the consistent level.
For offline backup restore, if you specify
WITHOUT ROLLING FORWARD option in restore command, the database will not to be
placed in rollforward pending state after it has been restored successfully. For online backup restore, rollforward is mandatory.
Database rollforward enables you to maintain consistency in the database. It recovers a database by applying transactions
recorded in the database log files. Rollforward is a process invoked after a database or a tablespace backup image has been restored.
The database is said to be recoverable if archival logging is enabled, that is, if
logarchmeth2 database configuration
parameters are set to a value other than
OFF. Rollforward phase is applicable to database or tablespace only if
the database is recoverable.
In a partitioned database environment, this command can only be invoked from the catalog partition. A database or tablespace rollforward
operation to a specified point in time affects all partitions that are listed in the
db2nodes.cfg file. A database or tablespace rollforward
operation to the
end of logs can be partition-specific. It affects the partitions that are specified. If no partitions are
affects all partitions that are listed in the
db2nodes.cfg file. If rollforward recovery is not needed on a particular partition, you can
ignore that partition.
Listing 8. Database rollforward.
$ db2 rollforward db test to end of logs and complete Rollforward Status Input database alias = test Number of nodes have returned status = 4 Node number Rollforward Next log Log files Last committed status to be read processed transaction ----------- ------------- ---------- ----------- ------------------------------ 0 not pending - 2011-03-03-06.28.31.000000 UTC 1 not pending - 2011-03-03-06.28.28.000000 UTC 2 not pending - 2011-03-03-06.31.45.000000 UTC 3 not pending - 2011-03-03-06.31.45.000000 UTC DB20000I The ROLLFORWARD command completed successfully.
Now the database has been rolled forward and you can successfully connect to the database.
db2 rollforward db test to end of logs and complete specifies that all committed
transactions from the archive log files listed in the database
configuration parameter 'logpath' are to be applied.
Roll forward to a point in time
In DB2, you can recover database to any specified point in time using point-in-time rollforward, provided that the point in time you specify is greater than the minimum recovery time (MRT). The minimum recovery time is the earliest point in time during a rollforward when a database is consistent. This minimum point in time ensures that the tablespace and logs are consistent with system catalogs. A minimum point in time is updated when DDL SQLs are run against the tablespace, or against tables in the tablespace. Point-in-time recovery is very handy in situations where you have lost some of the logs, when you wish to undo some unintentional changes to the database, or when you wish to perform disaster recovery of the database.
The specified point in time for the rollforward operation must be equal to or later than MRT. In case,
if rollforward is for restored image from online backup operation, the point in time should be time later than the time at which the online
backup operation completed. Database history (
list history backup all for db <dbname> command) and the backup image
'db2ckbkp' command) can give you the completion
timestamp of the backup, after which any timestamp will be valid for rolling forward.
If you are performing a point-in-time rollforward recovery, all database partitions must be rolled forward to ensure that all partitions are at the same level. Point-in-time recovery applies to all database partitions.
Below are the various ways to identify the minimum point in time.
- Use a very old timestamp (for example, two years back) in rollforward command. DB2 will throw an error with minimum rollforward time to which you can roll forward.
- Issue a rollforward stop, without specifying end of logs or PIT.
db2 rollforward db <dbname> stop. This command will return SQL1276N and give you the PIT in Coordinated Universal Time (CUT) which is the minimum PIT
db2 list history backup all for db <dbname>and look for the timestamp of your backup image. Once you find it, look for 'Earliest Log' and 'Current Log' values. These values show the logs' range that you need to apply in order to roll forward to a minimum PIT. You can copy these logs to some directory and issue a rollforward command using the
overflow log pathand
Note that the time you indicate in the rollforward command is in CUT, unless you also specify 'using local time'.
You can execute
db2 "values current timezone" to get the difference between CUT and local time.
Below is one example showing rollforward to PIT.
Listing 9. Rollforward to PIT
$ db2 "rollforward db TESTDB to 2011-09-05-04.58.47 using local time and stop" Rollforward Status Input database alias = TESTDB Number of nodes have returned status = 1 Node number = 0 Rollforward status = not pending Next log file to be read = Log files processed = S0000024.LOG - S0000025.LOG Last committed transaction = 2011-09-05-04.54.04.000000 Local or more containers is inaccessible DB20000I The ROLLFORWARD command completed successfully.
To determine the minimum point in time of recovery for a tablespace, you can use either of the following methods:
LIST TABLESPACES SHOW DETAILcommand.
- Obtain a tablespace snapshot with the
GET SNAPSHOT FOR TABLESPACE ON <dbname>command.
Although you can roll forward your database or tablespace to any point in time after the minimum point in time, there is no guarantee that the end time you choose to roll forward will have all data in a consistent state.
Performing a redirected restore in the DPF environment
If you want to restore a backup image to a target machine that is different from the source machine then you can restore the backup image using redirected restore option. You can opt for redirected restore typically in following situations:
- If you want to restore tablespace containers of the database into a different physical location, or
- Restore operation has failed because one or more containers is inaccessible
During a redirected restore, directory and file containers are created automatically, if they do not already exist. Removing, adding or changing table space containers of only DMS table space is supported by DB2. For SMS tablespace, the only method to modify the tablespace container configuration is redirected restore.
To perform redirected restore on a partitioned environment, you must restore the database on each partition starting with the catalog partition, set the tablespace, then restore continue. It is performed in three subsequent steps.
- Back up database TESTDB on all partitions. Issue the following commands from
catalog node, which in our example is node 0.
After executing the command below, you have the backup images under '/home/backup' directory'.
Listing 10. Backup in DPF.
$ db2_all '<<+0< db2 BACKUP DATABASE testdb to /home/backup' $ db2_all '<<-0< db2 BACKUP DATABASE testdb to /home/backup'
- Now, do a redirected restore it to newdb with different containers. On partition 0, issue the below commands:
Listing 11. Redirected restore on catalog node
$ db2_all '<<+0< db2 create db NEWDB on /db2/newdb' $ db2_all '<<+0< db2 restore db TESTDB from /home/backup/ into NEWDB redirect without prompting' $ db2_all '<<+0< db2 set tablespace containers for 0 using (path '/db2/newdb/NODE0000/catalog' $ db2_all '<<+0< db2 set tablespace containers for 1 using (path '/db2/newdb/NODE0000/temp" $ db2_all '<<+0< db2 set tablespace containers for 2 using (path '/db2/newdb/NODE0000/user' $ db2_all '<<+0< db2 set tablespace containers for 3 using (path '/db2/newdb/NODE0000/table' $ db2_all '<<+0< db2 restore database TESTDB continue'
Now database must be restored and tablespace containers need to be set individually on each database partition.
Listing 12. Redirected restore on catalog node
//For node 1 $ db2_all '<<+1< db2 restore db TESTDB from /home/backup/ into NEWDB redirect without prompting' $ db2_all '<<+1< db2 set tablespace containers for 1 using (path '/db2/newdb/NODE0001/temp' $ db2_all '<<+1< db2 set tablespace containers for 2 using (path '/db2/newdb/NODE0001/user' $ db2_all '<<+1< db2 set tablespace containers for 3 using (path '/db2/newdb/NODE0001/table' $ db2_all '<<+1< db2 restore database TESTDB continue' //For node 2 $ db2_all '<<+2< db2 restore db TESTDB from /home/backup/ into NEWDB redirect without prompting' $ db2_all '<<+2< db2 set tablespace containers for 1 using (path '/db2/newdb/NODE0002/temp')' $ db2_all '<<+2< db2 set tablespace containers for 2 using (path '/db2/newdb/NODE0002/user')' $ db2_all '<<+2< db2 set tablespace containers for 2 using (path '/db2/newdb/NODE0002/user')' $ db2_all '<<+2< db2 set tablespace containers for 3 using (path '/db2/newdb/NODE0002/table')' $ db2_all '<<+2< db2 restore database TESTDB continue' //For node 3 $ db2_all '<<+3< db2 restore db TESTDB from /home/backup/ into NEWDB redirect without prompting' $ db2_all '<<+3< db2 set tablespace containers for 1 using (path '/db2/newdb/NODE0003/temp')' $ db2_all '<<+3< db2 set tablespace containers for 2 using (path '/db2/newdb/NODE0003/user')' $ db2_all '<<+3< db2 set tablespace containers for 3 using (path '/db2/newdb/NODE0003/table')' $ db2_all '<<+3< db2 restore database TESTDB continue'
Finally, roll forward to the end of logs using catalog node. Copy the archived log files to
a directory, for example, '/home/logs'.
After completing the restore you can roll forward using following commands.
Listing 13. Roll forward to end of logs.
$ export DB2NODE=0 $ db2 terminate $ db2 rollforward database newdb to end of logs on all dbpartitionnums overflow log path /home/logs $ db2 rollforward database newdb complete
Computing the log space requirement
If you have circular logging and wish to enable archival logging for your DPF environment, you need to decide
'logprimary' and 'logsecond' values. You need to first calculate the number of logs you require per day using
db2pd -logs and check how many times the logs were written
over. You need to monitor your system over a period of time,
for different workloads. Once that has been done, you can predict the maximum active logs that can be used by the system.
Then allow 20-30% extra on what has been calculated using during observation. This is to ensure that no active logs are archived.
This calculated log space can be divided as per convenience between
This article introduced the basics of backup and recovery with the InfoSphere Warehouse data partitioning feature. It demonstrated step-by-step procedures to perform backup, restore, and rollforward in a DPF environment. Once you have mastered how to perform these operations, you will be able to take advantage of the scaling capability provided by InfoSphere Warehouse and boost the performance of your database system.
- Learn more about DB2 in the DB2 for Linux, UNIX, and Windows page on developerWorks. You'll find technical documentation, how-to articles, education, downloads, product information, and more.
- Visit the developerWorks DBA Central page to expand your database administration and tuning skills.
- Download a free trial version of InfoSphere Warehouse Enterprise Edition.