IBM Support

Reducing Backup Runtimes by Redistributing Tables over Tablespaces

Technical Blog Post


Abstract

Reducing Backup Runtimes by Redistributing Tables over Tablespaces

Body

Problem Description

DB2 Backup is able to utilize IO parallelism by reading and backing up multiple tablespaces at the same time. Number of concurrent threads reading the tablespaces (called "buffer manipulators", with db2bm.X EDU name) is controlled by "PARALLELISM" keyword. By default the value will be computed based on the size of utility hep, number of tablespaces and CPUs.

Each tablespace in a DB2 database is read by one single buffer manipulator (db2bm EDU). The backup starts with the largest tablespace. As soon as the backup of one tablespace is completed successfully, the next tablespace is read until all tablespaces are backed up.

It is possible that the largest tablespace has reached a size that is larger than all other tablespaces together. This leads to a situation that this tablespace determines the complete backup runtime.

This situation can be checked in the barstats written into the db2diag.log after each backup. One of the db2bm will show a substantially longer runtime than the other db2bm EDUs:
2016-09-07-08.44.18.108926+120 E5729808E1726         LEVEL: Info
PID     : 8829                 TID : 139796638852864 PROC : db2sysc 0
INSTANCE: jklauke              NODE : 000            DB   : TKL
APPHDL  : 0-37                 APPID: *LOCAL.jklauke.160907064019
AUTHID  : JKLAUKE              HOSTNAME: oc2040617600.ibm.com
EDUID   : 22                   EDUNAME: db2agent (TKL) 0
FUNCTION: DB2 UDB, database utilities, sqluxLogDataStats, probe:395
MESSAGE : Performance statistics
DATA #1 : String, 1212 bytes

Parallelism       = 5
Number of buffers = 6
Buffer size       = 3674112 (897 4kB pages)

BM#    Total      I/O      MsgQ      WaitQ      Buffers   MBytes
---  --------  --------  --------  --------    --------  --------
000    238.01    230.23      0.21      0.01       16701     58443
001    238.01      4.30      0.00    233.65          67       233
002    238.01      0.02      0.00    237.93           1         1
003    238.01      0.09      0.00    237.89           6        19
004    238.01     16.84      0.01    220.63        1180      4129
---  --------  --------  --------  --------    --------  --------
TOT   1190.06    251.50      0.22    930.14       17955     62826

MC#    Total      I/O      MsgQ      WaitQ      Buffers   MBytes
---  --------  --------  --------  --------    --------  --------
000    238.01      0.05    237.81      0.00        5988     20974
001    238.01      0.05    237.79      0.01        5985     20967
002    238.01      0.05    237.79      0.01        5985     20967
---  --------  --------  --------  --------    --------  --------
TOT    714.04      0.17    713.40      0.02       17958     62909

As you can see buffer manipulator 000 is doing most of the I/Os (note "I/O" column - time spent doing IO and "MBytes" - number of megabytes read) whereas the other manipulators are waiting the most of the time ("WaitQ", time when given db2bm thread is just waiting for others to complete)

The reason for this in this example is the following imbalance of the sizes of the tablespaces:

$ db2 "select substr(tbsp_name,1,16) tbsp_name, tbsp_total_size_kb from sysibmadm.mon_tbsp_utilization order by tbsp_total_size_kb desc"

TBSP_NAME        TBSP_TOTAL_SIZE_KB  
---------------- --------------------
USERSPACE1                   59867136
SYSCATSPACE                    262144
IBMDB2SAMPLEXML                 32768
SYSTOOLSPACE                    32768
IBMDB2SAMPLEREL                 13056
TEMPSPACE1                          8

  6 record(s) selected.

As the result, there is very little benefit from backup parallelism because there is a single tablespace that contains majority of the data.

The following tables reside in the tablespace USERSPACE1:

$ db2 "select substr(tabname,1,20) as tabname, fpages from syscat.tables where tbspace = 'USERSPACE1'"

TABNAME              FPAGES              
-------------------- --------------------
DATA1                             3085586
DATA2                                6426
DATA3                               44978
DATA4                               96384
DATA5                              128513
DATA6                             3599640

  6 record(s) selected.

To relax this situation a little bit this tablespace can be split and some of the tables, namely DATA5 and DATA3 could be moved to a new tablespace. This would have the size of the tablespace USERSPACE1 and should limit the backup runtime. In this example it would even be better to split the tablespace into three.

 

Create a New Tablespace

In the first step a new tablespace needs to be created. To find out the parameters for USERSPACE1 the following command can be used:
db2look -d <dbname> -l
The following snippet can be found in the output:

CREATE LARGE TABLESPACE "USERSPACE1" IN DATABASE PARTITION GROUP IBMDEFAULTGROUP
     PAGESIZE 8192 MANAGED BY AUTOMATIC STORAGE
     USING STOGROUP "IBMSTOGROUP"
     AUTORESIZE YES
     INITIALSIZE 32 M
     MAXSIZE NONE
     EXTENTSIZE 4
     PREFETCHSIZE AUTOMATIC
     BUFFERPOOL "IBMDEFAULTBP"
     DATA TAG INHERIT
     OVERHEAD INHERIT
     TRANSFERRATE INHERIT
     DROPPED TABLE RECOVERY ON;

It may be possible that the tablespace was part of the create database statement. If so you can find the parameters in the output of the following command:

db2look -d <dbname> -createdb

Now the tablespace USERSPACE2 can be created in the same way as USERSPACE1.

 

Moving Tables

Now it is time to move tables from the old tablespace USERSPACE1 to the new tablespace USERSPACE2. For this purpose the procedure ADMIN_MOVE_TABLE will be used. This procedure takes the following parameters:

>>-ADMIN_MOVE_TABLE--(--tabschema--,--tabname--,---------------->
>--data_tbsp--,--index_tbsp--,--lob_tbsp--,--------------------->
>--organize_by_clause--,--partkey_cols--,--data_part--,--------->
              .---------.                   
              V         |                   
---coldef--,----options-+--,--operation--)---------------------><

So in this case for table DATA5 the following command would be issued:

$ db2 “CALL SYSPROC.ADMIN_MOVE_TABLE(
'JKLAUKE',
'DATA5',
'USERSPACE2',
'USERSPACE2',
'USERSPACE2',
'',
'',
'',
'',
'',
'MOVE')”

This command will move table DATA5 to the new tablespace USERSPACE2. Note that other tablespace can be specified for index and lob tablespaces. The table DATA3 will be moved accordingly.

CALL SYSPROC.ADMIN_MOVE_TABLE( 'JKLAUKE', 'DATA6', 'USERSPACE2', 'USERSPACE2', 'USERSPACE2', '', '', '', '', 'COPY_USE_LOAD "NONRECOVERABLE"', 'MOVE')

Result set 1
--------------

KEY                              VALUE
-------------------------------- ----------------------------------------------------
AUTHID                           JKLAUKE
CLEANUP_END                      2016-09-07-12.28.04.948662
CLEANUP_START                    2016-09-07-12.28.04.295206
COPY_END                         2016-09-07-12.00.20.117581
COPY_OPTS                        LOAD NONRECOVERABLE,NON_CLUSTER
COPY_START                       2016-09-07-09.23.54.112229
COPY_TOTAL_ROWS                  95724175
INDEXNAME                        
INDEXSCHEMA                      
INIT_END                         2016-09-07-09.23.54.014399
INIT_START                       2016-09-07-09.23.52.670441
ORIGINAL_TBLSIZE                 29491200
REPLAY_END                       2016-09-07-12.28.04.037688
REPLAY_START                     2016-09-07-12.00.20.118824
REPLAY_TOTAL_ROWS                0
REPLAY_TOTAL_TIME                1
STATUS                           COMPLETE
SWAP_END                         2016-09-07-12.28.04.217737
SWAP_RETRIES                     0
SWAP_START                       2016-09-07-12.28.04.079908
UTILITY_INVOCATION_ID            0100000009000000080000000000000000002016090709235401723700000000
VERSION                          11.01.0000

  22 record(s) selected.

  Return Status = 0

 

Shrinking the old tablespace

The final step is to shrink the old tablespace. If autostorage is used this can be performed by the following simple command:
db2 alter tablespace USERSPACE1 reduce max
The movement of the extents can be monitored with the following query:
$ db2 "SELECT * FROM TABLE(SYSPROC.MON_GET_EXTENT_MOVEMENT_STATUS('', -1))"

TBSP_NAME       TBSP_ID CURRENT_EXENT LAST_EXTENT NUM_EXTENTS_MOVED NUM_EXTENTS_LEFT TOTAL_MOVE_TIME

--------------- ------- ------------- ----------- ----------------- ---------------- ---------------
SYSCATSPACE           0            -1          -1                -1               -1              -1
TEMPSPACE1            1            -1          -1                -1               -1              -1
USERSPACE1            2        137038      137039               308            15761            7607
IBMDB2SAMPLEREL       3            -1          -1                -1               -1              -1
IBMDB2SAMPLEXML       4            -1          -1                -1               -1              -1
SYSTOOLSPACE          5            -1          -1                -1               -1              -1
USERSPACE2            6            -1          -1                -1               -1              -1

 

  7 record(s) selected.

 

Results

The sizes of the tablespaces of the database look like the following now:

$ db2 "select substr(tbsp_name,1,16) tbsp_name, tbsp_total_size_kb from sysibmadm.mon_tbsp_utilization order by tbsp_total_size_kb desc"

TBSP_NAME        TBSP_TOTAL_SIZE_KB  
---------------- --------------------
USERSPACE1                   31047424
USERSPACE2                   29523968
SYSCATSPACE                    262144
IBMDB2SAMPLEXML                 32768
SYSTOOLSPACE                    32768
IBMDB2SAMPLEREL                 13056
TEMPSPACE1                          8

  7 record(s) selected.

The backup statistics show the following picture now:

2016-09-07-13.37.26.511820+120 E5956251E1726         LEVEL: Info
PID     : 8829                 TID : 139794881439488 PROC : db2sysc 0
INSTANCE: jklauke              NODE : 000            DB   : TKL
APPHDL  : 0-448                APPID: *LOCAL.jklauke.160907113403
AUTHID  : JKLAUKE              HOSTNAME: oc2040617600.ibm.com
EDUID   : 651                  EDUNAME: db2agent (TKL) 0
FUNCTION: DB2 UDB, database utilities, sqluxLogDataStats, probe:395
MESSAGE : Performance statistics
DATA #1 : String, 1212 bytes

Parallelism       = 5
Number of buffers = 6
Buffer size       = 3674112 (897 4kB pages)
                                                                 
BM#    Total      I/O      MsgQ      WaitQ      Buffers   MBytes
---  --------  --------  --------  --------    --------  --------
000    202.43      2.94      0.00    199.42          69       233
001    202.43    117.76      0.08     80.94        8663     30319
002    202.43    198.90      0.08      0.02        8232     28811
003    202.43      0.10      0.00    202.29           6        19
004    202.43      0.11      0.00    202.26           4        13
---  --------  --------  --------  --------    --------  --------
TOT   1012.15    319.82      0.16    684.95       16974     59397

MC#    Total      I/O      MsgQ      WaitQ      Buffers   MBytes
---  --------  --------  --------  --------    --------  --------
000    202.43      0.03    202.29      0.00        5659     19821
001    202.43      0.03    202.28      0.01        5658     19821
002    202.43      0.03    202.28      0.01        5660     19828
---  --------  --------  --------  --------    --------  --------
TOT    607.29      0.11    606.85      0.02       16977     59471

At the beginning the backup was taken in 238 seconds. The runtime was decreased to 202 seconds only be re-balancing the data in the tablespaces. The impact will even be higher on enterprise server systems.

 

References

For further information on admin_move_table check the following link: http://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.sql.rtn.doc/doc/r0055069.html
Further information for backup and restore statistics (barstats) can be found here: https://www.ibm.com/developerworks/community/blogs/DB2LUWAvailability/entry/backup_and_restore_statistics?lang=en

 

[{"Business Unit":{"code":"BU029","label":"Data and AI"}, "Product":{"code":"SSEPGG","label":"DB2 for Linux- UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":""}]

UID

ibm11140676