IBM Support

Reducing Backup Runtimes by Redistributing Tables over Table Spaces

Technical Blog Post


Abstract

Reducing Backup Runtimes by Redistributing Tables over Table Spaces

Body

Problem Description

DB2 Backup is able to use IO parallelism by reading and backing up multiple table spaces at the same time. Number of concurrent threads reading the table spaces (called "buffer manipulators", with db2bm.X EDU name, EDU stands for Engine Dispatchable Unit) is controlled by "PARALLELISM" keyword. By default the value computes based on the size of utility heap, number of table spaces and CPUs.

Each table space in a DB2 database is read by one single buffer manipulator (db2bm EDU). The backup starts with the largest table space. As soon as the backup of one table space is completed successfully, the next table space is read until all table spaces are backed up.

It is possible that the largest table space has reached a size that is larger than all other table spaces together. This leads to a situation that this table space determines the complete backup runtime.

This situation can be checked in the barstats (backup and restore statistics) written into the db2diag.log after each backup. One of the db2bm shows a substantially longer runtime than the other db2bm EDUs:
2016-09-07-08.44.18.108926+120 E5729808E1726         LEVEL: Info
PID     : 8829                 TID : 139796638852864 PROC : db2sysc 0
INSTANCE: db2inst1              NODE : 000            DB   : SAMPLE
APPHDL  : 0-37                 APPID: *LOCAL.db2inst1.160907064019
AUTHID  : DB2INST1             HOSTNAME: XXX.XXX.XXX
EDUID   : 22                   EDUNAME: db2agent (SAMPLE) 0
FUNCTION: DB2 UDB, database utilities, sqluxLogDataStats, probe:395
MESSAGE : Performance statistics
DATA #1 : String, 1212 bytes

Parallelism       = 5
Number of buffers = 6
Buffer size       = 3674112 (897 4kB pages)

BM#    Total      I/O      MsgQ      WaitQ      Buffers   MBytes
---  --------  --------  --------  --------    --------  --------
000    238.01    230.23      0.21      0.01       16701     58443
001    238.01      4.30      0.00    233.65          67       233
002    238.01      0.02      0.00    237.93           1         1
003    238.01      0.09      0.00    237.89           6        19
004    238.01     16.84      0.01    220.63        1180      4129
---  --------  --------  --------  --------    --------  --------
TOT   1190.06    251.50      0.22    930.14       17955     62826

MC#    Total      I/O      MsgQ      WaitQ      Buffers   MBytes
---  --------  --------  --------  --------    --------  --------
000    238.01      0.05    237.81      0.00        5988     20974
001    238.01      0.05    237.79      0.01        5985     20967
002    238.01      0.05    237.79      0.01        5985     20967
---  --------  --------  --------  --------    --------  --------
TOT    714.04      0.17    713.40      0.02       17958     62909

As you can see buffer manipulator 000 is doing most of the I/Os (note "I/O" column - time spent doing IO and "MBytes" - number of megabytes read) whereas the other manipulators are waiting the most of the time ("WaitQ", time when given db2bm thread is just waiting for others to complete)

The reason for this in this example is the following imbalance of the sizes of the table spaces:

$ db2 "select substr(tbsp_name,1,16) tbsp_name, tbsp_total_size_kb from sysibmadm.mon_tbsp_utilization order by tbsp_total_size_kb desc"

TBSP_NAME        TBSP_TOTAL_SIZE_KB  
---------------- --------------------
USERSPACE1                   59867136
SYSCATSPACE                    262144
IBMDB2SAMPLEXML                 32768
SYSTOOLSPACE                    32768
IBMDB2SAMPLEREL                 13056
TEMPSPACE1                          8

  6 record(s) selected.

As the result, there is little benefit from backup parallelism because there is a single table space that contains majority of the data.

The following tables reside in the table space USERSPACE1:

$ db2 "select substr(tabname,1,20) as tabname, fpages from syscat.tables where tbspace = 'USERSPACE1'"

TABNAME              FPAGES              
-------------------- --------------------
DATA1                             3085586
DATA2                                6426
DATA3                               44978
DATA4                               96384
DATA5                              128513
DATA6                             3599640

  6 record(s) selected.

To relax this situation slightly, this table space can be split and some of the tables, namely DATA5 and DATA3 could be moved to a new table space. This would have the size of the table space USERSPACE1 and should limit the backup runtime. In this example it would even be better to split the table space into three.

Create a New Table Space

In the first step, a new table space needs to be created. To find out the parameters for USERSPACE1 the following command can be used:
db2look -d <dbname> -l
The following snippet can be found in the output:
 
CREATE LARGE TABLESPACE "USERSPACE1" IN DATABASE PARTITION GROUP IBMDEFAULTGROUP
     PAGESIZE 8192 MANAGED BY AUTOMATIC STORAGE
     USING STOGROUP "IBMSTOGROUP"
     AUTORESIZE YES
     INITIALSIZE 32 M
     MAXSIZE NONE
     EXTENTSIZE 4
     PREFETCHSIZE AUTOMATIC
     BUFFERPOOL "IBMDEFAULTBP"
     DATA TAG INHERIT
     OVERHEAD INHERIT
     TRANSFERRATE INHERIT
     DROPPED TABLE RECOVERY ON;

It may be possible that the table space was part of the create database statement. If so you can find the parameters in the output of the following command:

db2look -d <dbname> -createdb

Now the table space USERSPACE2 can be created in the same way as USERSPACE1.

Moving Tables

Now it is time to move tables from the old table space USERSPACE1 to the new table space USERSPACE2. For this purpose the procedure ADMIN_MOVE_TABLE is used. This procedure takes the following parameters:

>>-ADMIN_MOVE_TABLE--(--tabschema--,--tabname--,---------------->
>--data_tbsp--,--index_tbsp--,--lob_tbsp--,--------------------->
>--organize_by_clause--,--partkey_cols--,--data_part--,--------->
              .---------.                   
              V         |                   
---coldef--,----options-+--,--operation--)---------------------><

So in this case for table DATA5 the following command would be issued:

$ db2 “CALL SYSPROC.ADMIN_MOVE_TABLE(
'TAB1',
'DATA5',
'USERSPACE2',
'USERSPACE2',
'USERSPACE2',
'',
'',
'',
'',
'',
'')”

This command moves table DATA5 to the new table space USERSPACE2. Note that other table space can be specified for index and lob table spaces. The table DATA3 is moved accordingly.

CALL SYSPROC.ADMIN_MOVE_TABLE( 'TAB1', 'DATA6', 'USERSPACE2', 'USERSPACE2', 'USERSPACE2', '', '', '', '', '', 'MOVE')

Result set 1
--------------

KEY                              VALUE
-------------------------------- ----------------------------------------------------
AUTHID                           DB2INST1
CLEANUP_END                      2016-09-07-12.28.04.948662
CLEANUP_START                    2016-09-07-12.28.04.295206
COPY_END                         2016-09-07-12.00.20.117581
COPY_OPTS
COPY_START                       2016-09-07-09.23.54.112229
COPY_TOTAL_ROWS                  95724175
INDEXNAME                        
INDEXSCHEMA                      
INIT_END                         2016-09-07-09.23.54.014399
INIT_START                       2016-09-07-09.23.52.670441
ORIGINAL_TBLSIZE                 29491200
REPLAY_END                       2016-09-07-12.28.04.037688
REPLAY_START                     2016-09-07-12.00.20.118824
REPLAY_TOTAL_ROWS                0
REPLAY_TOTAL_TIME                1
STATUS                           COMPLETE
SWAP_END                         2016-09-07-12.28.04.217737
SWAP_RETRIES                     0
SWAP_START                       2016-09-07-12.28.04.079908
UTILITY_INVOCATION_ID            0100000009000000080000000000000000002016090709235401723700000000
VERSION                          11.01.0000

  22 record(s) selected.

  Return Status = 0

Shrinking the old table space

The final step is to shrink the old table space. If autostorage is used this can be performed by the following simple command:
db2 alter tablespace USERSPACE1 reduce max
The movement of the extents can be monitored with the following query:

$ db2 "SELECT * FROM TABLE(SYSPROC.MON_GET_EXTENT_MOVEMENT_STATUS('', -1))"

TBSP_NAME       TBSP_ID CURRENT_EXENT LAST_EXTENT NUM_EXTENTS_MOVED NUM_EXTENTS_LEFT TOTAL_MOVE_TIME

--------------- ------- ------------- ----------- ----------------- ---------------- ---------------
SYSCATSPACE           0            -1          -1                -1               -1              -1
TEMPSPACE1            1            -1          -1                -1               -1              -1
USERSPACE1            2        137038      137039               308            15761            7607
IBMDB2SAMPLEREL       3            -1          -1                -1               -1              -1
IBMDB2SAMPLEXML       4            -1          -1                -1               -1              -1
SYSTOOLSPACE          5            -1          -1                -1               -1              -1
USERSPACE2            6            -1          -1                -1               -1              -1


  7 record(s) selected.

Results

The sizes of the table spaces of the database look like the following now:

$ db2 "select substr(tbsp_name,1,16) tbsp_name, tbsp_total_size_kb from sysibmadm.mon_tbsp_utilization order by tbsp_total_size_kb desc"

TBSP_NAME        TBSP_TOTAL_SIZE_KB  
---------------- --------------------
USERSPACE1                   31047424
USERSPACE2                   29523968
SYSCATSPACE                    262144
IBMDB2SAMPLEXML                 32768
SYSTOOLSPACE                    32768
IBMDB2SAMPLEREL                 13056
TEMPSPACE1                          8

  7 record(s) selected.

The backup statistics show the following picture now:

2016-09-07-13.37.26.511820+120 E5956251E1726         LEVEL: Info
PID     : 8829                 TID : 139794881439488 PROC : db2sysc 0
INSTANCE: db2inst1              NODE : 000            DB   : SAMPLE
APPHDL  : 0-448                APPID: *LOCAL.db2inst1.160907113403
AUTHID  : DB2INST1             HOSTNAME: XXX.XXX.XXX
EDUID   : 651                  EDUNAME: db2agent (SAMPLE) 0
FUNCTION: DB2 UDB, database utilities, sqluxLogDataStats, probe:395
MESSAGE : Performance statistics
DATA #1 : String, 1212 bytes

Parallelism       = 5
Number of buffers = 6
Buffer size       = 3674112 (897 4kB pages)
                                                                 
BM#    Total      I/O      MsgQ      WaitQ      Buffers   MBytes
---  --------  --------  --------  --------    --------  --------
000    202.43      2.94      0.00    199.42          69       233
001    202.43    117.76      0.08     80.94        8663     30319
002    202.43    198.90      0.08      0.02        8232     28811
003    202.43      0.10      0.00    202.29           6        19
004    202.43      0.11      0.00    202.26           4        13
---  --------  --------  --------  --------    --------  --------
TOT   1012.15    319.82      0.16    684.95       16974     59397

MC#    Total      I/O      MsgQ      WaitQ      Buffers   MBytes
---  --------  --------  --------  --------    --------  --------
000    202.43      0.03    202.29      0.00        5659     19821
001    202.43      0.03    202.28      0.01        5658     19821
002    202.43      0.03    202.28      0.01        5660     19828
---  --------  --------  --------  --------    --------  --------
TOT    607.29      0.11    606.85      0.02       16977     59471

At the beginning, the backup was taken in 238 seconds. The runtime was decreased to 202 seconds only be re-balancing the data in the table spaces. The impact is even higher on enterprise server systems.

References

For further information on admin_move_table check the following link: http://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.sql.rtn.doc/doc/r0055069.html
Further information for backup and restore statistics (barstats) can be found here: https://www.ibm.com/developerworks/community/blogs/DB2LUWAvailability/entry/backup_and_restore_statistics?lang=en

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm11140676