Technical Blog Post
Handling large DB tables in BPM
Sometimes the BPMDB (Business Process Manager database) will occupy a large amount of disk space. If this space is too large, you have several ways to check the database size and tune up your Database. This article describe the situation when the large DB tables resides in the LSW_MANAGED_ASSET_DATA tables as well as BPMCOACHNG table. These 2 tables, especially for LSW_MANAGED_ASSET_DATA is very likely to increase unbounded when large BLOB (Binary Large Object) data resides in the DB.
First, there are several SQLs on hand to help you better understand which table is taking up a large amount of space:
SQL1: DB2 ROW COUNT FOR ALL TABLES
select tabschema, tabname, card, overflow, stats_time, npages, fpages, avgrowsize from syscat.tables;
SQL2: DB2 SIZE OF BPM TABLES IN MEGABYTES (MB)
SELECT TABSCHEMA as "Schema",
TABNAME as "Table Name",
SUM(DATA_OBJECT_P_SIZE)/1024 as "Table Physical size in MB",
SUM(INDEX_OBJECT_P_SIZE)/1024 as "Index Size",
SUM(LONG_OBJECT_P_SIZE)/1024 as "Physical disk for long data types",
SUM(LOB_OBJECT_P_SIZE)/1024 as "Physical disk LOB fields",
SUM(XML_OBJECT_P_SIZE)/1024 as "XML physical data size"
GROUP BY TABSCHEMA,TABNAME;
Tips to load the output for analysis
If you output the data into a txt format, I would recommend you use MS Excel to load the data (Usually when you open the txt format, Excel will ask you how to recognize the data in each cell, I prefer to use the fixed size because a DB2 export always exports in fixed size):
1) Select Fixed width
2) Examine the data to make sure the column name is aligned with the data
3) Then, finish the text import wizard
Then you should get well formatted data into Excel. You can easily filter each column from biggest to smallest, also sum up each column's total value.
Analysis for the 2 SQLs
In my example, after filtering with NPAGES, the output displays as follows (from biggest to smallest):
NPAGES means the number of pages for the selected table. For BPMDB, we use 32k as page size.
create database BPMDBPS automatic storage yes using codeset UTF-8 territory US pagesize 32768;
row size of table = pagesize * number of pages
You can see the largest row size tables are LSW_EPV_VAR_VALUE (1093MB) and LSW_PO_VERSIONS (98MB). The LSW_MANAGED_ASSET_DATA and BPM_COACHNG do not show a large size.
But if you run the second SQL, it will be (size in MB):
You can see the 2 tables have more than 20+ GB in LOB fields.
How to clean up
Clean up the snapshots
Cleaning up the large number of artifacts is usually done by cleaning up the snapshots.
1) Apply required fixes. See Issues with BPMDeleteSnapshot and BPMSnapshotCleanup commands in IBM Business Process Manager (BPM).
If you are not on BPM v8.5.6, you must apply the above fixes before cleaning up the snapshots.
For Process Center see BPMSnapshotCleanup command in the product documentation.
For Process server see BPMDeleteSnapshot command in the product documentation.
Clean up the DB2 space
3) After you clean up the snapshot, the DB size will not automatically shrink. You need to run the commands below:
a) Run reorg, runstats for DB row size:
db2 reorg table <schema name>.<table name> inplace allow write access
db2 reorg indexes all for table <schema name>.<table name> allow write access
db2 runstats on table <schema name>.<table name> and detailed indexes all allow write access
b) Run reorg again against the longlobdata option:
db2 reorg table <schema name>.<table name> longlobdata --> this is important to reorganize the BLOB data. This may take very long time if your BLOB size is very huge
For this case, we need to run the commands for LSW_MANAGED_ASSET_DATA and BPM_COACHNG separately.
Now the DB size should be reduced.
4) But the tablespace utilization rate may still shows the old level of usage if you run the below command:
SQL3: TableSpace utilization
tbsp_type as Type,
substr(tbsp_state,1,20) as Status,
(tbsp_total_size_kb / 1024 ) as Size_Meg,
decimal((float(tbsp_total_size_kb - tbsp_free_size_kb)/ float(tbsp_total_size_kb))*100,3,1)as Percent_used_Space,
int((tbsp_free_size_kb) / 1024 )as Meg_Free_Space
from sysibmadm.tbsp_utilization where tbsp_type='DMS'
That is because the tablespace size and high water mark (hwm) of the tablespace still needs to be reduced.
5) Determine if you need to reduce the hwm and tablespace:
db2pd -db BPMDB -tablespaces
Output will be like below:
We only focus on USERSPACE1 because BPMDB is stored in this tablespace. From the above, you can see there are 696128 free pages in the tablespace. That means the reduce tablespace command would reallocate this space back to the OS.
6) Run the commands:
db2 ALTER TABLESPACE userspace1 LOWER HIGH WATER MARK
db2 ALTER TABLESPACE userspace1 REDUCE max
Then, re-run the db2pd command again:
You can see the total pages shrink and there is 0 free pages left. All the free pages have been given back to the OS.
Now the tablespace utilization should be correct.
1) The 2nd SQL is more accurate to calculate the total size of tables because it includes more than the row size, it has the physical size, LOB fields, etc. You need to check if the large LOB fields are using up the space.
2) Use longblob data to cleanup the BLOB data.
3) Use the db2pd command to determine if the high water mark and tablespace need to be reduced.
- Retrieve Table size full information: ADMINTABINFO administrative view and ADMIN_GET_TAB_INFO table function - retrieve table size and state information
- High Water Mark: Reclaimable storage
- Purge BPM Data: Purging data in IBM Business Process Manager