Cleaning up the database with the DBPurge script

It is a good maintenance practice to keep the IBM® Security Identity Manager Server database to a manageable size.

You can use the DBPurge script to clean up the audit trail in the database by removing records that are related to completed workflow processes. The script handles only removal, not archiving, of these records. Use the script sparingly to avoid any unforeseen problems.

Command syntax

Use the following command to delete historical workflow audit data, non-workflow audit events, and reconciliation reporting entries from the database.
DBPurge.[sh|cmd] -age num_days | -date yyyy-mm-dd[-HH:mm]
           [-grouping group_size]
           [-workflow wf_flag [-process_type proc_type]]
           [-audit audit_flag] [-recon recon_flag] [-threads count]
Depending on your operating system, the command is located in one of the following directories:
  • ISIM_HOME\bin\win
  • ISIM_HOME/bin/unix
-age num_days
Specifies the age by the number of days of the records you want to remove. Records that were completed more than or equal to num_days ago are eligible for cleanup. The value must be greater than or equal to zero. A value of zero removes all data that is currently in the system.
-date date
Specifies the deletion date and optional time in an alternative way. For example, '2010-08-15-22:00'. All records created on this date or earlier are deleted, based on the server timezone.
-grouping group_size
Optional: Specifies the number of deleted entries in a single commit. The group size must be between 1 - 100, where 50 is the default value.
-workflow wf_flag
Optional: Determines whether workflow data is removed. The flag is Boolean, and its default setting is true.
-process_type proc_type
Optional: Specifies a two-character parameter, which restricts the deletion of processes to the specified type.

For example, 'AP' removes only processes of type Account Password Change. This parameter is relevant only when workflow data is removed. If you do not specify this parameter, then processes of any type are removed if they match the other parameters. For more information about the valid values, see the TYPE column description in the Database and Directory > Server Schema Reference > Database tables reference > Workflow tables > PROCESS table section on the Identity Manager documentation.

-audit audit_flag
Optional: Determines whether non-workflow data is removed. The flag is Boolean, and its default setting is true.
-recon recon_flag
Optional: Determines whether historical reconciliation data is removed. The flag is Boolean, and its default setting is true.

This option does not remove the reconciliation process-related data. To remove the reconciliation processes-related data (specifically, the data that appears in View requests), you must specify -workflow 'true' and -process_type 'RC' with the -recon option. For more information, see the -workflow and -process_type parameters.

-threads count
Optional: Specifies the number of threads to be created by the DBPurge process for the DB2 database. Allowable values are 1 - 8. The default value is 4.
Note: You must set at least one of the data types such as workflow, audit, or reconciliation to true.

After you run the command, it reports the number of primary workflow, audit-based records, and reconciliation data that were removed. It also shows any errors or warnings.

Processing description

The following description illustrates the cleanup processing that occurs when you run DBPurge. Additional archive utilities can be built and run before running DBPurge. The exact implementation might vary.

DBPurge runs the following queries to locate the primary records to remove:
1. SELECT ID FROM PROCESS WHERE COMPLETED <= timestamp
2. SELECT ID FROM AUDIT_EVENT WHERE TIMESTAMP <= timestamp AND WORKFLOW_PROCESS_ID IS NULL
3. SELECT RECONID, ACCOUNTID FROM RECONCILIATION_INFO WHERE RECONID IN (SELECT RECONID FROM 
RECONCILIATION WHERE COMPLETED <= timestamp)
4. SELECT RECONID FROM RECONCILIATION WHERE COMPLETED <= timestamp

The value of timestamp is based on the specified -age parameter and uses the Identity Manager date format yyyy-MM-dd HH:mm:ss:SSS GMT. As the primary records are selected, the data is removed along with data from the secondary, dependent tables that reference these identifiers. The deletion is done in groups.

An adjusted age specification supports consistency so that the record age accurately reflects the time zone of the record time stamps. This strategy supports consistent handling of record time zones. The following values are valid:
0
Deletes any records that completed before the current time.
1
Deletes any records completed before exactly 24 hours ago.

This utility includes multi-threaded deletion. For all databases, separate threads and database connections to read record identifiers and to carry out deletions. For DB2® databases, multiple threads carries out the deletion and improve performance. Each thread requires its own database connection. The utility fails if the appropriate number of database connections is not available. For DB2 databases, DBPurge requires five connections; for other databases, it requires only two.

The following example is a high-level version of the statements for each table, and it illustrates the rows that are removed from each table.

Example

The following delete statements remove rows that reference identifiers from query (1) and from the PROCESS table:

DELETE FROM WORKITEM WHERE PROCESS_ID = ?
DELETE FROM ACTIVITY_LOCK WHERE PROCESS_ID = ?
DELETE FROM PROCESSLOG WHERE PROCESS_ID = ?
DELETE FROM PROCESSDATA WHERE PROCESS_ID = ?
DELETE FROM PENDING WHERE PROCESS_ID = ?
DELETE FROM PASSWORD_TRANSACTION WHERE PROCESS_ID = ?
DELETE FROM ACTIVITY WHERE PROCESS_ID = ?
DELETE FROM WORKFLOW_CALLBACK WHERE PROCESS_ID = ?
DELETE FROM SYNCH_POINT WHERE PROCESS_ID = ?
DELETE FROM AUDIT_MGMT_PROVISIONING WHERE EVENT_ID IN 
   (SELECT ID FROM AUDIT_EVENT WHERE WORKFLOW_PROCESS_ID = ?)
DELETE FROM AUDIT_MGMT_TARGET WHERE EVENT_ID IN 
   (SELECT ID FROM AUDIT_EVENT WHERE WORKFLOW_PROCESS_ID = ?)
DELETE FROM AUDIT_MGMT_DELEGATE WHERE EVENT_ID IN 
   (SELECT ID FROM AUDIT_EVENT WHERE WORKFLOW_PROCESS_ID = ?)
DELETE FROM AUDIT_EVENT WHERE WORKFLOW_PROCESS_ID = ?
DELETE FROM SCHEDULED_MESSAGE WHERE REFERENCE_ID = ?
DELETE FROM LCR_INPROGRESS_TABLE WHERE CHILD_ID = ?
DELETE FROM PROCESS WHERE ID = ?

The following delete statements remove rows that reference identifiers from query (2) and from the AUDIT_EVENT table:

DELETE FROM AUDIT_MGMT_PROVISIONING WHERE EVENT_ID = ?
DELETE FROM AUDIT_MGMT_TARGET WHERE EVENT_ID = ?
DELETE FROM AUDIT_MGMT_DELEGATE WHERE EVENT_ID = ?
DELETE FROM AUDIT_EVENT WHERE ID = ?

The following delete statements remove rows that reference identifiers from query (3) and from the RECONCILIATION_INFO table:

DELETE FROM RECONCILIATION_INFO WHERE RECONID = ? AND ACCOUNTID = ?

The following delete statements remove rows that reference identifiers from query (4) and from the RECONCILIATION table:

DELETE FROM RECONCILIATION WHERE RECONID = ?