Configuring and running

The purge utility is provided in the artifacts container for FTM. Copy the files to a computer that you want to run this utility from. For example, the utility is extracted to extract_directory/ibm/ftm/v409/tools/purgeDB directory.

The utility itself is implemented in Java™ and is provided in the purgeDB.jar file.

The following property files are also supplied:
fxhPurgeMsgBundle.properties
This property file has English language error messages that are used by the utility.
purgeDB.properties
This property file has the runtime configuration to control the utility. It also identifies the name of the properties file that defines the database connectivity.
dbConnection4DB2.properties
Connection properties files are examples of the connectivity files that define the database connection and can be referenced from purgeDB.properties.
Note: The properties file that is used to define the database connection must be encrypted for use with the utility. This encryption is to make sure that the database password is not stored in clear text in the file. For more information about encrypting properties files, see Encryption of properties files. The main properties file can be optionally encrypted.
The utility also depends on classes from the FTM installation folder and JDBC drivers for the database:
FTM installation folder classes
  • extract_directory/ibm/ftm/v409/run/classes/icu4j.jar
  • extract_directory/ibm/ftm/v409/run/classes/FTMCoreJava.jar
JDBC drivers for Db2®
db2jcc4.jar - JDBC driver.

Original algorithm

The original algorithm works by using a PURGE_LIST table to store the IDs of the objects that are expired. Identifying the expired records begins by using master transmission records. More records are located by using direct child relationships; for example, batch and transaction. Related records are located by using OBJ_OBJ_REL relationships. The related records are checked iteratively until no more are found.

All IDs for the expired objects that were found are stored in the PURGE_LIST table. Then, the objects and related records are purged. The purged IDs are removed from the PURGE_LIST table.

The original algorithm can be run in the following modes:
  • Discovery
  • Purge
  • Discovery and purge

Original algorithm enhancements

The following enhancements were made to the original algorithm.
  • Added a profile parameter that you can use to define multiple purge profiles. By using multiple profiles, different types of records can be targeted with different criteria.
  • Added a retention period property that supplements the start date and end date properties.
  • Transmission subtypes can be used to locate expired transmissions that are not master transmissions. Combined with the other properties, you can configure the purge utility to isolate and purge sets of objects by the business purpose and with specific retention periods.

New alternative algorithm

The original algorithm had the following fundamental problems that making enhancements to the algorithm couldn't address.
  1. The process is serial by design.
  2. Extra database workload is generated by inserting, deleting, or both, from the PURGE_LIST table.

The new alternative algorithm is intended to increase the efficiency and scalability of the purge utility. This algorithm updates the purge utility by maintaining the lists of IDs in memory and not in the PURGE_LIST table. The expiry identification algorithm is simplified to not locate objects across the OBJ_OBJ_REL records. In the old algorithm, the queries that are used to locate objects across these records can be expensive. Because the new algorithm does not use a central database table, the purge utility can run multiple purge jobs that are simultaneously targeted at specific data sets with different retention periods.

FTM applications can be configured to not log all physical transmissions, transactions, or both. The new algorithm can be used to start transaction level purges where the transmission records are not logged.

Purge utility properties

For a list of the runtime configuration properties for the purge utility, see the following table.

Table 1. Runtime configuration properties for the purge utility
Property name Description Algorithm (purge.algorithm value) See table note
purge.algorithm - Integer

1 = Original algorithm
2 = New alternative algorithm

Moving to algorithm 2 is recommended.
   
purge.application.name Specifies the name of the FTM application that is to be purged. When this property is not specified, purge is available for all applications. All  
purge.subtypesPT This property defines the subtypes that a physical transmission level purge is to purge. It is specified as a comma-separated list. This property replaces the deprecated purge.subtypes property.
To make sure that physical transmission objects that have a subtype of NULL are also purged, add the $Null keyword to the list of subtypes. For example,
purge.subtypesPT=$Null,PAY_ORIG, PAY_ORIG_BAT, PAYMENT_ORIGINATION, . . .
All 1
purge.subtypesTxn This property defines the subtypes that are to be purged in a transaction level purge. It is specified as a comma-separated list.
To make sure that transaction objects that have a subtype of NULL are also purged, add the $Null keyword to the list of subtypes. For example,
purge.subtypesTXN=$Null,PAY_ORIG, PAY_ORIG_BAT, PAYMENT_ORIGINATION, . . .
Alternative (2) 1
purgable.statesPT This property defines the purgeable states for a physical transmission level purge. It is specified as a comma-separated list of states. This property replaces the deprecated purgable.states property. All 2
purgable.statesTxn This property defines the purgeable states for a transaction level purge. It is specified as a comma-separated list of states. These states are used only for transactions that were never assigned to a transmission object or were orphaned due to a prior purge. Alternative (2) 2
purge.retention.days Records that are older than this date are included in the purge. All 3
purge.retention.days.events Event records that are older than this date are purged. Alternative (2) 4
event.references.ignore - true/false

If false, which is the default, purge only those events that the base objects do not reference. If true, purge all events regardless of any existing references.

Alternative (2)  
purge.populate - true/false

Specifies whether the PURGE_LIST table is populated. If the PURGE_LIST table already has entries, setting this property to false skips the population phase during execution. The preexisting list can be purged immediately. Generally, it is advised for this property to be set to true.

Original (1)  
purge.execute - true/false

Specifies whether the objects listed in PURGE_LIST are purged from the database. Set this property to false to allow the PURGE_LIST table to be populated, but not purged immediately. Generally, it is advised for this property to be set to true.

Original (1)  
populate.mode.batch - true/false

For large purge jobs, it is preferable to populate the PURGE_LIST and purge the entries in a series of batches instead of using a single PURGE_LIST to purge everything. Generally, it is advised for this property to be set to true.

All  
populate.batch.size - int

Specifies the number of master objects per iteration that are selected and used to populate the PURGE_LIST table. An appropriate value for this property is highly dependent on the structure of the data that is going to be purged. If the physical transmissions to be purged contain many nested elements, or child objects, a smaller value is appropriate. If the physical transmissions contain a relatively small number of child objects, a larger value is acceptable.

You might need to use a strategy of experimenting and tuning to figure out the correct value to use. Consider starting with 100 as a default value. If the population phase has no problems, increase it. If the population phase appears to be failing to complete, rule out other causes for the problem before you decrease this value.

All  
purge.window.start - String that represents a date, in the format yyyy-mm-dd.

Defines the date from which objects that are newer than the date can be selected for deletion from the database. The transmission objects with a created date that is later than this date, and all of their related objects, can be in the purge window.

All  
purge.window.end - String that represents a date in the format yyyy-mm-dd.

Defines the date from which objects that are older than the date can be selected for deletion from the database. The transmission objects with a created date that is earlier than this date, and all of their related objects, can be in the purge window.

All 3
max.deletion.size - integer

The maximum number of objects that one delete statement deletes from a table. Delete statements are written in the format: delete from *table* where *id* in ( *sequence of IDs*)

All 5
purge.listlimit - integer

The maximum number of parameter markers to be used in the select statements.

Alternative (2) 5
oor.batch size - int

Using the OBJ_OBJ_REL table to discover the new objects that are related to the objects already in the purge list can overwhelm the purge utility for large volumes of data. To address this issue, you can use this property to set a batch size to use. The utility uses multiple SQL queries with a set size to discover the relationships for batches of object IDs. It uses this method instead of finding and returning all related objects by using a single query.

The value of this property cannot be larger than 32,767. A default value that is not larger than 10,000 is advised.

Original (1)  
purge.start.threshold - int

If the purge list is being populated with the purge.populate, purge.execute, and populate.mode.batch properties all set to true, this value limits how large the purge list can grow before the contents are automatically purged. After the automatic purge, the population phase resumes.

Try starting with a value of 5000. Consider increasing this value if the purge phase is running after only a short amount of time spent in the population phase.

Original (1)  
select.oor - Boolean

Specifies whether OBJ_OBJ_REL records are traversed to find related objects.

Original (1)  
select.oor.ignore - string
This string has the following format.
'example','example','example'

Each example represents a relationship type in the OBJ_OBJ_REL table. The objects with this relationship type are ignored and not added to the PURGE_LIST when it is being populated. This property can have one type, many types, or no types at all when it is left blank.

Original (1)  
logging.level - string

Valid values are INFO, WARNING, or SEVERE, for general information, warnings, or errors.

The following logging levels for application output are supported.
  • INFO - all application information, warnings, and errors are output.
  • WARNING - only warning and error messages are output.
  • SEVERE - prints only error messages.
When the utility is configured properly, a value of WARNING or SEVERE results in no console output when the purge utility runs successfully. An invalid or blank value for logging.level defaults to WARNING.
All  
logging.file.location - string

Specify the path to the directory where the log file is written. If this entry is not specified or is left blank, it defaults to the current path.

All  
logging.file.name - string

Specify the name of the log file. If this entry is not specified or is left blank, it defaults to purgeDB.log.

All  
purge.trace - true/false

If this property is set to true and the logging level is INFO, the SQL statements and the execution time are output to the console.

All  
dbConnection.properties.filename - file name

The name of the encrypted properties files that are used to create the JDBC database connection.

All  
purge.retention.days.schedulertask For the scheduler task, the audit log records that are older than the number of days that are specified in this property are purged from the H_OBJ_BASE table. Alternate (2) 6
purge.retention.days.svcparticipant The audit log records for the service participant that are older than the number of days that are specified in this property are purged from the H_OBJ_BASE table. Alternate (2) 6
Table notes:
  1. When the new alternative algorithm is used, a purge job must define one or both of these properties.
  2. When the new alternative algorithm is used, a purge job must define one or both of these properties.
  3. The retention days property is considered a more practical property for production use and takes precedence.
  4. If multiple, concurrent purge jobs are using the new alternative algorithm, have only one job purge the event records to avoid database conflicts.
  5. Typically, a database product can impose a limit such as 32,767. A value that is no larger than 10,000 is advised as a practical limit.
  6. The standard algorithm, 1, also purges these records, but uses the purge.window.start, purge.window.end, purge.retention.days properties.
For the database connection properties file, the following sample template files are supplied:
dbConnection4DB2.properties
This file is the sample for a Db2 database.
The properties that are specified in these files are used to create a JDBC connection to the database.
For more information about the JDBC properties that are supported for Db2, see the IBM Data Server Driver for JDBC and SQLJ topic in the following documentation for your database platform:
  • The Db2 for Linux®, UNIX, and Windows documentation.
These database connection properties files have the following property keys:
connection.url=<connection string>
The JDBC database connection string that contains the hostname, port, and database name.
For Db2, the connection string has the following format: jdbc\:db2\://<hostname>\:<port>/<datasource>.
connection.schema=<schema>
The database schema to be used.
user=<user>
The username to use to connect to the database.
password=<password>
The password to use to connect to the database.
For Db2, the following JDBC properties were added to encrypt the user credentials as they are passed to the database server.
securityMechanism=<security mechanism number>
The supplied template contains securityMechanism=9. For this property, the value of 9 is com.ibm.db2.jcc.DB2BaseDataSource.ENCRYPTED_USER_AND_PASSWORD_SECURITY.
To use this option, the following configuration changes must be enabled on the Db2 Server:
  • Set the AUTHENTICATION property to SERVER_ENCRYPT.
  • Set the DB2AUTH property to JCC_ENFORCE_SECMEC.
encryptionAlgorithm=<1>
The encryption algorithm that is used by the specified securityMechanism. The supplied template contains the value of 1 = Data Encryption Standard(DES) with 56-bit encryption. A more secure algorithm, with a value of 2 = Advanced Encryption Standard(AES) with 256-bit encryption, is also available. However, this algorithm requires the installation of the IBM® Java Cryptology Extension (JCE) files.
Note: If the securityMechanism that is selected does not require encryption, remove or comment out this property. For example, a security mechanism value of 3 = com.ibm.db2.jcc.DB2BaseDataSource.CLEAR_TEXT_PASSWORD_SECURITY doesn't require encryption.
Note: If you connect to Db2 over SSL, and in the nonproduction environment your certificate hostname does not match one of the Db2 instances, you can disable the hostname validation. To do so, follow the steps that are listed.
  • Add sslHostnameValidation=OFF to dbConnection4DB2 properties.
  • Add -Dcom.ibm.db2.jcc.override.sslClientHostnameValidation=OFF to the command that is used to run the tool.

Running the utility

To run the utility, begin by creating a working directory on the disk. For example, C:\purge_work_dir. Copy each of the following files into it, as needed:
  • purgeDB.jar
  • purgeDB.properties
  • fxhPurgeMsgBundle.properties
  • icu4j.jar
  • FTMCoreJava.jar
  • dbConnection4DB2.properties (for Db2)
  • db2jcc4.jar (for Db2)

Use the descriptions for the purge utility properties to edit the database connection properties files and the purgeDB.properties file to work for your environment. Encrypt the database connection properties file and then encrypt the purgeDB.properties file if you want. Remove the nonencrypted version of any property files that were encrypted. For more information about encrypting properties files, see Encryption of properties files.

The purge utility is a Java program that you run from a command line. From the command line, go to the working directory that you created, which is C:\purge_work_dir in this example, and issue the following command:
<java path> -Djdbc.drivers=<JDBC driver> -cp <classpath> com.ibm.fxh.purge.purge <properties file>
where:
  • <java path> is the path and file name of the Java executable file to use. The encryption and decryption functions that are in the purge utility and the encryption tool are sensitive to the version of Java that is used to run them. To avoid incompatibility issues, start each tool with the same IBM-provided version of Java. For example, use the version of the Java executable file that is in the C:\Program Files\IBM\MQSI\9.0.0.6\jre17\bin\java file, which is provided by IBM App Connect Enterprise.
  • <JDBC driver> is the appropriate JDBC driver value. For Db2, use com.ibm.db2.jcc.DB2Driver.
  • <classpath> is the path to all the various necessary resources in the working directory. In this example, C:\purge_work_dir\*.
  • <properties file> is the name of the main properties file without its file extension. In this example, this value is purgeDB for a file that is named purgeDB.properties or a file that is named purgeDB.properties.en3.
The following example shows a sample command for Db2 on the Windows operating system:
"C:\Program Files\IBM\MQSI\9.0.0.6\jre17\bin\java" -Djdbc.drivers=com.ibm.db2.jcc.DB2Driver -cp "C:\purge_work_dir\*" com.ibm.fxh.purge.purge purgeDB