IBM® Business Process Manager is a stateful product that accumulates data over time. As with any stateful product, it's essential to its ongoing health to have a strategy for purging some of that state occasionally. This article explores the areas of IBM BPM where data is collected and the methods that exist today to purge that data. This content is part of the IBM Business Process Management Journal.

Dave Spriet (spriet@ca.ibm.com), BPM Consultant, IBM

Dave Spriet photoDave Spriet is a BPM Consultant for IBM Software Services for WebSphere, which involves BPM architecture reviews, installation, migration, upgrade and customer enablement. Dave has been with IBM since 1998 and has focused primarily on BPM, SOA and connectivity throughout his career. He has a Bachelor's degree (with Honors) in Computer Science and Statistics from McMaster University, Canada.



Phil Coulthard (coulthar@ca.ibm.com), Chief Architect, BPM and SOA Tools, IBM

Phil Coulthard photoPhil Coulthard is a member of the BPM and ODM CTO Office, which has architectural oversight of IBM Business Process Manager and Operational Decision Manager. He is the lead architect for the tools for IBM BPM. He has worked in BPM and Integration for the last nine years, prior to which he was the lead architect of compilers and tools for IBM i (known then as iSeries).



July 2014 (First published 04 December 2013)

Also available in Chinese Russian

If data grows without bounds, it can over time lead to disk space issues and to performance issues as database queries take ever longer. In this article, we cover all the areas where IBM BPM collects data either in a database or in the file system. We point out where information is specific to IBM BPM Advanced or Standard editions, and provide release-specific information. We go back as far the V7.5.1.1 release of IBM, though much of what is stated there will apply to the predecessor WebSphere Process Server and WebSphere Lombardi Edition products.

We will cover the following topics in this article:

  • Process Center projects and snapshots
  • Process Server process and task instance data
  • Performance Data Warehouse event data
  • Some additional secondary data that can accumulate, though not as fast as the above
  • IBM Business Monitor, a complementary product to IBM BPM.

Process Center

The Process Center holds projects, which are either deployable process apps or reusable toolkits.

Process apps and toolkits

Process apps and toolkits can be archived from the Manage tab of the Process Center for a project, as shown in Figure 1.

Figure 1. Archiving process apps and toolkits
Archiving process apps and toolkits

Click to see larger image

Figure 1. Archiving process apps and toolkits

Archiving process apps and toolkits

Archiving a project does not delete it or reclaim its space in the database, but merely marks it so it doesn't show by default in the Process Center UI. To actually delete a project, select to show the archived projects and then delete the project, as shown in Figure 2. To do this, the project must be archived first.

Figure 2. Deleting archived process apps and toolkits
Deleting archived process apps and toolkits

Deleting a project will delete all snapshots and process instances, as well as all BPM Advanced content such as associated BPEL process instances, business-level applications and enterprise applications.

This deletion capability was added in BPM V7.5.1.0 and is available only from the user interface. There is no scripted way to do this.

Snapshots in Process Center

Note: Do not delete the snapshots until all the required fixes have been applied as directed in this Flash alert.

When you edit a process app or toolkit using Process Designer, you are changing a special version or snapshot of it called Current (or Tip historically). At any point in time you can take a new snapshot and give that snapshot a name. Named snapshots are deployable to Process Server, but other than Current versions are not editable.

What you may not know is that every time you save artifacts in Process Designer an unnamed snapshot is created in the database. This is helpful to enable you to see history, but comes at a price due to the database growth.

You can archive individual named snapshots instead of archiving the entire project and all of its snapshots. You can do this from the Snapshots page of the process app or toolkit, using the dropdown for each snapshot. Once again, this does not delete the snapshot, it merely marks it and hides it.

So how do you delete a snapshot in Process Center? BPM V8.5.0 introduced the BPMSnapshotCleanup wsadmin command, which allows you to delete both named and unnamed snapshots. The named snapshots must first be archived in order to delete them.

This important command has been backported to BPM V8.0.1 as part of the 8.0.1.2 fixpack. If you are on V8.0.1, you are encouraged to update to V8.0.1.2 to get this command, as well a couple more purging related commands we'll talk about shortly. On V8.0.1.2, ensure that you have iFixes JR49267, JR48877, and JR49374 installed.

This command has also been backported to the V7.5.1.2 fixpack, so similarly 7.5.1.x customers are encouraged to move to it. On V7.5.1.2, ensure you have iFixes JR48877 and JR49374 installed.

The 8.5.01 fixpack introduced a new capability to enable automatic clean-up of unnamed snapshots. You can enable this by adding lines such as the following to your 100custom.xml file:

<unnamed-snapshots-cleanup-config>
    <enabled>true</enabled><cleanup-start-time>23:23:59<
	/cleanup-start-time><cleanup-duration-minutes>5<
	/cleanup-duration-minutes><clean-after-number-named-snapshots>4<
	/clean-after-number-named-snapshots>
</unnamed-snapshots-cleanup-config>

See Deleting unnamed snapshots, automated in the IBM BPM V8.5 Information Center for more details.

In the meantime, there is another poor man's way to delete unnamed snapshots: export a snapshot to a twx file and re-import it into a different Process Center. Unnamed snapshots are not exported.

Advanced content in Process Center

Note: Do not delete the snapshots until all the required fixes have been applied as directed in this Flash alert.

If you have IBM BPM Advanced, and you have process apps and toolkits that contain advanced content, such as BPEL processes, you need to have a strategy for deleting the business-level applications (BLA) and enterprise applications created in the Process Center playback server by the existence of that content.

For every process app or toolkit in Process Center containing a module or library, either directly or inherited through a toolkit, a BLA will be created for the Current snapshot and for every named snapshot of both toolkits and process apps containing advanced content. The BLA will be named <Acronym>-<Snapshot-name>, for example PA1-V1.0.

Within those BLAs, every module or library will produce an EAR that is an asset within it. These BLAs are created on demand, either when doing a playback from Process Designer or publishing to Process Center from Integration Designer, and will remain until the process app or toolkit is inactivated. As you create toolkits with advanced content, consume those toolkits from process apps, and snapshot the toolkits and process apps, this advanced content can quickly accumulate, affecting server start time, memory consumption and general performance.

You typically only really need the BLAs for the Current snapshot of a process app, so you can delete them from the Current snapshot of any toolkits and any named snapshots of either toolkits or process apps. To delete the BLAs, you simply need to "undeploy" your snapshot. You'll see this option on the Snapshots page of Process Center, in the dropdown for the Current snapshot if that snapshot's advanced content has been deployed. For named snapshots, you first need to use the Deactivate action, and then you will see the Undeploy action if there is deployed advanced content. Of course, you can always use the WebSphere administrative console or commands to manually remove the BLAs and EARs. Don't worry about deleting something that may be needed, as these BLAs will be recreated on demand if needed.

If you are using advanced integration services (AIS), which are the bridge between BPMN and BPEL processes, we recommend the façade pattern to reduce the size of the EARs involved. This pattern is described in the developerWorks article Implementing the facade pattern using IBM Business Process Manager Advanced V7.5.


Process Server

The Process Server is where you install process application snapshots to and where you run processes within them. What you need to think about primarily in the Process Server, in terms of accumulating content, is how to delete these installed snapshots, and how to delete processes once they complete or terminate.

Snapshots in Process Server

You can and will install multiple snapshots of the same process app to a given Process Server. Over time, these snapshots can accumulate and it becomes prudent to delete the ones that are no longer used.

For BPM Advanced customers, it's important to remember that if a process app contains any advanced content, such as a module or library from Integration Designer, a business-level application with EARs will be created for this process app. Conceptually, the BPM content is installed into a Process Server, after which the BPM Advanced content is deployed, which amounts to generating and installing the BLA and constituent EARs. This is the Rosetta Stone needed to understand the related wsadmin commands.

You can only remove snapshots from Process Server in BPM V8.0.1 or higher, with the introduction of the BPMDeleteSnapshot wsadmin command. For this command to work a number of pre-conditions must be met, including:

  • The snapshot cannot have any running instances and cannot be the default snapshot. Use the BPMShowSnapshot command to determine whether either of these are true.
  • The snapshot cannot be active. Use the BPMDeactivateSnapshot command to deactivate the snapshot. For BPM Advanced processes, you also need to use the BPMStop command. These commands prevent new instances of BPMN and BPEL processes from starting and allow existing instances to quiesce.
  • The BPM Advanced content, including the BLA and EARs, needs to be undeployed with the BPMUndeploy command.

When you successfully delete a snapshot, note that any business process definition (BPD) instances for it are deleted with it.

Instances in Process Server

There are two types of instances to consider here: user or human task instances, and process instances. This is true for both BPMN BPD processes and BPEL processes. Both task and process instances are recorded in the database even after the task and process has completed. Hence, it is important to think about occasionally purging older instances.

When you delete process instances, task instances are also deleted. You can also delete BPEL human task instances independently of their processes, but this is not the case for BPD user tasks.

To delete BPD process instances, and their associated task instances, beginning in BPM V8.0.1, you can use the BPMProcessInstancesCleanup wsadmin command. This allows you to either identify the specific instances to delete, or the date range within which any instances that completed will be deleted. You also identify whether to delete completed, canceled, failed or all types of instances. This command is enhanced in V8.0.1.2 and V8.5.0.1 to include additional parameters to specify the maximum duration time and maximum number of instances to delete, thus making it a candidate to run in a regularly scheduled chron job so you can limit its impact on the system and limit its work.

This command does not exist on V7.5.x, but you can use a supplied stored procedure LSW_BPD_INSTANCE_DELETE to delete explicitly identified process instances. Be aware there is an interim fix available for V7.5.1.1 that considerably improves the performance of this stored procedure. For details see APAR JR46453. This fix is rolled into the V7.5.1.2 fixpack.

BPM Advanced provides a few options for deleting Business Process Choreographer (BPC) human task and process instances:

  1. When modeling the human tasks and BPEL processes in Integration Designer, specify to automatically delete instances when complete.
  2. Use the BPC Explorer to delete tasks or processes individually.
  3. Create a custom utility using the BPC APIs.
  4. Use the supplied jython scripts deleteCompletedTaskInstances.py or deleteCompletedProcessInstances.py to delete a batch of task or process instances by state, owning user, or completion date.
  5. Use the Cleanup Service in the Human Task Manager or Business Flow Manager administrative console pages, to schedule jobs to automatically delete task or process instances. You can specify when to run, how long to run, how many instances to delete, and which instances to delete via state and date criteria, as shown in Figure 3.
Figure 3. Cleanup Service scheduling for BPEL process instance purging
Cleanup Service scheduling for BPEL process instance purging

In BPM Advanced, there are some other things that can be purged, including:

  • BPM process and human task templates that are no longer needed
  • Audit log entries
  • Failed messages in the hold queue
  • Unused shared work items

Information about how to delete these can be found in the Information Center topic Cleanup procedures for Business Process Choreographer. We also highly recommend developerWorks series Operating a WebSphere Process Server environment.

Durable subscription events in Process Server

Message events in an intermediate message activity of a BPD can be made durable, which you enable by checking Durable Subscription in the properties, as shown in Figure 4.

Figure 4. Specifying a durable subscription for intermediate events
Specifying a durable subscription for intermediate events

Click to see larger image

Figure 4. Specifying a durable subscription for intermediate events

Specifying a durable subscription for intermediate events

If specified in Process Designer, these durable messages will accumulate and require occasional clean-up, even if you select Consume Message. Historically, there was no built-in way to do this clean-up, but this has been addressed in V7.5.1.2, V8.0.1.2, and V8.5.0.1 with a new wsadmin command named BPMDeleteDurableMessages, which takes three parameters:

  • olderThan: only events older than this number of days will be deleted
  • maximumDuration: the command will only run for this amount of time
  • transactionSlide: the number of events to delete per transaction

For example:
BPMDeleteDurableMessages {-olderThan 30 -maximumDuration 60 -transactionSlice 100}


Performance Data Warehouse

BPD processes can support tracking, which means events are sent to the Performance Data Warehouse and logged in its database. How many events are sent is primarily determined by whether or not your BPD has autotracking turned on, which has been the default for new BPDs.

With autotracking on, the Performance Data Warehouse can quickly accumulate a lot of data. Unfortunately, there has historically not been any product-supplied and supported way to delete any of this data. As a result, many customers have resorted to custom SQL code that directly accesses the database to delete or move certain data according to age or state criteria.

Alternatively, one can completely remove all Performance Data Warehouse data by dropping and recreating the tables, as documented in these two technical notes:

The good news is that there is now product support for pruning selective data from the Performance Data Warehouse, supplied as part of the 8.5.0.1, 8.0.1.2, and 7.5.1.2 fix packs. With these releases, the perfDWTool gets a new option named prune to allow purging of data that is beyond a given age in days, as shown here:
perfDWTool.sw –u uid –p pwd –nodeName node prune –daysOld days

This will delete all data older than days. Note that this command needs to run from an active node in the cluster and all members of the cluster should be running. You should try to run it when the server is least busy. Its operation can be affected by three new settings you can override in 100custom.xml:

  • prune-batch-size: The number of records to be deleted in a single prune operation. The default value is 1000.
  • prune-operation-time-box: The amount of time the operation will run, in seconds. The default is 10800 or 3 hours.
  • prune-operation-time-box-retry: The number of times the operation will be tried. The default is 4 such that it will retry 3 times.

Additional data

While most data accumulation comes from process, tasks and events, there are additional areas of accumulation you need to consider over time, which we'll cover in this section.

Document attachments

Heritage Coaches support uploading documents to an internal document store via APIs as well as pre-supplied controls. These documents are not separately deletable, but they are associated with process instances and as such they will be deleted when process instances are deleted. Alternatively, you can write your own service to delete these using the Javascript method deleteAllVersions in TWDocument.

When next generation Coaches were first introduced in V8.0.0 and V8.0.1, there was no equivalent built-in document attachment store capability supplied, although there is support for accessing external enterprise content management systems. However this built-in attachment capability was re-introduced in V8.5.0 using the same Coach views that are used for external ECM systems. As with heritage Coaches, there is no explicit means of deleting these attachments, but they are deleted when their parent process instances are deleted, and can be programmatically purged using deleteAllVersions.

Temp directory

During install, and during some operations at runtime, files are placed in the system temp directory (%temp%). These can accumulate over time, so you should keep an eye on it and have a policy to occasionally purge it.


IBM Business Monitor

If you use IBM Business Monitor, you should also occasionally purge data there because it can accumulate quite quickly. There is mature support in IBM Business Monitor for both deleting and archiving data. Business Monitor comes with an event purging capability in the administrative console that you configure per monitor model, as shown in Figure 5.

Figure 5. Purging Event Console page in IBM Business Monitor
Purging Event Console page in IBM Business Monitor

Click to see larger image

Figure 5. Purging Event Console page in IBM Business Monitor

Purging Event Console page in IBM Business Monitor

You can specify the age of the instances to be deleted, and optionally identify a directory in which to archive the purged instances as a CSV file.

Note: This function only purges terminated instances; it never removes in-flight instances, regardless of age.

This purging and archiving can be done once, as described in Purging and archiving instance data, or it can be scheduled to run regularly, as described in Schedule purging and archiving instance data.


Conclusion

In this article, we pointed out the key areas of accumulating data you need to be aware of in IBM Business Process Manager and IBM Business Monitor, and described how to purge that data. Table 1 summarizes the data types and the options for purging it in various releases.

Table 1. Options for purging various types of data based on release
Data V7.5.1x V8.0.1x V8.5.0x
Process apps and toolkits in Process Center Archive and delete with Process Center UI Archive and delete with Process Center UI Archive and delete with Process Center UI
Snapshots in Process Center BPMSnapshotCleanup command in V7.5.1.2 BPMSnapshotCleanup command in V8.0.1.2 BPMSnapshotCleanup command, plus schedulable service added in V8.5.0.1
Advanced BLA and EARs in Process Center Undeploy with Process Center UI to remove BLA and EARs Undeploy with Process Center UI to remove BLA and EARs Undeploy with Process Center UI to remove BLA and EARs
Snapshots in Process Server BPMDeleteSnapshot command in V7.5.1.2 BPMDeleteSnapshot command in V8.0.1.0 BPMDeleteSnapshot command
BPD Process Instances LSW_BPD_INSTANCE_DELETE stored procedure BPMProcessInstancesCleanup command in V8.0.1.2 BPMProcessInstancesCleanup command, enhanced in V8.5.0.1
BPEL Process Instances deleteCompletedProcessInstances.py script or Cleanup Service in BFM deleteCompletedProcessInstances.py script or Cleanup Service in BFM deleteCompletedProcessInstances.py script or Cleanup Service in BFM
BPD durable events BPMDeleteDurableMessages command in V7.5.1.2BPMDeleteDurableMessages command in V8.0.1.2 BPMDeleteDurableMessages command
Performance Data Warehouse prune command in V7.5.1.2prune command in V8.0.1.2 prune command in V8.5.0.1
Business Monitor The Purge and Archive Instance Data console action and schedulable service The Purge and Archive Instance Data console action and schedulable service The Purge and Archive Instance Data console action and schedulable service

We hope this information will help you maintain a healthy BPM system. Note that at the time of this writing, the latest fixpacks were 7.5.1.2, 8.0.1.2 and 8.5.0.1. As new versions of IBM BPM become available, check out the "What's New" sections of their Information Centers to see if any additional capabilities have been added regarding data purging. You can find all versions of the BPM Information Centers at the IBM BPM Library page.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Business process management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Business process management
ArticleID=954695
ArticleTitle=Purging data in IBM Business Process Manager
publish-date=07042014