This topic applies only to the IBM Business Automation Workflow Advanced
configuration.

BPEL process archive overview

Draft comment:
This topic only applies to BAW, and is located in the BAW repository. Last updated on 2025-03-13 12:15

The Business Process Archive Manager is an optional component that allows you to use a script to move completed BPEL process instances and human tasks from the Business Process Choreographer database to an archive database. By regularly performing archiving, you can prevent the runtime database from filling up with old objects, which over time, can degrade the database performance. You can use a Business Process Archive Explorer or the Business Process Archive Manager API to access processes and tasks that were moved to the archive database. Because it is not possible to move data from an archive database back to a runtime database, using this archiving facility does not provide any backup protection.

Architecture

The BPEL process archive facility consists of the following elements:

Business Process Archive Manager
Business Process Archive Explorer
Business Process Archive database
The archive.py script
Business Process Archive Manager EJB API

The following figure illustrates the relationships between the Business Process Archive Manager configuration, a Business Process Choreographer configuration, and their databases.

Business Process Archive Manager

To create a Business Process Archive Manager, you must add entries to the properties file before you create the deployment environment. For more information about editing the properties file, see Configuring Business Process Archive Manager.

The following conditions apply:

You can create a Business Process Archive Manager only on the support cluster in a three-cluster setup.
A Business Process Choreographer configuration can only use a Business Process Archive Manager configurations that is in the same cell.
A Business Process Archive Manager configuration can be used to archive data from only one Business Process Choreographer configuration.
Each Business Process Archive Manager configuration must have its own Business Process Archive database.

Business Process Archive database

Each Business Process Archive Manager requires its own database. The database must be of the same type and structure as is used for the Business Process Choreographer database. The default name for the archive database is BPARCDB.

The archive.py administrative script

A WebSphere® system administrator can run this script to archive data from the runtime database of one Business Process Choreographer configuration to the archive database of a Business Process Archive Manager configuration. You can specify various parameters to control which instances are archived, how many to archive in total, and how many to archive in each database transaction. The source and destination are specified by their deployment servers or clusters. For more information about this script, see Archiving completed BPEL process and task instances.

Business Process Archive Manager EJB API support

Only a subset of the actions that are available using the Business Flow Manager and Human Task Manager EJB APIs can also be used against a Business Process Archive Manager configuration to read and delete process instances and human tasks that are in the archive database. The other APIs are not supported by the Business Process Archive Manager.

A new method, OperationMode getOperationMode() is provided, which indicates whether the client is connected to a Business Process Choreographer configuration or a Business Process Archive Manager configuration. This can be used to write custom clients that can connect to and operate appropriately on runtime configurations and archive configurations.

For more information about the Business Process Archive Manager API, see the Javadoc for the packages com.ibm.bpe.api and com.ibm.task.api.

Business Process Archive Explorer

The Business Process Archive Explorer is similar to the Business Process Choreographer Explorer except that it connects to an archive database associated with a Business Process Archive Manager configuration. The Business Process Archive Explorer is configured when you configure the Business Process Archive Manager.

Depending on your authorization, you can use the Business Process Archive Explorer to browse instances and possibly delete instances too. You cannot update instances or create new instances.

Authorization

The actions that you can perform with the Business Process Archive Manager EJB API or the Business Process Archive Explorer depends on the following Java™ Platform, Enterprise Edition (Java EE) roles:

Users who are in the Business Process Archive Manager system monitor role can read and view all process instances and all task instances in the archive database.
Users who are in the Business Process Archive Manager system administrator role can also delete any top-level process instances and top-level task instances in the archive database.
Users who are not in the system monitor or system administrator roles can see only the instances that they created or started themselves, but they cannot view any details about the instances.
No one (not even users in the system administrator roles) can modify any of the data that is associated with any instances in the archive database.
Instance-based authorization information, such as the potential owner or reader information, is not archived. Therefore, this data is not available in the archive. The only exception to this rule is the information about the starter and creator of processes and tasks.
Users must be in the WebClientUser role to use the Business Process Archive Explorer.

Which data is archived

Only top-level process instances and top-level stand-alone human task instances that have reached one of the end states (Finished, Terminated, Failed, or Expired) can be moved to the archive database. When a top-level instance is archived, certain data is also moved with it to the archive, and other data is deleted.

For completed top-level process instances, including business state machine instances:

Instance data such as activities, variables, inline human tasks, input messages, and output messages are moved.
Child processes and related data are moved recursively.
If any related metadata such as process templates and task templates are not already in the archive database, a copy of them is created.
Query tables and stored queries are not moved nor copied to the archive database.
Work items that are associated with an archived instance are deleted without being archived.

For completed top-level stand-alone human tasks:

Instance data such as input messages and output messages are moved.
Escalation instances are moved.
Child tasks, including follow-on tasks, are moved.
If related metadata such as task templates are not already in the archive database, a copy of them is created.
Work items that are associated with an archived instance are deleted without being archived.

Metadata

Extra metadata, such as process and task template information, is copied to the archive when necessary to allow the archived data to be interpreted and displayed correctly. The metadata in the archive database is deleted when it is no longer required, that is, when the last process instance or human task that references the metadata is deleted.

What is not archived

Other Business Process Choreographer data, such as configuration data, XSD and WSDL artifacts, SCA modules, applications, work baskets, business categories, business rules, messages, and audit trail data, cannot be moved to the archive.