Skip to main content

Enhanced audit log data analysis and query for BPEL processes with Process Choreographer 5.1

Frank Neumann (frank_neumann@de.ibm.com), Business Process Choreographer Development, IBM, Software Group
Author photo
Frank Neumann joined IBM in 1997 and he has been with the Process Choreographer development team since the very beginning. He is team lead and architect for the Process Choreographer database access layer and work item manager. Frank holds a master's degree in computer science from the University of Karlsruhe.
Anke Robeller (anke@de.ibm.com), Business Process Choreographer Development, IBM, Software Group
Author photo
Anke Robeller works in the Process Choreographer development team in the IBM development lab in Boeblingen. She joined IBM Germany in 1999 and has been working in the development of the Process Choreographer navigation component since 2001. Anke holds a master's degree in computer science from the University of Stuttgart, Germany.

Summary:  This article describes how to exploit the audit log data written by the business process choreographer, part of IBM® WebSphere® Business Integration Server Foundation V5.1.x, for Business Process Execution Language (BPEL) processes.

Date:  21 Feb 2005
Level:  Intermediate
Activity:  593 views

Introduction

The business process choreographer in IBM WebSphere Business Integration Server Foundation V5.1 writes business-relevant information about navigated Business Process Execution Language (BPEL) processes to a database audit log table. This information can be queried and analyzed using standard SQL statements for the underlying database system. This article describes how the audit log is set up, and what data is written to the audit log table. It also provides information on the best practices for using this data to answer typical statistical and historical questions regarding business processes.

The article refers to IBM WebSphere Business Integration Server Foundation V5.1.x and BPEL processes. For information regarding prior versions of the process choreographer processes, see Resources.


Setting up the business process for auditing

When auditing is enabled, the process engine generates audit events during the process lifetime. Each audit event leads to a new row in the audit log table (AUDIT_LOG_T). In contrast to other process choreographer database tables, the audit log entries are not deleted automatically when a process completes or on a delete request. You can, therefore, use the audit log to query information about active processes and to track historical data about processes that ran in the past.

An audit event is associated with an entity of the process. These entities are:

  • Process template
  • Process instance
  • Activity
  • Link
  • Variable

You can enable audit logging for each of these entities when you model the process in IBM WebSphere Studio Application Developer Integration Edition. For processes, this is done on the Server tab of the business process properties. Ensure that the Business Relevant attribute is checked, as shown in Figure 1.


Figure 1. Enable auditing for a process
Figure 1. Enable auditing for a process

You can enable auditing for each individual activity and variable of a business process in the Server settings of the property pages by checking the Business Relevant attribute (Figures 2 and 3). You can turn on activity and variable auditing independently of process level auditing. This means that activities and variables can be audited even if auditing for the process is turned off.


Figure 2. Enable auditing for an activity instance
Figure 2. Enable auditing for an activity instance

Figure 3. Enable auditing for a variable
Figure 2. Enable auditing for an activity instance

Auditing is enabled for links if the corresponding flow activity (the flow where the link is defined) has the business relevant attribute set to true.

When an enterprise application that contains a business process is installed or uninstalled, an audit log entry is generated. This behavior cannot be disabled.

For activities, the business relevant option has additional effects. If the business relevant attribute is set for an activity, the activity instance is always persisted. As a result, the activity is shown in the instance monitor of the process choreographer Web client. If the Common Event Infrastructure (CEI) component is installed and configured, the business relevant attribute for processes, activities, and variables also triggers the production of CEI events. These events are a subset of the audit events; they are generated at the same time as audit events and carry the same information.


Navigation and states of a business process

An audit event is generated whenever something happens that is of significance to the business process. This event generation is tightly coupled with process and activity state transitions.

Process instance events

Figure 4 shows the state transitions of a business process and the audit events that are produced. When a process is started, the state changes to running and a "process started" event (21000) is generated. If the process ends successfully, this leads to the finshed state and a "process completed" event (21020) is written. When a fault reaches the process boundary the process state changes to failing ("process failing" event, 42010) until all of the active and long-running activities and active scopes are terminated. Finally, the process is in the failed state and a "process failed" event (42001) is logged.

A process can be terminated by a forceTerminate request or by a terminate activity in the process. The termination of a process can span multiple navigation steps and thus multiple chained transactions. For example, all of the active subprocesses in the process must be terminated in a separate step. In this interim phase, the process state is terminating and a "process terminating" event (42009) is written. When the termination of these subcomponents is complete, the process ends in the terminated state and a "process terminated" event (21005) is generated.

When a compensable, long-running process is completed abnormally, that is, the fault handler of a process is navigated, process compensation is started. The process state changes to compensating and a "process compensating" event (42003) is produced. During process compensation, all of the undo services of successfully completed invoke activities are run. If the execution of one of these undo services fails, the process is put into the compensation-in-doubt state and the corresponding audit event "process compensation in doubt" (42030) is written. The compensation of the process can now only be repaired manually by the process administator. When the compensation of a process succeeds, the final process state is compensated ("process compensated" event, 42004). For more information on compensation and compensation repair, see Resources.

A BPEL process can have correlation sets. A correlation set consists of a set of properties. It is used to identify a process instance. A correlation set that has been initialized with the process-creating receive or pick activity is written to the audit log table with a "process correlation set initialized" event (42027). Only the first and initializing correlation set is persisted in the log.

The deletion of a process instance is also logged with a "process deleted" event (21020). The deletion can be caused by automatically deleting the process instance on completion or by a deletion request.


Figure 4. Process state transitions and related audit events
Figure 4. Process state transitions and related audit events

Activity instance events

The state transitions and generated audit events are highly dependent on the kind of the activity. BPEL introduces a variety of activity kinds. In addition, process choreographer defines some additional activity kinds, namely, staff activities and Java™ snippets. In the following, all of these these activity kinds are divided into groups. Both the state transitions and the generated audit events are described for these groups.

Structured activities: flow, sequence, switch, while, and pick

Structured activities, such as flow, sequence, switch, while, and pick, can contain a series of other activities that can be either structured or basic activities. Structured a ctivities prescribe the control flow of a business process and usually have a low business relevance compared to, for example, message activities (receive, pick, reply, and invoke activities) that communicate with the outside world.

In general, process choreographer does not produce audit events for a structured activity, but it does generate events that can be used to follow the control path that was taken through the process during its execution. For example, if auditing is enabled for a flow activity and the transition condition of a link evaluates to true, this results in a "link condition evaluated to true" event (21034). A business-relevant while activity produces events that report the Boolean value of the while condition; the corresponding events are: "activity loop condition true" (42037) and "activity loop condition false" (42038).

In contrast to other structured activities, a pick activity has a more complex lifecycle. The navigation of a pick activity generates a series of audit events as shown in Figure 5. When a pick activity is started, an "activity started" event (21007) is fired and the activity state changes to the waiting state. A fault that happens in the activation phase of the pick results in an "activity failed" event (21080). When an on-message branch of the pick is started by a receiving message, an "activity message received" event (42036) is logged. The activation of an on-alarm branch leads to an "activity expired" event (21081). If the enclosing scope of the pick activity is terminated due to a fault or termination request, the pick activity is terminated and an "activity terminated" event (21027) is fired.


Figure 5. State transitions and generated audit events for a pick activity
Figure 5. State transitions and generated audit events for a pick activity

Short-lived basic activities: assign, empty, reply, terminate, and throw

Assign, empty, reply, terminate, and throw activities all have a short lifetime. They are always run inside one transaction; that is, they are always started and completed in the same transaction. As a consequence, a started audit event is not written. An "activity completed" event (21011) is usually the first and only event for these activity kinds. Assign, reply, and throw activities can also fail, which leads to an "activity failed" (21080) audit event. All of these state transitions are summarized in Figure 6.


Figure 6. State transitions and audit events for short-lived basic activities
Figure 6. State transitions and audit events for short-lived basic activities

Basic activities that wait for an external event: wait, receive

After a receive or wait activity has been activated, the navigation transaction is committed and the process waits for either a specified time or for a message to be received. An "activity started" (21007) event is written and the activity stays in the waiting state. When the message is received or the specified time has elapsed, an "activity completed" (21070) event is generated. When the enclosing scope of the activity is terminated, the activity is terminated too ("activity terminating", 21027). See Figure 7.


Figure 7. State transitions and audit events for receive and wait activities
Figure 7. State transitions and audit events for receive and wait activities

Repairable activities: invoke and Java snippets

Invoke activities and Java snippets are put in a stopped state ("activity stopped" event, 42015) if all of the following happen:

  • A fault is raised by the activity.
  • The fault is not handled on the enclosing scope.
  • The continueOnError attribute of the activity is set to false.

The activity can then be repaired by manual, administrative interaction. The retry of the activity can be forced ("activity force retried" event, 42031) which causes the activity to be run again from the beginning. The completion of the activity can also be forced ("activity force ompleted" event, 43031). In this case, the administrator predefines the output of the activity and the navigation continues at the end of the activity.

Note that invoke activites, staff activities, and Java snippets are the only activity kinds that can be repaired in this way.

A Java snippet is similar to an empty or an assign activity in that it is short-lived. An "activity started" event (21007) is not created for a Java snippet. However, an invoke activity can be long-running and it thus triggers the started event. A long-running invoke activity can also expire ("activity expired" event, 21081). The audit events that are generated only for invoke activities are indicated by an asterisk (*) in Figure 8.


Figure 8. State transitions and audit events of invoke activities and Java snippet
Figure 8. State transitions and audit events of invoke activities and Java snippets

When a process is compensated, successfully completed invoke activities that have an undo action are undone. For invoke activities in a long-running process, the following events are generated to document the compensation of the activity:

  • "Undo started" event (42033), fired when the undo service is about to be invoked.
  • "Undo completed" event (42035), fired when the undo service invocation succeeds.
  • "Undo skipped" event (42034), fired when the undo service invocation fails and the undo is skipped by administrative, manual interaction.

Staff activities

A staff activity can be considered to be an invoke activity that is implemented by a human being. The running state of an invoke activity corresponds to the two staff activity states ready and claimed. The ready state expresses that the activity is ready to be handled by a human being. A user can then decide to work on the activity and claim it ("activity claimed" event, 21022). To finish the work, the user completes the activity ("activity completed" event, 21011). Alternatively, after the staff activity has been claimed, it can be released again by a cancel claim request. This interaction is logged with an "activity claim canceled" event (21021).

The remaining state transitions and audit events for staff activities are the same as for invoke activities. The state transitions and generated audit events of a staff activity are shown in Figure 9.


Figure 9. State transitions and audit events for staff activities
Figure 9. State transitions and audit events for staff activities

In addition to the events related to state transitions, an audit event can signal that a staff activity has been manipulated by human interaction, but this interaction has not resulted in a state change. The output or fault message of a claimed staff activity can be set without completing the activity, for example, to store intermediate results. In these cases, an "activity output message set" event (42012) or an "activity fault message set" event (42013) is written.

All events

When a process model is installed or uninstalled a "process template installed" (42006) or "process template uninstalled" (42007) event is written.

A "variable updated" (21090) event is generated whenever the contents of a variable is changed. A variable can be changed for various reasons: either in a Java snippet or through the completion of a receive, pick, invoke, assign, or staff activity.

With WebSphere Business Integration Server Foundation V5.1 process choreographer you can transfer, create, and delete work items. Whenever a work item is created or deleted, the following audit events are generated: "process work item created" (42042), "process work item deleted" (42041), "activity work item created" (42040), "activity work item deleted" (42039). Note that transferring a work item corresponds to the creation of a new work item and the deletion of an existing one. Thus, a transfer work item request normally results in a delete and a create work item event. The creation and deletion of work items during normal navigation do not result in work item events. For example, when a staff activity is activated and the work items for a potential owner are created, no work-item-created event is generated.

The following table provides a summary of the process choreographer audit events.

Event nameEvent numberEvent nameEvent numberEvent nameEvent number
activity claim canceled 21021activity claimed 21022activity completed 21011
activity expired 21081 activity failed 21080 activity failing 42011
activity fault message set 42013activity force completed 42032activity force retried 42031
activity loop condition false 42038activity loop condition true 42037activity message received 42036
activity output message set 42012activity ready 21006activity started 21007
activity stopped 42015activity terminated 21027 activity undo completed 42035
activity undo skipped 42034activity undo started 42033activity work item created 42040
activity work item deleted 42039link evaluated to true 21034 process compensated 42004
process compensating 42003process compensation indoubt 42030process completed 21004
process correlation set initialized 42027process deleted 21020 process failed 42001
process failing 42010process started 21000 process template installed 42006
process template uninstalled 42007process terminated 21005 process terminating 42009
process work item created 42042process work item deleted 42041variable updated 21090

The event numbers are available as constants in the StateObserverEvent class. For example, to access the value of an "activity started" event, you can use:

com.ibm.bpe.api.StateObserverEvent.ACTIVITY_STARTED


Audit log data

Process choreographer writes audit log data to the AUDIT_LOG_T table of the process choreographer database.

Two views, AUDIT_LOG_B and AUDIT_LOG are provided to access this data. The AUDIT_LOG view is filled with data for both BPEL business processes data and Version 5.0 business processes. The AUDIT_LOG_B view contains only the audit log data of BPEL processes.

The layout of the AUDIT_LOG_B view is described here. The listing first specifies the name and database type of the column, and then states for what event types the column is filled. The event type is defined by the object, for example, process instance, associated with the event. In addition, a short description is given for each column.

ALID

Database type: binary.
Event types: all.

Internal ID and primary key for a row.

EVENT_TIME

Database type: timestamp.
Event types: all.

Timestamp of when the event occurred in Coordinated Universal Time (UTC) format.

AUDIT_EVENT

Database type: number.
Event types: all.
For a list of audit event codes, refer to previous section.
PTID
Database type: binary.
Event types: all.
Process template ID of the process that is related to the current event.
PIID
Database type: binary.
Event types: process instance, activity, link and variable.
Process instance ID of the process instance that is related to the current event.
PROCESS_TEMPL_NAME
Database type: text.
Event types: all.
Process template name of the process template that is related to the current event.
TOP_LEVEL_PIID
Database type: binary.
Event types: process instance, activity, link and variable.
Identifier of the top-level process that is related to the current event.
PARENT_PIID
Database type: binary.
Event types: process instance, activity, link and variable.
Process instance ID of the parent process, or null if no parent exists.
VALID_FROM
Database type: timestamp.
Event types: all.
Valid-from date of the process template that is related to the current event.
ACTIVITY_NAME
Database type: text.
Event types: activity.
Name of the activity for which the event occurred. To analyze the audit log, it is beneficial to specify a name for each business-relevant activity. BPEL defines the activity name attribute a optional.
ACTIVITY_KIND
Database type: number.
Event types: activity.

Kind of the activity for which the activity occurred. It can have the following values:

  • invoke -- 21
  • receive -- 23
  • reply -- 24
  • throw -- 25
  • terminate -- 26
  • wait -- 27
  • empty -- 3
  • sequence -- 30
  • switch -- 32
  • while -- 34
  • pick -- 36
  • flow -- 38
  • scope -- 40
  • Java Snippet -- 42
  • staff -- 43
  • assign -- 44

These constants are also defined in ActivityInstanceData.KIND_*. For example, to access the value of an invoke activity use:

 com.ibm.bpe.api.ActivityInstanceData.KIND_INVOKE

ACTIVITY_STATE
Database type: number.
Event types: activity.
State of the activity that is related to the event. Possible values are:
  • inactive -- 1
  • ready -- 2
  • running -- 3
  • skipped -- 4
  • finished -- 5
  • failed -- 6
  • terminated -- 7
  • claimed -- 8
  • failing -- 10
  • waiting -- 11
  • expired -- 12
  • stopped -- 13

These constants are also defined in ActivityInstanceData.STATE_*. For example, to access the value of activity state "running" use:

 com.ibm.bpe.api.ActivityInstanceData.STATE_RUNNING

CONTROL_LINK_NAME
Database type: text.
Event types: link.
Name of the link that is related to the current link event.
PRINCIPAL
Database type: text.
Event types: process instance, activity.
Name of the principal that started the request that caused the audit event.
VARIABLE_DATA
Database type: LOB.
Event types: variable.
Contents of the variable that has been updated. The variable data contains both, the message definition and the parts defined in there as well as the name and value of all parts that are currently set in the WSDL message. Note that just simple parts are logged, the contents of complex part types are not visible in the audit log. The format is described in more detail in the Accessing variable contents section.
EXCEPTION_TEXT
Database type: LOB.
Event types: process instance and activity.
Exception message that caused an activity or process to fail. Applicable for: "process failed" and "activity failed" events.
DESCRIPTION
Database type: text.
Event types: process instance, activity, variable and link.
Description of activity or process, containing potentially resolved replacement variables.
CORR_SET_INFOR
Database type: LOB.
Event types: process instance.
The stringified form of the correlation set that was initialized at process start time. The correlation set information is stored in an XML format. For example, a correlation set, named "order", with a string property "customer_name" and a integer property "order_number" is represented like this:
<?xml version="1.0"?>
<correlationSet name="order">
  <property name="customer_name" value="Anke"/>
  <property name="order_number" value="12345"/>
</correlationSet>

USER_NAME
Database type: text.
Event types: process instance and activity.
The name of the user whose work item has been changed, applicable for:
  • Process instance work item deleted
  • Activity instance work item deleted
  • Process instance work item created
  • Activity instance work item created.
VARIABLE_NAME
Database type: text.
Event types: variable.
The name of the variable related to the current event.
SIID
Database type: binary.
Event types: -
The column is intended to log the ID of the scope instance for scope events.Scope events are currently not generated.
ATID
Database type: binary.
Event types: activity.
The ID of the activity template related to the current event.

Views and tables in the database schema

Whenever you access audit log data for BPEL processes, you should use the AUDIT_LOG_B view to do this. The AUDIT_LOG view is maintained for compatibility reasons and for Version 5.0 processes. This view contains additional columns that are not required or used by BPEL processes.

The underlying database table for both views is AUDIT_LOG_T. As with all tables in the process choreographer database schema, these tables are subject to change in future releases without further notice. Every effort is made to keep the (published) views as stable as possible to allow custom programs and scripts to work with future releases. It is, therefore, recommended that you use the audit log views and do not access the audit log table directly. Because both of these views are defined with a single SQL select statement on the underlying table, the query performance will not suffer.

In addition to the audit log view, process choreographer defines a couple of other views that have current run time information, for example, about processes, activities, or events. The white paper about work items and the query API() (see Resources) discusses these views in detail and demonstrates how the query() API can be used to retrieve data.

Running queries on these (instance data) views is a good choice if the business relevant attribute is not set for a business process. However, all business process instance data is usually gone when the process is deleted. Thus, queries on these views are usually not feasible to gather statistical or historical data.

Furthermore, queries on the instance data views affect the performance of running business processes. They might also lead to locking issues, for example, if a query requires a table scan on one of the underlying tables, a large number of read-locks might be acquired that can cause lock wait or even deadlocks.


Accessing audit log data in the database

Depending on your objectives, you can access the audit log data in the pocess choreographer database in different ways:

  • Direct SQL usage
    You can use your database runtime client to process SQL statements that directly access the AUDIT_LOG_B view. This method provides a quick and easy way to gather information from the audit log without the effort of writing specific application code.
  • Audit application
    You can write a Java application and use JDBC to access data in the AUDIT_LOG_B view. An application can gather data, post-process it, and present it in a more readable form.
  • Offline analysis
    You can copy the content of the AUDIT_LOG_B view to another table and then do offline analysis. This method can be beneficial if you want to avoid performance penalties when querying online data.

No matter which method you choose, you need to design your SQL statements to get the desired results. The next section provides sample statements for commonly requested queries in the area of volume and response information. They can be used as-is or modified to meet individual needs.

All samples given in this article are for IBM DB2® Universal Database on Windows®. SQL syntax for other database systems might vary but the statements are easy to adapt.

Handling timestamps

The 5.1.x version of the business process chorographer uses the default JDBC setTimestamp() function to write timestamp data to the database.

Depending on the database system and the implementation of the JDBC driver, this usually results in timestamp data written in the time zone of the application server.


Listing 1. Displaying timestamps of audit events

SELECT AUDIT_EVENT, EVENT_TIME FROM AUDIT_LOG_B

AUDIT_EVENT EVENT_TIME
----------- --------------------------
      21000 2004-12-20-07.32.41.123000
      21007 2004-12-20-07.32.41.123000
      21090 2004-12-20-07.32.41.143000
      21011 2004-12-20-07.32.41.143000
      21090 2004-12-20-07.32.42.505000
      21011 2004-12-20-07.32.42.505000
      21007 2004-12-20-07.32.42.535000
      21090 2004-12-20-07.32.42.545000
      21011 2004-12-20-07.32.42.545000
      21011 2004-12-20-07.32.43.096000
      21004 2004-12-20-07.32.43.106000

  11 record(s) selected.

If you plan to implement audit log analysis in a Java program, you can use the Timezone and DateTime class. These classes provide ready-to-use time zone conversions.

Important: If you run Oracle as the database system for process choreographer, Oracle 8i does not support the TIMESTAMP column type, and so the process choreographer database schema defines DATE columns instead. While Oracle 9i does support the TIMESTAMP column type, timestamp resolution is less exact than with other database systems. As a result, multiple sequential events can appear to have happened at exactly the same time.

When timestamps are subtracted from one another, the result is a timestamp duration. Most database systems support timestamp arithmetics. The following samples are for DB2 Universal Database.

Given two timestamps, you are interested in the duration between them:


Listing 2. Subtracting two timestamp columns

  SELECT TIMESTAMP('2004-11-29-12.10.20')-
         TIMESTAMP('2004-11-29-12.09.10') FROM SYSIBM.SYSDUMMY1 

  1
  ----------------------
              110.000000

The resulting timestamp duration of 110.000000 is interpreted in the format yyyyxxddhhmmss.zzzzzz, where yyyy, xx, dd, hh, mm, ss and zzzzzz represent the number of years, months, days, hours, minutes, seconds and microseconds, respectively.

In this example, the duration is one minute and ten seconds. However, aggregation functions, such as AVG, do not work with the timestamp duration because the duration is computed based on the numerical value. If you want, for example, to compute the average time between two events, one resulting in a timestamp difference of 110.000000 (which is one minute and ten seconds) and another resulting in 230.000000 (which is two and a half minutes) then the average would be one minute and 50 seconds while the average computation of the values 110 and 230 results in 170 (which is the wrong result of one minute and 70 seconds).

The solution for aggregation of timestamp durations is to use TIMESTAMPDIFF, a built-in UDF (user defined function) that is included with DB2 UDB, as shown in the following example.


Listing 3. Subtracting two timestamp columns using TIMESTAMPDIFF

  SELECT TIMESTAMPDIFF(2,
         CHAR(TIMESTAMP('2004-11-29-12.10.20')-TIMESTAMP('2004-11-29-12.09.10')))
         FROM SYSIBM.SYSDUMMY1 
  1
  -----------
           70

The first argument determines resolution of the result, for example, 2 for seconds, 4 for minutes, and the second argument is the result of subtracting two timestamps and converting the result to CHAR.

Sample SQL statements

If you want to know the average time (in seconds) that it takes for a business process to complete in your system, select the "process started" (21000) and "process completed" (21004) events of a process instance and let the database system compute the average difference between both events:


Listing 4. Average time for a business process

   SELECT  
     AVG(TIMESTAMPDIFF(2,
         CHAR(AL_COMPLETED.EVENT_TIME-AL_STARTED.EVENT_TIME))) 
   FROM    
     AUDIT_LOG_B AL_STARTED, AUDIT_LOG_B AL_COMPLETED
   WHERE   
     AL_STARTED.AUDIT_EVENT = 21000 AND 
     AL_COMPLETED.AUDIT_EVENT = 21004 AND
     AL_STARTED.PIID = AL_COMPLETED.PIID

The process instance IDs and corresponding template names of processes that failed can be retrieved by searching for "process failed" events (42001):


Listing 5. Failed business processes

   SELECT  
     PIID, PROCESS_TEMPL_NAME AS NAME 
   FROM    
     AUDIT_LOG_B
   WHERE   
     AUDIT_EVENT = 42001 

If you want the current number of processes that are still running, look for a "process started" (21000) entry, but none of these events have an end state (finished, failed, terminate, compensated):


Listing 6. Not yet finished processes

   SELECT COUNT(*) 
   FROM 
     AUDIT_LOG_B AL1
   WHERE    
     AL1.AUDIT_EVENT = 21000 AND NOT EXISTS
    (SELECT 1 
       FROM AUDIT_LOG_B AL2 
     WHERE  
       AL1.PIID = AL2.PIID AND 
       AL2.AUDIT_EVENT IN (21004, 21005, 42001, 42004) 
    )               

In the next example, you want to know how many business processes completed regularly in November 2004. Note that times must be specified in a format known by the database system:


Listing 7. Number of finished processes in a certain month

   SELECT COUNT(*) 
   FROM 
     AUDIT_LOG_B
   WHERE    
     AUDIT_EVENT = 21004 AND 
     EVENT_TIME >= '2004-11-01 00:00:00' AND 
     EVENT_TIME <  '2004-12-01 00:00:00'

A user, "Frank," wants to know how often he initiated different business process templates since January 2004. The result of the following query displays the process template name in the first column and shows how often this process has been started in the second column:


Listing 8. How often did I start different processes?

   SELECT
     PROCESS_TEMPL_NAME AS NAME, COUNT(*) AS FREQUENCY 
   FROM 
     AUDIT_LOG_B
   WHERE 
     AUDIT_EVENT = 21000 AND 
     PRINCIPAL='Frank' AND
     EVENT_TIME >= '2004-01-01 00:00:00' 
   GROUP BY PROCESS_TEMPL_NAME

You are interested in the average time that staff activities took to be completed. To design the right SQL statement, you have to know:

  • Entries for staff activities have ACTIVITY_KIND=43
  • The event code for "activity ready" is 21006 (note that staff activities don't go to the started state).
  • The event code for "activity completed" is 21011.
  • Two events belong to the same activity instance if they have the same activity template ID (ATID) and the same process instance ID (PIID).

Listing 9. Average time for staff activities

   SELECT
     AVG(TIMESTAMPDIFF( 2, CHAR(AL2.EVENT_TIME-AL1.EVENT_TIME))) 
     AS AVERAGE_TIME 
   FROM 
     AUDIT_LOG_B AL1, AUDIT_LOG_B AL2 
   WHERE 
     AL1.ACTIVITY_KIND = 43 AND AL2.ACTIVITY_KIND=43 AND
     AL1.AUDIT_EVENT = 21006 AND AL2.AUDIT_EVENT = 21011 AND
     AL1.PIID = AL2.PIID AND AL1.ATID = AL2.ATID

However, if the staff activity is inside a while loop, the previous statement is not correct because each loop run creates an entry in the log with the same ATID/PIID pair for an activity. Thus, if you have staff activities in loops and want to run the above request, you need to ensure that you subtract only events which belong together. Because activities in a loop are still run in sequence and not in parallel, you can use the event time to get the "next" completed event to a certain "ready" event.


Listing 10. Average time for staff activities, considering also loops

   SELECT
     AVG(TIMESTAMPDIFF(2, CHAR(AL2.EVENT_TIME-AL1.EVENT_TIME))) 
     AS AVERAGE_TIME 
   FROM 
     AUDIT_LOG_B AL1, AUDIT_LOG_B AL2 
   WHERE 
     AL1.ACTIVITY_KIND = 43 AND AL1.AUDIT_EVENT = 21006 AND
     AL2.EVENT_TIME =  
      (SELECT 
         MIN(AL.EVENT_TIME) 
       FROM 
         AUDIT_LOG_B AL 
       WHERE
          AL.ACTIVITY_KIND=43 AND AL.AUDIT_EVENT=21011 AND 
          AL.PIID = AL1.PIID AND AL.ATID = AL1.ATID AND
          AL.EVENT_TIME > AL1.EVENT_TIME AND
          AL2.ALID = AL.ALID
      )

Performance considerations

Writing audit log events is comparable to writing database log records in a database system. The logs can grow very quickly and get very large. You should try to use the fastest disc you have for this kind of data and also regularly back up and delete the entries in this table. Audit logs can easily become the bottleneck of the entire system if large amounts of data are written and the system is not configured appropriately.

To get reasonable response times for queries on large tables, it is a best practice to define indexes for certain columns or combination of columns. However, be aware that the database system has to maintain these indexes and this, particularly for audit logging, slows down the overall performance of process navigation if audit logging is enabled. This is the reason why the process choreographer database schema does not define any indexes by default for the audit log table. Depending on your requirements, you have to balance the benefit of faster audit log queries with a potential penalty in overall system performance. If you run audit log queries infrequently and the overall performance is important to you, consider running these queries without any indexes or create an index temporarily when you are do audit log analysis.

When you run queries for very large audit log tables, you can end up with a situation where your database system has to read thousands of pages into the bufferpool to compute the result. Depending on your database memory size, other database pages that are used for navigation of business processes might get displaced during this processing. This can be another reason why process choreographer performance degrades if you run audit log analysis in parallel with process navigation. To reduce impact on regular process navigation, make sure you store the AUDIT_LOG_T table in a separate tablespace (and on a separate physical disc) and also define a separate bufferpool for audit log data.

If you plan to run audit log analysis often and you do not want to impact running business processes, consider copying audit log entries into a new table (again, using a separate bufferpool) or even to a different system, and then create indexes that best match your queries and run these queries on the new table.

Due to the large amount of data in the audit log table, a single INSERT statement will usually fail because the data per transaction exceeds the log file size of the database system.


Listing 11. Simple approach using INSERT

   CREATE TABLE AUDIT_LOG_COPY LIKE AUDIT_LOG_T
   INSERT INTO AUDIT_LOG_COPY SELECT * FROM AUDIT_LOG_T 

Thus, you should use the faster DB2 export and load statements to copy the table, for example:


Listing 12. Using export/load to copy audit data

   CREATE TABLE AUDIT_LOG_COPY LIKE AUDIT_LOG_T
   EXPORT TO C:\TMP\AUDIT_LOG_T OF IXF SELECT * FROM AUDIT_LOG_T 
   LOAD FROM C:\TMP\AUDIT_LOG_T OF IXF INSERT INTO AUDIT_LOG_COPY 

Deleting audit log data

Audit log data can increase rapidly in a production environment and you might reach a situation where even a simple DELETE FROM AUDIT_LOG_T statement fails because the amount of data exceeds the current log file settings. Process choreographer ships with a powerful utility to delete entries in the audit log:


Listing 13. Command line utility to delete audit log data

BPEAuditLogDelete {-processtime processtime| -time time|-all [slice]}
                   -dbs database_system 
                   -dbn database_name 
                   [-u userId] [-p password] 
                   [-host host] [-port port]

This utility uses slicing to delete entries in the audit log table; that is, only the specified number of entries are deleted in a series of single transactions. Apart from this feature, the utility can also delete 'old' audit log entries, written before a given time. For more information on the BPEAuditLogDelete utility, refer to the information center.


Writing a JDBC application to access audit log data

Writing a Java program using Java Database Connectivity (JDBC) to query the audit log has several advantages:

  • Flexibility
    You can pass command line arguments to a program, thereby allowing multiple variations of the query to be run with a single program.
  • Performance
    If your JDBC program uses prepared statements and parameter markers, queries are processed faster if they are found in the database statement cache.
  • Capability
    If it is hard to write a single ('closed') SQL query for the desired task, it might be easier to run multiple queries and post-process the results in a program.
  • Usability
    A program can present retrieved data in a user friendly format, for example, using graphics in a GUI, to make the data easier to read and use.

Here is a small Java program that you can use as a starting point. The program takes the name of a process template as a command line parameter and prints out start times of all of the corresponding process instances. The code assumes again a DB2 Universal Database. Make sure that you have the CLI-based JDBC Driver (db2java.zip) in your classpath when compiling and running this sample code. You can also use the Universal JDBC Driver (db2jcc.jar) that ships with DB2 Universal Database V8. In this case, make sure that you replace references to COM.ibm.db2.jdbc.DB2DataSource with com.ibm.db2.jcc.DB2SimpleDataSource in the following code.

To keep the code simple, there is no error or resource handling, such as closing resources if an error occurs. This example shows with just a few lines of code how easy it is to write a small Java application.


Listing 14. JDBC program displaying start times of processes

  import java.sql.*;
    
  public class AuditLogInfo
  {
    public static void main(String argv[])     
    {
      if( argv.length != 1 )
      {
        System.err.println("usage: " + 
                           AuditLogInfo.class.getName () + 
                           " <process template name>"); 
        return;
      }
      System.out.println("Start times of process instances for: " +
                         argv[0] );
  
      COM.ibm.db2.jdbc.DB2DataSource db2ds = 
          new COM.ibm.db2.jdbc.DB2DataSource();
      // Adapt "BPEDB" if you've chosen a different name 
      // during configuration
      db2ds.setDatabaseName("BPEDB");
  
      try
      {
        Connection con = db2ds.getConnection();
        PreparedStatement pstmt = con.prepareStatement(
            "SELECT EVENT_TIME FROM AUDIT_LOG_B WHERE " +                      
            "PROCESS_TEMPL_NAME = ? AND " +
            "AUDIT_EVENT=21000");
        pstmt.setString(1, argv[0] );
        ResultSet resultSet = pstmt.executeQuery();
        while( resultSet.next() )
        {
          System.out.println( resultSet.getTimestamp(1) );
        }
        resultSet.close();
        pstmt.close();
        con.close();
      }
      catch( SQLException exc )
      {
        exc.printStackTrace();
      }
    }
  }

Accessing variable contents

Because variable update events (event number 21090) write update information into a BLOB column, reading and interpreting this data requires some code and cannot be achieved by a command line SQL statement.

Most of the BLOB columns in the process choreographer database schema contain serialized Java objects. For the VARIABLE_DATA column, and for version 5.1.x, this is currently a text string as described in the Audit log data section.

To deserialize the LOB column and get to the text string again, use the following code:


Listing 15. Reading variable update events

   // ds points to a valid data source
   DataSource ds = ...
   con = ds.getConnection();
   String sql = "SELECT EVENT_TIME, VARIABLE_NAME, VARIABLE_DATA " +
                "FROM AUDIT_LOG_B WHERE AUDIT_EVENT=21090";
   Statement stmt = con.createStatement ();
   ResultSet result = stmt.executeQuery( sql );
   while( result.next() )
   {
     System.out.println( "Event Time: " + result.getTimestamp(1) );
     System.out.println( "Variable Name: " + result.getString(2) );
     ObjectInputStream ois = new ObjectInputStream(result.getBinaryStream(3));
     System.out.println( "Variable Value: " + (String) ois.readObject () );
   }
   result.close();
   stmt.close();
   con.close();

The result for one event might look as follows:


Listing 16. Resulting output

Event Time: 2004-12-20 17:32:41.143
Variable Name: InputVariable
Variable Value: QName = http://www.example.com/process80649267/interface/InputMessage,
[0] key = 'contents' 
value = 'Part: name=contents typeName={http://www.w3.org/2001/XMLSchema}string';--
[partName = contents, type = java.lang.String, value    = 123];

The corresponding BPEL process defines an InputVariable with a single part "contents". Its value (of type String) is set to "123".

The first part of the text in the example contains information about the message template, that is, the qualified name of message definiton and the parts defined in the message definition. The second part, the last line in the listing, lists all of the parts that are currently defined in the message. The two parts are separated by two hyphens ("--").

You find all of the information about types of the message parts in this text string. In most cases, you know exactly which variable parts are of interest and need to be displayed. In this case, parsing the text string is quite straightforward because the parts are enclosed by "[" and "]" and the value is the last string after the equals sign. The above code can be modified to display the value of the InputVariable.


Listing 17. Displaying updated variable values

  // ds points to a valid data source
   DataSource ds = ...
   con = ds.getConnection();
   String sql = "SELECT EVENT_TIME, VARIABLE_NAME, VARIABLE_DATA " +
                "FROM AUDIT_LOG_B WHERE AUDIT_EVENT=21090";
   Statement stmt = con.createStatement ();
   ResultSet result = stmt.executeQuery (sql);
   while( result.next() )
   {
     if( result.getString(2).equals("InputVariable") )
     {
        System.out.println( "Event Time: " + result.getTimestamp(1) );
        ObjectInputStream ois = new ObjectInputStream(result.getBinaryStream(3));
        String variableData = (String) ois.readObject ();
        // Search the value of the (only) part in
        // the variable data string.
        // It is located between the last occuring '=' and ']'
        int ix1 = variableData.lastIndexOf ("=");
        int ix2 = variableData.lastIndexOf ("]");                 
        System.out.println( "Variable Data: " + variableData.substring ( ix1+1, ix2 ));
     }
   }
   result.close();
   stmt.close();
   con.close();

As a result, you get the following output.


Resulting output
Listing 18. Resulting output

Event Time: 2004-12-20 17:32:41.143
Variable Data:  123

Accessing audit log data from a J2EE application

Beyond using plain SQL statements or a J2SE-based JDBC program, you can exploit the power of WebSphere Application Server and the extensions in WebSphere Business Integration Server Foundation to build an enterprise application that uses servlets, EJB components, and WebSphere resources to connect to the database table.

Because process choreographer runs inside WebSphere, there is already a data source pointing to the run time database ("BPEDB"). You simply have to add a resource reference to this data source for your enterprise application and can use a JNDI lookup to retrieve it.

You can also use CMP Enterprise JavaBeans to wrap the audit log table and allow users to access information via the dynamic EJB query language. WebSphere Application Server offers performance benefits, such as statement caching and allows you to access audit log information from any remote location using a Web browser. You can use WebSphere Studio to build your own JSP-based or servlet-based Web application to display audit log data.


Conclusion

The process choreographer audit log data is a powerful data resource and with the capabilities provided by the SQL query language, you can easily gather information about relevant data and interesting coherences of business processes.



Download

DescriptionNameSizeDownload method
Download file 1AuditLogUsage51.zip260 KBFTP|HTTP

Information about download methods


Resources

About the authors

Author photo

Frank Neumann joined IBM in 1997 and he has been with the Process Choreographer development team since the very beginning. He is team lead and architect for the Process Choreographer database access layer and work item manager. Frank holds a master's degree in computer science from the University of Karlsruhe.

Author photo

Anke Robeller works in the Process Choreographer development team in the IBM development lab in Boeblingen. She joined IBM Germany in 1999 and has been working in the development of the Process Choreographer navigation component since 2001. Anke holds a master's degree in computer science from the University of Stuttgart, Germany.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=49004
ArticleTitle=Enhanced audit log data analysis and query for BPEL processes with Process Choreographer 5.1
publish-date=02212005
author1-email=frank_neumann@de.ibm.com
author1-email-cc=
author2-email=anke@de.ibm.com
author2-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers