Skip to main content

skip to main content

developerWorks  >  WebSphere  >

WebSphere Application Server Enterprise Process Choreographer: Work items and the query() API call

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Intermediate

Frank Neumann (frank_neumann@de.ibm.com), Developer, IBM

23 Jul 2003

This article describes the underlying implementation of the query functions in WebSphere Application Server Enterprise Process Choreographer and provides best practices on how to use them.

1. Introduction

IBM WebSphere Application Server Enterprise Process Choreographer Version 5 introduced person activities, which allow human interactions in business processes.

To manage and describe the relationship between a person and a business process entity (such as, a person activity), work items are created. A customer application (EJB client) or a JSP can query these work items and information for associated entity objects using the Process Choreographer's query() EJB API function.

Work items are also used to implement a security concept, describing who is allowed to get information about and operate on certain business process objects.

This article describes the underlying implementation of the query functions in Process Choreographer and provides best practices on how to use them.



Back to top


2. Work item management

2.1. How and when work items get created

Work items, like other business process entity objects, such as process or activity instances, are created after starting a business process with human interaction. Depending on the staff assignment expression in the business process model, the navigation engine in Process Choreographer triggers the creation of appropriate work items for the business process entities.

Process Choreographer entities dealing with human interaction include:

  • Business processes
  • Person activities
  • Events

Work Items for a person or a group are created when an entity is created or enters the ready state.

Staff assignment expressions define, who (a person or a group of people) can perform a certain role for an entity. When an entity is activated, a work item with a corresponding reason is created for each qualifying user:

  • Potential starter (for a business process)
  • Potential owner (for a person activity)
  • Reader (for a business process or a person activity)
  • Administrator (for a business process or a person activity)
  • Editor (for a person activity)
  • Potential Sender (for an event)

In addition to these model-driven work items, the Process Choreographer navigation engine creates the following dedicated work items:

  • Starter (for a business process)
  • Administrator (for a person activity)
  • Owner (for a person activity)

An administrator work item for a person activity is created for person receiving an administrator work item for the enclosing business process.

Since work items are also used to implement security (the right to access and read the data for a certain entity), the additional administrator work items for a person activity also allow the business process administrators to access the enclosed person activities. See Security for Entities for details.

2.2. Database tables for work items

During configuration of Process Choreographer, a relational database is associated with the business process container. The database stores all template (model) and instance (run-time) data necessary to manage the business processes.

Since business processes are navigated transactionally, and processing can run in a distributed (clustered) environment, a relational database is the best and safest choice.

The Process Choreographer database schema contains tables and views (see Additional queryable views).

Data relating to work items is stored in the following tables (note table names in the Process Choreographer schema usually end with _T, while views do not have this convention) :

  • WORK_ITEM_T
    This table contains one row for a relationship between an entity (for example, a person activity) and a person or a group of people. If more than one role results in this relationship (for example, if a person is both administrator and starter of a process), one row is created for each role (or reason).
  • RETRIEVED_USER_T
    If the relationship is between an entity and a group of people (usually the result of a staff assignment expression in the business process), this table contains all qualifying users retrieved from the staff directory for a certain staff assignment expression.
  • STAFF_QUERY_TEMPLATE_T
    During the deployment of objects that can have staff assignment expressions, the expressions are stored in a precompiled format in this table. If a process defines the same staff assignment expression more than once, there is only one entry for this expression in the table, that is, if staff assignment expressions are syntactically identical, they are shared in this table.
  • STAFF_QUERY_INSTANCE_T
    If a staff assignment expression is evaluated during process navigation, an instance of the expression is created together with a timestamp when it expired. The result of the staff assignment evaluation goes into the RETRIEVED_USER_T table.

The WORK_ITEM view is defined as a join of the tables WORK_ITEM_T and RETRIEVED_USER_T to contain again one row per relationship and user, even if a group of people were assigned to a certain role. This view is the target of all operations based on the query() API function.

2.3. Additional queryable views

Since not only the work items themselves are of interest but also information about the referenced objects, such as business processes and person activities, the query() API function also provides access to this information.

In order to do so (and to introduce an additional abstraction layer that allows the underlying tables to be changed in future versions of Process Choreographer), a couple of additional views are defined to be queryable by the query() API function:

  • PROCESS_TEMPLATE
  • PROCESS_ATTRIBUTE
  • ACTIVITY
  • ACTIVITY_ATTRIBUTE
  • EVENT

For a detailed list of the available columns/attributes for those views, see [2].

To process the query() API call, some information about column types is required. Thus, no other tables or views than the ones listed, can be queried using the query() API call.



Back to top


3. The QueryResultSet

After getting the home and remote EJB interfaces for Process Choreographer's BusinessProcess interfaces, the query API function can be called and returns a QueryResultSet object.

Sample of a query() call and result set processing:

 
   import com.ibm.bpe.api.*; 
   ... 
   InitialContext initialContext = new InitialContext(...); 
 
   // lookup the EJB home interface 
   Object object = initialContext.lookup("com/ibm/bpe/api/BusinessProcessHome"); 
   BusinessProcessHome processHome = (BusinessProcessHome) 
       javax.rmi.PortableRemoteObject.narrow(object, BusinessProcessHome.class); 
 
   // get the remote interface 
   BusinessProcess process = processHome.create(); 
 
   // run a query to show all activity names that are ready to be started 
   // and that I can claim (where I have a potential owner work item) 
   QueryResultSet resultSet = process.query( 
      // select clause - what do we want to get? 
      "PROCESS_INSTANCE.NAME, ACTIVITY.TEMPLATE_NAME",  
      // where clause - what are the qualifying rows? 
      "WORK_ITEM.REASON=WORK_ITEM.REASON.REASON_POTENTIAL_OWNER AND " + 
     "ACTIVITY.STATE = ACTIVITY.STATE.STATE_READY", 
      // order clause - specify the sort order 
      "PROCESS_INSTANCE.NAME", 
      // threshold - return first 10 entries only 
      new Integer(10),  
      // no timezone specified 
      null 
   ); 
 
   // loop over results in the result set 
   while( resultSet.next() ) 
   { 
      // print out selected columns, keep in mind column indexes start with "1" 
      System.out.println( "Process instance name = " + resultSet.getString(1) ); 
      System.out.println( "Activity template name = " + resultSet.getString(2) ); 
   } 
            

Note, that a QueryResultSet is similar to a JDBC ResultSet, particularly with regard to cursor-driven navigation and column indexes starting with "1" instead of "0":

  • After a query() call, the result set cursor is positioned before the first entry.
  • Applying next() moves the cursor one line down and returns "false" if the end of the result set is reached.
  • first() and last() can be used to revisit already read entries.

However, in contrast to a JDBC ResultSet, a QueryResultSet object is serializable and can be sent to a (remote) client. It also includes Process Choreographer specific enhancements, such as ID and timestamp handling.

3.1. Read QueryResultSet data

The QueryResultSet knows six column types:

  • TYPE_STRING
    For any text string columns, this type is also used for columns stored in CLOBs.
  • TYPE_NUMBER
    For any numerical data, such as short, integer, or big integer.
  • TYPE_TIMESTAMP
    For any timestamp data. The resolution is normally milliseconds (where supported by the database system).
  • TYPE_BINARY
    For binary strings and BLOB data.
  • TYPE_BOOLEAN
    For storing Boolean values (true or false), the database representation is a smallint column.
  • TYPE_ID
    For entity identifiers - each entity is identified by one or more of these identifiers, for example, process instance ID (PIID) and activity instance ID (AIID). They are stored in the database as a 16 byte long binary field.

The QueryResultSet provides functions to read data from the current cursor position

  • byte [] getBinary()
    Returns a column of TYPE_BINARY. For example, a VARBINARY column.
  • Boolean getBoolean()
    Returns "true", "false", or null. This function can also be applied to any numeric column type and returns true for any non-null value.
  • Integer getInteger(), Long getLong(), Short getShort()
    Returns the corresponding numerical value or null.
    Note: These functions can be applied to all numerical types, however, if the underlying data type is different, its type is "casted" which might result in a loss of data.
  • Object getObject()
    Returns a generic object for a column value. You can use this function if you do not know the type or if it is only needed to be passed to another function.
  • OID getOID()
    If the column is an identifier, this function returns an OID object that is needed for other API calls. You may have to cast the OID (base interface) to the more specific OID class, such as PIID or AIID.
  • String getString()
    Returns a string representation of the specified column. This function is not only applicable to character based database columns, it can also be used to get a string representation of numerical values, timestamps, and OID values. For constant values, this functions returns the descriptive name of the constant rather than the numerical value.
  • Calendar getTimestamp()
    Returns the timestamp value for the specified column. This function takes into account the current client time zone setting. For details, see Timestamps and Timezones.

If a column index is specified that is outside the range of the number of columns, an IndexOutOfBoundsException (runtime) exception is thrown.

If the function is not applicable to the corresponding column type, a ClassCastException is thrown.

The safest way to avoid these exceptions is to examine the column types first and then call the appropriate data accessor function on the query result set.

3.2. QueryResultSet meta data

Normally, a program calling the query() API function knows which column in the database it wants to query. However, if a user is allowed to specify arbitrary columns, a (generic) GUI must find out what the result set contains before it displays the data.

For this purpose, the QueryResultSet offers the following functions to get meta data about the selected columns:

  • numberColumns()
    Returns the number of columns in the select clause.
  • getColumnDisplayName(int columnIndex)
    Returns the name of the column, which is useful for displaying headings on a result table.
  • Since 5.0.2: getTableDisplayName(int columnIndex)
    Returns the name of the table or view. This is useful for displaying headings on a result table or to clarify to which view a column belongs.
  • Since 5.0.2: getColumnType(int columnIndex)
    Returns the type of the column. This function returns one of the constants defined for column types in the QueryColumnInfo interface: TYPE_STRING, TYPE_NUMBER, TYPE_TIMESTAMP, TYPE_BINARY, TYPE_BOOLEAN, TYPE_ID.
  • size()
    Returns the number of entries in the result set. This can be useful for allocating memory before reading the result set data.

The following sample demonstrates how to display result set contents where the structure (columns) are unknown because, for example, the query clauses have been passed dynamically from user input.

 
   void displayResultSet( QueryResultSet resultSet ) 
   { 
       // Print table heading 
       for( int i=0; i<resultSet.numberColumns(); i++ ) 
       { 
          System.out.print( resultSet.getColumnDisplayName(i) ); 
          System.out.print( "\t" ); 
       } 
       System.out.println(); 
 
       // Print row data (String representations) 
       while( resultSet.next() ) 
       { 
          for( int i=0; i<resultSet.numberColumns(); i++ ) 
          { 
             System.out.print( resultSet.getString(i) ); 
             System.out.print( "\t" ); 
          } 
          System.out.println(); 
       } 
   } 
            



Back to top


4. Query processing

The query() API call in Process Choreographer was designed to be as similar as possible to SQL syntax and to assist in describing Process Choreographer specifics where it makes sense.

The operations necessary to derive an SQL statement from the query() parameters include:

  • Collecting referenced views and replacing view names with defined correlation names.
  • Replacing string and number literals with parameter markers.
  • Parsing timestamps and IDs, and converting them to JDBC values in a prepared statement.
  • Adding an appropriate "from" clause to the SQL statement.
  • Adding result set limitation threshold value, which depends on the database system.

The result is a JDBC prepared statement that is processed by the database system.

The details of these steps are discussed later in this document.

It is important to be aware that Process Choreographer does not parse the clauses passed to it because parsing is performed by the database. This allows the SQL syntax that is available for the database system to be fully exploited, rather than requiring a restricted subset supported by all possible database systems.

4.1. Parameter markers and prepared statements

Each literal expression, such as a string or a numerical constant value, that is found in the where clause of a query() API call is replaced by a parameter marker.

For example, the following comparison expression in a where clause:

"... WI.OWNER_ID = 'Frank' ..."

is replaced using a parameter marker:

"... WI.OWNER_ID = ? ..."

This allows JDBC "prepared" statements to be used instead of "statements", even for string literals that are passed in by a client application.

Applications, such as a Web client, normally run a limited set of queries, differing only in string literals (for example, consider a query to retrieve all work items belonging to the logged in user).

Without the use of prepared statements, these queries would have to be compiled in the database system for each user, causing a significant performance impact.

After replacing parameter markers, however, these queries are syntactically identical, and the database system only needs to build an access plan once, and can reuse it for subsequent queries.

4.2. Correlation names

To shorten the SQL statement and to allow for complex subqueries, correlation names are assigned to all queryable views:

PROCESS_TEMPLATEPT
PROCESS_ATTRIBUTEPA
PROCESS_INSTANCEPI
ACTIVITYAI
ACTIVITY_ATTRIBUTEAA
EVENTEI
WORK_ITEMWI

A query, such as
query("WORK_ITEM.WIID","WORK_ITEM.OWNER_ID='Frank'", null, null, null );

results in an SQL statement
SELECT WI.WIID FROM WORK_ITEM WI WHERE WI.OWNER_ID='Frank'



Back to top


5. Query clauses

5.1. Select clause

As in SQL, the select clause specifies which information is to be retrieved from the database system. The view and column names that are available are listed in [2].

The select clause in Process Choreographer's query() API call must conform to the following syntax:

  • The select clause can contain one or more column specifications, each of which describes one column of a queryable table or view.
  • Multiple column specifications are separated by a comma ",".
  • Each column specification must contain exactly one token of the form view.column, where view is one of the queryable views listed previously, and column is a column in this view. Each expression in the select clause containing a period "." is assumed to be a view-column token.
  • The view.column token can be surrounded by any SQL "decoration" that is understood by the database system. This decoration, however, must not change the returned type. Aggregation functions (COUNT, MIN, MAX), for example, change the returned type and are therefore not supported.

If the table or column name is not recognized, a QueryUnknownTableException or QueryUnknownColumnException is thrown.

Examples of valid select clauses are:

  • Get all work item IDs
    "WORK_ITEM.WIID"
  • Get distinct activity IDs and work item reason
    "DISTINCT ACTIVITY.AIID, WORK_ITEM.REASON"
  • Get process name and creation time
    "PROCESS_INSTANCE.NAME, PROCESS_INSTANCE.CREATED"
  • Basic arithmetic, as long as it does not change the resulting column type and the database system supports it
    "WORK_ITEM.REASON + 7"
  • Column aliases
    "WORK_ITEM.REASON AS ROLE"

Samples of invalid select clauses are:

  • Aggregation functions
    "COUNT(WIID)"
  • Type conversion (casting) that results in incompatible types
    "CAST (ACTIVITY.CREATED AS CHAR)"
  • Unknown table or column names in a view.column token
    "WORK_ITEM.DOES_NOT_EXIST"
    "MYVIEW.VALUE"
  • Expressions without a valid view.column, for example, returning a constant value, calling a stored procedure or UDF
    "WORK_ITEM.WIID, 'text'"

5.2. Where clause

The where clause restricts the result set of the query and specifies filter criteria.

It is an optional parameter. If this parameter is null, a filter is not applied.

The where clause is processed in a similar way to the select clause. You must follow certain rules and most of the syntax of the underlying database system can be used for the expressions:

  • Each view.column token must refer to a known queryable view and column - as described in select clause. If the table or column name is not recognized, a QueryUnknownTableException or QueryUnknownColumnException is thrown.
  • Subselects, which perform another full query in the where clause, are not generally supported. However, see the Complex query section for ways to get around this limitation.

Note: Certain column types, such as object IDs and timestamps, require a special syntax that is described later in this paper.

5.3. Order-by clause

The order-by clause can specify one or more columns that are used for ordering the result set. This is an optional parameter, and ordering is not applied if the parameter is null.

Processing the order clause is similar to what happens with the select clause, and the same syntactical restrictions apply.

You can specify the order direction by adding descending or ascending tokens.

The ordering operation is completely processed by the underlying database system, which means that the character order for certain locales and code pages must be specified there. Process Choreographer performs no post-processing when it creates the query result set.

5.4. Threshold

Optionally, the number of rows fetched and returned in the QueryResultSet can be limited.

You should consider using this when the expected result is large and it is acceptable to limit the number of rows returned.

The implementation differs, and depends on what the underlying database system supports.

Here are some examples that show how Process Choreographer uses this value to restrict the result set:

  • For DB2, the resulting SELECT statement is restricted by a "FETCH xxx FIRST ROWS ONLY"
  • For Oracle, the where clause of the SELECT statement is extended by "AND ROWNUM <= xxx"
  • For all other database systems, the result set limitation is specified on the JDBC result set, which is, for example, the recommended way for Cloudscape.

Note: Combining an order-by clause with a threshold can lead to different results, depending on the implementation by the database system. For example, some database systems perform the sort operation first according to the order-by-clause operation and then cut the result list, other database systems apply the threshold first and then order the remaining rows.

5.5. Timestamps and time zones

The representation of times and dates normally depends on the current locale and time-zone settings.

To support a multi-time zone environment, where a client application, the WebSphere Application Server, and the database back end can each reside in a different time zone, Process Choreographer has to take the different time zones into account, and provides ways to perform any necessary conversions.

Time information in normally stored in UTC in Process Choreographer's database, any trace or audit log time stamps are based on UTC so that results written from Application Servers running in different time zones can easily be correlated. A conversion is required both before presenting time information on a user interface, and after getting time information from a user interface.

This conversion affects two functions: query() and getTimestamp() on the query result set.

The query() API call expects timestamps in where clauses to be passed using a TS('') pseudo function, optionally a time zone can be specified. If no time zone is given (the parameter is null), timestamps are assumed to be in UTC and are stored in the database without any conversion.

Sample for passing a timestamp in UTC using the TS() function:

 
  process.query( "ACTIVITY.AIID", 
                 "ACTIVITY.STARTED > TS('2003-05-13T06:01:07')", 
                 null, null, null ); 
            

The same sample if the timestamp is in Pacific Standard Timezone (PST):

 
  process.query( "ACTIVITY.AIID", 
                 "ACTIVITY.STARTED > TS('2003-05-13T06:01:07')", 
                 null, null, TimeZone.getTimeZone("PST") ); 
            

That is all you need to do in terms of time-zone conversion to pass timestamp information from a client environment to Process Choreographer; the query() function converts the timestamp to UTC and continue working with the converted value.

The timestamp in the TS('') function is expected to be in the format:

YYYY-MM-DDThh:mm.ss

Everything except the year is optional, that is, can be omitted and is then set with default values. Thus, a timestamp expression only specifying the date part TS('2003-05-13') is valid, too.

Keep in mind, that if your multi-tier application runs on an application server processing requests from a remote Web client, you must get the time-zone information from the user's Web browser.

Now we consider what is returned by the getTimestamp() function for the QueryResultSet. First, the returned Java Calendar object contains both time and time-zone information. The time is always UTC, independent of the time-zone information. The time zone, that is set in the Calendar object is either UTC or the one passed in the previous query() call.

If you have a multi-tier environment where a Web client may have a different time-zone setting compared to the machines where your JSP runs, this time-zone information can help with the following conversion. If your code is running on the client machine, you can also ignore the time-zone setting in the Calendar object and use the local settings.

To display time stamp information, you have to use a formatter class, like DateFormat or SimpleDateFormat. The following sample shows how the value in a Calendar object can be displayed taking into account the time-zone information that comes with the Calendar object:

 
  resultSet = process.query( "ACTIVITY.STARTED", null, null, null, 
                             TimeZone.getTimeZone("Europe/Berlin") ); 
  while( resultSet.next() ) 
  { 
     Calendar cal = resultSet.getTimestamp( 1 ); 
     // target timezone as specified in query() (or UTC if not specified) 
     // (you may also set the browser's timezone here) 
     TimeZone tz = cal.getTimeZone(); 
     // Use either DateFormat or SimpleDateFormat 
     DateFormat fmt = DateFormat.getDateTimeInstance( DateFormat.LONG, 
                                                      DateFormat.LONG ); 
     fmt.setTimeZone( tz ); 
     // Format the time and print the result 
     System.out.println( "Activity started: " + fmt.format( cal.getTime() ) ); 
  } 
            

The way that Process Choreographer fills the Calendar object when using a QueryResultSet object is the same for API objects like ActivityInstanceData, returned by an API call like getActivityInstance( AIID ). Since you do not pass a time-zone parameter to this call, the Calendar object has the current local time zone set as a default.

5.6. CURRENT_DATE

With Version 5.0.1 (PTF1) of Process Choreographer, the CURRENT_DATE token was introduced to specify the current date as a timestamp. Instead of explicitly passing the current date in a TS() function call, the CURRENT_DATE token can be used to express this. This is important for defining worklists that should return a result that depends on the current time.

Most database systems also support arithmetic with timestamp columns. Because the CURRENT_DATE token is replaced with a parameter marker in further processing, adding and subtracting time to the CURRENT_DATE token is not supported, however, you can specify additional time information with the comparing column.

The following example shows how to use CURRENT_DATE in where clauses with DB2 Universal Database syntax, the syntax for other database systems might be different:

"... ACTIVITY.STARTED + 3 DAYS > CURRENT_DATE ..."

Example of invalid use:

"... ACTIVITY.STARTED > CURRENT_DATE &minus; 3 DAYS"

5.7. Object IDs

To identify application objects, such as activities or business processes, identifiers (IDs) are introduced for most of these objects. In the database, IDs are represented as 16 bytes of raw (binary) data.

To handle IDs in application programs, there is also a string representation that can be used in a where clause. A process instance, for example, might be identified by an ID, such as:

_PI:800300f3.9aee3366.9e1c67f6.27

To pass this ID in a qualifying where clause of a query() call, it has to be specified using the ID() pseudo function:

"... AND PROCESS_INSTANCE.PIID = ID('_PI:800300f3.9aee3366.9e1c67f6.27') ..."

The ID() pseudo function translates the ID string representation back to its binary format and passes it to the underlying database system. Again, the corresponding SQL statement uses parameter markers and replaces correlation names:

" ... AND PI.PIID = ?"

If an ID is selected in the result set, either getOID() or getString() can be applied to the result set. The first returns an OID object, the second returns the corresponding string representation. If you have the choice of working either with OID objects or with strings, you should use the OID representation, because passing an object is considerably faster than transforming a string representation.

Note: If you omit the ID() pseudo function for object IDs, and the database system performs an implicit type conversion from an object ID string to the binary database representation, you do not get an SQL error. In this case, you will only notice that the result set is empty and no rows are returned.

It is important to check your where clauses carefully to ensure object IDs are handled correctly.

5.8. Syntax for constant values

Some columns in queryable views are defined as numeric, but should only contain certain, distinct values. This is comparable to an enumeration in programming languages.

Instead of comparing these column values with the concrete integer value, it is good practice to compare these values against predefined constants.

For example, the WORK_ITEM.REASON column could have one of the following values:

  • REASON_POTENTIAL_OWNER=1
  • REASON_EDITOR=2
  • REASON_READER=3
  • REASON_OWNER=4
  • REASON_POTENTIAL_STARTER=5
  • REASON_STARTER=6
  • REASON_ADMINISTRATOR=7
  • REASON_POTENTIAL_SENDER=8

Assume you are interested in all work items for objects for which you have an owner work item (the ones you claimed).

A valid where clause for this query would be:

" ...WORK_ITEM.REASON = 4 ..."

Since this obscures the meaning, it is recommended that you use the predefined constant values which with the following syntax:

" ...WORK_ITEM.REASON = WORK_ITEM.REASON.REASON_OWNER ..."

Note: The constant expression on the right is constructed according to the format:

<view>.<column>.<constant name>

The constant name must be fully-qualified in the sense that it is has to be scoped by view.column to avoid problems with constants defined for other columns.

Another way to replace fixed numerical values by the more flexible named constants is to use the values defined in the corresponding API object:

" ... WORK_ITEM.REASON = " + WorkItemData.REASON_OWNER + "..."

The corresponding API objects not only contain all defined constant expressions as integer values, but also have additional valuable information for all columns.



Back to top


6. Security for entities

Work items are used by Process Choreographer to allow access to entities, such as activities and business processes.

If the principal name of a user who is logged onto WebSphere Application Server has at least one work item for an entity, the user is allowed to query its contents.

Further checks in the Process Choreographer API and navigation engine can restrict additional operations depending on the type of a work item (the reason), however, to retrieve information using a query() operation, the existence of any work item is sufficient.

The next section explains how the query() API function implements this.

6.1. From clause, join algorithm, join hierarchy

For those familiar with SQL, it might seem strange that select clauses and where clauses must be given to the query() API call, while the from clause is omitted.

The reason is that the from clause, which specifies the views to be queried is computed based on the referenced columns in the select, where, and order-by clauses. While these clauses are scanned, a list of referenced views is built that becomes the basis for the from clause.

The WORK_ITEM view plays a special role in this process because it is always added to the from clause to guarantee the security restrictions. All referenced views are "joined" with the work item view to make sure that only information for which the user has a work item can be returned.

To make this work, additional join conditions are appended to the passed where clause, otherwise, the cartesian product of both views would be returned.

For example, consider the following select clause (assuming both where and order-by clauses are null) where a user wants to query the state of all activities for which he has a work item:

SelectClause = "ACTIVITY.STATE"

Internally, the from clause is formed by all referenced views, which in this example is only ACTIVITY, plus the WORK_ITEM view:

FromClause = "WORK_ITEM WI, ACTIVITY AI"

The where clause gets the proper join condition appended:

WhereClause = "WI.OBJECT_ID = AI.AIID"

While this appears to be a simple operation, it can become more complex if multiple entities are involved.

Consider the following select clause (again for simplicity, the where clause is assumed to be null) where the user wants to do the same query as above but is also interested in the name of the enclosing process instance:

SelectClause = "ACTIVITY.STATE, PROCESS_INSTANCE.NAME"

The from clause becomes:

FromClause = "WORK_ITEM WI, ACTIVITY AI, PROCESS_INSTANCE PI"

However, what about the additional join conditions for the where clause?

Process Choreographer uses a built-in hierarchy of entities and knowledge about which columns to use for the join condition that results in the user's expectation:

WhereClause = "WI.OBJECT_ID = AI.AIID AND AI.PIID = PI.PIID""

An alternative (rejected) join condition could have been:

"WI.OBJECT_ID=PI.PIID AND PI.PIID=AI.PIID"

However, this would have returned information about all activities where the user has a work item for the enclosing process rather than having a work item for the activity, which would have broken the security rule stated previously.

Without explaining more details about this processing, this gives an impression of what happens when Process Choreographer composes the final SQL statement, and might help to explain some unexpected results returned by certain SQL statements processed by your database system.

6.2. Adding the user - the BPESystemAdministrator role

To return only the objects for which the calling user has a work item, an additional expression is added to the where clause of each call:

"WORK_ITEM.OWNER_ID = '...'"

where "..." becomes the principal name of the logged in user.

With the previously described mechanism of adding the WORK_ITEM view and proper join conditions, this completes the security concept for the query() API call in Process Choreographer.

There is an exception to the rule of adding the restricting "WORK_ITEM.OWNER_ID = ..." to the where clause of all query() API calls: users that are in the J2EE role BPESystemAdministrator are allowed to query all work items, even those not belonging to themselves. However, omitting the "WORK_ITEM.OWNER_ID = ..." clause does not lift the restriction that at least one work item must exist to gather information for a certain entity.

It is important not to confuse the roles assigned during modeling with the J2EE role assigned during deployment. To be a member in the BPESystemAdministrator J2EE role changes the behavior of business processes and how the resulting SQL statement is built. However, the roles, such as "Administrator" or "Potential Owner", which are assigned to business processes or person activities during modeling, are per business process asignments that determine the authority to perform certain tasks on a business process entity.

Because Process Choreographer uses the principal name of the user logged on to WebSphere Application Server for database comparisons, you must be careful if the Application Server uses a case-insensitive directory for authentication, for example, the user registry with an Application Server running on Windows. Although the user can log on to the server with different case spellings of her logon name, the query() call might return no results because the underlying database system performs a case-sensitive comparison for the principal name. In such an environment, you must either handle this in the client code, or make the users aware of the need to match cases exactly when entering their logon user ID.



Back to top


7. Complex queries

As already mentioned, subqueries are, in general, not supported because of the way queries are scanned in Process Choreographer. However, if you know the rules and algorithms used, you can run subqueries.

Consider the common requirement for querying the most current business process template.

Each process template includes a VALID_FROM field that specifies when the template becomes valid. To discover which template is selected by Process Choreographer if a new process with with name processTempl1 is started, see the following example. It shows a where clause passed to a queryProcessTemplate() API call (which is very similar to the query() API call except that it returns ProcessTemplate API objects rather than a QueryResultSet):

"PROCESS_TEMPLATE.NAME = 'processTempl1' AND PROCESS_TEMPLATE.VALID_FROM = (SELECT MAX(VALID_FROM) FROM PROCESS_TEMPLATE WHERE NAME=PROCESS_TEMPLATE.NAME AND VALID_FROM <= CURRENT_DATE)"

The subselect clause returns the maximum value of VALID_FROM, not later than the current date. Understanding how the name comparison, NAME=PROCESS_TEMPLATE.NAME works, requires knowledge from the previous sections of this document: This clause gets transformed using the PT alias for the PROCESS_TEMPLATE view and parameter markers:

"PT.NAME = ? AND PT.VALID_FROM = (SELECT MAX(VALID_FROM) FROM PROCESS_TEMPLATE WHERE NAME=PT.NAME AND VALID_FROM <= CURRENT_DATE)"

Note that the VALID_FROM column in MAX() and NAME column at the end are replaced because they are not recognized and do not contain a period ".". The same is true for PROCESS_TEMPLATE.

Because PROCESS_TEMPLATE.NAME is replaced by PT.NAME, the query returns the desired result.



Back to top


8. Staff assignment expiration and caching

For performance reasons, Process Choreographer caches the user IDs retrieved from the staff plug-in during the evaluation of staff assignment expressions (for details about staff handling, see [1]). In addition to this, the results of syntactical equivalent staff assignment expressions are shared within one process model.

By default, the list of stored user IDs are assumed to be valid for one hour. However, because no daemon process re-evaluates all staff assignment expressions after this time, they only get recomputed when another request accesses the same, shared staff query expression.

To illustrate this and understand the implications of this implementation, consider the following scenario:

A business process template, PT, defines a person activity, AT, with a staff assignment expression for the role "potential owner" of "all persons in department D3056".

If a process instance for template PT is started for the first time and the activity instance for AT becomes ready, the staff expression is evaluated for the first time and returns a list of user IDs. This list is stored in the RETRIEVED_USER_T table and causes corresponding entries in the WORK_ITEM view; one row for each returned user, associated with the activity instance and a reason of potential owner.

If a second process instance for PT is started within an hour of the first one, and the instance of AT becomes ready, the staff expression is not evaluated again because it has already been cached and is still valid. The work item points to the same RETRIEVED_USER_T entries that were created for the first process instance and again, the WORK_ITEM view has a work item for each user.

If, after an hour (the default expiration time), a third process instance for template PT is started, the result of the staff expression has expired and the result is recomputed calling the staff directory.

After the new result is stored in the RETRIEVED_USER_T table, all work items for the three instances (if still running) point to the refreshed results.

This example shows that recomputing staff assignment results (picking up directory changes) depends on activities sharing the same staff assignment expressions. If no such activities become ready or if there are no shared staff assignment expressions, any staff directory changes do not take effect, even if the expiration time has passed.

The same caching applies if the same staff assignment expression was defined for multiple activities within a business process, for example, where there are two activities, and the same group of people are allowed to process a certain operation.



Back to top


9. queryProcessTemplates()

The query() API call described in this paper is designed to query work items and the associated instance objects such as process instances and person activity instances.

To get information about business process templates, Process Choreographer provides the queryProcessTemplates().

The reason for providing an additional API call for business process templates is because work items are not created for template data. Thus, evaluating the access rights for a process template (mainly the REASON_POTENTIAL_STARTER) is computed differently and there is no join operation with the WORK_ITEM view.

With this limitation, the queryProcessTemplates() API call is simpler, because it does not expects a select clause and it returns the entire ProcessTemplateData API objects.

The where and order-by clauses, threshold, and timezone are processed in the same way as in the query() API call.



About the author

The author of this whitepaper, Frank Neumann (frank_neumann@de.ibm.com) works in the WebSphere Application Server Enterprise Process Choreographer development team.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top