Process Portal Search Index Comprehensive

White Papers

Abstract

This white paper provides a deep explanation of the Process Portal search index processes and operations, common issues, troubleshooting steps, maintenance recommendations, and more.

Content

Introduction

The search index in IBM Business Automation Workflow Process Portal is used to allow users to search for tasks and process instances. The search index is also used to generate various charts and data throughout Process Portal including in the Team Performance and Process Performance dashboards. Unfortunately, the search index can cause problems for users when the search index data falls out of sync with the operational task and process instance data.

The purpose of this article is to help Administrators have a deeper understanding of the Process Portal search index. The article details how the indexing process works, the configurations available, how to understand when there is a problem, and how to resolve common issues related to the search index. The search indexing process is a “black box” operating behind the scenes in Business Automation Workflow. Administrators are largely unaware when the search index is experiencing issues (such as being out of sync) until users report discrepancies in the data they observe in Process Portal.

Understanding the search indexing process helps Admins troubleshoot the indexer and determine the root cause by being able to identify the “point of failure” in the process. This document provides best practices for configuring, maintaining, cleaning up, and resolving common issues that can arise with the Process Portal search index.

How it Works

Brief Description

The Process Portal search index is a file-based Lucene index. By default, the search index is enabled in Process Portal and by default only human tasks are indexed. System tasks can be indexed by updating a setting. Each node in a BPM topology maintains its own search index. In some cases, it can be appropriate to configure a shared search index. There are three search-index-tracking database tables that are involved with the search indexing process. They are the following:

BPM_TASK_INDEX – keeps records of BPM tasks, reflection of the operational LSW_TASK data
BPM_INSTANCE_INDEX – keeps records of BPM process instances, reflection of the operational LSW_BPD_INSTANCE data
BPM_TASK_INDEX_JOB – keeps records of each indexing “job” for each node

There are four main phases of the search indexing process. Each phase is described in detail

Phase 1 - Operational Data Changes Trigger Runtime Events

Operational data is stored in the LSW_TASK and LSW_BPD_INSTANCE tables. When operational data is created, updated, or deleted the runtime engine fires events. The events correspond to starting processes, updating process instance data, completing a process instance, tasks starting, tasks being assigned or reassigned, tasks completing, and so on.

Phase 2 - Search Event Point Listener Subscribed to the Events Triggers Updates to Index-Tracking Tables

When the listener picks up that a task or process instance is created, a new row is added to the BPM_TASK_INDEX or BPM_INSTANCE_INDEX table. Now a field in the table named “MAJOR_EVENT_DATETIME” is set to a time stamp, which is one day before epoch time of “1969-12-31T00:00:00“. This process creates a virtual queue. Similarly, when a process instance or a task is updated the “MAJOR_EVENT_DATETIME” field is updated to this same timestamp. A single row is inserted or updated at a time for each of these operations.

When process instance data and tasks are deleted from Business Automation Workflow then the SQL Query updates up to 50,000 rows at once in the corresponding tracking tables. This SQL Query operation updates the DELETED_DATETIME field from NULL to the current database timestamp.

Phase 3 – De-queueing the Task and Process Instances that need to be updated

Another thread called ProcessIndexQueueDaemon runs continuously in parallel that updates the MAJOR_EVENT_DATETIME field from the epoch time to the current database time. This operation occurs in batches of 20 rows at a time. If this SQL transaction takes longer than 1 second to complete, then a warning is logged in the server systemout.log file:

CWLLG3231W: Task Index entries {LIST OF TASK IDS} took longer than 1 second to commit, retrying.
CWLLG3230W: Instance Index entries {LIST OF INSTANCE IDS} took longer than 1 second to commit, retrying.

This process acts as a "dequeue" for the tasks or process instances, allowing the indexing job to consume the updated records in order to update the corresponding documents in the search index

Phase 4 - Updating the Lucene Search Index

The index job thread, ProcessIndexUpdaterDaemon, finds records from the index-tracking tables that have a MAJOR_EVENT_DATETIME that falls within the indexing window (last index job timestamp to the current indexing job timestamp). Now the indexing job updates the corresponding Lucene search index documents.

The purge process for deleting documents that are deleted from the operational tables also occurs in this thread. This thread checks for records in the tracking tables that have a DELETED_DATETIME value that is newer than the last index purge time.

Each indexing job generates new records in the BPM_TASK_INDEX_JOB database table. This thread also deletes records of indexing jobs that are older than 7 days. The records are checked hourly to verify whether any job records need to be deleted.

Additionally, the ProcessIndexUpdatedDaemon thread is also responsible for re-creating the Lucene search index during the processIndexFullReIndex command. This activity is completed by directly parsing the search indexing tables, which create the corresponding documents in the Lucene index.

Configuring the Process Portal Search Index

There are a number of configurable properties and configuration changes that can be made to the search index. The configurable properties can be updated by using a custom XML file and then restarting the server. To verify that the property changes persisted, you can check the TeamworksConfiguration.running.xml file to confirm the new value.

Use the following template and update the values that need to change. This snippet must be placed in between the open and close of the <properties> tags.

<search-index>
     <task-index-enabled merge=”replace”>true</task-index-enabled>
     <task-index-update-interval merge=”replace”>5</task-index-update-interval>
     <task-index-update-completed-tasks merge=”replace”>false</task-index-update-completed-tasks>
     <task-index-store-fields merge=”replace”>false</task-index-store-fields>
     <task-index-work-manager merge=”replace”>wm/default</task-index-work-manager>
     <task-index-include-system-tasks merge=”replace”>false</task-index-include-system-tasks>
     <process-index-instance-completion-best-effort merge=”replace”>false</process-index-instance-completion-best-effort>
</search-index>

<task-index-enabled>

By default, the value is true. Updating the value to false disables the search index on the node in which the change was made. The search filter field is no longer available in Process Portal. Users observe the change in the Work, Processes, Team Performance, and Process Performance dashboards. Users observe that Quick Stats are unavailable.

If you have multiple nodes and disable indexing on only some nodes, then the search index can get out of sync for all nodes. The nodes with indexing disabled do not register the runtime events and thus not trigger the updates to the index-tracking tables.

<task-index-update-interval>

The default value for this property is 5 seconds. This property controls the amount of time between index updates. The value specifies the window that the indexer uses to look for tasks and process instances to update. The indexer uses the latest INDEX_END_TIME from the BPM_TASK_INDEX_JOB table for the node in which the indexer is occurring. The indexer then adds the value of the property to look for all task and process instance records whose MAJOR_EVENT_DATETIME falls in that timeframe.

Increasing this property causes the indexer thread to have a longer period of times between processing records waiting to be indexed. It decreases the load of the indexer. Users might also observe that tasks and process instances require a longer time before they are available for searches and seeing Quick Stats updated in the dashboards.

<task-index-update-completed-tasks>

The default value is false. The indexer updates the Lucene index with tasks that are open and once when they are completed. When this property is true, completed tasks are updated in the Lucene search index when instance level data such as business variable values are updated. Searches are based on the most recent values of the instance level data. Setting this property to true increases the load of the indexer as it requires processing more data.

<task-index-store-fields>

The default value is false. This property determines whether the value of the fields that are stored as separate fields.

<task-index-work-manager>

The default value is the string "wm/default". This property sets the work manager that is used by the indexing process to update the search index. You improve the performance of the search index creation by changing the work manager to a new dedicated work manager that has a higher number of available threads. You can create the new work manager in the WebSphere Application Server Admin Console.

<task-index-include-system-tasks>

By default, this value is false. This property can be updated if business requirements determine that system tasks must be available for search queries and for inclusion in the Gantt Chart generated in Process Performance dashboard. Generally, system tasks do not need to be indexed as they are handled entirely without human interaction. Indexing system tasks also greatly increases the load on the system.

<process-index-instance-completion-best-effort>

The default setting is false. This property controls whether completion dates are created for instances migrated from previous versions of BPM. If the value is true, the latest completion date of the associated tasks is used for the instance completion date. If there are not any associated tasks, then the last modified time stamp of the instance is used.

<process-index-queue-update-size>

The default value for this property is 20. This property is added with APAR JR55895 to introduce batch sets to the indexing process with updating the Lucene search index. The property limits the number of records per batch to ensure that the commit transaction to update the search index completes in less than 1 second.

Using a Shared Search Index

Each node maintains its own search index. In some environment topologies with multiple nodes, it becomes difficult for each search indexes to stay up to date when all search indexes are trying to access the index-tracking database tables. Competing for resources can be a challenge when there are more than 4 nodes configured. It can be beneficial to set up a shared search index between the nodes.

The index can be maintained in only one cluster member at a time. It is enforced by using a locking strategy on the index.

To set up a shared search index, you can use a shared network storage solution for your index. Using WebSphere Admin Console navigate to the environment variables by clicking Environment > WebSphere variables. Update the value of the cell scoped BPM_SEARCH_TASK_INDEX_ROOT variable to point to the common location for each cell.

Maintaining the Process Portal Search Index

Stay Current on the Latest Fixes

Be aware of the Business Automation Workflow maintenance strategy: https://www.ibm.com/support/pages/node/871252

Index Diagnostic Utility Tool

The Index Diagnostic tool was created by a team of BPM Developers. It is packaged as a .war file and can be deployed in the AppCluster as a WebSphere application. The tooling is quick and easy to install and access. It does not require stopping or restarting the server and takes only minutes to install. The tool comes with a .pdf file that provides in-depth detail for installation instructions and instructions for generating reports and interpreting the results. The tool also provides means to resync the search-index-tracking tables and the Lucene search index directly in the tool. More in-depth information is provided about the Diagnostic tool in next section.

Index Diagnostic Utility Tool v3.0

This version of the Index Diagnostic tool is supported for version of IBM Business Automation Workflow up to version 23.0.2.
Download the Index Diagnostic Utility Tool V3.0: index-diagnostic.zip

Index Diagnostic Utility Tool v4.0

This version of the Index Diagnostic tool is supported for IBM Business Automation Workflow 24.0.0 and 24.0.1 and future releases.
Download the Index Diagnostic Utility Tool V4.0: index-diagnostic_4.0.zip

Clean up the Search-Index-Tracking database Tables

When operational Business Automation Workflow data is deleted by using the appropriate wsadmin commands, it is deleted from the operational database tables and the Process Portal search index. The corresponding records are not automatically deleted from the index-tracking tables, BPM_TASK_INDEX and BPM_INSTANCE_INDEX. Instead, only the DELETED_DATETIME column gets updated in the tables. Over time, as records build up in these tables issues can occur such as database lock timeouts and high CPU usage.

Verify the number of records in your tracking tables that represent deleted tasks and process instances by using the following queries, substituting <schema_name> where indicated.

SELECT COUNT(1) FROM <schema_name>.BPM_TASK_INDEX WHERE DELETED_DATETIME IS NOT NULL;  
SELECT COUNT(1) FROM <schema_name>.BPM_INSTANCE_INDEX WHERE DELETED_DATETIME IS NOT NULL;

The indexTablesCleanup command was included with the product starting with IBM Business Process Manager 8.6.0 CF2018.03. The indexTablesCleanup command cleans up the obsolete records from the index-tracking tables.

If you have inactive nodes for Disaster Recovery, you might need to run the processIndexRemoveDeleted command before you can successfully run the indexTablesCleanup command. If the indexTablesCleanup command returns an error regarding the value of the parameter that is provided for the number of nodes, then complete some additional steps to run the command successfully. These additional steps are documented in this technote: https://www.ibm.com/support/pages/methods-provided-ibm-clean-bpmtaskindex-and-bpminstancetable-do-not-work

Maintain and Tune the database

In some cases, it is important to tune the properties related to locking and perform tuning and upkeep of the database environment. Regularly schedule cleanup maintenance activities to purge Business Automation Workflow process instance data by using the BPMProcessInstancesPurge command and follow that up with running the indexTablesCleanup command.

How to Determine an Issue Exist

This section helps you determine when problems arise with the Process Portal Search Index and how to resolve them. It may not always be obvious when there is a problem with the search indexer. The following methods can be used to definitively determine whether there is a major problem with the health of the Search Index.

Process Portal Observations

Any number of discrepancies might be reported by BPM users that can indicate out of sync conditions within the search index. The search index stores many different fields of data for tasks and process instances including status, owner, team, exposed business data, task and instance IDs, and so on. The Process Portal Search Index is used to generate search results, task lists, and other data throughout Process Portal. Users might report that a closed task is appearing as a result in the search against their WORK task list with an Open status. Users might report that there are incorrect numbers reported for the Quick Stats views in the out of the box Process Portal dashboards. Users might be able to identify these discrepancies by comparing the data to another data point in Process Portal. Screen captures are a useful way to prove these conditions exists.

Index Diagnostic Reports

The Index Diagnostic reports provide snapshots of the health of the Search Index. You can review these reports to determine whether the search-index-tracking tables and the Lucene search index are in sync with the operational data by checking the Index Summary Report. You can also check that the search indexing process is running on every interval to ensure that there are not any issues with the indexing jobs. You might find that they are not running as expected, indicating a problematic condition occurred that is preventing the jobs from running efficiently and successfully as expected.

Server Logs

Frequently check the systemout.log files and the FFDC files for exceptions or messages related to the search index. Search indexing activity exceptions can be identified with the string “com.ibm.bpm.search” and “com.ibm.bpm.server.search.eventpoints.listeners”. Oftentimes exceptions are written in the FFDC files providing more robust details of the error. Search terms related to the search index are “ProcessIndexU”, “ProcessIndexQ”, “ProcessIndexB”, or “SearchEventPo”.

Database Deadlocks & Lock Escalations

Contact your DBA to determine when deadlocks or locking escalations are preventing the indexing process from running successfully. Your DBA is able to point to any locking issues with the search-index-tracking tables BPM_TASK_INDEX, BPM_INSTANCE_INDEX, or the BPM_TASK_INDEX_JOB tables.

Javacore & Heap dumps

If you are experiencing high CPU usage or OutOfMemoryError exceptions, then analyze any javacore files or heap dumps that are generated. If the analysis reveals that “com/ibm/bpm/search/…/…/…” is the culprit, then the search index is involved in the issue. You might also find hung threads reported in the server logs indicating “com.ibm.bpm.search…” which reveals hung threads with the search indexer, most likely due to deadlocks in the database.

How to Resolve Common Issues

The most common Process Portal search index issues are caused by a buildup of records in the BPM_TASK_INDEX or the BPM_INSTANCE_INDEX tables. The first step is to check whether there is a need to clean up these tables. In the Process Admin Console, navigate to Performance > Dashboard. Expand the "Other Data" section. Check the values of the "Number of task index table rows available for clean up" and "Number of instance index table rows available for cleanup" entries. These values show how many records in the index-tracking tables can be cleaned up by using the indexTablesCleanup command.

You can further clean up the index-tracking tables by running the BPMProcessInstancesPurge command first and then run the indexTablesCleanup command to reduce the volume of the index-tracking database tables.

Solving Simple Out of Sync Scenarios

Out of sync conditions can be verified easily by Administrators with the Index Diagnostic tool reports. The report shows the counts of tasks and process instance listed in the operational database tables versus the index-tracking tables versus the Lucene search index. The report can be used to quickly determine whether an out of sync condition exists. If an out of sync condition exists, there are a number of methods to get back in sync.

1. Check the Search Index Configuration Properties for each Node

Check the search index configuration properties by accessing the Index Diagnostic Utility tool – Index Summary Report or by reviewing the <search-index> section in the TeamworksConfiguration.running.xml file.

Ensure indexing is enabled. By default system tasks are not indexed so if you expect Process Portal to return system tasks in search results against tasks then you must configure system tasks to be included in the indexing process.

If you have multiple nodes in your environment and need search indexing enabled to use quick search in Process Portal, then search indexing must be enabled on all nodes. If your configuration has indexing disabled on some nodes but enabled on others, then the search index goes out of sync. Activities on the nodes with indexing disabled do not trigger updates to the index-tracking tables shared by all nodes and are not be persisted to the other active search indexes. If you must disable indexing on one node, then you must also bring the node offline so that users are not able to perform activities against tasks and process instances on that node.

2. Index Diagnostic Utility Tool Reconciliation Feature

The easiest way to resync the index-tracking tables to the operational database and to sync the Lucene search index to the index-tracking tables is to run the reconciliation features in the Index Diagnostic tool. The tool reports when discrepancies exist and identify the task and process instances records or documents that are out of sync. The reconciliation is executed with the click of the button inside the report.

3. Run the Search Index Rebuild Command

The processIndexFullReIndex command rebuilds the search index. This command must be run against each node in the environment. While this command is executing, the search filters usually available in Process Portal are temporarily disabled and unavailable to users. This command can be run while users are working in the environment.

To monitor the progress and completion of the tool check the systemout.log file for the following messages:

CWLLG0763I: The IBM BPM process search full re-index job was successfully requested. Process indexing will begin on the next scheduled process index update interval.

CWLLG0764I: The BPM process search full re-index job was successfully started.

CWLLG0765I: The IBM BPM process search full re-index job was successfully completed.

When the command completes, the CWLLG0765I message indicates how many tasks and process instance documents were updated in the search index.

High CPU Usage & OutofMemoryError Exceptions due to TaskIndexVO

This issue can occur when a large volume of tasks or process instances is cleaned up from the environment. The search indexer tries to update many records of the search-index-tracking tables leading to this issue. To resolve this issue, clean up the index-tracking tables by using indexTablesCleanup and run the processIndexFullReindex command.

Search Index Job Commits Frequently Failing

You might find that the search index is failing to commit updates to the Lucene search index indicated by a high volume of the following message generated in the systemout.log file:

ProcessIndexQ W com.ibm.bpm.search.process.spi.impl.ProcessIndexQueueDaemon updateTaskQueueItems() Task Index [1935094] took longer than 1 second to commit, retrying.

This message is an indication that the throughput of the number of tasks and instances attempted to be updated in one indexing job is too high and needs to be regulated. The fix APAR JR55895 resolves this issue by introducing batching to the indexing process. By default each batch being updated at one time is 20 records. APAR JR55895 provides a new configurable property, <process-index-queue-update-size>, which changes the number of records that can be updated in one batch.

Perform the following steps to resolve the issue:

Tune the <process-index-queue-update-size> property. Decreasing the value reduces the time it takes to run
Tune the database to ensure allocated resources are sufficient and run REORG & RESTATS on tracking tables

Database Deadlocks & Lock Escalations

If you do not maintain the health of the tracking tables by cleaning up records corresponding to delete tasks and process instances, then you might find that deadlocks occur during the indexing process. Deadlocks prevent the search index from syncing successfully and cause high CPU usage in the database server. To resolve these issues, run the indexTablesCleanup command followed by the processIndexFullReindex command.

NullPointerException Errors Reported in the Server Logs

The server logs might indicate NullPointerException errors related to “com.ibm.bpm.search…”. In this case, open a case with IBM Support.

Providing MustGather data for IBM BPM Support

Provide the information in this list to IBM Support to address search index-related issues:

1. Problem Description

a. What problems are observed in the BPM environment?

b. What are the indications that the issue is related to the search index?

2. Business Impact Statement

a. How are users impacted?

b. How are business requirement impacted?

c. Are there any deadlines impacted – if yes, when is the deadline?

3. What type of environment is the issue occurring within? DEV/TEST/PROD?

4. Upload screen captures of the search results or other symptoms that indicate the Process Portal search index is out of sync.

5. Provide the TeamworksConfiguration.running.xml and custom XML files for each node

. Use the Index Diagnostic tool to Create Diagnostic Summary Report for Support and upload the .zip file

7. Upload the entire logs directory including FFDC files

8. Provide the results of the following SQL queries:

a. SELECT COUNT(1) FROM <schema_name>.BPM_TASK_INDEX WHERE DELETED_DATETIME IS NOT NULL;

b. SELECT COUNT(1) FROM <schema_name>.BPM_INSTANCE_INDEX WHERE DELETED_DATETIME IS NOT NULL;

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SS8JB4","label":"IBM Business Automation Workflow"},"Component":"Process Portal","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB76","label":"Data Platform"}},{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSFPJS","label":"IBM Business Process Manager"},"Component":"Process Portal","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB76","label":"Data Platform"}}]

Tips

Process Portal Search Index Comprehensive

White Papers

Abstract

Content

Introduction

How it Works

Brief Description

Phase 1 - Operational Data Changes Trigger Runtime Events

Phase 2 - Search Event Point Listener Subscribed to the Events Triggers Updates to Index-Tracking Tables

Phase 3 – De-queueing the Task and Process Instances that need to be updated

Phase 4 - Updating the Lucene Search Index

Configuring the Process Portal Search Index

<task-index-enabled>

<task-index-update-interval>

<task-index-update-completed-tasks>

<task-index-store-fields>

<task-index-work-manager>

<task-index-include-system-tasks>

<process-index-instance-completion-best-effort>

<process-index-queue-update-size>

Using a Shared Search Index

Maintaining the Process Portal Search Index

Stay Current on the Latest Fixes

Index Diagnostic Utility Tool

Index Diagnostic Utility Tool v3.0

Index Diagnostic Utility Tool v4.0

Clean up the Search-Index-Tracking database Tables

Maintain and Tune the database

How to Determine an Issue Exist

Process Portal Observations

Index Diagnostic Reports

Server Logs

Database Deadlocks & Lock Escalations

Javacore & Heap dumps

How to Resolve Common Issues

Solving Simple Out of Sync Scenarios

High CPU Usage & OutofMemoryError Exceptions due to TaskIndexVO

Search Index Job Commits Frequently Failing

Database Deadlocks & Lock Escalations

NullPointerException Errors Reported in the Server Logs

Providing MustGather data for IBM BPM Support

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?