IBM Support

A looping or run away business process definition (BPD) or service might occur in IBM Business Process Manager (BPM)

Question & Answer


Question

During testing and development, a business process definition or service might accidentally enter into an infinite loop. What is the procedure to clean up these unnecessary instances and tasks and stop the looping?

Cause

IBM Business Process Manager is a development platform and has many of the same programming structures and features that are found in traditional programming, such as looping, recursion, and so on. Looping on a service or business process definition is possible. As with any looping event, an exit condition must exist. Sometimes during development or production runtime, an infinite loop can occur.

IBM Business Process Manager has a component called the Event Manager. The Event Manager moves tokens in the business process definition engine and the service engine. The Event Manager continues to process a looping event until the event manager is stopped or the loop stops.

Answer

Use the Process Monitor to view high looping instances

Open the process admin console. Then click Monitoring, the Process Monitor. This approach lists all processes and instances currently executing. You can click an executing process or service and click halt to stop that executing task.


Stop the event manager

Stopping the event manager is the first step in controlling a run away process. If the environment is a cluster, stop the Event Manager on each node using the following steps in the Process Admin Console:

  1. Select Event Manager from the left column.
     
  2. Select each event manager and click pause.
     

If the server is under extreme load, it might not be possible to reach this screen or for the Event Manager to pause from the admin screen.

The next option is to stop the Event Manager at server start up using the following steps:

  1. Create a XML document called 120pauseEM.xml that contains the following code:
    <properties>
      <event-manager>
         <scheduler>
            <start-paused merge="replace">true</start-paused>
         </scheduler>
      </event-manager>
    </properties>

     
  2. Place the file in the same directory as the 100custom.xml file for the Process Server. For more information, see the Configuration file overview and explanation for Lombardi Teamworks, WebSphere Lombardi Edition (WLE), and the IBM Business Process Manager (BPM) products document.
     
  3. Restart the servers.
     

Now the Event Manager is paused. The business process definitions and services do not run.


Note: For clusters, place the file in the deployment manager section and synchronize the nodes.


Stop and delete looping instance and tasks.
After the server is started (with EM paused), find the task or instance that is looping and delete it. If there are a few instances, use the Process Inspector in IBM Process Designer to delete these rogue instances and tasks. In the Process Admin Console, use the Process Monitor to find business process definitions or services that have run for a large number of counts. This approach gives you an idea of where to look for the looping instance.


For a large volume of erroneous instances or tasks which need to be deleted, use the REST API to terminate the instance. This will need to be called for each instance. Terminating will stop all tasks and allow the instance to be deleted with the wsadmin command BPMProcessInstanceCleanup (BPM 8.0-8.5.6) or BPMProcessinstancePurge (BPM 8.5.7).

Other considerations

What if the problem is an undercover agent (UCA)?
  • For scheduled, time-based undercover agents, they might be set to run every minute and this processing interval causes too much overhead. You can disable this process through the Installed Apps tab in the Process Admin console. Complete the following steps:
    1. Find the application and disable the undercover agent.
       
    2. Make a workspace from the deployed application.
       
    3. Correct the settings for the scheduled undercover agent.
       
    4. Create a new snapshot in the workspace.
       
    5. Deploy the application to the runtime server to correct this version.
       
  • For an event-based undercover agent that kicked off some set of instances, complete the following steps:
    1. If a service or business process definition looped calling an undercover agent, delete this service.
       
    2. Delete the business process definitions that received the messages.

  • Note: There are additional considerations with event-based UCAs. This scenario should only be done in a NON-production environment. Other business data might be present from valid sources in production.

    When a UCA is triggered, a Java™ Message Service (JMS) message is created and sent. This message is picked up by a start message event (SME) or intermediate message event (IME). Before a SME or IME picks up the message during the event manager poll, it is stored in the messaging system. These are SIB tables in the process database. To remove all messages in transit, stop all IBM Business Process Manager servers and delete the SIB tables. Then, restart the IBM Business Process Manager servers. Before deleting, confirm that the check box for re-creating the tables is present. For more information, see the A malformed Java Message Service (JMS) message causes a repeating error in SystemOut.log file for IBM Business Process Manager (BPM) document. Depending on your database and topology, the SIB tables might be in a different schema than the Process Server database.



Identifying run away instances from the database

You might be unable to determine which instances are run away. An instance might be looping and creating hundreds or thousands of tasks. The following query locates open tasks for instances and orders them from high to low. The create time can help you to determine when they were created. You might see an average of 50 tasks per instance. If the report shows an instance with 2,000 tasks, that instance is likely looping.
 
  • select COUNT(t.BPD_INSTANCE_ID) as "InstanceCount", t.BPD_INSTANCE_ID, bpd.INSTANCE_NAME, bpd.CREATE_DATETIME

    from LSW_TASK t inner join LSW_BPD_INSTANCE bpd on bpd.BPD_INSTANCE_ID = t.BPD_INSTANCE_ID

    where t.status <> 32

    group by t.BPD_INSTANCE_ID, bpd.INSTANCE_NAME, bpd.CREATE_DATETIME
    order by InstanceCount desc, bpd.CREATE_DATETIME desc

Identifying services which are looping

For looping services which were started in an instance can be halted from the process monitor screen. Turning traces on will allow discovery of which task ID is associated with the looping. Please see this dwAnswer post for more information



What if an outside system caused the influx, such as a web service or email system?
Add controls and logic to prevent outside systems from stem-rolling IBM Business Process Manager. You need to identify how the instances are created. For example:
  • Does this situation involve a call to a web service that then triggers the start of services or business process definitions?
     
  • Is there an undercover agent that reads email messages, kicks off tasks, and the inbox was spammed?



There are two approaches to prevention.
  • Ensure that the kick off system does not go out of control.
     
  • Put extra filters or limitations on the receiving IBM Business Process Manager side. An example might be that you can only receive one request per customer ID per day. Another example might be if you receive more than X requests in Y time, stop processing.
 

Additional Clean up of performance data warehouse data.

A looping business process definition (BPD) with tracking enabled generates large volumes of data. The performance data warehouse database might have a huge increase in size as well, in the rage of several gigabytes.

Considerations for non-production environment

The process center and other environments are for testing and development. Any data in the Performance Data Warehouse (PDW) is from testing and does not have business value. Often the easiest method is to stop the Performance Data Warehouse server and rebuild the database tables with the installation SQL scripts. After this step, you need to resend the tracking definitions for all active applications. You can resend tracking definitions through the Process Admin console under the Installed Apps tab. For more information, see the How to clean up the Performance Data Warehouse database and the LSW_PERF_DATA_TRANSFER table for IBM Business Process Manager (BPM) document.

Considerations for production environments

In IBM Business Process Manager V8.0.1 Fix Pack 2 and V8.5.0 Fix Pack 1, you can use the perfdwtool.cmd or perfdwtoo.sh command to archive and delete data from the Performance Data Warehouse. Using this tool is the best option for removing the excess data from a loop. For more information on this command, see the following topics in the product documentation:

 

Looping Services

You can halt it by going to the process monitor screen. If the service is not tied to an instance, you may have to stop all AppTargets to clear out the looping service. Also consider these 2 fixes for your system to stop looping.

  • Feature JR48395, which was introduced in IBM Business Process Manager V8.5.5.0, can detect and stop loops in JavaScript™ blocks. Individual fixes are available on prior V8.x releases.
  • JR51504 is an additional fix which detects when the loop occurs in a method call to Java code. An example is a method in a managed asset which is looping.

Process Monitor and Event Manager Documentation.

Starting in V8.5.5.0, there is a feature that allows you to monitor the IBM Business Process Manager system with MBeans. See the product documentation for more information. The event manager page and its related settings are also explained in the product documentation.


Related information

[{"Product":{"code":"SSFTDH","label":"IBM Business Process Manager Standard"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Business Process Definition (BPD)","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"8.5;8.0.1;8.0;7.5.1;7.5","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}},{"Product":{"code":"SSFTN5","label":"IBM Business Process Manager Advanced"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Business Process Definition (BPD)","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"8.5;8.0.1;8.0;7.5.1;7.5","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}},{"Product":{"code":"SSFPRP","label":"WebSphere Lombardi Edition"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"BPD Execution","Platform":[{"code":"PF002","label":"AIX"},{"code":"","label":"Linux\/x86"},{"code":"PF033","label":"Windows"}],"Version":"7.2;7.1","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Product Synonym

BPM

Document Information

Modified date:
08 June 2020

UID

swg21622584