About Exception Handling
Process Manager provides built-in exception handlers you can use to automatically take corrective action when certain exceptions occur, minimizing the human intervention required. You can also define your own exception handlers for certain conditions.
Built-in exception handlers
The built-in exception handlers are:
- Rerun
- Kill
- Opening an alarm
Rerun
The Rerun exception handler reruns the entire work item. Use this exception handler in situations where rerunning the work item can fix the problem. The Rerun exception handler can be used with Underrun, Exit and Start Failed exceptions. Work items that have a dependency on a work item that is being rerun cannot have their dependency met until the work item has rerun the last time. When selecting the Rerun exception handler, you can specify the maximum number of times the exception handler reruns the work item.
Kill
The Kill exception handler kills the work item. Use this exception handler when a work item has overrun its time limits. The Kill exception handler can be used with the Overrun exception, and when you are monitoring for the number of jobs done or exited in a flow or subflow.
If you are running z/OS® mainframe jobs on Windows, you need to configure a special queue and submit jobs to that queue to be able kill them.
Alarm
An alarm provides both a visual cue that an exception has occurred, and either sends an email notification or executes a script. You use an alarm to notify key personnel, such as database administrators, of problems that require attention. An alarm has no effect on the flow itself.
You can use an alarm as an automated exception handler for many types of exceptions.
For other types of exceptions where alarms are not available as exception handlers, you can create an alarm directly in the Flow Editor.
An opened alarm appears in the list of open alarms in the Flow Manager until the history log file containing the alarm is deleted or archived.
Alarms are configured by the Process Manager administrator.
Behavior when built-in exception handlers are used
The following describes the behavior when an exception handler is used.
Flows
When a Flow Experiences this Exception… |
and the Handler Used is… |
This Happens… |
---|---|---|
Overrun |
Kill |
The flow is killed. All incomplete jobs in the flow are killed. The flow status is ‘Killed’. |
Alarm |
The alarm is opened. The flow continues execution as designed. |
|
Underrun |
Rerun |
Flows that have a dependency on the success of this flow may not be triggered, depending on the type of dependency. The flow is recreated with the same flow ID. The flow is rerun from the first job, as many times as required until the execution time exceeds the underrun time specified. |
Alarm |
The alarm is opened. |
|
Flow has exit code of n |
Rerun |
Flows that have a dependency on this flow may not be triggered, depending on the type of dependency. The flow is recreated with the same flow ID. The flow is rerun from the first job, as many times as required until an exit code other than n is reached. |
Alarm |
The alarm is opened. Flows that have a dependency on this flow may not be triggered, depending on the type of dependency. |
|
n unsuccessful jobs |
Kill |
The flow is killed. All incomplete jobs in the flow are killed. The flow status is ‘Killed’. |
Alarm |
The alarm is opened. Flows that have a dependency on this flow may not be triggered, depending on the type of dependency. The flow continues execution as designed. |
|
Work item has exit code of n |
Rerun |
Flows that have a dependency on this flow may not be triggered, depending on the type of dependency. The flow is rerun from the first job, as many times as required until the work item has a different exit code. |
Subflows
When a Subflow Experiences this Exception… |
and the Handler Used is… |
This Happens… |
---|---|---|
Overrun |
Kill |
The subflow is killed. The flow behaves as designed. |
Alarm |
The alarm is opened. Both the flow and subflow continue execution as designed. |
|
Underrun |
Rerun |
Work items that have a dependency on this subflow may not be triggered, depending on the type of dependency. The subflow is rerun from the first job, as many times as required until the execution time exceeds the underrun time specified. |
Alarm |
The alarm is opened. The flow continues execution as designed. |
|
Subflow has exit code of n |
Rerun |
Work items that have a dependency on this subflow may not be triggered, depending on the type of dependency. The subflow is rerun from the first job, as many times as required until an exit code other than n is reached. |
Alarm |
The alarm is opened. The flow continues execution as designed. |
|
n unsuccessful jobs |
Kill |
The subflow is killed. The flow behaves as designed. |
Alarm |
The alarm is opened. The flow and subflow continue execution as designed. |
|
A work item has exit code of n |
Rerun |
Work items that have a dependency on this flow may not be triggered, depending on the type of dependency. The flow is rerun from the first job, as many times as required until the work item has a different exit code. |
Job or job array
When a Job or Job Array Experiences this Exception… |
and the Handler Used is… |
This Happens… |
---|---|---|
Overrun |
Kill |
The job or job array is killed. The flow behaves as designed. The job or job array status is determined by its exit value. |
Alarm |
The alarm is opened. Both the flow and job or job array continue to execute as designed. |
|
Underrun |
Rerun |
Objects that have a dependency on this job or job array may not be triggered, depending on the type of dependency. The job or job array is rerun as many times as required until the execution time exceeds the underrun time specified. |
Alarm |
The alarm is opened. The flow continues execution as designed. |
|
An exit code of n |
Rerun |
The job or job array is rerun as many times as required until it ends successfully. |
Alarm |
The alarm is opened. The flow behaves as designed. |
|
n unsuccessful jobs |
Kill |
The job array is killed. The flow behaves as designed. The job array status is determined by its exit value. |
Alarm |
The alarm is opened. The flow continues execution as designed. |
User-defined exception handlers
In addition to the built-in exception handlers, you can create your flow definitions to handle exceptions by:
- Running a recovery job
- Triggering another flow
Recovery job
You can use a job dependency in a flow definition to run a job that performs some recovery function when an exception occurs.
Recovery flow
You can create a flow that performs some recovery function for another flow. When you submit the recovery flow, specify the name of the flow and exception as an event to trigger the recovery flow.