Cluster Resource Group Exit Program


  Required Parameter Group:


  Include: QSYSINC/QCSTCRG3

For each cluster resource group that has an exit program specified, the exit program is called when various Cluster Resource Services APIs are used or when various cluster events occur. The exit program is called on each active node in the cluster resource group's recovery domain and is passed an Action Code that tells the exit program what function to perform.

An active node in the cluster resource group's recovery domain means that cluster resource services and the job for the particular cluster resource group are running on the node.

The exit program is required for data, application and peer cluster resource groups and is responsible for providing and managing the environment necessary for the resource's resilience.

The exit program is optional for device cluster resource groups because the system manages resilient devices. An exit program may be specified for a device cluster resource group if a user has additional functions to perform during various APIs or cluster events.

The exit program is called from a separate job which is started with the user profile specified on the Create Cluster Resource Group (QcstCreateClusterResourceGroup) API. For most action codes, Cluster Resource Services waits for the exit program to finish before continuing. A time out is not used. If the exit program goes into a long wait such as waiting for a response to a message sent to an operator, no other work will be started for the affected cluster resource group. In the case of a long wait during failover processing for a node failure, all Cluster Resource Services jobs are affected and no other cluster work will be started. Care should be exercised in the exit program when the possibility of a long wait exists.

In general if the exit program is unsuccessful or ends abnormally, the exit program will be called a second time with an action code of Undo. This allows any unfinished activity to be backed out and the original state of the cluster resource group and the resilient resource to be restored. There are some exceptions to this general statement about Undo. Some APIs continue even if the exit program is not successful and do not make a second call with an Undo action code. Also, an application cluster resource group exit program is not called with Undo if it fails while processing the Start action code for a Switchover or Failover.

More information about action codes, functions an exit program should perform, and what causes an exit program to be called is presented after the exit program parameters are described.

The exit program is restricted to the Cluster Resource Services APIs or commands it can use. Only the following are allowed:

Also, the exit program must follow these guidelines to run properly in the job Cluster Resource Services starts for it and to handle error conditions correctly.

Note: See Cluster Resource Services Job Structure for additional information about jobs used to call exit programs.

Cluster middleware IBM® Business Partners and available clustering products provide software products that replicate data to other nodes in a cluster by using data cluster resource groups. Application cluster resource groups may have dependencies on these data cluster resource groups. An application cluster resource group exit program can be used to coordinate activities with data cluster resource group exit programs that are provided by an HABP.

Sample source code that can be used as the basis for writing an exit program is shipped in the QUSRTOOL library. See the TCSTAPPEXT and TCSTDTAARA members in the QATTSYSC file for an example written in ILE C.


Authorities and Locks

None.


Required Parameter Group

Success indicator
OUTPUT; BINARY(4)

Indicates to Cluster Resource Services the results of the cluster resource group exit program. The exit program must set this parameter before it ends. If the job running the exit program is cancelled before the exit program ends, the exit program cancel handler should set this parameter. Possible values of this parameter

Some APIs ignore this field. In other words, regardless of what value is set by the exit program these functions continue to completion and do not backout partial results or call the exit program a second time with an Undo action code. Likewise, the exit program should make every attempt to complete successfully for these APIs. This field is ignored by the following:

An informational, alert message, CPIBB10, will be sent if the exit program returns anything other than Successful, has an unhandled exception, or the job running the exit program is cancelled.-

See When the Exit Program Ends for additional information about the Success indicator.

Action code
INPUT; BINARY(4)

Identifies the cluster API or event that is being processed and, therefore, the action the exit program should perform. The action codes listed below apply to all cluster resource group types unless otherwise specified. See also action code dependent data field in the Field Descriptions below. This further defines the action code.

Possible action codes:


Exit program data
INPUT; CHAR(256)

Because this parameter is passed between nodes in the cluster, it can contain anything except pointers. For example, it can be used to provide state information. The owner of the cluster resource group knows the layout of the information contained in this parameter.

This data comes into existence when the cluster resource group object is created with the Create Cluster Resource Group (QcstCreateClusterResourceGroup) API or the Create Cluster Resource Group (CRTCRG) command. Change this data in the following ways:

See the description of each API in Cluster Resource Group APIs.

Information given to user
INPUT; CHAR(*)

Detailed information for this exit program call. See the EXTP0100 Format, EXTP0101 Format, EXTP0200 Format, or EXTP0201 Format for more information.

Format name
INPUT; CHAR(8)

The format of the information provided in the Information Given To User parameter. If the exit program is called with a second action code such as Undo, the format contains the same data as was passed the original action code. The format name supported is:


EXTP0100 Format

This format contains information for the cluster event.


EXTP0101 Format

This format contains information for the cluster event.

EXTP0200 Format

This format contains information for the cluster event with additional information about site name and data port IP addresses on each node in the recovery domain.


EXTP0201 Format

This format contains information for the cluster event with additional information about site name and data port IP addresses on each node in the recovery domain.



Field Descriptions

Action code dependent data. For some action codes, additional information is provided to describe the action code. This field is used during:

The possible values are:

Allow active takeover IP address. Allows a takeover IP address to already be active when it is assigned to an application cluster resource group. This field is only valid when configure takeover IP address field is 0x01. Possible values are:

Application id. This is an application identifier for the Peer CRG type. It identifies the application supplying the peer cluster. Recommend format is 'vendor-id.name' where vendor-id is an identifier for the vendor creating the cluster resource group (i.e.QIBM.ExamplePeer). This indicates it is supplied by IBM for the ExamplePeer application. It is not recommended to use QIBM for vendor id name unless the cluster resource group is supplied by IBM. This field only applies to peer cluster resource groups.

Changing node ID. The node in the recovery domain being assigned a new role or status. This field is hexadecimal zeroes if it doesn't apply.

A special value of *LIST is specified for this parameter when more than one node is changed. The special value is left-justified. When *LIST is specified, entries in the recovery domain array and the prior recovery domain array can be compared to determine which nodes have had changes to the node role or membership status.

This field is used during:

Changing node role. The role the node is being assigned. This field is used by the same situations that the Changing node ID field is used. The values are:

Cluster name. The name of the cluster containing the cluster resource group.

Cluster resource group attributes. A bit mask that identifies various cluster resource group attributes. The 64 bits in this field are numbered 0 thru 63 starting with the rightmost bit. If a bit is set to '1', it indicates the cluster resource group has that attribute. The meaning of each of the bits are:

This field applies only to application cluster resource groups.

Cluster resource group changes. A bit mask that identifies the fields in the cluster resource group that are being changed by the Change Cluster Resource Group API. Set to hexadecimal zeroes for all other exit program calls. The 64 bits in this field are numbered 0 thru 63 starting with the rightmost bit. If a bit is set to '1', it indicates that the action represented by the bit is occurring. Even though multiple bits may be set to indicate several things are being changed, the exit program is called only when the recovery domain is changed. For more information, see the Change Cluster Resource Group (QcstChangeClusterResourceGroup) API or Change Cluster Resource Group (CHGCRG) command. This field is used by the Change and Undo action codes. The meaning of each of the bits is:

Cluster resource group name. The cluster resource group that is being processed by Cluster Resource Services.

Cluster resource group status. Status of the cluster resource group at the time the exit program is called. Possible values include:

Additional information for cluster resource group status can be found in Cluster Resource Group APIs.

Cluster resource group type. The type of cluster resource group:

Cluster version. The exit program is being called to process the action code at this cluster version. This value determines the cluster's ability to use new functions supported by the cluster. It is set when the cluster is created and can be changed by the Adjust Cluster Version (QcstAdjustClusterVersion) API or Change Cluster Version (CHGCLUVER) command. Note: When the Adjust Cluster Version API is executed, there is a small window of time where the cluster and cluster resource group job may be operating at different cluster versions.

Cluster version modification level. The exit program is being called to process the action code at this modification level The modification level further identifies the version at which the nodes in the cluster can communicate. It is updated when code changes that impact the version are applied to the system. Note: When the Adjust Cluster Version API is executed, there is a small window of time where the cluster and cluster resource group job may be operating at different cluster version modification levels.

Configuration object array. This array identifies the resilient devices that can be switched from one node to another. This array is present only for a device cluster resource group.

Configuration object name. The name of the auxiliary storage pool device description object which can be switched between the nodes in the recovery domain. For cluster version 8 and prior an auxiliary storage pool device description can be specified in only one cluster resource group in a device domain. For cluster version 9 and later an auxiliary storage pool device description can be specified in more than one cluster resource group in a device domain if the cluster resource group recovery domains do not overlap.

Configuration object online. Vary the configuration object on or leave the configuration object varied off when a device is switched from one node to another or when it is failed over to a backup node. Possible values are:

Configuration object online status. The status of the vary on for the configuration object on the new primary node. Possible values are:

Configuration object type. This specifies the type of configuration object specified with configuration object name. Possible values are:

Current node ID. Identifies the node running the exit program.

Data port IP address - IPv4. The IP address associated with the recovery domain node. This is a dotted decimal format field and is a null-terminated string. If the actual address is not an IPv4 address, the special value *IPV6 padded on the right with hex zeros is returned.

Data port IP address - IPv4 or IPv6. The IP address associated with the recovery domain node. Either an IPv4 or IPv6 address is supported. When data port IP address type field is 0, the address returned is an IPv4 address in dotted decimal format and padded on the right with hex zeros. When data port IP address type field is 1, the address returned is an IPv6 address and padded on the right with hex zeros. The coded character set identifier (CCSID) of the IP address returned will match the CCSID of the exit program job.

Data port IP address array. Array of data port IP addresses in use by the node in the recovery domain entry.

Data port IP address type. Type of IP address that follows in the Data port IP address - IPv4 or IPv6 field. The possible values are:

Data port site name. The site name that the data port address is to be used with. Only allowed if current cluster version is 8 or higher.

Device subtype. A device's subtype. This information is only as current as the last time the cluster resource group object could be updated. If configuration changes have been made on the node which owns the hardware and those changes have not yet been distributed to all nodes in the cluster, this information may be inaccurate. The data cannot be distributed if the configuration was changed on a node which does not have cluster resource services running. Possible values are:

Device type. This specifies the type of device. Possible values are:

Distribute information user queue library name. The name of the library that contains the user queue to receive the distributed information. This field will be set to hexadecimal zeros if no distribute information user queue name was specified when the cluster resource group was created.

Distribute information user queue name. The name of the user queue to receive distributed information from the Distribute Information API. This field will be set to hexadecimal zeros if no distribute information user queue name was specified when the cluster resource group was created.

Failover default action. Should a response to the failover message queue not be received in the failover wait time limit, then this field tells clustering what it should do pertaining to the failover request. This field applies to all primary-backup model cluster resource groups.

Failover message queue library name. The name of the library that contains the user queue to receive failover messages. This field will be set to hexadecimal zeros if no failover response user queue name was specified. This field applies to all primary-backup model cluster resource groups.

Failover message queue name. The name of the message queue to receive messages dealing with failover. This field will be set to hexadecimal zeros if no failover response user queue name was specified. This field applies to all primary-backup model cluster resource groups.

Failover wait time. Number of minutes to wait for a reply to the failover message that was enqueued on the failover message queue. This field applies to all primary-backup model cluster resource groups.

Job name. Name of the job associated with a cluster resource group exit program.

Leader node id. This field identifies the name of a recovery domain node that is actively participating in the current protocol for the given cluster resource group. A value of hexadecimal zero means the exit program cannot use this field. This field only applies to a peer cluster resource group.

The leader node id is available for these action codes:

Length of configuration object array entry. This specifies the length of an entry in the configuration object array. This field applies only to device cluster resource groups.

Length of entry in the recovery domain. The length of an entry in the recovery domain array. This field is used if each entry may have a different length.

Length of prior recovery domain array entry. The length of an entry in the prior recovery domain array. For EXTP0100 and EXTP0101 formats this length should be used to navigate to the next prior recovery domain array entry.

Length of recovery domain array entry. The length of an entry in the recovery domain array. For EXTP0100 and EXTP0101 formats this length should be used to navigate to the next recovery domain array entry.

Length of data port IP address array entry. The length of an entry in the data port IP address array. For format EXTP0201 this length should be used to navigate to the next data port IP address array entry.

Length of information given to user. The length of the data passed in the format.

Membership status. The cluster resource group membership status for the current role of a node:

Node ID. A unique string of characters that identifies a node in the recovery domain.

Node role. The role a node is to be assigned at the successful completion of the action code being processed. For primary-backup model cluster resource groups node can have one of three roles: primary, backup, or replicate. For peer model cluster resource groups a node can have one of two roles: peer or replicate. Any number of nodes can be designated as the peer or replicate.

Node role type. Indicates which of the two node roles is being processed:

Number of entries in configuration object array. The number of resilient device entries in the Configuration Object Entry array. This field has a value of 0 for a data or application cluster resource group. This field applies only to device cluster resource groups.

Number of data port IP addresses. The number of data port IP addresses associated with the recovery domain node. This field has a value of 0 for a data or application cluster resource group. This field applies only to device cluster resource groups.

Number of nodes in the prior recovery domain. The number of nodes in the prior recovery domain. This is the number of elements there are in the Prior Recovery Domain Array. This will be 0 if the Prior Recovery Domain Array is not included. This field is used during:

Number of nodes in the recovery domain array. The number of nodes in the recovery domain. This is the number of elements in the recovery domain array.

Offset to configuration object array. The byte offset from the beginning of the format to the list of resilient devices. This field has a value of 0 for a non-device cluster resource group. This field applies only to device cluster resource groups.

Offset to data port IP address array. The byte offset from the beginning of the format to the list of data port IP addresses for a recovery domain node. This field has a value of 0 for a non-device cluster resource group. This field applies only to device cluster resource groups.

Offset to prior recovery domain array. The byte offset from the beginning of the format to the array of nodes in the prior recovery domain. This will be 0 if the prior recovery domain array is not included. This field is used during:

Offset to recovery domain array. The byte offset from the beginning of the format to the array of nodes in the recovery domain.

Original cluster resource group status. The original status of the cluster resource group before it was changed to some pending status while an API is running. For example when the exit program is called for the Start Cluster Resource Group (QcstStartClusterResourceGroup) API, the Cluster resource group status field will contain 550 (Start CRG Pending) while this field will contain 20 (Inactive) or 30 (Indoubt). Possible values include:

Additional information for cluster resource group status can be found in Cluster Resource Group APIs.

Preferred node role. The preferred role a node is assigned. See Node role for a more detailed description of the node role.

Prior action code. When a cluster resource group exit program is called with an action code of Undo (15), the action code for the unsuccessful operation is placed in this field. Otherwise, this will be hex zeroes.

Prior recovery domain array. The prior recovery domain array contains the view of the recovery domain before changes were made as a result of the API being used or a cluster event occurring.

For example if a switchover is done, the prior recovery domain array will have the view with the old primary and backup order. The recovery domain array will have the view with the new primary and backup order.

If an event such as a node failure occurs, the prior recovery domain array will have the old membership status for the failing node such as Active while the recovery domain array will have the new status such as Inactive.

In most cases, the prior recovery domain is a view of the current recovery domain. If the Change Cluster Resource Group (QcstChangeClusterResourceGroup) API is being used to change the preferred recovery domain, the prior recovery domain will have a view of the preferred recovery domain.

The prior recovery domain array is available for these action codes:

Recovery domain array. The nodes that are the recovery domain for the cluster resource group. This view of the recovery domain will contain any changes made to the node's membership status or the node's role by the API or cluster event which caused the exit program to be called.

Request handle. Uniquely identifies the API request. It is used to associate responses on the user queue specified in the Results Information parameter. This field will have a null value when the exit program is called with an action code of Failover (9).

Requesting user profile. This is the user profile that initiated the API request.

Reserved. This field is reserved and is set to hexadecimal zeroes.

Server takeover IP address - IPv4. This is a takeover IP address for servers associated with the relational database. This is a dotted decimal field and is a null-terminated string. If the actual address is not an IPv4 address, the special value *IPV6 padded on the right with hex zeros is returned. This field only applies to device cluster resource groups.

Server takeover IP address - IPv4 or IPv6. This is a takeover IP address for servers associated with the relational database name in the device description for an auxiliary storage pool. It is set to hexadecimal zeroes for other cluster resource group types. Either an IPv4 or IPv6 address is supported. When server takeover IP address type field is 0, the address returned is an IPv4 address in dotted decimal format and padded on the right with hex zeros. When server takeover IP address type field is 1, the address returned is an IPv6 address and padded on the right with hex zeros. The coded character set identifier (CCSID) of the IP address returned will match the CCSID of the exit program job. If not specified, or for a secondary and UDFS auxiliary storage pool, this field will contain *NONE left justified and padded with blanks.

Server takeover IP address type. Type of IP address that follows in the Server takeover IP address - IPv4 or IPv6 field. The possible values are:

Site name. The name of the site associated with the recovery domain node. This field only applies to device cluster resource groups.

Takeover IP address - IPv4. This is the floating IP address that is associated with an application. This is a dotted decimal field and is a null-terminated string. If the actual address is not an IPv4 address, the special value *IPV6 padded to the right with hex zeros is returned. This field is used only by application cluster resource groups.

Takeover IP address - IPv4 or IPv6. The floating IP address that is to be associated with the application. This field is only meaningful for an application cluster resource group. It is set to hexadecimal zeroes for other cluster resource group types. Either an IPv4 or IPv6 address is supported. When takeover IP address type field is 0, the address returned is an IPv4 address in dotted decimal format and padded on the right with hex zeros. When takeover IP address type field is 1, the address returned is an IPv6 address and padded on the right with hex zeros. The coded character set identifier (CCSID) of the IP address returned will match the CCSID of the exit program job.

Takeover IP address type. Type of IP address that follows in the Takeover IP address - IPv4 or IPv6 field. The possible values are:


Application Takeover IP Address Management

The takeover IP address is the IP address used to control how clients access the application as the point of access for the application moves from one node to another during Switchover or failover. The takeover IP address is started only on one node at a time. That node is the primary node in the cluster resource group's recovery domain. The takeover IP address can be configured by Cluster Resource Services or it can be configured by the user. This attribute is specified on the Create Cluster Resource Group API and is passed to the exit program in the cluster resource group attributes field.

The following table shows which cluster APIs and events configure and manage the takeover IP address. This occurs only for application cluster resource groups. Additional information about the takeover IP address can be found in Cluster Resource Group APIs.

Table 1. Takeover IP Address Management



When the Exit Program Ends

When an exit program is called with an action code, control can return to its caller because it set the success indicator and returned, had an unhandled exception, or the exit program job was cancelled.

Setting the Success Indicator and Returning

The returned value of the success indicator is used by the operating system in different ways depending upon the action code. For most action codes, anything other than Successful will result in the exit program being called again with an action code of Undo to backout the actions previously performed. There are two exceptions to this.

One, if an application exit program was called with an action code of Start, setting the success indicator to Unsuccessful, attempt restart will result in the exit program being called with Restart. Being called with an action code of Restart will occur as long as the restart count has not been reached. When the restart count is reached, failover occurs and the application is started on the first active backup node.

The exit program is not called with Restart if either an Unsuccessful, do not attempt restart indicator is returned, the exit program sets the success indicator to Successful and returns, or the cluster resource group is ended with the End Cluster Resource Group (QcstEndClusterResourceGroup) API.

Two, some action codes always proceed regardless of the exit program success indicator and the exit program is not called again with an action code of Undo. These are:

If the exit program returns an unsuccessful indicator from Undo, the cluster resource group's status is set to Indoubt.


An Exception Occurs

An unhandled exception is treated the same way as an unsuccessful indicator. The exit program will be called again with either Restart or Undo except for the same action codes listed above where it is not called again with Undo.

If the exit program does not handle an exception while processing Undo, the cluster resource group's status is set to Indoubt.


Job is Cancelled

If the exit program job is cancelled and the exit program was performing the function of any action code other than Undo, Start, or Restart, it is treated as an unsuccessful indicator. The exit program is called with an Undo action code except for those action codes listed above where it is not called again with Undo.

If the exit program was cancelled while performing the function of Undo, the cluster resource group's status is set to Indoubt.

If the exit program was cancelled while performing the function of Start or Restart, the cluster resource group is ended; failover does not occur. It is the responsibility of the exit program cancel handler to also end any other jobs or subsystems it may have started.

An exit program job always has an associated cluster resource group job. It is the associated cluster resource group job that submits the exit program job. If the cluster resource group job is cancelled while an exit program is running, the exit program job is also cancelled. If the cluster resource group job is cancelled, the exit program is called with the End Node action code on the node where the job was cancelled.


Restarting an Application Cluster Resource Group Exit Program

Cluster Resource Services uses a restart count to control how often an active application will be restarted on the primary node before a failover occurs. The restart count is specified on the Create Cluster Resource Group (QcstCreateClusterResourceGroup) API or the Change Cluster Resource Group(QcstChangeClusterResourceGroup) API for application cluster resource groups. If the specified value is 0, the failed application will not be restarted on the primary node but failed over to the first backup. If the specified value is greater than 0, Cluster Resource Services will call the exit program with an action code of Restart after having initially called the exit program with an action code of Start. It will continue to do this for each failure, until the restart count has been reached. The exit program will be called with an action code of Restart if it returns from handling the Start action code in one of these ways:

Once the restart count has been reached, Failover will be attempted in order to start the application on the first active backup node. The restart count is reset only when the exit program is called with a Start action code. This occurs with the Start Cluster Resource Group (QcstStartClusterResourceGroup) API or the Initiate Switchover (QcstInitiateSwitchOver) API or the failover event.


Multiple Action Codes

In most situations, cluster APIs or events result in the exit program being called with a single action code. When the exit program completes successfully, the exit program is not called again for that API or cluster event. There are several situations where successful completion results in the exit program being called twice. This occurs for active application cluster resource groups for the Initiate Switchover API and the failover cluster event. In both cases, the exit program is called on the new primary first with either the Switchover or Failover action code. During this time, the exit program should do any preparation work necessary to start the application but should not yet start the application. When the exit program returns with a successful indicator, it will be called a second time with the Start action code to start the application.

Another situation occurs when a cluster resource group is deleted using either the Delete Cluster Resource Group API or Delete Cluster Resource Group From Cluster command. The exit program will be called first with Verification Phase action code and then with the Delete action code. If the verification phase returns with a unsuccessful indicator, the exit program will not be called a second time and the cluster resource group will not be deleted.

Causes of the Failover Event

It is natural to think of the failover event being caused by the most obvious problem: a node fails. The node failure could be due to a hardware problem such as the loss of a processor or an environmental problem such as the loss of electrical power.

There are a wide variety of other things that can cause a failover event when it occurs on a node that is in a cluster resource group's recovery domain. For details about causes of failover events and recovery actions from these events, see Managing failover outage events in the Implementing high availability topic collection.

The failover event always calls the exit program so that the exit program is aware a member left the cluster. The exit program is called regardless of the state of the cluster resource group: active, inactive, or indoubt. Also, the exit program is called regardless of which member left the cluster: primary, backup, replicate or peer. The exit program must look at both the state of the cluster resource group and the role of the node that left in order to perform the correct action.

Cluster resource groups should failover in a particular order when a node failure occurs. That order is device cluster resource groups first, application resource groups, and then data cluster resource groups. Peer cluster resource groups failover in parallel with the other cluster resource group types.


Partition Processing

A cluster enters a partition state when a failure occurs that cannot conclusively be identified as a node failure. Cluster Resource Services detects that communication with another node or nodes has been lost but cannot determine why. A classic example is the failure of a communication line between the systems.

The exit program is called when a cluster partitions. The membership status for each partitioned node in the recovery domain will be set to Partition. However, this is different for each cluster partition. For example, suppose we have a 2 node cluster with nodes A and B, both nodes are in a cluster resource group's recovery domain, and the cluster partitions. When the exit program on A is called, the recovery domain will indicate that A is active and B is partitioned. When the exit program on B is called, the recovery domain will indicate that B is active and A is partitioned.

For primary-backup model cluster resource groups:

For peer model cluster resource groups:


Handling the Undo Action Code

When Cluster Resource Services processes an API or cluster event and an exit program is called, a failure either in the exit program or in Cluster Resource Services after the exit program ends results in an attempt to recover the prior state of the cluster resource group and its resilient resources.

Actions performed by Cluster Resource Services which changed the cluster resource group are backed out. The exit program is called with an action code of Undo so that actions it took can also be backed out.

If the exit program had nothing to do for an action code, its work to handle the Undo is trivial. Merely set the success indicator to Successful and return.

If the exit program has a failure and can back out its actions as part of handling the original action code, it may also have little or nothing to do when called with the Undo action code. Doing this back out as part of the original action code processing may be driven from the procedure which detected the problem, or from an exception handler, or from a cancel handler.

When the exit program handles the original action code successfully but Cluster Resource Services subsequently detects an error that requires the API or cluster event to be backed out, the Undo processing by the exit program becomes more involved. While the exit program is passed the action code it worked on before being called with Undo, there may be other information the exit program will have to obtain in order to successfully perform the back out. Any required back out information will have to be kept where a new job can be access it.

The format data passed to the exit program for Undo is exactly the same as was passed for the original action code except for the prior action code field.

A cluster resource group's status is returned to its original value if both the exit program and Cluster Resource Services handle the Undo action code successfully. If Cluster Resource Services is unable to back out changes or the exit program sets the success indicator to anything other than Successful, the status of the cluster resource group is set to Indoubt. When this occurs, someone such as an operator or programmer may have to be involved to determine what errors caused the problem.


Reasons an Exit Program is Called

The table below shows the reasons an exit program is called and maps the reason to the Action Code parameter on the cluster resource group exit program. The third and fourth columns of the table give suggestions for the types of things a data or application cluster resource group exit program might do for an action code.

The following Cluster Resource Group APIs or commands do not cause the exit program to be called:

For a device cluster resource group, neither the replication provider nor the application provider need to supply an exit program. An exit program is optional. An exit program is required only if customer specific activities are required for resilient devices. Some examples of why a customer may wish to provide an exit program might include:


Table 2. Reasons an Exit Program Is Called



Action Code Cross Reference

Some action codes are used by more than one API or cluster event. The following table is a cross reference between an action code and which API or cluster event uses it. The action code dependent data value is listed in parenthesis after each API and cluster event. Those with no specified dependent data value have a value of No Information (0).

Table 3. API and Cluster Event to Action Code Cross Reference



Exit program introduced: V4R4

[ Back to top | Cluster APIs | APIs by category ]