Configuring transaction properties for peer recovery

Peer recovery for the transaction service enables servers in a cluster to complete outstanding work for a failed cluster member. Follow the steps in this topic to configure the transaction properties that are required for peer recovery of failed application servers in a cluster.

Before you begin

To enable transaction peer recovery between servers, you must have a common configuration of the resource providers between the participating server members. This means that peer recovery processing can only take place between members of the same server cluster. Although a cluster can contain servers that are at different versions of WebSphere® Application Server, you must enable and configure high availability only if all servers in the cluster are at Version 6 or later.

About this task

[z/OS] Peer recovery of transactions is in addition to the support for Peer restart and recovery, which enables you to restart on a peer system in the sysplex. For more information about configuring peer restart and recovery, see Setting up peer restart and recovery.

Configuring the transaction properties that are required for peer recovery is part of the overall task for configuring a cluster to use high availability support.

Procedure

On z/OS® platforms, configure the Resource Access Control Facility (RACF®) to allow the application servers to call the ATRSRV macro.

The ATRSRV macro allows a server to commit and back out transactions for other servers. This process differs from peer restart and recovery support, where the other server is started on another system. The ATRSRV macro is provided by MVS™ Resource Recovery Services (RRS).

The user ID that the application server controller region runs under must have ALTER access to the MVSADMIN.RRS.COMMANDS.gname.sysname resource in the FACILITY class, where gname is the RRS logging group (usually the SYSPLEX name), and sysname is the system name. To allow access to all logging groups and systems, use wildcards in the resource name, for example MVSADMIN.RRS.COMMANDS.*.
Note: Because the controller region runs as an authorized address space, it implicitly has ALTER access to this resource class, unless the RACF configuration explicitly restricts access. By explicitly allowing access to this resource, you are not relying on the authorized state of the controller region.

For more information about the ATRSRV macro and setting the appropriate RACF permissions, see Chapter 8 of MVS Programming: Resource Recovery, SA22-7616-02.
Configure the transaction log directory setting for each server in the cluster.
You can configure the location of the transaction log directory by using either the administrative console or commands. The configuration is stored in the serverindex.xml node-level configuration file.
Each server in the cluster must be able to access the log directories of other servers in the same cluster. For this reason, do not leave this setting unset. If you do not set a directory, the application server assumes a default location within the appropriate profile directory, which might not be accessible to other servers in the cluster.

Each server in the cluster must also have a unique transaction log directory, to avoid attempts by multiple servers to access the same log file. For example, you could use the name of each server as part of the log directory name for that server.

The storage mechanism that is used to host recovery log files (for example, you can use IBM® Network attached storage (NAS) and shared SCSI drives, but not simple network share) and access to that mechanism (for example, through a local area network (LAN)), must support the file-based force operation that is used by the recovery log service to force data to disk.

The storage mechanism that is used to host recovery log files and access to that mechanism must support the file-based force operation that is used by the recovery log service to force data to disk. For example, you can store the logs on another IBM i server by using the NetClient file system (QNTC), which provides access to data on a remote system using the Server Message Block (SMB) protocol.

In addition, configure the mechanism by which the remote log files are accessed, to exploit any fault tolerance in the underlying file system. For example, by using the Network File System (NFS) and hard-mounting the remote directory containing the log files (by using the -o hard option of the NFS mount command), the NFS client will try again with a failed operation until the NFS server becomes available again.

Note: If you have migrated from a previous version of WebSphere Application Server, be aware that previous versions stored the recovery log configuration in the server.xml server-level configuration file. If you run existing scripting that configures the original recovery log settings, or migrate Version 5 application servers to a later version of WebSphere Application Server, the original transaction log directory configuration in the server.xml file is updated. The administrative console detects this condition and prompts you to save the configuration when you view the transaction service panel. This save operation saves the changed configuration to the serverindex.xml file, and resets the older fields to null. Change your existing scripting to target the serverindex.xml file at the earliest opportunity. New scripting should also target the serverindex.xml file.
Enable the high availability function for the cluster, by completing the following steps on the cluster configuration panel of the WebSphere Application Server administrative console:
1. In the administrative console, click Servers > Clusters > WebSphere application server clusters > cluster_name.
2. Select the Enable failover of transaction log recovery option.
3. Click OK.
For more information about enabling the high availability function for a cluster, see Server cluster settings.
Decide which kind of transaction peer recovery to use by referring to How to choose between automated and manual transaction peer recovery.
Complete one of the following actions, depending on the configuration that you require.
- If you want to use automated peer recovery, follow the steps in Configuring automated peer recovery for the transaction service.
- If you want to use manual peer recovery, configure a policy for the transaction service, as described in Configuring manual peer recovery for the transaction service.

What to do next

You must also configure the compensation log location. Each server must have a unique compensation log directory and the compensation logs must be accessible, in a similar way to the transaction logs.