Enqueue replication into a IBM Z coupling facility

Enqueue replication is a high-availability concept that is built-in in SAP systems and that protects SAP enqueue locks in case of planned or unplanned outages of the SAP enqueue server or the host on which it is running. Besides the standard SAP enqueue replication, an additional replication mechanism is available for SAP on IBM Z. This topic describes the business continuity aspects of this additional replication mechanism, and how it can be integrated into an existing Business Continuity setup for SAP. Starting with SAP 7.21 Kernel, the SAP enqueue server running on z/OS® can utilize a second replication mechanism for its enqueue locks. For details on implementation, configuration, and requirements see SAP Note 1753638: Enqueue Replication into System z Coupling Facility together with the attached PDF file.

The standard SAP enqueue replication operates in the following way:

  • A replication server is started on a different host than the one where the enqueue server is running
  • The enqueue server and the replication server communicate via TCPIP
  • Enqueue lock information is sent from the enqueue to the replication server
  • If the enqueue server fails, it must be restarted on the host where the replication server was running
  • The restarted enqueue server picks up the replication information and recovers the enqueue lock information

In contrast to the standard SAP enqueue replication, the second replication method that is available for SAP on IBM Z does no longer require a replication server. Instead, the enqueue server stores its replication information directly into the cross-system coupling facility (XCF) storage, which is accessible from all LPARs in a z/OS sysplex.

In this publication, this additional replication mechanism is abbreviated as EnqCF replication, whereas the standard replication method is called TCPIP-based replication.

With EnqCF replication, the recovery scenario for an enqueue server failure changes as follows:

  • Replication information is constantly and directly stored into the coupling facility (CF) by the SAP enqueue server.
  • If the enqueue server fails, it can be restarted on any LPAR in the sysplex.
  • The restarted enqueue server picks up the replication information from the CF and recovers the enqueue lock information.
Table 1. Comparison of EnqCF replication and TCPIP-based replication
Criteria TCPIP-based replication EnqCF replication
replication server
  • is an essential part of a high availability setup
  • must be started on a host that is different from the one where the enqueue server is running
not needed:
  • no need for an automation policy, which ensures the anti-collocation of enqueue and enqueue replication server
planned or unplanned failover of replication server
  • replication server must be restarted on a host that is different from the one where the enqueue server is running
  • while the replication server is being restarted , a failure of the enqueue server cannot be recovered
not applicable:
  • no need to automate the replication server
  • replication server no longer has any impact on the availability characteristics of the enqueue server
planned restart of enqueue server
  • enqueue server must be restarted on LPAR where replication server was running to avoid loss of enqueue locks
  • replication server must be moved to a different LPAR
more flexible:
  • enqueue server can be restarted on any LPAR in the sysplex, even on the same LPAR
  • SAP Central Services restart is possible without loss of enqueue information even with only one LPAR being active
unplanned failover of enqueue server
  • enqueue server must be restarted on LPAR where replication server was running to avoid loss of enqueue locks
  • replication server must be moved to a different LPAR (if available)
more flexible:
  • enqueue server can be restarted on any LPAR in the sysplex
  • no sophisticated automation policy required which ensures that enqueue server is started on the correct LPAR