IBM Support

Should I use coupling facility (CF) duplexing for my DB2 structures?

Question & Answer


Question

Explanation of coupling facility (CF) duplexing options available for DB2 structures.

Answer

For group buffer pools (GBPs), the general recommendation is to use duplexing. It's cheap insurance that can save 10's of minutes to hours of recovery time in the rare event of a coupling facility (CF) failure. Duplexing the GBPs adds little or no overhead (CPU or transaction response time). GBP duplexing was first shipped in Version 6. DB2 uses an operating system service called "user managed coupling facility structure duplexing" to duplex GBP structures. To activate duplexing for a GBP structure, the z/OS Coupling Facility Resource Management (CFRM) policy must be set up to indicate DUPLEX(ALLOWED) or DUPLEX(ENABLED) for the GBP structure. The general recommendation is to use DUPLEX(ENABLED). The MVS SETXCF command can be used to manually activate or deactivate duplexing for a GBP structure. DB2 provides extensive instrumentation related to duplexing by using accounting and statistics reports and also by using DISPLAY GROUPBUFFERPOOL commands.

For the SCA and IRLM Lock structures, duplexing is not nearly as critical, since in the event of a CF failure, these structures can be "rebuilt" on the fly from in-memory information. The dynamic rebuild of the lost structure into an alternate CF typically takes less than 10 seconds. Applications and end users should notice no outage, except possibly a transient response time delay. For highest availability, the general recommendation is to configure the SCA and the Lock structure in a "failure-isolated" CF. "Failure-isolated" means that the CF LPAR runs on a CPC that is isolated from all members of the data sharing group; that is, the CF does not share a CPC with any connected DB2 member. One example of a failure-isolated CF is a standalone CF. If the SCA or Lock is not "failure-isolated" from the DB2 members, then the CPC that contains the CF and a member is a single point of failure for the DB2 group. If this CPC were to fail then the entire DB2 group would come down and the lost SCA or Lock information would need to be reconstructed from the logs by using the DB2 Group Restart mechanism. Because CPC failures are extremely rare, some installations are willing to take the risk of using non failure-isolated ICFs (Internal Coupling Facilities) for Lock and SCA to avoid the added cost of having to configure standalone CFs. Other installations are not willing to take this risk, oftentimes citing the fact that CPC failures are more likely caused by human error than by hardware or system errors.

z/OS provides a function called "system managed coupling facility structure duplexing". Duplexing the Lock or SCA structures can provide the following benefits:
  • Allows the Lock and SCA to use non failure-isolated ICFs without compromising any availability. ICFs are generally cheaper to configure than standalone CFs.
  • In the event of a CF failure, a duplexed SCA or Lock can be recovered somewhat faster than dynamically rebuilding a lost simplexed structure. For example, on a CF failure, the transient response time delay that may be noticed by some applications or users should typically be shorter if the Lock or SCA structures are duplexed.

There may be noticeable performance overhead in duplexing the Lock structure. The overhead is not as noticeable for the SCA, since the SCA is updated much less frequently than the Lock. It is highly recommended to carefully consider the performance overhead of duplexing the Lock structure before activating it in production. Our recommendation is to continue to use DB2 (user) managed duplexing for GBPs. The operational controls for "user managed" and "system managed" duplexing are identical. That is, your operations staff is unaware of which one is being used. System-managed duplexing is available.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEPEK","label":"Db2 for z\/OS"},"Component":"Data Sharing","Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"9.0;10.0;11.0, 12","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
26 June 2019

UID

swg21023437