Protection of retained locks: failed-persistent connections

Use extreme care when deleting failed-persistent connections to the lock structure. IRLM and XES use failed-persistent connections to the lock structure to track retained lock information. Retained locks might be lost and data integrity might be exposed by arbitrarily deleting a failed-persistent lock structure connection.

You should delete failed-persistent lock structure connections only in the following situations:

  • Disaster recovery. All Db2 and IRLM-related failed-persistent connections and structures must be deleted before restarting the data sharing group at the remote site. During the restart, Db2 uses the group restart process to rebuild the retained locks from the logs.
  • A Parallel Sysplex®-wide outage when the lock structure is forced. Failed-persistent connections can be safely forced when all members are down and the lock structure is also forced. During the restart phase of a disaster recovery process, Db2 uses the group restart process to rebuild the retained locks from the logs.
  • After a hard failure occurs, such as a check-stop or abnormal re-IPL of a z/OS® image that contains an active member and the Db2 or IRLM member has not been restarted.
Important: This information about deleting failed-persistent connections is not relevant for sites running z/OS with APAR OA02620 applied. With this APAR, you cannot delete failed-persistent connections to the lock structure unless you also deallocate the lock structure. Deleting failed-persistent connections without also deallocating the associated structure can result in a loss of coupling facility data. This situation can then cause undetectable losses of data integrity. APAR OA02620 protects your site from data corruption problems that can occur as a result of deleting retained locks. In doing so, the APAR also prevents extended outages that would result from long data recovery operations.

Do not delete a member's failed-persistent connection just because that member was normally quiesced.

When a member is shut down while holding retained locks, those retained locks are transferred to another member to hold until the original member is restarted. Therefore, although a normally quiesced member does not hold retained locks for itself, it might hold retained locks for another member that was shut down. The following situations can cause a transfer of retained locks:
  • Member failure. Assume that three members, DB1A, DB2A, and DB3A, are running normally. If z/OS incurs a hard failure that takes down DB2A, DB2A's retained locks are transferred to one of the other members; assume that it was DB1A. If DB1A is subsequently shut down normally, you might assume that no locks are held by DB1A and that you can safely delete DB1A's failed-persistent connection. In actuality, because DB1A is holding the retained locks from DB2A, deleting DB1A's failed-persistent connection deletes DB2A's retained locks and exposes the Db2 data to potential data integrity errors.
  • A lock structure rebuild when one or more members are down and holding retained locks. This could be caused by either the z/OS SETXCF command or a coupling facility-related failure.

A coupling facility structure rebuild deletes any failed-persistent connections that existed before the rebuild. The retained locks belonging to failed members are re-created and held during the rebuild by one of the active members until the failed members are restarted.