Database accessibility during backward phase of crash recovery or HADR takeover

If a database fails unexpectedly before all transactions (or units of work) are committed, or before the changes associated with committed transactions are fully written to disk, then the database is left in an inconsistent state. When a database restart is performed, these transactions are rectified during a process called crash recovery. During crash recovery, the changes associated with these transactions are first replayed from the transaction logs (we often refer to this as the Forward or Redo phase of crash recovery). Transactions which were not yet committed at the time of the database failure are then rolled back (we often refer to this as the Backward or Undo phase of crash recovery). A TAKEOVER HADR operation can also perform a Backward phase to rollback changes that were not yet committed at the time of the operation.

For releases prior to v11.1.2.2, the database is only accessible after the full completion of crash recovery or HADR Takeover. This is not ideal, since a very large transaction could require a lengthy Backward phase, leaving the database entirely inaccessible until it is completed.

Beginning in v11.1.2.2, the database can be configured to allow connectivity during the Backward phase of crash recovery. This behaviour can be enabled by using the following database registry variable:
db2set DB2_ONLINERECOVERY=YES
The registry variable is read at the beginning of crash recovery, it is not possible to enable or disable while crash recovery is actively running.
The point in time when the database becomes connectable during the Backward phase will depend on the characteristics of the workload. Uncommitted transactions with certain workload operations will be undone during the Backward phase without allowing database access :
  • SQL Data Definition Language (DDL) statements.
  • Operations which modify the database catalogs.
  • Operations against column - organized tables.
  • Workloads that generate some other log - records that are not yet supported .
We refer to this as the Synchronous portion of the Backward phase. Thus, the database will remain inaccessible during the Backward phase until the point in time of the very first of these unsupported operations.
Note: It may be possible that, if the oldest uncommitted transactions contain these unsupported operations, then the majority of the Backward phase will be performed synchronously without database access available

After completion of the Synchronous portion of the Backward phase, database access is allowed. We refer to this as the Asynchronous portion of the Backward phase.

For the duration of the Asynchronous portion of the Backward phase, while database access is allowed :
  1. Any tables, indexes, or objects that are not associated with any uncommitted transactions are fully accessible during the asynchronous Backward phase.
  2. Any tables, indexes, or objects that are associated with an uncommitted transaction are protected by an Exclusive (X) lock if DB2_ONLINERECOVERY_WITH_UR_ACCESS=YES, or a Super Exclusive (Z) lock if DB2_ONLINERECOVERY_WITH_UR_ACCESS=NO. As the Backward phase compensates the uncommitted transactions, this lock will be released when there are no more remaining uncommitted transactions referencing the table/object. Thus, prior to the release of these locks, these tables/objects are only accessible to queries using UR isolation level when DB2_ONLINERECOVERY_WITH_UR_ACCESS=YES, and completely inaccessible when DB2_ONLINERECOVERY_WITH_UR_ACCESS=NO.

    If these tables also contain LONG or LOB fields, or are organized by MDC or ITC, they will always be protected by a Super Exclusive (Z) lock. These objects are never accessible during the Asynchronous Backward phase.

    Note: The undo of these transactions may inherently require a Super Exclusive (Z) lock. For example, undoing utility operations such as LOAD), and that will always supersede the Exclusive (X) lock acquired during the Asynchronous portion when DB2_ONLINERECOVERY_WITH_UR_ACCESS=YES.

The RESTART DATABASE or TAKEOVER HADR command will return at the start of the Asynchronous Backward phase, when the database becomes connectable.

While the crash recovery operation is running, a DEACTIVATE DB operation will return an SQL1495W warning/error. A db2stop operation will return an SQL1025N error. Attempting to force the crash recovery application agent using FORCE APPLICATION 0 will return an SQL0104N error. A db2stop force operation is allowed and will interrupt the crash recovery. A TAKEOVER HADR operation is also allowed during any phase of crash recovery.

If the INDEXREC configuration parameter is configured for index recreation at database restart time, then indexes will be recreated after the completion of the Asynchronous Backward phase.

Note: For pureScale instances, the backward phase of member crash recovery will continue to be performed synchronously without database accessibility on that member, because the database remains connectable on other members. Similarly for group crash recovery, the backward phase will continue to be performed synchronously without database accessibility.

Monitoring Crash Recovery Progress:

The progress of crash recovery can be monitored during any phase using either of the "db2pd -recovery" option; the LIST UTILITIES SHOW DETAIL CLP command; the SNAPUTIL administrative view or SNAP_GET_UTIL table function. During the asynchronous backward phase, the MON_GET_UTILITY table function and the CHANGE HISTORY event monitor can also be used.

$ db2pd -db mydb1 -recovery
Recovery:
Recovery Status		7400010F40000000
Current Log		S0000031.LOG
Current LSN		000000000006114D
Current LRI		000000000000000100000000000026E8000000000006114D
Current LSO		80969655
Job Type		CRASH RECOVERY
Job ID			1
Job Start Time		(1479058883) Sun Nov 13 12:41:23 2016
Job Description		Crash Recovery
Invoker Type		User
Total Phases		2
Current Phase		2
Progress:
Address		PhaseNum	Description		StartTime			CompletedWork				TotalWork
0x078... 	1		Forward			Sun Nov 13 12:41:231290045	4 bytes					12900454 bytes
0x078... 	2		Backward		Sun Nov 13 12:49:27		0 bytes					12900454 bytes

The Forward phase of crash recovery (when transactions are replayed from the transaction logs) is displayed as the first phase, and its progress can be monitored using the Total Work and Completed Work values over time.

The Backward phase of crash recovery (where uncommitted transactions are rolled back or undone) is displayed after the first phase completes. Its progress can also be monitored by observing the Total Work and Completed Work values over time.

For example, if the CompletedWork value is observed 10mins apart, and the difference in the CompletedWorkvalue is 123,456 bytes during this duration of time, then we can easily extrapolate the time to complete the remaining work:
ObservationTime_inMins =	10mins
WorkDone_inBytes =		123,456bytes    
RemainingWork_inBytes = 	TotalWork – CompletedWork;   
TimeToCompletion_inMins = 	(RemaingWork_inBytes / WorkDone_inBytes) * ObservationTime_inMins.
While the Synchronous portion of the Backward phase is in progress, it is not possible to estimate when the Backward phase will enter the Asynchronous portion. When the Synchronous portion completes, an ADM1505I message is printed in the administration notification log ("ADM1505I Crash recovery has completed synchronous processing"), and the following message will be displayed in the db2diag.log:
2016-11-12-50.03.33.674725-300		I269681A541 			LEVEL: Info
PID : 63373472 				TID : 2058 			KTID : 131596723
PROC : db2sysc 							
INSTANCE: db2inst1 			NODE : 000 			DB : MYDB1
APPHDL : 0-7 				APPID: *LOCAL.dsciaraf.161114020311
AUTHID : DB2INST1 			HOSTNAME: hotelaix2
EDUID : 2058 				EDUNAME: db2agent (X)
FUNCTION: Db2 UDB, recovery manager, sqlpresr, probe:3170
DATA #1 : <preformatted>
Crash recovery synchronous phase completed. Next LSN is 000000000006113C.

If the Backward phase of crash recovery performs an Asynchronous portion , then the MON_GET_UTILITY table function will display an entry with a UTILITY_TYPE value of ‘ONLINERECOVERY’ after the database opens up for connectivity .

For example:
$ db2 “ SELECT COORD_MEMBER, APPLICATION_HANDLE AS APPHDL, SUBSTR(APPLICATION_NAME, 1, 30) AS APPNAME, 
SUBSTR(SESSION_AUTH_ID, 1, 10) AS USER, UTILITY_TYPE, UTILITY_INVOKER_TYPE, SUBSTR(UTILITY_DETAIL, 1, 50) AS CMD FROM TABLE(MON_GET_UTILITY(-2)) AS T ”
COORD_MEMBER APPHDL    APPNAME    		USER       UTILITY_TYPE     UTILITY_INVOKER_TYPE 			CMD                                           
------------------------------------------------------------------------------------------------------------------------------------
0                   	21 db2undo_trans     	-		ONLINERECOVERY   	AUTO   				RESTART DATABASE MYDB1
1 record(s) selected.

The CHANGE HISTORY event monitor is also capable of recording the start and end of the asynchronous portion of the Backward phase of crash recovery using an event-control type called ONLINERECOVERY.

Monitoring applications during the Backward phase of crash recovery

After database access is allowed during the Backward phase of crash recovery, monitoring of applications should be done using traditional methods, with the following considerations:

All uncommitted transactions that are part of the Backward phase of crash recovery will have an Application Handle (AppHandl) of zero (0). These transactions will also have a transaction state of 'ABORT' and Tflag2 bit 0x00000001 set.

For example:
$ db2pd -db MYDB1 -transactions
Transactions:
Address AppHandl ... TranHdl ... State 	Tflag 		Tflag2 ... 	LogSpace ...
0x... 	0 ... 		3 ... 	ABORT 	0x00000000 	0x00000001 ... 	780 ...
0x... 	0 ... 		4 ... 	ABORT 	0x00000000 	0x00000001 ... 	9465875 ...
0x... 	7 ... 		5 ... 	READ 	0x00000000 	0x00000000 ... 	0 ...
0x... 	8 ... 		6 ... 	READ 	0x00000000 	0x00000000 ... 	0 ...
Any tables, indexes, or objects that were associated with an uncommitted transaction are protected by an Exclusive (X) or Super Exclusive (Z) lock for the duration of the Backward phase of crash recovery. This means that applications which attempt to access these objects during the Backward phase of crash recovery are subject to lock - wait conditions.
Lock - wait conditions can be monitored using the MON_GET_APPL_LOCKWAIT table function , or the db2pd -wlocks option. For example:
$ db2 "select LOCK_OBJECT_TYPE as TYPE, LOCK_NAME, LOCK_MODE as MODE,
LOCK_MODE_REQUESTED as MODE_REQ, LOCK_STATUS as STATUS,
HLD_APPLICATION_HANDLE as HL
D_APPHANDL, REQ_APPLICATION_HANDLE as REQ_APPHANDL
from TABLE (MON_GET_APPL_LOCKWAIT(NULL, -2))"
TYPE 	LOCK_NAME 			MODE 		MODE_REQ 	STATUS 		HLD_APPHANDL 	REQ_APPHANDL
----------------------------------------------------------------------------------------------------------
TABLE 	02000500000000000000000054 	X	 	IS		 W		 0		 7
1 record(s) selected.

In this example, Application Handle 7 is requesting an Intent Share (IS) mode lock on a Table, which is already held in Exclusive (X) mode by Application Handle 0 (an uncommitted transaction that is part of crash recovery).

Additional lock details can be observed using the MON_FORMAT_LOCK_NAME table function. For example:

$ db2 "SELECT SUBSTR(TABNAME,1,20) AS TABNAME, SUBSTR(VALUE,1,50) AS
VALUE FROM TABLE(MON_FORMAT_LOCK_NAME('02000500000000000000000054')) as LOCK"
NAME 					VALUE
------------------------------------------------------
LOCK_OBJECT_TYPE 			TABLE
TBSP_NAME 				TSPA
TABSCHEMA 				MYSCHEM
A1TABNAME 				MYTABLE3
4 record(s) selected.

In this example, the lock wait condition is associated with a table lock on table MYSCHEMA1.MYTABLE3 in tablespace TSPA.

Additional details about the application in lock - wait state can be obtained using the WLM_GET_WORKLOAD_OCCURRENCE_ACTIVITIES table function:

db2 "SELECT APPLICATION_HANDLE as APPHNDL,
LOCK_WAIT_TIME, SUBSTR(STMT_TEXT,1,45) AS STMT_TEXT
FROM TABLE(WLM_GET_WORKLOAD_OCCURRENCE_ACTIVITIES(NULL,-1)) AS T"
APPHNDL			LOCAL_START_TIME			LOCK_WAIT_TIME 		STMT_TEXT
-----------------------------------------------------------------------------------------
10			2016-11-13-12.55.35.821373 		0 			SELECT APPHNDL,LOCK_WAIT_TIME
7 			2016-11-13-12.53.33.777571 		27 			select count(*) from t2
2 record(s) selected.