Contents


Transaction IDs in DB2

Retrieve a unique identifier for a unit of work from the DB2 log records

Comments

Normally, an application that talks to a database server does not need to be aware of the internals of the database engine with respect to transaction handling. The application issues a series of SQL statements - such as INSERT, UPDATE, DELETE, SELECT, or CREATE TABLE - and at the end of a transaction, a transaction-ending statement is run, either COMMIT or ROLLBACK. A COMMIT statement ends the transaction and tells the database server to make all changes permanent. A ROLLBACK, on the other hand, causes the database server to undo all changes made in the transaction.

The world does not consist of only "normal" applications. Many specialized applications exist, and such applications usually have very specific requirements. For example, an application that is concerned with replicating data from one database system to another, such as DB2 Replication [2], needs to know about transactional information. When replicating data changes, it is usually necessary to replicate all changes made in a single transaction together. Thus, it is necessary to know which changes were made in which transaction and each transaction needs to be identified uniquely.

DB2 UDB does not provide a special register or any other direct means to retrieve the identifier of a transaction, which is needed for internal purposes. In the following sections I describe how you can combine several features of DB2 UDB to come up with a transaction identifier. The general idea is to access the log records, which are written by DB2 to ensure recovery of your valuable data in case of a system crash, power outage or hard drive failure. DB2 stores the internal transaction ID in each log record. Some additional setup-steps are performed to trigger a specific log record to be written so that the correct transaction ID is retrieved.

Accessing the Transaction ID

DB2 UDB comes with a great variety of APIs to manage the database system itself. One of these APIs allows you to access the logs records that are written by the database engine during its operation. Log records are written for INSERT, UPDATE, DELETE, REORGANIZE and many other operations that you can issue to modify the data in the tables of your database or to administer the system itself.

Each log record includes in its header a unique identifier for the transaction (encoded in 6 bytes) [3] that triggered the writing of the record. This identifier is needed to guarantee the durability of all database transactions in case of expected or unexpected interruption of the operation, such as a system failure. DB2 maintains that identifier automatically.

The approach takes advantage of the transaction identifier stored in the log record header. We use the log reader API to access the log records and to extract the transaction ID from there. While doing that, keep the following in mind:

  • A log record must be written in the current transaction for which we want to determine the requested identifier.
  • Given that a single log (possibly split into several files) is used by the database engine for all concurrent transactions, many or all of which could make data modifications and cause log records to be written concurrently, it is necessary to find a log record that really belongs to the current transaction.

Both issues can be addressed together. We use the DB2-builtin function GENERATE_UNIQUE to generate a unique value. That value is inserted into a table, causing a log record to be written. In the next step, all new log records (since before the INSERT) are read until the one log record is found that contains the unique value. That's the log record we seek and the transaction ID is extracted from it. Finally, the insert operation is undone so that the table does not accumulate stale data. The whole processing is illustrated in Figure 1.

Figure 1. Logic to extract the transaction identifier from the database log
Process to capture the transaction ID
Process to capture the transaction ID

Implementing the stored procedure

The complete logic outlined above is encapsulated in a single stored procedure. That way, all applications have a simple and standardized way to retrieve the transaction ID. No direct calls to the DB2 API will be necessary. The transaction ID is returned by the stored procedure as a string, that is, a value of type CHAR(12).

Before you can compile the stored procedure you need to create a table named TA_ID_FORCE_LOGWRITE. This table is accessed inside the procedure. The table itself has a very simple structure consisting of a single column where the unique value generated by the function GENERATE_UNIQUE, is stored. Create the table using the SQL statement shown in Listing 1:

Listing 1. Creating the table TA_ID_FORCE_LOGWRITE
CREATE TABLE ta_id_force_logwrite (
   unique_val VARCHAR(18) FOR BIT DATA NOT NULL,
   CONSTRAINT taid_pk PRIMARY KEY (unique_val)
)@

The procedure is implemented in C++ to access the asynchronous read log API [4], and it uses embedded SQL to perform the necessary SQL operations. A savepoint is used to roll back the INSERT operation that is performed inside the procedure to trigger a log record being written. The SQL statements in the procedure take advantage of the power of SQL supported by DB2 UDB Version 8.2.

Therefore, we generate the unique value, insert it in a table and retrieve it to the stored procedure code all in a single statement. This is done in the section set in italics in the listing below. If you want to use the procedure on an earlier version of DB2, you might have to break this logic into several, independent SQL statements.

The DB2 log is read using the API db2ReadLog. At the beginning, the API is called to determine the current log sequence number (LSN). This step is done to avoid querying any log records that were written prior to the invocation of the procedure. After all, we are only interested in a single log record: the one written by the INSERT operation.

After the INSERT operation, all new log records are retrieved. For each log record, we verify that it is a log record written for an INSERT operation. If it is and if the data that is inserted contains the unique value used during the INSERT statement, then we have found the log record in question and can return the proper transaction identifier.

Before leaving the procedure, we rollback the processsing to the savepoint that was set at the beginning. Thus, no data modifications remain beyond the scope of the procedure.

The complete code for the procedure is shown in Listing 2. The calls to the log API are set in bold and the SQL statements appear in italics in the listing. The other pieces are only concerned with the preparation of the parameters.

Listing 2. Stored procedure code
#include <string.h> // memset(), memcpy(), strncpy()
#include <stdio.h> // sprintf()

#include <sqludf.h>
#include <db2ApiDf.h>

#if defined(__cplusplus)
extern "C"
#endif
int SQL_API_FN getTransactionId(
	SQLUDF_VARCHAR *taId,
	SQLUDF_NULLIND *taId_ind,
	SQLUDF_TRAIL_ARGS)
{
    SQL_API_RC rc = SQL_RC_OK;
    struct sqlca sqlca;
    db2ReadLogInfoStruct logInfo;
    db2ReadLogStruct logData;
    SQLU_LSN startLsn;
    SQLU_LSN endLsn;
    char buffer[64 * 1024] = { '\0' }; // for log record data

    EXEC SQL BEGIN DECLARE SECTION;
        char uniqueVal[13] = { '\0' };
    EXEC SQL END DECLARE SECTION;

    // we assume NULL return
    *taId_ind = -1;

    /*
     * Step 1: Set a savepoint to be able to undo the data modifications
     */
    EXEC SQL SAVEPOINT get_transaction_id ON ROLLBACK RETAIN CURSORS;

    /*
     * Step 2: Query the DB2 Log to get the start LSN
     */
    memset(&sqlca, 0x00, sizeof sqlca);
    memset(&logInfo, 0x00, sizeof logInfo);
    memset(&logData, 0x00, sizeof logData);
    logData.iCallerAction = DB2READLOG_QUERY;
    logData.piStartLSN = NULL;
    logData.piEndLSN = NULL;
    logData.poLogBuffer = NULL;
    logData.iLogBufferSize = 0;
    logData.iFilterOption = DB2READLOG_FILTER_OFF;
    logData.poReadLogInfo = &logInfo;
    rc = db2ReadLog(db2Version810, &logData, &sqlca);
    if (rc < 0) {
        memcpy(SQLUDF_STATE, "38TA0", SQLUDF_SQLSTATE_LEN);
        strncpy(SQLUDF_MSGTX, "Could not query log for last LSN",
                SQLUDF_MSGTEXT_LEN);
        goto exit;
    }
    else if (sqlca.sqlcode) {
        memcpy(SQLUDF_STATE, "38TA1", SQLUDF_SQLSTATE_LEN);
        snprintf(SQLUDF_MSGTX, SQLUDF_MSGTEXT_LEN, "SQL error while "
                "reading log records.  SQLCODE = %d, SQLSTATE=%s",
                sqlca.sqlcode, sqlca.sqlstate);
        goto exit;
    }
    memcpy(&startLsn, &logInfo.nextStartLSN, sizeof startLsn);

    /*
     * Step 3: Force a log record to be written
     *
     * Insert a unique value into our table, which triggers a log record to be
     * written.  The same value is also returned right away so that we can use
     * it to search through the new log records.
     */
    EXEC SQL
	SELECT value
	INTO   :uniqueVal
	FROM   NEW TABLE ( INSERT
                           INTO   ta_id_force_logwrite
                           VALUES ( GENERATE_UNIQUE() ) ) AS t(value);
    if (sqlca.sqlcode) {
        memcpy(SQLUDF_STATE, "38TA2", SQLUDF_SQLSTATE_LEN);
        snprintf(SQLUDF_MSGTX, SQLUDF_MSGTEXT_LEN, "SQL error while "
                "triggering log record.  SQLCODE = %d, SQLSTATE=%s",
                sqlca.sqlcode, sqlca.sqlstate);
        goto exit;
    }

    /*
     * Step 4: Search through the new log records to find our INSERT
     */
    while (true) {
        char *ptr = NULL;
        char *transactionId = NULL;
        sqlint32 recordLength = 0;

        memset(&sqlca, 0x00, sizeof sqlca);
        memset(&logInfo, 0x00, sizeof logInfo);
        memset(&logData, 0x00, sizeof logData);
        memset(&endLsn, 0xFF, sizeof endLsn);
        logData.iCallerAction = DB2READLOG_READ_SINGLE;
        logData.piStartLSN = &startLsn;
        logData.piEndLSN = &endLsn;
        logData.poLogBuffer = buffer;
        logData.iLogBufferSize = sizeof buffer;
        logData.iFilterOption = DB2READLOG_FILTER_OFF;
        logData.poReadLogInfo = &logInfo;
        rc = db2ReadLog(db2Version810, &logData, &sqlca);
        if (rc < 0) {
	    memcpy(SQLUDF_STATE, "38TA3", SQLUDF_SQLSTATE_LEN);
    	    sprintf(SQLUDF_MSGTX, "Could not read log record.  rc = %d",
                    (int)rc);
	    goto exit;
        }
        else if (sqlca.sqlcode == SQLU_RLOG_READ_TO_CURRENT) {
	    memcpy(SQLUDF_STATE, "38TA4", SQLUDF_SQLSTATE_LEN);
    	    strncpy(SQLUDF_MSGTX, "Last log record reached prematurely.",
                    SQLUDF_MSGTEXT_LEN);
	    goto exit;
        }
        else if (sqlca.sqlcode) {
    	    memcpy(SQLUDF_STATE, "38TA5", SQLUDF_SQLSTATE_LEN);
	    snprintf(SQLUDF_MSGTX, SQLUDF_MSGTEXT_LEN, "SQL error while "
                    "reading log records.  SQLCODE = %d, SQLSTATE=%s",
                    sqlca.sqlcode, sqlca.sqlstate);
    	    goto exit;
	}
        if (logInfo.logBytesWritten < 20) {
	    memcpy(SQLUDF_STATE, "38TA6", SQLUDF_SQLSTATE_LEN);
    	    strncpy(SQLUDF_MSGTX, "Log Manager Header of record too small.",
                    SQLUDF_MSGTEXT_LEN);
	    goto exit;
        }
        memcpy(&startLsn, &logInfo.nextStartLSN, sizeof startLsn);

        // the data in the buffer starts with the LSN, followed by the Log
        // Manager Header; skip the LSN
        ptr = buffer;
        ptr += sizeof(SQLU_LSN);

        // get the length of the log record (plus LSN)
        recordLength = *(sqlint32 *)ptr + sizeof(SQLU_LSN);
        ptr += 4;

        // verify that this is a "Normal" log record
        if (*(sqlint16 *)ptr != 0x004E) {
            continue;
        }
        ptr += 2;

        // skip behind the Log Manager Header (to the DMS Log Record Header);
        // (we do not have "Compensation" records here and "Propagatable"
        // doesn't occur either)
        ptr += 2 + // flags
            6; // LSN of previous record in same transaction

        // remember the location of the transaction id
        transactionId = ptr;
        ptr += 6;

        // now we are at the beginning of the DML Log Record Header
    	if (ptr - buffer + 18 + 4 > recordLength) {
            continue;
        }

        // check that the "Function identifier" in the DMS header indicates an
        // "INSERT" log record
        ptr += 1;
        if (*(unsigned char *)ptr != 118) {
            continue;
        }

        // skip to the record data
        ptr += 5 + // remainder of DMS Log Record Header
            2 + // padding
            4 + // RID
            2 + // record Length
            2 + // free space
            2; // record offset

        // the record contains data if the 1st byte of the record header (the
        // record type) is 0x00 or 0x10, or if the bit 0x04 is set
        if (*ptr != 0x00 && *ptr != 0x10 && (*ptr & 0x04) == 0) {
            continue;
        }
        ptr += 4;

        // we reached the record data and the unique value can be found after
        // the record length
        ptr += 1 + // record type
            1 + // reserved
            2 + // length of fixed length data
            4; // RID

        // that's where the unique value should be
        // once we found the unique value, extract the transaction ID and
        // convert it to a string
        if (memcmp(ptr, uniqueVal, 13) == 0) {
            int i = 0;
            char *result = taId;
            for (i = 0; i < 6; i++) {
                sprintf(result, "%02hhx", ptr[i]);
                result += 2;
            }
            *result = '\0';
            *taId_ind = 0;
            break; // found the correct log record
        }
    }

  exit:
    EXEC SQL ROLLBACK TO SAVEPOINT get_transaction_id;
    return SQLZ_DISCONNECT_PROC;
}

The stored procedure can be compiled with the bldrtn script that can be found in the sqllib/samples/c/ directory. That script generates a shared library and that library is copied to the sqllib/function directory. Once this is done you can register the procedure in your database as shown in Listing 3. This is the final step before you can start using the procedure.

Listing 3. Register the procedure in the database
CREATE PROCEDURE getTransactionId ( OUT transactionId CHAR(12) )
   SPECIFIC getTaId
   DYNAMIC RESULT SETS 0
   MODIFIES SQL DATA
   NOT DETERMINISTIC
   NEW SAVEPOINT LEVEL
   LANGUAGE C
   EXTERNAL NAME 'transaction-id!getTransactionId'
   FENCED THREADSAFE
   NO EXTERNAL ACTION
   PARAMETER STYLE SQL
   PROGRAM TYPE SUB
   NO DBINFO@

Testing the procedure

The final step is to verify the proper functioning of the procedure. Listing 4 shows some SQL statements that were executed on the DB2 command line with auto-commit turned off. The very first scenario illustrates the error that you will get if you still have circular logging activated. Following that, you will see the results of various calls to the stored procedure with some data modifications thrown in. Of course, DB2 will only assign a new transaction ID after a COMMIT or ROLLBACK has been executed, and only if the current transaction already caused log records to be written.

Listing 4. Testing the procedure
$ db2 -c- -td@

db2 => CALL getTransactionId(?)@
SQL0443N  Routine "*ACTIONID" (specific name "") has returned an error
SQLSTATE with diagnostic text "SQL error while reading log records.  SQLCODE =
-2651, SQLS".  SQLSTATE=38TA1

db2 => ? sql2651@

SQL2651N The log records associated with the database can not be
          asynchronously read.

Explanation:

The asynchronous read log API was used against a connected
database which does not have LOG RETAIN or USER EXITS ON.  Only
databases which are forward recoverable may have their associated
logs read.

User Response:

Update the database configuration for the database, identified to
the asynchronous read log API, turning LOG RETAIN and/or USER
EXITS ON.

db2 => UPDATE DATABASE CONFIGURATION USING LOGRETAIN ON@
db2 => BACKUP DATABASE sample TO /dev/null@
Backup successful. The timestamp for this backup image is : 20050305214103
db2 => TERMINATE@
$ db2stop force && db2start && db2 -c- -td@
db2 => CONNECT TO sample@
db2 => CALL getTransactionId(?)@

  Value of output parameters
  --------------------------
  Parameter Name  : TRANSACTIONID
  Parameter Value : 200503052054

  Return Status = 0

db2 => COMMIT@
DB20000I  The SQL command completed successfully.
db2 => CALL getTransactionId(?)@

  Value of output parameters
  --------------------------
  Parameter Name  : TRANSACTIONID
  Parameter Value : 200503052054

  Return Status = 0

db2 => CREATE TABLE t ( a INT )@
DB20000I  The SQL command completed successfully.
db2 => CALL getTransactionId(?)@

  Value of output parameters
  --------------------------
  Parameter Name  : TRANSACTIONID
  Parameter Value : 200503052055

  Return Status = 0

db2 => ROLLBACK@
DB20000I  The SQL command completed successfully.
db2 => CALL getTransactionId(?)@

  Value of output parameters
  --------------------------
  Parameter Name  : TRANSACTIONID
  Parameter Value : 200503052056

  Return Status = 0

Conclusion

There are many reasons why a DBA might want to be able to identify the current unit of work, for example, when managing a replication process. Using the technique I've described here, you can take advantage of DB2 UDB's unique capabilities and increase your knowledge of what's going on "under the covers" in your database.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=83944
ArticleTitle=Transaction IDs in DB2
publish-date=05262005