Lock events for DB2 for Linux, UNIX, and Windows, Part 1
Analyzing lockwait situations in DB2 for Linux, UNIX, and Windows
Determine the cause for concurrency issues using the db2pd tool
This content is part # of # in the series: Lock events for DB2 for Linux, UNIX, and Windows, Part 1
This content is part of the series:Lock events for DB2 for Linux, UNIX, and Windows, Part 1
Stay tuned for additional content in this series.
db2pd options for lock monitoring
db2pd is a utility for monitoring and troubleshooting all kinds of DB2 database activity. It is a standalone utility shipped with the DB2 engine starting with the DB2 V8.2 release and was developed to look, feel, and function like the Informix
db2pd is executed from the command line with an optional interactive mode. The utility runs very quickly as it is not acquiring any locks or latches and runs outside of the engine's resources (meaning it even works on a hung engine). A lot of the monitor data provided by
db2pd can also be gathered by snapshot monitoring, but the output format of
db2pd and snapshot monitoring are quite different. This allows the DBA to choose the monitoring alternative that is more suitable for her or his needs. This article focuses on
db2pd's options for lock monitoring. There is a developerWorks article by Sam Poon (see the Related topics section) that provides a broader introduction to
db2pd's monitoring capabilities.
The following diagram shows
db2pd's options for lock monitoring:
db2pd options for lock monitoring
TranHdl: This is a specification of a transaction handle that allows for monitoring only locks held by a specific transaction.
showlocks: This suboption expands the lockname into meaningful explanations. For a row lock, the following information is shown: tablespace ID, table ID, partition ID, page and slot. Tablespace ID and table ID can easily be mapped to the corresponding tablename using a query on the catalog view
Listing 1. Map tablespace ID, table ID to tableschema, tablename
SELECT TABSCHEMA, TABNAME FROM SYSCAT.TABLES WHERE TBSPACEID = tbspaceid AND TABLEID = tableid
wait: Specifying the
db2pdonly shows locks that transactions are currently waiting on, and the locks responsible for the wait situation. This suboption greatly simplifies lockwait analysis because it limits the output to the locks participating in the lockwait situation.
fileoptions are not specific for lock monitoring, but apply to (nearly) all
databaseoption limits monitor data returned by
db2pdto that of a certain database. The
fileoption let's you define a file to which
db2pdoutput is written.
A lockwait analysis scenario
Next, let's start to analyze a sample lockwait situation using the
db2pd options introduced. For this purpose, let's create the DB2
SAMPLE database as follows:
Listing 2. Create
Transaction A is executed by user A to give each manager a 10% bonus depending on the salary of the manager:
Listing 3. Update operation performed by transaction A
UPDATE EMPLOYEE SET BONUS = SALARY * 0.1 WHERE JOB = 'MANAGER'
While transaction A is still running (because user A has not ended the transaction using
ROLLBACK), transaction B is executed by user B to give each employee a 2% salary raise:
Listing 4. Update operation performed by transaction B
UPDATE EMPLOYEE SET SALARY = SALARY * 0.02
As transaction B does not finish, user B calls the DBA to determine the cause of the problem. The DBA in turn calls
db2pd to look for a possible lockwait situation:
Listing 5. Check for lockwait situation
db2pd -db sample -locks wait showlocks Database Partition 0 -- Database SAMPLE -- Active -- Up 3 days 08:33:05 Locks: Address TranHdl Lockname Type Mode Sts Owner Dur 0x050A0240 6 02000600050040010000000052 Row ..X W 2 1 0x050A0DB0 2 02000600050040010000000052 Row ..X G 2 1 HoldCount Att ReleaseFlg 0 0x00 0x40000000 TbspaceID 2 TableID 6 PartitionID 0 Page 320 Slot 5 0 0x00 0x40000000 TbspaceID 2 TableID 6 PartitionID 0 Page 320 Slot 5
db2pd reports a lockwait situation for a row lock occurring on table with ID 6 in tablespace with ID 2. Checking against
SYSCAT.TABLES, the DBA determines that there is indeed a lockwait situation on table
Listing 6. Determine the table involved in the lockwait situation
SELECT TABSCHEMA, TABNAME FROM SYSCAT.TABLES WHERE TBSPACEID = 2 AND TABLEID = 6 TABSCHEMA TABNAME -------------------------------------------------------------------------------- FECHNER EMPLOYEE 1 record(s) selected.
The status column of the
db2pd -locks output (
Sts) shows a "G" for transaction 2 (column
TranHdl). G stands for "granted" and means that the transaction with transaction handle 2 owns the row lock. Additionally, column
Mode indicates that it is an X lock that is held by transaction 2. The waiting transaction (state W = "wait" in column
Sts) is the transaction with handle 6. That transaction is requesting an X lock on the same row as transaction 2. You can see this by looking at the
Owner column (showing transaction 2 as the lock owner) and comparing the
Lockname (identical for both entries in the
db2pd -locks section).
Next, the DBA maps the transaction handles to applications. This is done using another
Listing 7. Map transaction handles to applications
db2pd -db sample -transactions Database Partition 0 -- Database SAMPLE -- Active -- Up 3 days 08:34:47 Transactions: Address AppHandl [nod-index] TranHdl Locks State Tflag Tflag2 0x05141880 30 [000-00030] 2 9 WRITE 0x00000000 0x00000 0x05144880 34 [000-00034] 6 5 WRITE 0x00000000 0x00000
The output from this
db2pd call shows that transaction 2 (column
TranHdl) is executed by application 30 (column
AppHandl) and transaction 6 by application 34, respectively. Both transactions are in the process of writing changes to the database (column
State = WRITE). So the DBA now knows that application 30 is holding the lock on which application 34 is waiting.
To get more information about the applications participating in the lockwait situation,
db2pd is called with the
-agents option. This option prints information about the agents working on behalf of the applications. Note that
-agents is an instance level option, that means specifying a database is not necessary (indeed when a database is specified,
db2pd prints a warning and ignores the database option).
Listing 8. Get information about applications and corresponding agents
db2pd -agents Database Partition 0 -- Active -- Up 3 days 08:35:42 Agents: Current agents: 2 Idle agents: 0 Active coord agents: 2 Active agents total: 2 Pooled coord agents: 0 Pooled agents total: 0 Address AppHandl [nod-index] AgentTid Priority Type State 0x04449BC0 34 [000-00034] 3392 0 Coord Inst-Active 0x04449240 30 [000-00030] 2576 0 Coord Inst-Active ClientPid Userid ClientNm Rowsread Rowswrtn LkTmOt DBName 3916 USER_B db2bp.ex 43 43 NotSet SAMPLE 2524 USER_A db2bp.ex 153 14 NotSet SAMPLE
db2pd -agents output, the DBA can see the IDs of the users working with application 30 and 34 (column
Userid): application 30 is executed by USER_A, and application 34 is executed by USER_B. Such a mapping between application and user ID is only possible if each user has a separate database authorization ID. Generally, it is not possible for applications running on an application server because these applications use connection pooling so that connections are not personalized.
More information about each application is provided by the
Listing 9. Get more information about applications
db2pd -db sample -applications Database Partition 0 -- Database SAMPLE -- Active -- Up 3 days 08:36:14 Applications: Address AppHandl [nod-index] NumAgents CoorTid Status 0x04AF8080 34 [000-00024] 1 3940 Lock-wait 0x03841960 30 [000-00020] 1 2548 UOW-Waiting C-AnchID C-StmtUID L-AnchID L-StmtUID Appid 195 1 0 0 *LOCAL.DB2.061122195637 0 0 60 1 *LOCAL.DB2.061122195609
Status column confirms something the DBA already knows: application 34 is in lock-wait state. But that is nothing new, so the DBA concentrates on columns
L-AnchID/L-StmtUID, respectively. "C" stands for current and "L" stands for last anchor ID/statement UID. These IDs can be used to identify the last SQL statement executed by an application as well as the statement currently executed by an application. To do so,
db2pd with the
-dynamic option is called. This option shows the contents of the database's dynamic statement cache:
Listing 10. Examine contents of dynamic statement cache
db2pd -db sample -dynamic Database Partition 0 -- Database SAMPLE -- Active -- Up 3 days 08:37:39 Dynamic Cache: Current Memory Used 187188 Total Heap Size 1271398 Cache Overflow Flag 0 Number of References 2 Number of Statement Inserts 3 Number of Statement Deletes 0 Number of Variation Inserts 2 Number of Statements 3 Dynamic SQL Statements: Address AnchID StmtUID NumEnv NumVar NumRef NumExe 0x056CEBD0 60 1 1 1 1 1 0x056CE850 180 1 0 0 0 0 0x056CFEA0 195 1 1 1 1 1 Text UPDATE EMPLOYEE SET BONUS = SALARY * 0.1 WHERE JOB = 'MANAGER' SET CURRENT LOCALE LC_CTYPE = 'de_DE' UPDATE EMPLOYEE SET SALARY = SALARY * 0.02 Dynamic SQL Environments: Address AnchID StmtUID EnvID Iso QOpt Blk 0x056CECD0 60 1 1 CS 5 B 0x056D30A0 195 1 1 CS 5 B Dynamic SQL Variations: Address AnchID StmtUID EnvID VarID NumRef Typ 0x056CEEB0 60 1 1 1 1 4 0x056D3220 195 1 1 1 1 4 Lockname 010000000100000001003C0056 01000000010000000100C30056
The mapping between the
-applications output and the
-dynamic output is straightforward:
Application 34 (in lock-wait state) is currently executing the SQL statement identified by current anchor ID 195 and current statement ID 1. In the
Dynamic SQL Statements section of the
db2pd -dynamic output, those IDs can be mapped to the following SQL statement:
Listing 11. SQL statement executed by application 34
UPDATE EMPLOYEE SET SALARY = SALARY * 0.02
The last SQL statement executed by application 30 that is holding the lock was the statement with last anchor ID 60 and last statement ID 1. Those IDs can be mapped to the following SQL statement:
Listing 12. SQL statement executed by application 30
UPDATE EMPLOYEE SET BONUS = SALARY * 0.1 WHERE JOB = 'MANAGER'
Note that the
db2pd -dynamic output contains another interesting piece of information that is normally difficult to find out: The isolation level of a dynamic SQL statement that was executed is shown in column
Iso of the
Dynamic SQL Environments section (UR = Uncommitted Read, CS = Cursor Stability, RS = Read Stability, RR = Repeatable Read).
Let's summarize what the DBA discovered regarding the cause for the hang of user B's application:
- The hang is caused by an exclusive row lock on table
- The transaction owning the lock belongs to an application executed by user A. The transaction of user B is waiting on that lock.
- The two statements colliding are both
UPDATEstatements on table
With that information, the DBA can initiate the necessary steps to resolve the lockwait situation, such as advising user A to end the transaction, or forcing user A's application, among other options. In addition, measures can be taken to avoid such situations in the future, for example configuring the DB2 governor to automatically end transactions that run too long.
In the sample scenario,
db2pd was executed several times one after the other, each time with a single option. In a real-world scenario, that would not be the case. Instead,
db2pd would be called once with all the options introduced above:
Listing 13. Single
db2pd call with all options required to analyze a lockwait situation
db2pd -db sample -locks wait showlocks -transactions -agents -applications -dynamic -file db2pd.out -repeat 15 40
The resulting output consists of the outputs for each option in the order the options appear in the
db2pd call. Also, note the two additional options at the end of the
db2pdoutput should be written to a file. In the sample call, output is written to file
db2pdshould be executed every 15 seconds for 40 times (that means for 10 minutes in intervals of 15 seconds). The output of each execution is appended to the end of the file specified by the
-repeat options are useful to monitor database activity for a certain period of time. For lockwait analysis, those options help to catch lockwait situations that only exist for a short period of time. For example, if the database parameter
LOCKWAIT is set to 20 seconds, a transaction waiting for a lock is rolled back after 20 seconds of wait time. To catch such a lockwait situation, the
db2pd interval must be set to an interval shorter than 20 seconds, such as 15 seconds in the example.
Catching locktimeouts that occur only sporadically
Sometimes lockwait situations lead to locktimeouts that cause transactions to be rolled back. The period of time until a lockwait leads to a locktimeout is specified by the database configuration parameter
LOCKTIMEOUT. One of the biggest problems concerning locktimeout analysis is the fact that it is not known when the next locktimeout situation will occur. For catching deadlocks, you can create an event monitor for deadlocks. Such a deadlock event monitor writes an entry each time a deadlock occurs. But there is no equivalent event monitor for locktimeouts. So until DB2 9®, the only way to catch a locktimeout was by continuous
db2pd or snapshot monitoring (for
-repeat options can be used for continuous lock monitoring as explained earlier).
DB2 9 comes with a new mechanism to collect monitor data when database failures or events occur: the
db2cos script. For catching locktimeout events, we can configure the database to start the
db2cos script each time a locktimeout occurs. Within the
db2pd can be called with the same options as already discussed. Let's walk through a sample scenario demonstrating the use of the
db2cos script to catch locktimeouts.
For this scenario, assume the DBA set the database locktimeout value to 10 seconds:
Listing 14. Update locktimeout setting
UPDATE DB CFG FOR SAMPLE USING LOCKTIMEOUT 10
To start the
db2cos script each time a locktimeout occurs, the DBA calls the
db2pdcfg utility as follows:
Listing 15. Configure
db2cos script invocation using
db2pdcfg -catch locktimeout count=1
-catch option specifies the failure or event that should automatically lead to an invocation of the
db2cos script. For locktimeout events, the string
locktimeout can be specified. As an alternative, you could specify the corresponding SQL error code and reason code for locktimeouts:
Listing 16. Alternative
db2pdcfg call for catching locktimeouts
db2pdcfg -catch 911,68 count=1
Besides certain string values and SQL codes,
db2pdcfg also accepts internal DB2 error codes. So there are quite a lot database failures and events that you can catch this way. Locktimeout events are only one example of using of
The value 1 for the
count suboption indicates that the
db2cos script should only be executed once a locktimeout event occurs.
db2pdcfg confirms the setting of the error catch with the following output:
db2pdcfg's confirmation of error catch setting
Error Catch #1 Sqlcode: 0 ReasonCode: 0 ZRC: -2146435004 ECF: 0 Component ID: 0 LockName: Not Set LockType: Not Set Current Count: 0 Max Count: 1 Bitmap: 0x4A1 Action: Error code catch flag enabled Action: Execute sqllib/db2cos callout script Action: Produce stack trace in db2diag.log
The setting of the error catch is also reported in the
db2diag.log. Instead of opening the
db2diag.log in a text editor, the
db2diag.log file can be filtered using the
db2diag utility (a useful utility for examining the contents of the
Listing 18. Confirmation of error catch setting in
db2diag -g funcname:=pdErrorCatch 2006-12-18-126.96.36.199000+060 I727480H285 LEVEL: Event PID : 4648 TID : 3948 PROC : db2syscs.exe INSTANCE: DB2 NODE : 000 FUNCTION: DB2 UDB, RAS/PD component, pdErrorCatch, probe:30 START : Error catch set for ZRC -2146435004
ZRC -2146435004 is the DB2 internal error code for locktimeouts. These can be checked with the following
Listing 19. Check meaning of DB2 internal error codes using
db2diag -rc -2146435004
db2pdcfg, the database engine has now been configured to call the
db2cos script each time a locktimeout occurs. The
db2cos script gathers all the required monitor information to determine the cause of the locktimeout. To do so, the DBA has to modify the
db2cos script to call
db2pd with the options already known. The
db2cos script can be found in the following subdirectory:
DB2 install directory\BIN\db2cos.bat, for example
Instance owner home/sqllib/bin/db2cos
db2cos.bat script on Microsoft Windows® looks like this:
Listing 20. Contents of default
db2cos.bat on Windows
setlocal :iterargs if %0. == . goto iterdone if /i %0. == INSTANCE. set INSTANCE=%1 if /i %0. == DATABASE. set DATABASE=%1 if /i %0. == TIMESTAMP. set TIMESTAMP=%1 if /i %0. == APPID. set APPID=%1 if /i %0. == PID. set PID=%1 if /i %0. == TID. set TID=%1 if /i %0. == DBPART. set DBPART=%1 if /i %0. == PROBE. set PROBE=%1 if /i %0. == FUNCTION. set FUNCTION=%1 if /i %0. == REASON. set REASON=%1 if /i %0. == DESCRIPTION. set DESCRIPTION=%1 if /i %0. == DiAGPATH. set DIAGPATH=%1 shift goto iterargs :iterdone if %DATABASE%. == . goto no_database db2pd -db %DATABASE% -inst >> %DIAGPATH%\db2cos%PID%%TID%.%DBPART% goto exit :no_database db2pd -inst >> %DIAGPATH%\db2cos%PID%%TID%.%DBPART% :exit
In the case of an event or failure at the database level, the default
db2cos script calls
db2pd with the
-inst options. The DBA replaces the corresponding line with a
db2pd call that collects the monitor data required for locktimeout analysis:
Listing 21. Change
db2cos script to collect data for locktimeout analysis
if %DATABASE%. == . goto no_database db2pd -db %DATABASE% -locks wait -transactions -agents -applications -dynamic >> %DIAGPATH%\db2cos%PID%%TID%.%DBPART% goto exit
db2cos script is prepared and the DBA can wait for the next locktimeout event to occur.
Let's assume the same lock situation, between user A and user B as described before, happens. But this time
LOCKTIMEOUT is set and therefore the transaction of user B is rolled back after 10 seconds (
LOCKTIMEOUT = 10). User B notifies the DBA that his or her transaction was rolled back with SQL error message -911 and reason code 68 (SQL code -911 / reason code 68 = locktimeout). In turn, the DBA checks the monitor data that was gathered by the automatic invocation of the
First, the DBA calls
db2diag with the internal error code for locktimeout to determine the exact time the locktimeout happened:
Listing 22. Check point-in-time for locktimeout event in
db2diag -g data:=-2146435004 2006-12-18-188.8.131.526000+060 I6857H409 LEVEL: Event PID : 2968 TID : 2932 PROC : db2syscs.exe INSTANCE: DB2 NODE : 000 DB : SAMPLE APPHDL : 0-21 APPID: *LOCAL.DB2.061226132544 AUTHID : FECHNER FUNCTION: DB2 UDB, lock manager, sqlplnfd, probe:999 DATA #1 : <preformatted> Caught rc -2146435004. Dumping stack trace.
db2diag.log entry shows that a locktimeout occurred at 2006-12-18-184.108.40.2066000. As the
db2cos script writes its output to a file named
db2cos%PID%%TID%.%DBPART% in the
%DIAGPATH%, the DBA can expect to find a file
db2cos29682932.0 in the instance's diagnostic path:
%DIAGPATH%= instance's diagnostic path = on Windows by default
%PID%= process ID = 2968 (as shown in the
%TID%= thread ID = 2932 (also shown in the
%DBPART%= database partition = 0 (in a non-partitioned database environment)
The contents of that file will resemble the
db2pd monitor output examined step-by-step in the first part of this article and will allow the DBA to identify the cause of the locktimeout.
Having caught the locktimeout, the DBA can disable the
db2cos script invocation by calling
db2pdcfg with the
-catch clear option:
Listing 23. Clear error catch settings using
db2pdcfg -catch clear All error catch flag settings cleared.
This article demonstrates how to use the
db2pd utility for lockwait monitoring. Going through a sample scenario, it shows how a DBA can identify the cause for concurrency problems by examining the output of different
db2pd options. Starting with DB2 9,
db2pd can be used in conjunction with the new
db2cos script to catch locktimeout events as they happen. You learned how to configure automatic invocation of the
db2cos script in the case of locktimeout events. The
db2diag utility was introduced -- a useful tool for checking the contents of the
- See all the articles in the series.
- The article "The db2pd tool" (developerWorks, Apr. 2005) by Sam Poon provides an overview of
db2pd's monitoring capabilities.
- The chapter Monitoring and troubleshooting using db2pd in the DB2 9 for Linux, UNIX, and Windows documentation contains further usage scenarios for DB2 monitoring with
- Besides an introduction to DB2 data concurrency concepts, the tutorial "DB2 9 Fundamentals exam 730 prep, Part 6: Data concurrency" contains a nice overview of the different kinds of locks used by DB2.
- In the article "Understanding locking in DB2 Universal Database" (developerWorks, Nov. 2005) Lorysa Bond introduces the principles behind DB2's locking strategy.
- The article "Lock avoidance in DB2 UDB V8" (developerWorks, Sept. 2005) by Werner Schuetz explains DB2 concurrency basics and shows how DB2's locking behaviour can be influenced by setting special DB2 registry variables.
- The complete DB2 9 for Linux, UNIX, and Windows documentation online in HTML format: DB2 Information Center
- Download a free trial version of DB2 Enterprise 9.
- Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edtion and provides a solid base to build and deploy applications.
- Visit the developerWorks resource page for DB2 for Linux, UNIX, and Windows to read articles and tutorials and connect to other resources to expand your DB2 skills.
- Learn about DB2 Express-C, the no-charge version of DB2 Express Edition for the community.
- Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2, Lotus®, Rational®, Tivoli®, and WebSphere®.