Level: Intermediate Dirk Fechner (fechner@de.ibm.com), IT Services Specialist, IBM Software Group
19 Jul 2007 When several DB2® users access a database concurrently, lockwait situations can cause bad response times. Lockwaits tend to be temporary in nature and thus hard to catch. Nevertheless, when lockwait situations occur, it is the database administrator's responsibility to determine the cause of the lockwait times. This article demonstrates, by example, how to use the db2pd and db2pdcfg utilities for DB2 for Linux®, UNIX®, and Windows® to accomplish that task.
db2pd options for lock monitoring
db2pd is a utility for monitoring and troubleshooting all kinds of DB2 database activity. It is a standalone utility shipped with the DB2 engine starting with the DB2 V8.2 release and was developed to look, feel, and function like the Informix onstat utility. db2pd is executed from the command line with an optional interactive mode. The utility runs very quickly as it is not acquiring any locks or latches and runs outside of the engine's resources (meaning it even works on a hung engine). A lot of the monitor data provided by db2pd can also be gathered by snapshot monitoring, but the output format of db2pd and snapshot monitoring are quite different. This allows the DBA to choose the monitoring alternative that is more suitable for her or his needs. This article focuses on db2pd's options for lock monitoring. There is a developerWorks article by Sam Poon (see the Resources section) that provides a broader introduction to db2pd's monitoring capabilities.
The following diagram shows db2pd's options for lock monitoring:
Figure 1. db2pd options for lock monitoring
-
TranHdl: This is a specification of a transaction handle that allows for monitoring only locks held by a specific transaction.
-
showlocks: This suboption expands the lockname into meaningful explanations. For a row lock, the following information is shown: tablespace ID, table ID, partition ID, page and slot. Tablespace ID and table ID can easily be mapped to the corresponding tablename using a query on the catalog view SYSCAT.TABLES:
Listing 1. Map tablespace ID, table ID to tableschema, tablename
SELECT TABSCHEMA, TABNAME
FROM SYSCAT.TABLES
WHERE TBSPACEID = tbspaceid AND TABLEID = tableid
|
-
wait: Specifying the wait suboption, db2pd only shows locks that transactions are currently waiting on, and the locks responsible for the wait situation. This suboption greatly simplifies lockwait analysis because it limits the output to the locks participating in the lockwait situation.
- The
db2pd database and file options are not specific for lock monitoring, but apply to (nearly) all db2pd calls. The database option limits monitor data returned by db2pd to that of a certain database. The file option let's you define a file to which db2pd output is written.
A lockwait analysis scenario
Next, let's start to analyze a sample lockwait situation using the db2pd options introduced. For this purpose, let's create the DB2 SAMPLE database as follows:
Listing 2. Create SAMPLE database
Transaction A is executed by user A to give each manager a 10% bonus depending on the salary of the manager:
Listing 3. Update operation performed by transaction A
UPDATE EMPLOYEE
SET BONUS = SALARY * 0.1
WHERE JOB = 'MANAGER'
|
While transaction A is still running (because user A has not ended the transaction using COMMIT or ROLLBACK), transaction B is executed by user B to give each employee a 2% salary raise:
Listing 4. Update operation performed by transaction B
UPDATE EMPLOYEE
SET SALARY = SALARY * 0.02
|
As transaction B does not finish, user B calls the DBA to determine the cause of the problem. The DBA in turn calls db2pd to look for a possible lockwait situation:
Listing 5. Check for lockwait situation
db2pd -db sample -locks wait showlocks
Database Partition 0 -- Database SAMPLE -- Active -- Up 3 days 08:33:05
Locks:
Address TranHdl Lockname Type Mode Sts Owner Dur
0x050A0240 6 02000600050040010000000052 Row ..X W 2 1
0x050A0DB0 2 02000600050040010000000052 Row ..X G 2 1
HoldCount Att ReleaseFlg
0 0x00 0x40000000 TbspaceID 2 TableID 6 PartitionID 0 Page 320 Slot 5
0 0x00 0x40000000 TbspaceID 2 TableID 6 PartitionID 0 Page 320 Slot 5
|
db2pd reports a lockwait situation for a row lock occurring on table with ID 6 in tablespace with ID 2. Checking against SYSCAT.TABLES, the DBA determines that there is indeed a lockwait situation on table EMPLOYEE.
Listing 6. Determine the table involved in the lockwait situation
SELECT TABSCHEMA, TABNAME
FROM SYSCAT.TABLES
WHERE TBSPACEID = 2 AND TABLEID = 6
TABSCHEMA TABNAME
--------------------------------------------------------------------------------
FECHNER EMPLOYEE
1 record(s) selected.
|
The status column of the db2pd -locks output (Sts) shows a "G" for transaction 2 (column TranHdl). G stands for "granted" and means that the transaction with transaction handle 2 owns the row lock. Additionally, column Mode indicates that it is an X lock that is held by transaction 2. The waiting transaction (state W = "wait" in column Sts) is the transaction with handle 6. That transaction is requesting an X lock on the same row as transaction 2. You can see this by looking at the Owner column (showing transaction 2 as the lock owner) and comparing the Lockname (identical for both entries in the db2pd -locks section).
Next, the DBA maps the transaction handles to applications. This is done using another db2pd option, -transactions:
Listing 7. Map transaction handles to applications
db2pd -db sample -transactions
Database Partition 0 -- Database SAMPLE -- Active -- Up 3 days 08:34:47
Transactions:
Address AppHandl [nod-index] TranHdl Locks State Tflag Tflag2
0x05141880 30 [000-00030] 2 9 WRITE 0x00000000 0x00000
0x05144880 34 [000-00034] 6 5 WRITE 0x00000000 0x00000
|
The output from this db2pd call shows that transaction 2 (column TranHdl) is executed by application 30 (column AppHandl) and transaction 6 by application 34, respectively. Both transactions are in the process of writing changes to the database (column State = WRITE). So the DBA now knows that application 30 is holding the lock on which application 34 is waiting.
To get more information about the applications participating in the lockwait situation, db2pd is called with the -agents option. This option prints information about the agents working on behalf of the applications. Note that -agents is an instance level option, that means specifying a database is not necessary (indeed when a database is specified, db2pd prints a warning and ignores the database option).
Listing 8. Get information about applications and corresponding agents
db2pd -agents
Database Partition 0 -- Active -- Up 3 days 08:35:42
Agents:
Current agents: 2
Idle agents: 0
Active coord agents: 2
Active agents total: 2
Pooled coord agents: 0
Pooled agents total: 0
Address AppHandl [nod-index] AgentTid Priority Type State
0x04449BC0 34 [000-00034] 3392 0 Coord Inst-Active
0x04449240 30 [000-00030] 2576 0 Coord Inst-Active
ClientPid Userid ClientNm Rowsread Rowswrtn LkTmOt DBName
3916 USER_B db2bp.ex 43 43 NotSet SAMPLE
2524 USER_A db2bp.ex 153 14 NotSet SAMPLE
|
In the db2pd -agents output, the DBA can see the IDs of the users working with application 30 and 34 (column Userid): application 30 is executed by USER_A, and application 34 is executed by USER_B. Such a mapping between application and user ID is only possible if each user has a separate database authorization ID. Generally, it is not possible for applications running on an application server because these applications use connection pooling so that connections are not personalized.
More information about each application is provided by the db2pd option -applications:
Listing 9. Get more information about applications
db2pd -db sample -applications
Database Partition 0 -- Database SAMPLE -- Active -- Up 3 days 08:36:14
Applications:
Address AppHandl [nod-index] NumAgents CoorTid Status
0x04AF8080 34 [000-00024] 1 3940 Lock-wait
0x03841960 30 [000-00020] 1 2548 UOW-Waiting
C-AnchID C-StmtUID L-AnchID L-StmtUID Appid
195 1 0 0 *LOCAL.DB2.061122195637
0 0 60 1 *LOCAL.DB2.061122195609
|
The Status column confirms something the DBA already knows: application 34 is in lock-wait state. But that is nothing new, so the DBA concentrates on columns C-AnchID/C-StmtUID and L-AnchID/L-StmtUID, respectively. "C" stands for current and "L" stands for last anchor ID/statement UID. These IDs can be used to identify the last SQL statement executed by an application as well as the statement currently executed by an application. To do so, db2pd with the -dynamic option is called. This option shows the contents of the database's dynamic statement cache:
Listing 10. Examine contents of dynamic statement cache
db2pd -db sample -dynamic
Database Partition 0 -- Database SAMPLE -- Active -- Up 3 days 08:37:39
Dynamic Cache:
Current Memory Used 187188
Total Heap Size 1271398
Cache Overflow Flag 0
Number of References 2
Number of Statement Inserts 3
Number of Statement Deletes 0
Number of Variation Inserts 2
Number of Statements 3
Dynamic SQL Statements:
Address AnchID StmtUID NumEnv NumVar NumRef NumExe
0x056CEBD0 60 1 1 1 1 1
0x056CE850 180 1 0 0 0 0
0x056CFEA0 195 1 1 1 1 1
Text
UPDATE EMPLOYEE SET BONUS = SALARY * 0.1 WHERE JOB = 'MANAGER'
SET CURRENT LOCALE LC_CTYPE = 'de_DE'
UPDATE EMPLOYEE SET SALARY = SALARY * 0.02
Dynamic SQL Environments:
Address AnchID StmtUID EnvID Iso QOpt Blk
0x056CECD0 60 1 1 CS 5 B
0x056D30A0 195 1 1 CS 5 B
Dynamic SQL Variations:
Address AnchID StmtUID EnvID VarID NumRef Typ
0x056CEEB0 60 1 1 1 1 4
0x056D3220 195 1 1 1 1 4
Lockname
010000000100000001003C0056
01000000010000000100C30056
|
The mapping between the -applications output and the -dynamic output is straightforward:
Application 34 (in lock-wait state) is currently executing the SQL statement identified by current anchor ID 195 and current statement ID 1. In the Dynamic SQL Statements section of the db2pd -dynamic output, those IDs can be mapped to the following SQL statement:
Listing 11. SQL statement executed by application 34
UPDATE EMPLOYEE SET SALARY = SALARY * 0.02
|
The last SQL statement executed by application 30 that is holding the lock was the statement with last anchor ID 60 and last statement ID 1. Those IDs can be mapped to the following SQL statement:
Listing 12. SQL statement executed by application 30
UPDATE EMPLOYEE SET BONUS = SALARY * 0.1 WHERE JOB = 'MANAGER'
|
Note that the db2pd -dynamic output contains another interesting piece of information that is normally difficult to find out: The isolation level of a dynamic SQL statement that was executed is shown in column Iso of the Dynamic SQL Environments section (UR = Uncommitted Read, CS = Cursor Stability, RS = Read Stability, RR = Repeatable Read).
Let's summarize what the DBA discovered regarding the cause for the hang of user B's application:
- The hang is caused by an exclusive row lock on table
EMPLOYEE.
- The transaction owning the lock belongs to an application executed by user A. The transaction of user B is waiting on that lock.
- The two statements colliding are both
UPDATE statements on table EMPLOYEE.
With that information, the DBA can initiate the necessary steps to resolve the lockwait situation, such as advising user A to end the transaction, or forcing user A's application, among other options. In addition, measures can be taken to avoid such situations in the future, for example configuring the DB2 governor to automatically end transactions that run too long.
In the sample scenario, db2pd was executed several times one after the other, each time with a single option. In a real-world scenario, that would not be the case. Instead, db2pd would be called once with all the options introduced above:
Listing 13. Single db2pd call with all options required to analyze a lockwait situation
db2pd -db sample -locks wait showlocks -transactions -agents -applications -dynamic
-file db2pd.out -repeat 15 40
|
The resulting output consists of the outputs for each option in the order the options appear in the db2pd call. Also, note the two additional options at the end of the db2pd call:
-
-file indicates that db2pd output should be written to a file. In the sample call, output is written to file db2pd.out.
-
-repeat indicates that db2pd should be executed every 15 seconds for 40 times (that means for 10 minutes in intervals of 15 seconds). The output of each execution is appended to the end of the file specified by the -file option.
The -file and -repeat options are useful to monitor database activity for a certain period of time. For lockwait analysis, those options help to catch lockwait situations that only exist for a short period of time. For example, if the database parameter LOCKWAIT is set to 20 seconds, a transaction waiting for a lock is rolled back after 20 seconds of wait time. To catch such a lockwait situation, the db2pd interval must be set to an interval shorter than 20 seconds, such as 15 seconds in the example.
Catching locktimeouts that occur only sporadically
Sometimes lockwait situations lead to locktimeouts that cause transactions to be rolled back. The period of time until a lockwait leads to a locktimeout is specified by the database configuration parameter LOCKTIMEOUT. One of the biggest problems concerning locktimeout analysis is the fact that it is not known when the next locktimeout situation will occur. For catching deadlocks, you can create an event monitor for deadlocks. Such a deadlock event monitor writes an entry each time a deadlock occurs. But there is no equivalent event monitor for locktimeouts. So until DB2 9®, the only way to catch a locktimeout was by continuous db2pd or snapshot monitoring (for db2pd, the -file and -repeat options can be used for continuous lock monitoring as explained earlier).
DB2 9 comes with a new mechanism to collect monitor data when database failures or events occur: the db2cos script. For catching locktimeout events, we can configure the database to start the db2cos script each time a locktimeout occurs. Within the db2cos script, db2pd can be called with the same options as already discussed. Let's walk through a sample scenario demonstrating the use of the db2cos script to catch locktimeouts.
For this scenario, assume the DBA set the database locktimeout value to 10 seconds:
Listing 14. Update locktimeout setting
UPDATE DB CFG FOR SAMPLE USING LOCKTIMEOUT 10
|
To start the db2cos script each time a locktimeout occurs, the DBA calls the db2pdcfg utility as follows:
Listing 15. Configure db2cos script invocation using db2pdcfg
db2pdcfg -catch locktimeout count=1
|
The -catch option specifies the failure or event that should automatically lead to an invocation of the db2cos script. For locktimeout events, the string locktimeout can be specified. As an alternative, you could specify the corresponding SQL error code and reason code for locktimeouts:
Listing 16. Alternative db2pdcfg call for catching locktimeouts
db2pdcfg -catch 911,68 count=1
|
Besides certain string values and SQL codes, db2pdcfg also accepts internal DB2 error codes. So there are quite a lot database failures and events that you can catch this way. Locktimeout events are only one example of using of db2pdcfg and db2cos.
The value 1 for the count suboption indicates that the db2cos script should only be executed once a locktimeout event occurs.
db2pdcfg confirms the setting of the error catch with the following output:
Listing 17. db2pdcfg's confirmation of error catch setting
Error Catch #1
Sqlcode: 0
ReasonCode: 0
ZRC: -2146435004
ECF: 0
Component ID: 0
LockName: Not Set
LockType: Not Set
Current Count: 0
Max Count: 1
Bitmap: 0x4A1
Action: Error code catch flag enabled
Action: Execute sqllib/db2cos callout script
Action: Produce stack trace in db2diag.log
|
The setting of the error catch is also reported in the db2diag.log. Instead of opening the db2diag.log in a text editor, the db2diag.log file can be filtered using the db2diag utility (a useful utility for examining the contents of the db2diag.log):
Listing 18. Confirmation of error catch setting in db2diag.log
db2diag -g funcname:=pdErrorCatch
2006-12-18-13.37.25.177000+060 I727480H285 LEVEL: Event
PID : 4648 TID : 3948 PROC : db2syscs.exe
INSTANCE: DB2 NODE : 000
FUNCTION: DB2 UDB, RAS/PD component, pdErrorCatch, probe:30
START : Error catch set for ZRC -2146435004
|
ZRC -2146435004 is the DB2 internal error code for locktimeouts. These can be checked with the following db2diag call:
Listing 19. Check meaning of DB2 internal error codes using db2diag
Using db2pdcfg, the database engine has now been configured to call the db2cos script each time a locktimeout occurs. The db2cos script gathers all the required monitor information to determine the cause of the locktimeout. To do so, the DBA has to modify the db2cos script to call db2pd with the options already known. The db2cos script can be found in the following subdirectory:
- Windows:
DB2 install directory\BIN\db2cos.bat, for example C:\Program Files\IBM\SQLLIB\BIN\db2cos.bat
- UNIX/Linux:
Instance owner home/sqllib/bin/db2cos
The default db2cos.bat script on Microsoft Windows® looks like this:
Listing 20. Contents of default db2cos.bat on Windows
setlocal
:iterargs
if %0. == . goto iterdone
if /i %0. == INSTANCE. set INSTANCE=%1
if /i %0. == DATABASE. set DATABASE=%1
if /i %0. == TIMESTAMP. set TIMESTAMP=%1
if /i %0. == APPID. set APPID=%1
if /i %0. == PID. set PID=%1
if /i %0. == TID. set TID=%1
if /i %0. == DBPART. set DBPART=%1
if /i %0. == PROBE. set PROBE=%1
if /i %0. == FUNCTION. set FUNCTION=%1
if /i %0. == REASON. set REASON=%1
if /i %0. == DESCRIPTION. set DESCRIPTION=%1
if /i %0. == DiAGPATH. set DIAGPATH=%1
shift
goto iterargs
:iterdone
if %DATABASE%. == . goto no_database
db2pd -db %DATABASE% -inst >> %DIAGPATH%\db2cos%PID%%TID%.%DBPART%
goto exit
:no_database
db2pd -inst >> %DIAGPATH%\db2cos%PID%%TID%.%DBPART%
:exit
|
In the case of an event or failure at the database level, the default db2cos script calls db2pd with the -db and -inst options. The DBA replaces the corresponding line with a db2pd call that collects the monitor data required for locktimeout analysis:
Listing 21. Change db2cos script to collect data for locktimeout analysis
if %DATABASE%. == . goto no_database
db2pd -db %DATABASE% -locks wait -transactions -agents -applications -dynamic
>> %DIAGPATH%\db2cos%PID%%TID%.%DBPART%
goto exit
|
Now the db2cos script is prepared and the DBA can wait for the next locktimeout event to occur.
Let's assume the same lock situation, between user A and user B as described before, happens. But this time LOCKTIMEOUT is set and therefore the transaction of user B is rolled back after 10 seconds (LOCKTIMEOUT = 10). User B notifies the DBA that his or her transaction was rolled back with SQL error message -911 and reason code 68 (SQL code -911 / reason code 68 = locktimeout). In turn, the DBA checks the monitor data that was gathered by the automatic invocation of the db2cos script.
First, the DBA calls db2diag with the internal error code for locktimeout to determine the exact time the locktimeout happened:
Listing 22. Check point-in-time for locktimeout event in db2diag.log
db2diag -g data:=-2146435004
2006-12-18-14.27.24.656000+060 I6857H409 LEVEL: Event
PID : 2968 TID : 2932 PROC : db2syscs.exe
INSTANCE: DB2 NODE : 000 DB : SAMPLE
APPHDL : 0-21 APPID: *LOCAL.DB2.061226132544
AUTHID : FECHNER
FUNCTION: DB2 UDB, lock manager, sqlplnfd, probe:999
DATA #1 : <preformatted>
Caught rc -2146435004. Dumping stack trace.
|
The db2diag.log entry shows that a locktimeout occurred at 2006-12-18-14.27.24.656000. As the db2cos script writes its output to a file named db2cos%PID%%TID%.%DBPART% in the %DIAGPATH%, the DBA can expect to find a file db2cos29682932.0 in the instance's diagnostic path:
-
%DIAGPATH% = instance's diagnostic path = on Windows by default C:\Program Files\IBM\SQLLIB\DB2
-
%PID% = process ID = 2968 (as shown in the db2diag.log entry)
-
%TID% = thread ID = 2932 (also shown in the db2diag.log entry)
-
%DBPART% = database partition = 0 (in a non-partitioned database environment)
The contents of that file will resemble the db2pd monitor output examined step-by-step in the first part of this article and will allow the DBA to identify the cause of the locktimeout.
Having caught the locktimeout, the DBA can disable the db2cos script invocation by calling db2pdcfg with the -catch clear option:
Listing 23. Clear error catch settings using db2pdcfg again
db2pdcfg -catch clear
All error catch flag settings cleared.
|
Summary
This article demonstrates how to use the db2pd utility for lockwait monitoring. Going through a sample scenario, it shows how a DBA can identify the cause for concurrency problems by examining the output of different db2pd options. Starting with DB2 9, db2pd can be used in conjunction with the new db2cos script to catch locktimeout events as they happen. You learned how to configure automatic invocation of the db2cos script in the case of locktimeout events. The db2diag utility was introduced -- a useful tool for checking the contents of the db2diag.log.
Resources Learn
- The article "The db2pd tool" (developerWorks, Apr. 2005) by Sam Poon provides an overview of
db2pd's monitoring capabilities.
- The chapter
Monitoring and troubleshooting using db2pd
in the DB2 9 for Linux, UNIX, and Windows documentation contains further usage scenarios for DB2 monitoring with
db2pd.
- Besides an introduction to DB2 data concurrency concepts, the tutorial "DB2 9 Fundamentals exam 730 prep, Part 6: Data concurrency" contains a nice overview of the different kinds of locks used by DB2.
- In the article "Understanding locking in DB2 Universal Database" (developerWorks, Nov. 2005) Lorysa Bond introduces the principles behind DB2's locking strategy.
- The article "Lock avoidance in DB2 UDB V8" (developerWorks, Sept. 2005) by Werner Schuetz explains DB2 concurrency basics and shows how DB2's locking behaviour can be influenced by setting special DB2 registry variables.
- The complete DB2 9 for Linux, UNIX, and Windows documentation online in HTML format: DB2 Information Center
- The DB2 for Linux, UNIX, and Windows technical support site: DB2 Technical Support: From here you can search for APARs, download fixpacks, get DB2 for Linux, UNIX, and Windows documentation in PDF format, etc.
-
Visit the developerWorks resource page for DB2 for Linux, UNIX, and Windows to read articles and tutorials and connect to other resources to expand your DB2 skills.
-
Learn about DB2 Express-C, the no-charge version of DB2 Express Edition for the community.
- Browse the
technology bookstore
for books on these and other technical topics.
Get products and technologies
-
Download a free trial version of DB2 Enterprise 9.
-
Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edtion and provides a solid base to build and deploy applications.
- Download
IBM product evaluation versions
and get your hands on application development tools and middleware products from
DB2, Lotus®, Rational®, Tivoli®, and
WebSphere®.
Discuss
About the author  | 
|  |
Dirk Fechner works as an IT Services Specialist for IBM Software Group. His area of expertise is administration of and application development with DB2 UDB on distributed platforms. He has five years of experience with DB2 UDB and is an IBM Certified Advanced DBA and IBM Certified Application Developer. He currently supports administrators, developers, and end-users at DaimlerChrysler on a wide variety of DB2 topics: administrative tasks, application development, and problem determination.
|
Rate this page
|