IBM Support

Large number of record locks held impacts system performance

Troubleshooting


Problem

This document explains how to determine what job(s) are holding a large number of record locks.   

Symptom

System performance is impacted due to the number of record locks active in the partition.  The complaint may be that CPU utilization is high and that all jobs are using more CPU than normal.  The CPU consumption is high, but the CPU usage is spread across all jobs accessing the database.    Job Watcher and PEX TPROF will show the top CPU procedure as rmslDBHLock*.  

Cause

The system has one internal structure that tracks record locks.  This structure is typically referred to as the seize/lock table, but it is not an actual table.  This structure is accessed as jobs obtain and release locks.   The performance of applications accessing the database will decrease as the number of locks in the seize/lock table becomes large.  The resources required for the operating system to track record locks increases as the number of record locks in the partition becomes large.

Environment

All versions of IBM i.

Resolving The Problem

Use one of the two methods below to determine which jobs(s) are holding a large number of record locks:

Method 1: Query the QSYS2/SYSLIMITS  view:  
If commitment control is being used, query the QSYS2/SYSLIMITS view to find the job(s) holding a large number of record locks.

Method 2: Query  the collection services performance data: 
Query the collection services performance data to find the job(s) holding a large number of locks.  This method will work whether or not commitment control is being used. 

An application, that uses commitment control, can more easily hold many more record locks than an application that doesn't. For example, an application without commitment control will read a record, lock it for update, update the record, then read the next record. The record lock only lasts until the next record is read. Commitment control, on the other hand, provides the ability to hold many resources as a single transaction. For example, an application can read, lock, and update 1,000 rows and keep them locked until the application does a COMMIT or ROLLBACK operation. Also, keep in mind that commitment control can be used in both High Level Language programs as well as in SQL. For example, an RPG program could use commitment control and read a group of records, one at a time, perform an update on each, and choose not to issue a COMMIT or ROLLBACK to release the locks until some number of records have been processed. With SQL, a single statement running under commitment control can be the source of many records being locked. For example, UPDATE FILEA SET X=5 WHERE X=8 could update millions of rows. SQL will lock the records as the statement runs. Once the statement completes, the locks will continue to be held until a COMMIT or ROLLBACK operation is issued.

However, one primary example where large number of record locks can be held without the use of commitment control is the SQL MERGE statement.  An SQL MERGE statement will place an update lock on all records in the target table that will be updated or deleted. Even when running with a commit level of *NONE, these locks are obtained and held until the end of the MERGE statement processing. Once all rows affected by the MERGE have been processed, the rows locked in the target table for delete or update operations will be unlocked.  

A large number of locks in the seize/lock table can cause all jobs accessing the database to consume more CPU. A single job or XA transaction holding several hundred thousand record locks can be enough to cause performance issues. And, one or more transactions holding millions of record locks can cause a performance outage. 

The maximum number of records which can be locked to a commit transaction for a job is 500,000,000. The maximum can be set to a lower value via the COMMITMENT_CONTROL_LOCK_LIMIT QAQQINI setting.  

A commitment control transaction holding a large number of record locks can be associated with a job or an XA transaction. The locks may be record update locks due to pending changes or record read locks.

Note: some jobs can be running under commitment control or isolation level of *NONE (*NC) but the level will be raised to *CHG for various reasons. This will cause the locks on the rows to be held until a commit is done.
Method 1: Query the QSYS2/SYSLIMITS  view:  
If commitment control is being used, query the QSYS2/SYSLIMITS view to find the job(s) holding a large number of record locks.

Using QSYS2/SYSLIMITS to get an idea of the number of record locks a job held:
Use the following SQL if the issue is not currently happening.
select
last_change_timestamp,
JOB_NAME,
SBS_NAME,
JOB_STATUS,
AJSTATUS,
CURRENT_VALUE,
MAXVAL,
Sizing_Name
From QSYS2/syslimits
Where limit_ID = '16200'
Order by last_change_timestamp desc
Use the following if the issue is currently happening and you want to look at only active jobs:
select
last_change_timestamp,
JOB_NAME,
SBS_NAME,
JOB_STATUS,
AJSTATUS,
CURRENT_VALUE,
MAXVAL,
Sizing_Name
From QSYS2.syslimits
Where limit_ID = '16200' and
job_name in (
SELECT JOB_NAME 
FROM TABLE(QSYS2.ACTIVE_JOB_INFO()))
Order by last_change_timestamp desc;
Method 2: Query  the collection services performance data: 
Query the collection services performance data to find the job(s) holding a large number of locks.  This method will work whether or not commitment control is being used. 
Using Collection Services data:
1)    Determine if collection services is active:
 -  Run the command CHKPFRCOL.
 -  Place the cursor on the ‘Collection Services active.’ message and press F1.  Note the object from the    message, for example:   The current collection object is QMPGDATA/Q199000002.
- If the message states ‘Collection Services not active.’, then start collection services using the STRPFRCOL command.

2)    Open a Run SQL Scripts session from IBM I ACS
3)    Run the following script, using the collection noted above:
CREATE ALIAS QTEMP/QAPMJOBMI FOR QMPGDATA/QAPMJOBMI (Q199000002);
SELECT INTNUM, DATETIME, JBNAME, JBUSER, JBNBR , JBPSRCDLCK, JBTSRCDLCK FROM qtemp.qapmjobmi ORDER BY JBPSRCDLCK+JBTSRCDLCK desc;
4)    Check the output for jobs holding a large number of ‘Process scoped record locks or Thread scoped record locks’.  Keep in mind that the current performance collection object will typically only go back to midnight.  And, if a job accumulating record locks is held, it will still be holding the record locks and have related overhead.  A held job will only shows in the report for the ‘Interval date time’ when it was actively running. 
Sample output:
image-20220718132614-1
 
Actions to consider:
After determining the jobs holding a large number of record locks, you may make a decision to end the job(s) or let them finish.  Be aware that simply holding the jobs, that have already obtained the large number of record locks, will not reduce the overhead due to the number of record locks already obtained.  If a job is cancelled that is under commitment control, then the job will go into rollback.  The rollback can be cancelled,  if you have a way to recover or restore the file. See this document: How to end a rollback
To prevent the issue from happening again, consider limiting the maximum number of record locks taken per transaction by using the QAQQINI file. See Question number 7 in this document: Commitment Control. Also see Managing Transaction Size.

System health services provides a way to monitor for large commitment control transactions.  You can add a trigger program over the System Limits Table and monitor for the maximum number of rows locked to a unit of work.  An entry will be put out for every 10,000 rows that are locked.  Action can be taken based on the number of rows locked. Notification can be sent through email or a message to sent to QSYSOPR.  Find more information here:  https://www.ibm.com/docs/en/i/7.3?topic=services-system-health
 
Apply the PTF for APAR MA49150 and monitor for warnings that the lock table is nearing capacity.

Keyword: Seize/Lock table

[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"ARM Category":[{"code":"a8m0z0000000CHoAAM","label":"Performance"}],"ARM Case Number":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions"}]

Historical Number

472147883

Document Information

Modified date:
20 July 2022

UID

nas8N1018804