IBM Support

How to diagnose hang in OPEN()/FCNTL() system call?

Question & Answer


Question

How to diagnose hang in OPEN()/FCNTL() system call?

Answer

It can happen that some process would hang in the open() or fnctl() system call.
This usually has to do with file locking issue. File locking is required
sometimes on Unix files to guarantee the system integrity. It is quite common
for programs to wait until the lock is available. Therefore if the lock holder
does not release the lock on the file then the waiter will hang.

Hangs can happen in the backend (db2sysc) but also in other process (like db2bp).
Looking at a stack trace of the process hanging will show open() or fcntl()
at the top of the stack. For example:

[9000000000318F8]open+0000F8
[9000000000308A0]open64+000040
[900000005D16AF0]sqloopenp+000930
...

fcntl() at 0x900000000037924
sqloflock() at 0x90000002da5efe0
sqloopenp() at 0x90000002da5db04
...

While there might be various reasons for open() or fcntl() to hang a common
one is that some other process is already holding a lock on that same file.
The idea of course is to find out what file it is and what process is holding
the lock on that file.

The steps to solve this locking issue are:

- Find the file we are trying to lock.
- Find who is already holding a lock on that file.


- 1 - Find the file we are trying to lock -

There are various ways to find out the name of the file. One way is to generate
a core file of the hanging process. Analysing the core file will most likely
lead to finding the file name but this requires some assembly skills and a good
understanding of how stack works. So we will not look at this method. However
collecting a core file of the hanging process is a very good idea because
someone else might be able to analyse it for you and recover the file name.


- Using DB2 traces.

Another way, when this happens with DB2 related process is to use the traces.
To do that you need to trace the two interesting calls we can see in the stack
outputs above, that is 'sqloopenp()' and 'sqloflock()'. Imagine that 'db2start'
hangs for a few minutes before completing. Using some tool like 'procstack'
or 'pstack' you collected a few stacks and found out that the process was
always in 'fcntl()'. So you can suspect the process to be hanging waiting for
a lock on a file. One thing you can do is start the traces:

# db2trc on -t -Madd sqloopenp -Madd sqloflock

In this case after a few minutes the hang disappear. At that time you dump
the traces using once again 'db2trc':

# db2trc dump trc.raw
# db2trc stop
# db2trc off
# db2trc flw -t trc.raw trc.flw
# db2trc fmt trc.raw trc.fmt

In the 'trc.flw' file you see the following:

3 0.000613000 sqloopenp entry
4 0.000629000 | sqloflock entry
15 178.851583000 | sqloflock exit
16 178.851595000 sqloopenp exit

So we spent '178' seconds trying to open a file. And by looking at the 'trc.fmt'
output we can find the file name:

3 entry DB2 UDB oper system services sqloopenp cei (1.3.15.842.2)
pid 17046 tid 46912841521984 cpid -1 node -1 sec 0 nsec 613000

bytes 75

Data1 (PD_TYPE_FILE_NAME,27) File name:
/home/dalla/sqllib/db2systm

So here using DB2 traces we could identify the file name. The file we are
waiting to get a lock on is '/home/dalla/sqllib/db2systm'.


- Using 'truss/strace'.

Utilities like 'truss/strace' can be used for gathering system calls performed
by various process. They are powerful but unfortunately very slow because
of the heavy interaction between the tool the Kernel and the target process.
Also soem programs like 'db2start' might have special permissions like 'suid'
bit that might prevent the tool from executing by another user than 'root'.
However this can be used for other programs as well and the approach is similar.
Here we track the 'open()' and 'fcntl()' calls only using a timestamp on a
small program that will attempt to lock the same file:

# strace -f -o strace.out -T -e open,fcntl chkfile3 /home/dalla/sqllib/db2systm

As above, once the target program resumed we can interrupt the trace and
look at the 'strace.out' file to find out the file name:

19650 open("/home/dalla/sqllib/db2systm", O_RDWR) = 3 <0.000029>
19650 fcntl(3, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0 <122.958504>

We waited 122 seconds to get a lock on the file with descriptor 'fd = 3'.
The previous line shows that the descriptor 'fd = 3' is for a file named
'/home/dalla/sqllib/db2systm'.


- Other ways.

Tools like probevue, dtrace, systemtap offer a better solution. They are much
faster than 'strace/truss' or any debugger because they is no interaction with
the target process. Also they can cover all process running on the machine.
If the DB2 traces don't provide you with the answer this choice would be the
preferred one. We will see a quick example below.


- 2 - Find who is already holding a lock on that file -

Once you identified the file that is causing the issue the next step is to
find who might be holding a lock on that file. Note that for files that
are stored on NFS (Networked File System) the process holding the lock might
be running on another machine.


- Using DB2 traces.

If the holder is another DB2 related process the traces would most likely allow
you to find out which one using the same technic as above. However the process
holding the lock might be something not related to DB2 like a security or backup
tool running on the same machine.


- Using 'truss/strace'.

In this case strace or truss would most likely not help. The reason is that
you do not know what process to trace and obviously it wouldn't make sense
to start truss or strace on every process running on the machine.


- Other ways.

We talked above about probevue, dtrace and systemtap. Those tools would be
very handy here. They enable some probes of your choice in the Kernel and this
means that any process hitting those probes will be reported. If you enable
only a few probes, here 'open()' and 'fcntl()' [for locking only] they should
not have impact on the system performance. The only thing is that if you let
them run long enough they might create large outputs, but nothing you cannot
handle using 'pipe' to some program that would smartly keep only the latest
output files.

So let's try to see a simple way to track who might be locking the small file
'/usr2/as_support/dalla/aio1.out'. We will use a small C program to lock that
file and see if we can catch it.

The tools like probevue, dtrace and systemtap are a bit more complex than
truss or strace and require a bit of practice. In this case we will use
simple examples that could be enhanced for better performance in case they
would be used on a production system. But here we simply want to give an
overview of what they can do.


- Systemtap [linux]

The following script, once again, is very simple. If it would to be used
on a production system it should be modified for example to handle fcntl()
probes only for the file of interest.

#! /usr/bin/env stap

/*
* chklock8.st: Track lock manipulations on a given file.
*
* Run as user 'root' using the following command line:
*
* stap -o chkfile8.out chkfile8.st
*
*
* dalla
*/

global myfilename
global inopen
global infcntl
global open_fname
global open_flags
global fcntl_fd
global fcntl_str


/*
* 'myfilename' should be the name of the file we want to
* track locks for.
*/
probe begin
{
myfilename = "/usr2/as_support/dalla/aio1.out";

printf("[info] start tracing %s at %d\n", myfilename, gettimeofday_us());
}


/*
* We consider only the 'open()' calls when they affect the
* given file as locking requires the file to be opened.
*/
probe syscall.open
{
fname = user_string($filename);

if (isinstr(fname, myfilename) != 0) {
inopen[tid()] = 1;
open_fname[tid()] = fname;
open_flags[tid()] = $flags;
}
}


/*
* When we exit the 'open()' call print the info and fd.
*/
probe syscall.open.return
{
if (inopen[tid()] == 1) {
printf("[%s - %d - %d] open(%s, 0x%08x) = %d\n",
execname(), pid(), tid(),
open_fname[tid()], open_flags[tid()], $return);
inopen[tid()] = 0;
}
}


/*
* When we hit fcntl() we check that this is for a lock request.
*/
probe syscall.fcntl
{
fcntl_fd[tid()] = $fd;
fcntl_str[tid()] = _fcntl_cmd_str($cmd);
infcntl[tid()] = 1;
}

probe syscall.fcntl.return
{
if (infcntl[tid()] == 1) {
printf("[%s - %d - %d] fcntl(%d, %s) = %d\n",
execname(), pid(), tid(),
fcntl_fd[tid()],
fcntl_str[tid()], $return);
infcntl[tid()] = 0;
}
}

probe end
{
printf("[info] start tracing %s at %d\n",
myfilename, gettimeofday_us());

delete myfilename;
delete inopen;
delete infcntl;
delete open_fname;
delete open_flags;
delete fcntl_fd;
delete fcntl_str;
exit();
}

Once we hit the lock issue we can interrupt the systemtap script and look
at the output file to see if we can spot any process locking our file
named '/usr2/as_support/dalla/aio1.out':

...
[pcscd - 4326 - 4371] fcntl(7, F_SETFD) = 0
[chkfile3 - 11685 - 11685] open(/usr2/as_support/dalla/aio1.out, 0x00000002) = 3
[chkfile3 - 11685 - 11685] fcntl(3, F_SETLKW) = 0
[pcscd - 4326 - 4371] fcntl(7, F_SETFD) = 0
...

We find it. This same script could be used, slightly modified, to find out
who would hold a lock preventing DB2 from processing further.


- Probevue [aix]

Similar program for AIX now, using probevue. Once again this is a simple
example that should most likely be enhanced it it was to run on a production
system. However for our example it will do:

/*
* chkfile8.pb: Track locking activity on a given file.
*
* Run as user 'root' using the following command line:
*
* probevue -t 10 -e 75 -s 64 -o chkfile8.out chkfile8.pb
*
*
* dalla
*/

int open(char *, int);
int kfcntl(int, int);

__thread int in_open;
__thread char *open_path;
__thread int open_mode;

__thread int in_kfcntl;
__thread int kfcntl_fd;
__thread int kfnctl_cmd;



@@syscall:*:open:entry
{
__auto String fname[256];

fname = get_userstring(__arg1, -1);

if (strstr(fname, "/home/dalla/tmp/shmat.c")) {
thread:in_open = 1;
thread:open_path = __arg1;
thread:open_mode = __arg2;
}
}

@@syscall:*:open:exit
when (thread:in_open == 1)
{
__auto String fname[256];

fname = get_userstring(thread:open_path, -1);

printf("[%s - %ld - %ld] open(%s, 0x%08x) = %d [errno = %d]\n",
__pname, __pid, __tid, fname, thread:open_mode, __rv, __errno);

thread:in_open = 0;
}


/*
* Here we are only interested in F_SETLK64 and F_SETLKW64 (fcntl.h)
*/
@@syscallx:*:kfcntl:entry
{
if ((__arg2 == 12) || (__arg2 == 13)) {
thread:in_kfcntl = 1;
thread:kfcntl_fd = __arg1;
thread:kfcntl_cmd = __arg2;
}
}

@@syscallx:*:kfcntl:exit
when (thread:in_kfcntl == 1)
{
printf("[%s - %ld - %ld] kfcntl(%d, %d) = %d [errno = %d]\n",
__pname, __pid, __tid, thread:kfcntl_fd, thread:kfcntl_cmd, __rv, __errno);

thread:in_kfcntl = 0;
}

Same approach, once we had the hang we can check the output file to detect
who was holding a lock. Note that scripts can be enhanced with timestamps
as well to make things more readable if needed.

[chkfile3 - 15335586 - 32047187] open(/home/dalla/tmp/shmat.c, 0x00000002) = 3 [errno = 0]
[chkfile3 - 15335586 - 32047187] kfcntl(3, 13) = 0 [errno = 0]

[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Performance - Hang","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"}],"Version":"10.1;10.5;9.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21984385