IBM Support

75 ways to demystify DB2 #47: Techtip : In a rare scenario db2fodc clears out hang situation

Technical Blog Post


Abstract

75 ways to demystify DB2 #47: Techtip : In a rare scenario db2fodc clears out hang situation

Body

On Linux Platforms, DB2 FCM threads (db2fmcs and db2fcmr) may hang in select() system call due to possibly reaching file descriptor limit. The symptom would be hard to confirm via a stack trace of the FCM threads since the stack generation tools (db2pd -stack, pstack, gstack, gcore) will all clear up the hang. With APAR IC81639, db2fcm (s and r) threads will have the ability to use epoll() instead of select() since epoll() does not have the file descriptor limit that select() has. This change will not be by default but instead enabled via a registry variable - DB2_FCM_FDPOLL.
 

Basically select()  (here select() is the system call, not SQL select) was apparently missing events. select() can be used to detect if something is available for reading on a socket. There was something but select() never saw that . Because select() would never see the event some threads would just stay there hanging thinking  there is nothing to read on the socket. Using epoll() provides better results as select() and it is well faster than select(). Interrupting the system call (via gstack or db2pd) is enough for the system call to realise that there is something on the socket. i.e a signal or anything else (maybe pstack/gstack as well) would clear the hang. In that case the system call, being interrupted is able to see an event it was not able to see before.

 

 

 

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm11141018