Technical Blog Post
Abstract
Frequent db2diag.log messages : sqkfChannel::DeliverInboundBuffer
Body
We have seen some cases recently where the we see following messages frequently written into the db2diag.log for purescale environments :
2018-01-10-03.35.00.656437+540 I8443A1349 LEVEL: Error
PID : 18153532 TID : 3193 PROC : db2sysc 1
INSTANCE: db2inst1 NODE : 001
HOSTNAME: host0005
EDUID : 3232 EDUNAME: db2pdbc 1
FUNCTION: DB2 UDB, fast comm manager, sqkfChannel::DeliverInboundBuffer, probe:4717
DATA #1 : String, 51 bytes
Invalid Sequence No. Detected = 2. Expected No. = 1
DATA #2 : signed integer, 2 bytes
0
DATA #3 : unsigned integer, 2 bytes
665
DATA #4 : unsigned integer, 2 bytes
2
CALLSTCK: (Static functions may not be resolved correctly, as they are resolved to the nearest symbol)
[0] 0x090000000AC34B48DeliverBufferToTargetChannel__19sqkfFastCommManagerFP10sqkfBufferiN2217SQLKF_CHANNEL_PRIP17SQLKF_SESSION_HDLP15sqkfSendConduit +0x192C
[1] 0x090000000AC35D28DeliverBufferToTargetChannel__19sqkfFastCommManagerFP10sqkfBufferiN2217SQLKF_CHANNEL_PRIP17SQLKF_SESSION_HDLP15sqkfSendConduit +0x2B0C
[2] 0x090000000A45ACBC RunEDU__22sqePdbSystemControllerFv + 0x2BFC
[3] 0x090000000A459324 RunEDU__22sqePdbSystemControllerFv + 0x1264
[4] 0x090000000A467E90 RunEDU__22sqePdbSystemControllerFv + 0x334
[5] 0x0900000009B6E724 EDUDriver__9sqzEDUObjFv + 0x3F0
[6] 0x090000000A11A598 sqloEDUEntry + 0x394
[7] 0x090000000097EE10 _pthread_body + 0xF0
[8] 0xFFFFFFFFFFFFFFFC ?unknown + 0xFFFFFFFF
The probe "sqkfChannel::DeliverInboundBuffer” means that the member 0 received data buffer #2 but for some reason did not receive buffer #1 from member 2.
It is a message reported by Fast Communication Manager(FCM) for pure scale (and DPF as well) environment which is indicating there is communication error between nodes. So all this error means, is that the FCM was expecting information from buffer 1, but got buffer 2. This isn't really reporting a big problem. But basically means that some intercommunication issue happened.
After this happens the request would be sent back to the coordinator node to advise that this issue happened and then it would be resubmitted. If there are no other error messages in db2diag.log, the subsequent resubmission went fine and you need not worry about it.
We have seen this error happen before, during communication time-out error between members and CF when testing the link state. This error can be usually ignored because DB2 will rebuild the problematic link after this error, as it did in most customer cases. We also saw this happen when there were network delays / packets drops in network systems.
If you continuously see this message in db2diag.log, we suggest to check the network connection/communication between Member 0 and member 2 hosts. Please request your network admin to test communication between these members for link states, network delays, packet drops or any other connection/communication errors present between the members which might be surfacing sporadically leading to these sporadic messages in db2diag.log.
Also addressed in :
UID
ibm13286083