IBM Support

MQ Configuration Agent Hangs Due to Communication Error

Troubleshooting


Problem

View Discrepancies for any objects will cause MQ Configuration agent to hang. The agent is hung due to a communication error.

Symptom

Problematic non-responsive agent causes the entire configuration product to hang. Configuration requests requiring access to WebSphere MQ data may be unresponsive for up to 10 minutes.

Received the following messages from TEP client:




Suggested agent trace option:
KDC_DEBUG=Y
KDE_DEBUG=Y
KBB_RAS1=ERROR (UNIT:KRA STATE) (UNIT:KCF08 ALL) (UNIT:KMCCAGGR ALL)(UNIT:KMCCMIMI ALL) (UNIT:KMCCAGDC ALL) (UNIT:KMCCMQTK ALL)

Sample messages with the above agent tracing:

In the agent log:
(49A94FCF.0000-5:kdepsnd.c,84,"KDEP_Send") Status
1DE0004D=KDE1_STC_INVALIDTRANSPORTCORRELATOR
(49A94FCF.0001-5:kdcc1sr.c,984,"rpc__sar") Endpoint unavailable:
"ip.pipe:#<your_ip_addr>[1918]", 120, 131(2), FEB0/249, 1.1.1.1, d6062a
(49A94FCF.0002-5:kdepsnd.c,84,"KDEP_Send") Status
1DE0004D=KDE1_STC_INVALIDTRANSPORTCORRELATOR
(49A94FCF.0003-5:kraarpcm.cpp,497,"evaluateStatus") RPC call Sample for
<1144001786,1048809> failed, status = 1c010001
(49A94FCF.0004-6:kcf00agt.cpp,292,"TakeSampleDestructor") Agent
Disconnected from Server Unknown.

In the TEMS log:

Every 10 minutes:
KCFCM004E Internal Error 11:(FETCH_FAILED)- Function CMConfigAgent::bringOnline - RC 0

And possibly:

KCFCM005E Error from RCA. Function = CMConfigAgent::bringOnline
RC 514 (x'202') Reason 0 (x'0') Name = SYC1::RCACFG


How to determine which agent node is hung (if KCFCM005E message is not present in TEMS log to identify problematic agent):
Keep (UNIT:kpxreq ALL)(UNIT:kpxreqds ALL) trace on CMS. When hung condition occurs, you will see numerous time-out messages in CMS log:

kpxreqds.cpp,1217,"timeout") Timeout for <11534543> *.KMCPRCA.

Search CMS log for the request number specified in the timeout message -
11534543. The target for this request is the hanged agent:
kpxreq.cpp,257,"AddTarget") Adding target <gbw0at42.gothenburg.vcc.::RCA
CFG> to req *.KMCP

Recycling identified agent will resolve problem until maintenance can be applied.

Resolving The Problem

The TEMS and agents maintenance need to be brought up to ITM 6.1.0 Fixpack 6 IF2 or later.

The reason for agent hung is the communications issue, APAR IZ03993. This problem doesn't always occur, because it is dependent upon system load and timing, but when it does occur, it can cause the
symptoms encountered by customer.

[{"Product":{"code":"SSRLBE","label":"Tivoli OMEGAMON XE for Messaging for Distributed Systems"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"OMEGAMON XE WEBSPHERE MQ CONFIGURATION","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"7.0;7.0.1;7.1.0;7.3.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
17 June 2018

UID

swg21381619