Technical Blog Post
Abstract
i5OS agent not connecting to RTEMS
Body
All the i5OS monitor admins know very well the issue related to APAR IV76801.
(see technote https://www-01.ibm.com/support/docview.wss?uid=swg21966093 for detail)
that caused lot of panic as all the i5OS monitoring agents suddenly stopped working.
Now all i5OS environments are for sure running the patch or the agent version that include above fix.
Anyway, every time we have to investigate a communication issue between agent and RTEMS, that is the first option
that comes to our mind.
There is instead another interesting scenario where you can experience failing communications between i5OS agent and TEMS.
This is a scenario I worked on recently.
The message log (KMSOMLOG) just showed "Tivoli Monitoring: i5/OS Agent startup in progress"
The agent RAS1 log (KA4AGENT) instead contains error messages like:
(5C46D416.0F32-12:kdcc1sr.c,485,"rpc__sar") Connection failure: "ip.spipe:#*:3660", 1C010001:1DE00045, 0, 5(6), FFFF/2, D140831.1:1.1.1.13, tms_ctbs630fp7:d6305a
(5C46D416.0F33-12:kdcl0cl.c,142,"KDCL0_ClientLookup") status=1c020006, "location server unavailable", ncs/KDC1_STC_SERVER_UNAVAILABLE
(5C46D416.0F34-12:kraarreg.cpp,1794,"LookupAndRegisterWithProxy") Unable to connect to broker at ip.spipe:#*: status=0, "success", ncs/KDC1_STC_OK
(5C46D416.0F35-12:kraarreg.cpp,1814,"LookupAndRegisterWithProxy") Exit: 0x0
(5C46D416.0F36-12:kraarreg.cpp,1466,"ConnectProxyUsingCMSLIST") Unable to find running CMS on CT_CMSLIST <IP.SPIPE:(1.2.3.4);>
By setting:
KDE_DEBUG=Y
KDC_DEBUG=Y
I was able to check that the SSL handshake and initial communication was working fine.
Agent sent its XID buffer (search in the log for outbound XID buffer) containing info regarding itself and RTEMS responded with
a XID buffer (search in the log for inbound XID buffer) containing info related to the RTEMS system.
So communication was working fine, no problems with ports or firewalls, this was an application issue.
Looking at the error messages, I noticed that the message "Unable to connect to broker at ip.spipe" was not showing a real IP address,
just an asterisk.
(5C46D416.0F34-12:kraarreg.cpp,1794,"LookupAndRegisterWithProxy") Unable to connect to broker at ip.spipe:#*: status=0, "success", ncs/KDC1_STC_OK
Making a further search into the agent RAS1 log I also found other messages saying:
(5C4AB9E4.082A-1C:kdebhba.c,96,"KDEB_HostByAddr") Status 1DE00007=KDE1_STC_NAMEUNAVAILABLE
(5C4AB9E4.082B-1C:kraarreg.cpp,1803,"LookupAndRegisterWithProxy") No CMS's registered with broker ip.spipe:#1.2.3.4.
Something was wrong with the info returned by RTEMS about services registered in the broker.
This can happen if the agent is configured to use PARTITIONS when this is not really requested,
or when a wrong partition configuration is used in a NAT environment.
Looking at the agent config I noticed we were actually using partition name:
KDC_PARTITION="PARTIT1"
This was not a NATted network so this configuration was not needed at all.
We reconfigured the agent to remove PARTIT1 from the KDC_PARTITION parameter.
By the way, you can do it by manually removing the value after KDC_PARTITION= in QAUTOTMP/KMSPARM KBBENV,
Or you can do it from agent configuration panels setting partition field to *NONE.
Then restart the agent and verified it was able to successfully connect and register with RTEMS, and it appeared
online on TEP afterward.
So, if you are experiencing communication problems with your i5OS agent, beside checking whether you have or not the
code fix requested for APAR IV76801, also double check agent configuration for unwanted KDC_PARTITION.
Hope it helps.
Subscribe and follow us for all the latest information directly on your social feeds:
|
|
|
Check out all our other posts and updates: | |
Academy Blogs: | https://goo.gl/U7cYYY |
Academy Videos: | https://goo.gl/TLfMoF |
Academy Google+: | https://goo.gl/HnTs0w |
Academy Twitter : | https://goo.gl/AhR8CL |
UID
ibm11085271