IBM Support

Troubleshooting Windows OS Agent Disk Usage Summary view

Technical Blog Post


Abstract

Troubleshooting Windows OS Agent Disk Usage Summary view

Body

"Disk Usage Summary" view for Windows OS agent nodes collects data from all the available Windows OS Agent nodes.

If you are experiencing problems with this specific view, for example if it takes more than 10 minutes to complete or if the TEP client hangs when refreshing the workspace that include this view, then this article can provide some useful suggestion to understood the root cause and possibly fix it.

--

When a query is executed on several managed nodes, you can experience a performance problem if each agent needs a lot of time to perform the data collection.

It means that the query is too heavy and should be better tuned to return only useful data.
Anyway, most of the time this kind of performance problems are related with a single or a few systems failing in performing data collection, for instance because the agent is not reachable at all, or it is reachable but it is frozen or not working as fast as expected.
This causes the delay on TEMS side, and thus on TEPS and TEP client that asked for such query output.
In this kind of scenario, the most important task is identifying the system(s) running the impacting OS Agents.

To identify whether one or more OS Agents are facing problems leading to a response delay, we can leverage on the logs of TEMS.


Set this trace level on TEMS configuration file (or do it dynamically using service console):

ERROR (UNIT:kpxrpcrq ALL)(UNIT:kpxreq ALL)(UNIT:kpxcloc ALL)

 

Restart TEMS (not needed if traces have been dynamically activated) and then reproduce the error condition.

If one or more Windows OS nodes are failing in being contacted from TEMS while routing the query request, you may find these errors in the TEMS logs:

--------------
(4D47AF84.039E-163C:kpxreqi.cpp,144,"RequestImp_destr") RequestImp RES1 Delete handle 2175795485, owner 2167407050, obj@B91ACB0
(4D47AF84.039F-163C:kpxreqds.cpp,482,"Update") Request <2175795485> to node Primary:FAILNODE1:NT now has status 8
(4D47AF84.0399-C4C:kpxreqi.cpp,365,"startFailed") Start Request <2177897180> to Primary:FAILNODE1:NT failed because 210101f8.
(4D47AF84.03A1-C4C:kpxreq.cpp,1573,"Update") Entry
(4D47AF84.03A2-C4C:kpxreq.cpp,1577,"Update") Node Primary:FAILNODE1:NT has changed to status 4
(4D47AF84.03A3-C4C:kpxreq.cpp,1604,"Update") Setting public request error <3004>.
(4D47AF84.03A0-163C:kpxreq.cpp,1573,"Update") Entry
(4D47AF84.03A5-163C:kpxreq.cpp,1577,"Update") Node Primary:FAILNODE1:NT has changed to status 8

--------------

The most important message,  the one showing the name of the failing node, is highlighted in bold.

So, you can simply make a find in the TEMS logs looking for the string ":NT failed because" to identify one or more agent nodes that have not correctly received the requests and that are likely causing the delay or the TEP hung.

You could have a single or multiple nodes with this error condition.

Once you have identified them, move on the target nodes to investigate why they are not returning data with the expected response time.

As a double-check, temporarily stop these agents and run the query again.

If you correctly identified the impacting nodes, it is expected to have an immediate improvement in the query response time and in the overall performance.

 

Thanks for reading

 

Tutorials Point

 

Subscribe and follow us for all the latest information directly on your social feeds:

 

 

image

 

image

 

image

 

 

  

Check out all our other posts and updates:

Academy Blogs:https://goo.gl/U7cYYY
Academy Videos:https://goo.gl/TLfMoF
Academy Google+:https://goo.gl/HnTs0w
Academy Twitter :https://goo.gl/AhR8CL


image

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSVJUL","label":"IBM Application Performance Management"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

UID

ibm11277020