Technical Blog Post
Historical data is not seen.
Recently had a problem come in where it was reported that customer was getting frequent
disk space utilization alerts from some servers.
The other fact that was given was that summarization and pruning was not working for these servers.
Linking these together, it sounded like the data was not being collected from the agents, and so filling disk space.
To investigate this, the agents logs were looked at first.
Looking at the OS agent for the machine, errors like below were seen:
(5747637C.0001-12:khdxdacl.cpp,765,"resolveServerAddress") Local lookup requested: ip.pipe:<full qualfied name of machine>
(5747637C.0002-12:khdxdacl.cpp,887,"resolveServerUsingLB") Looking up annotation "<TEMs name of machine": 18
(5747637C.0003-12:khdxdacl.cpp,887,"resolveServerUsingLB") Looking up annotation "Candle_Warehouse_Proxy": 18
(5747637C.0004-12:khdxdacl.cpp,820,"resolveServerAddress") Warehouse proxy not registered
(5747637C.0005-12:khdxdacl.cpp,627,"routeExportRequest") Export for object <ITM_Audit> (table KRAAUDIT appl KPX) failed in createRouteRequest, Status = 8.
These errros indicate that the agent cannot connect to or find a Warehouse Proxy agent (WPA).
This can be caused by a number of different problems, so next step is to check the WPA logs.
In these logs check for error messages about the createRouteRequest.
The type of issues to review with these messages are firewalls having ports closed and the configuration of the KHD_WAREHOUSE_TEMS_LIST.
There can be a number of WPA's in an environment or there can be one.
If there is more than one WPA, then the parameter KHD_WAREHOUSE_TEMS_LIST sets which TEMS each WPA serves.
This attribute is found in the in the KHDENV or hd.ini file for the WPA.
For this WPA a list of TEMSes were given, but the names were wrong. It has to be the monitoring server instance name that is given.
In the case seen, the name was given as TEMS1_RTEMS when it should have been RTEMS_TEMS1, this meant that the ITM code could not find the instance so no route could be found to the WPA.
Check the list of TEMS machines in the KHD_WAREHOUSE_TEMS_LIST attribute are the correct names for the TEMSes, this is set at installation.
Once set correctly and the needs to be WPA restarted, historical information is sent to the warehouse
and the size of the files reduced.
Just as an extra to this....
There has been a number of problems seen where the process of historical collection has not been understood, and an understanding of this helps to trace through any problems, so a very quick summary on historical data collection is as follows:
It is set at the TEPS and are sent to the agents in the form of situations that start UADVISOR_* to state what is to be collected.
Historical data files are usually kept on the agent, but can be configured to be saved on the TEMS.
For each attribute there are two files for example: KHDLOADST and KHDLOADST.hdr for one of the WPA collections.
The WPA is the agent that collects the data from the agents and inserts it into the ITM warehous database.
The data is not removed from the agent (or TEMS) until it has successfully inserted into the warehous database.
Therefore if there are problems getting the data to the warehouse the files on the agent will continue to grow in size. On the positive side no data is lost.
The historical data files always contains up to 24 hrs of data, so once it is running for over 24 hours it should stay fairly constant is size. If the size keeps growing this can be an indication of an issue with the data being inserted into the warehous.
Once the data is in the warehous, the S&P agent that summarizes and prunes the data.
The S&P agents is responsible for the tables with _H _D on the end of the name.
For example NT_Logical_Disk_D is the windows agent logical disk data summarized daily.
The WPA only puts data into the default tables for example: NT_Logical_Disk
Therefore knowing which tables have up to date data in the warehous database can help to locate where the issue possibly is.
Subscribe and follow us for all the latest information directly on your social feeds:
|Academy Twitter Handle:||http://ow.ly/Dj35c|