Technical Blog Post
Debugging the HTTP (HU) agent from not connecting to the APM dashboard.
So you've installed the APM 814 HTTP agent to monitor your Apache IHS server, and its showing as "Not Running" on the APM dashboard. Here are some steps you can do to get it working again.
1. Check the Apache server and the HU agent, make sure they're both up and running:
Go to your $APM_HOME/agent/bin dir and run these commands: ps -aef | grep -i httpd and cinfo -r. The first command will get you a listing of all the Apache IHS server processes. If you've started the server as root, that will be the parent process, and you will see child processes running as apache or nobody. If you have started the HU agent successfully, the cinfo -r command will list out the agent's pid, owner and start time.
2. Check if all the config files are created correctly:
When you install the HU agent, its going to create the khu.etc.httpd.conf.httpd.conf and khu_cps.properties files in the $APM_HOME/agent/tmp/khu dir. Make sure these files get created and exist. Do not edit these files, as they are generated by default when you install the agent. Note that the name of the conf file depends on the path to where your Apache server config resides. In my case, its under my /etc/httpd/conf dir, hence my generated file is called khu.etc.httpd.conf.httpd.conf. Yours might vary if your Apache config resides elsewhere on your system.
In addition to these two config files, you will also find the hu_dd.properties and hu.environment file in the $APM_HOME/config dir.
Also, when you start the HTTP agent, it will create some files in the $APM_HOME/agent/tmp/khu/discovery/<md5-dir> directory. Never edit these files, they are auto-generated when you start the agent. If you don't see these files, then restart your agent. They should
look like this:
3. Check permissions:
Check the permissions of all the directories and sub-directories leading up to the config files. They should be readable at the very least. If the agent is started as non-root, then check that the agent can read the files if the config files belong to a different user. If you run into permission issues, this will be seen in the agent logs as errno 11 or errno 13. In addition to the config files, check the permissions of the Apache server directories, and make sure the agent can read those directories and sub-directories as well.
4. Check the Include line in the config file:
After the agent has been installed, you have to manually edit the Apache server config file and insert a valid Include line towards the bottom of the file pointing to the correct config file. This change is made in the Apache server config file, usually httpd.conf (make a backup first), and should look something like this:
5. Do not use an alias in the config file:
Its tempting to change the last line in the khu_cps.properties file in the ../tmp/khu dir to point to an alias for the hostname. Note that doing so will cause the agent to stop working.. so do not change this file. Also, do not change the generated port (first line in the properties file.)
6. Check userid of the Apache server and the agent:
Usually the Apache server is started as root. This parent process then spins off child processes running as apache or nobody. The agent process is usually started as root. Check if these are the userids used. If not, then verify the files/directories used by the Apache server can be read by the agent correctly.
7. Recycle the server and agent in order:
If possible, stop both the agent and the Apache server. Then start the Apache server first (usually apachectl start) and tail the Apache error_log. Then start the HU agent and observe the messages inserted into the error_log during agent startup. You will see [notice] messages in there. Note: If you're in a production environment, then stopping the Apache server may not be always possible, so this step might have to be scheduled for a change window.
8a. Check the Apache version being used:
Most versions of Apache server v2.2.x up to v2.4.x are supported with the APM 814 HTTP agent. However, sometimes you will find certain Apache sub-versions like Apache v2.2.31 cannot read certain older modules. If that's the case, then you will most likely need to use the latest fix or contact Support to obtain a future fix. To check the exact Apache version being used, use the apachectl -V command.
8b. Check the Apache modules match what's in the config file:
For example, if you're using Apache server v2.2., then verify that the first "LoadModule" line in the generated config file khu.etc.httpd.conf.httpd.conf file points to the right library file. See the line below for an example. Note that the 22 indicates v2.2 of the Apache server. Verify this library file exists, and the permissions leading up to are all readable by the agent. If your using Apache v2.4, then the LoadModule will point to apache24 library module.
LoadModule khu_module "/opt/ibm/apm/agent/lx8266/hu/lib/khuapache22dc_64.so"
9. Check only if there's one khuagent process running:
I've seen cases where there are multiple khuagent processes running. Do a ps -aef | grep -i khuagent and verify you only have one khuagent process running. If you see multiple khuagent processes, and not sure which one is the 'real' one, then stop/kill both, and restart the HU agent. Verify there's only one khuagent process running again.
10. Check the Agent logs:
There are 3 log files you will look at. First check the hu_ServerConnectionStatus.txt logfile, the first line in this file should say CONNECTED; otherwise you got some networking issues and your agent won't connect to the APM server. Next check the <hostname>_hu_khuagent_*01.log logfile, look for any agent-related errors in here. If you don't find any errors in these 2 files, then look at the activity file, usually called hu_asfActivity_<timestamp>-01.log logfile. This file will show the agent retrieving data and sending it to the server. Search for ROWCOUNT in this file, and see that the count is greater then 1. If its 1, then likely your agent is not collecting data.
11. Put small load on the Apache server:
I've seen cases where if the load's negligible, or zero, the agent does nothing. So see if you can put some load on the Apache server and then the agent 'wakes up' and displays more data. There's an utility called http-ping.exe (Google it) that you can use to generate some small load on the Apache server, and I've used it on my test system, although am not recommending it here for a Production environment.
12. Use the latest APM 814 fix for the HTTP agent:
13. Increase Agent Log Level and Tracing:
If all of the above have failed, and yielded no results to get your HTTP agent going, then you can increase the debug and trace level on the agent as follows. Edit the hu.environment file in your config directory (make backup first), and change the KBB_RAS1 line to something more enhanced. You will see the different available tracing options in the file itself. Also, add this line to the bottom of the file: JAVA_TRACE_LEVEL=DEBUG. Save the file, exit and restart your agent. You will now get enhanced logging and tracing.
14. Reference APM 814 HTTP Agent Configuration documentation pages:
If you would like to review your agent configuration steps, you can check the documentation references here.
If you still need help and would like support from the IBM Support team, please open a new pmr, or update your existing pmr and a support engineer will contact you back. If you are opening a new pmr, please upload an enhanced pdcollect output from your agent system and some screenshots from your APM dashboard to the new pmr. If you find this blog useful and/or would like to add your own solution not listed above, please send us a Comment below. Thank you.
ITCAM / APM / ICAM L2 Support team
Subscribe and follow us for all the latest information directly on your social feeds:
|Academy Twitter :||https://goo.gl/GsVecH|