Technical Blog Post
Performance tuning tips for the Datapower agent.
The DataPower (BN) agent is used to monitor your DataPower appliances/devices. For DP BN agent v220.127.116.11 or higher the ideal recommended number of monitored DP appliances is 5 to 6 per agent, not more. It can however, monitor more then 5 or 6, but you will likely run into performance issues. Some symptoms you may see are high CPU usage (via the top or topas (AIX) command) or if on a Windows system, you'll see high CPU usage in the Task Manager. Your agent will be slow responding, and you will also see sluggish response. Here are some things you can do to get around this:
1) First see how many DataPower agent instances are being monitored. If you have more then 5 or 6 instances/devices per agent, then you may want to consider installing the agent on another system and balancing the load amongst two agents.
2) Do a ps -aef | grep -i KBN and observe what the heap sizes are set to. Usually the default heap sizes in the KBN.sh file are set too low. If you have a large amount of native memory, say 8GB or higher, then set the heap sizes to a larger value. To do this, edit the KBN.sh file (in the "bin" dir) and change the -Xmx parm to -Xmx1024m (or 2048m) and save the file.
3) Check your system resources like ulimit -a (look for "open files" or "nofile") and observe that it is higher then 1024 (default), say 4096 or higher. To set this, do ulimit -n 4096 as root. Also, check the ulimit -m settings and increase that if needed.
4) Check your log and trace settings in the bn.ini file. If you have a higher level of debug set, then this will impact performance of the agent. Temporarily, go back to a lower default setting.
5) Look at the BN agent logs and see if you get a message like "Data collection wait condition failed with return code 110". This indicates that you may have max'd out as you are monitoring the maximum recommended. In this case, you'll have to simply use another BN agent to monitor some instances.
6) Start one agent instance at a time, then do a free -mh from the command-line and observe memory usage. Then start the next instance and do free -mh again, and observe the difference. It may that one of your monitored instances may be utilizing high memory and could be causing the high CPU. If this is the case, narrow it down to that specific instance and diagnose that device/appliance.
7) Another (long-term) option would be to increase native memory on the agent system itself, say from 8GB to 16GB. If you find the log message described in step 5 above, and don't want to split up the agents on to multiple systems, then you can consider increasing native memory on your agent system.
8) Make the above changes and restart the agent. If you still see high CPU after the above recommendations, then observe when the high CPU happens and open a new Case for Support to debug. Does the high CPU happen immediately or after some time? Or does it happen when starting a particular instance? Put all available info in the new Case that you opened.
9) Capture a new top and ps -aef | grep -i KBN output and send to the Case for review.
ITCAM / APM / ICAM L2 Support team
Subscribe and follow us for all the latest information directly on your social feeds:
|Academy Twitter :||https://goo.gl/GsVecH|