The IBM Java HealthCenter is a low overhead agent shipped with the JVM that can provide deep insight into JVM activity for IBM Java 5 and above: http://www.ibm.com/developerworks/java/jdk/tools/healthcenter/. HealthCenter is similar to VisualVM. HealthCenter can provide information on classloading, JVM arguments, garbage collection, file I/O, locking, native memory, threads, and profiling. The last item, profiling, is my favorite feature of HealthCenter -- it is a very low overhead, sampling profiler that can pinpoint high CPU issues or waiting on backend resources. As of Java 5 SR10 and Java 6 SR5, it is rated "for production use," which means that it can be used even under a fully loaded, production system (with an overhead of approximately just a few percent): http://publib.boulder.ibm.com/infocenter/hctool/v1r0/topic/com.ibm.java.diagnostics.healthcenter.doc/com.ibm.java.diagnostics.healthcenter.gui/docs/platforms.html. This means WAS >= 184.108.40.206 and WAS >= 220.127.116.11 (http://www-01.ibm.com/support/docview.wss?uid=swg27005002).
You have to restart the JVM with the generic JVM argument -Xhealthcenter to enable it. By default, HealthCenter opens port 1972 (or the next available port incremented from that) and you connect to the agent using the HealthCenter client available in IBM Support Assistant. Of course, this raises a lot of issues. First, many customers have firewalls or don't allow JVMs to open additional ports, particularly on production systems. Second, if you're profiling multiple JVMs, it's difficult to manually monitor all of them real time. Third, it won't catch everything before the first time a client connects. Finally, there is additional overhead in transferring the data over the network.
However, HealthCenter does have a headless mode in which it will write the data to local files on the JVM's system and the resulting "HCD" file can be loaded into the HealthCenter client for post analysis. To run in this mode, use the generic JVM argument -Xhealthcenter:level=headless. There are also a few issues in headless mode with the current version of HealthCenter. First, headless mode will immediately start to gather data. This means that it will gather everything about the startup of the JVM (which usually you don't care about). It also means that if your problem doesn't start until an hour into the run, it will still gather that first hour of data. This doesn't change the impact to the JVM, however it does mean that you will need to account for the additional disk space that this "extra" data needs. The size of the HealthCenter files will be proportional to the amount of profiling data that the JVM generates, which will be a function of many variables, including application activity and throughput, so there is no simple formula for estimating the required disk space. The best approach is to run a test in a test environment with a similar workload and estimate based on that. As a rough guide, I've run this in a few very large, production systems and HealthCenter produced 1 to 5GB per hour. By default, HealthCenter will write to the current working directory of the JVM, which for WAS is the profiles folder (e.g. /usr/IBM/WebSphere/profiles/AppSrv01/). You can change this (for example, to direct to a directory with more space) using a generic JVM argument -Dcom.ibm.java.diagnostics.healthcenter.headless.output.directory=$SOMEDIR
The second limitation is that the JVM must be fully stopped for the HCD file to be produced. The healthcenter.hcd file will also be written into either the current working directory or into the com.ibm.java.diagnostics.healthcenter.headless.output.directory directory. The temporary files that the JVM produced during the run, EnvironmentSource$PID, JLASource$PID, MemorySource$PID, MethodDictionarySource$PID, and TRACESubscriberSource$PID will be deleted on JVM exit and a healthcenter.hcd file produced. The HCD file is what you load into the HealthCenter client. The file is actually just a ZIP file of the aforementioned Source files. If a JVM does not stop gracefully, has to be killed, or itself crashes, then the HCD file may not be produced. In this case, you may try to zip up the Source files manually and rename the file to .hcd and try to load this in the client.
If you try to start the JVM in headless mode and it fails to restart, then the agent is probably too old. If you check the native_std*.log files, you'll probably see a message like the following:
SEVERE: Health Center agent failed to start.
java.lang.IllegalArgumentException: No enum const class com.ibm.java.diagnostics.healthcenter.agent.dataproviders.DataCollectionLevel.HEADLESS
In this case, you'll just need to update the HealthCenter agent libraries. The latest libraries can be downloaded for each platform from: http://publib.boulder.ibm.com/infocenter/hctool/v1r0/topic/com.ibm.java.diagnostics.healthcenter.doc/com.ibm.java.diagnostics.healthcenter.gui/docs/installingagent.html
You can check which version of the HealthCenter agent is installed by running:
$ <WAS>/java/bin/java -version -Xhealthcenter:level=headless
...Sep 12, 2011 6:03:53 AM com.ibm.java.diagnostics.healthcenter.agent.mbean.HCLaunchMBean startAgent
INFO: Agent version "18.104.22.16810620"...
Note: When running level=headless combined with -version, you will have to Ctrl+C to kill the program. This is a known limitation of HealthCenter when running -version.
Some customers are weary about updating the WAS binaries; however, the files updated are only those of HealthCenter. HealthCenter is shipped with the JVM, but just as an optional feature. If you do not use -Xhealthcenter, then the libraries are not loaded. So although you're updating files within the WAS install, it's equivalent to updating something like the doc directory in the JVM -- it's something that's shipped with the JVM, but not normally used, so you're not changing the functionality of the installation or changing anything that had been previously used (obviously, the functionality and dynamics of the JVM change when HealthCenter is enabled).
- Upload the zip file in BINARY mode to <WAS>/java/$FILE.zip
- Change directory to <WAS>/java/ and run the command: ./bin/jar xvf $FILE.zip
- If WAS runs under a non-root user, make sure to chown properly.
You may receive errors unzipping the file because the HealthCenter libraries are "in use." This is because you had just tried to run HealthCenter. First, make sure that the JVM is down. If it still doesn't work, you may need to stop all other JVMs on that node and the node agent. If it still doesn't work, and the JVM is run by a non-root user, then try extracting the zip using the root user, and then chown to the non-root user.
Once you've got the HCD file, start the HealthCenter client in ISA. When you first load the client, it may ask you to connect to a JVM, just click Cancel. Then click File > Open File... and select the HCD file (ensure that it has the extension .hcd or .zip). Then click on Profiling, and you'll see something like the following:
You may notice that some methods do not have any names but are just hexadecimal addresses. This may be due to an old version of the JVM or the agent. Also, for headless mode, some of this is unavoidable, particularly for methods that are executed very early during startup (e.g. classloading). For any kind of normal runs where you don't care about startup, then this is usually not an issue. However, if you do see hot methods without method names, you can infer what they are by looking at what the methods call and who calls them. For example, in the following case, there is a hot method without a name, and we can see from the call path that it is related to security, particularly around creating an LTPA key (i.e user login).
The Self% is how often that method was sampled at the top of the stacks, and the Tree% is how often that method was sampled somewhere in the stacks. I usually first start by looking at the highest Self% (the default sort in the client). Next, I'll sort by Tree%, and I'll scroll past methods that usually don't do much (for example, a Servlet doPost, or a servlet filter), and the first method that is after these types of methods is usually what is doing most of the activity. This is more of an art than a science.
You can also zoom in on a particular time period by dragging a rectangle around a particular time area beneath the profiling view. This is great for comparing the activity at two different times in the run.
The sampling profiler frequently samples the stacks of the CPUs. Highlighted methods in the client are not necessarily those that consume CPU. If the agent samples stacks which are constantly making web service calls, this will show up in the client. This is in many cases even better than just a CPU profiler, since it can pinpoint almost any type of performance issue, not just excessive CPU usage.
In summary, IBM Java Health Center is an extremely powerful tool which has a very low-overhead, headless, sampling profiler on recent versions of WAS (>= 22.214.171.124 and >= 126.96.36.199) which can be used to determine the root cause of performance issues.