There are two broad categories of profilers: statistical/sampling profilers which sample call stacks and tracing profilers which record method entry/exit. IBM Health Center is an example of a sampling profiler and it is usually the best way to profile. In some cases, a tracing profiler is useful, such as analyzing low volume background work when there is no application activity (for example, to investigate the base footprint of a product like WAS). Even high frequency sampling may not capture enough data.
In our case, we were looking for a tracing profiler on zLinux. We settled on the Rational Agent Controller (RAC), which has a really broad set of supported operating systems: AIX, Linux, Linux s/390 (zLinux), Windows, Solaris, and z/OS. Once you've got the agent controller installed and the JVM instrumented, you can either gather data in headless mode which you load into Rational Application Developer, or start/pause monitoring remotely from RAD.
The RAC comes with a JVMTI profiling agent which has to be attached to the JVM. This profiler has a lot of native components which makes this a bit tricky. First, you'll need to add a generic JVM argument, such as:
Note that the argument has to be specified with double quotes to avoid any issues with the semicolon in the Linux launcher. So if you already had some arguments, such as -Xgcpolicy:gencon, then your final generic JVM arguments would be:
Next, we need to tell Linux how to load native library dependencies for libJPIBootLoader.so. To do this, we need to tell WAS to start with a specific LD_LIBRARY_PATH environment variable. Envars can be set through the Environment Entries option:
Name = LD_LIBRARY_PATH
Value = /opt/IBM/SDP/AgentController/plugins/org.eclipse.tptp.javaprofiler/:/opt/IBM/SDP/AgentController/lib
WAS is smart enough to append the library path you specify to the library path that it needs itself.
In the example above, we use the server=controlled option, which means that the JVM will not start until RAD connects to it. The reason we did this was so that we can control what gets profiled, since we weren't interested in profiling JVM startup. This option is recommended over server=enabled for high volume profiling. Here are the basic steps we followed:
- Start the RAC agent (RAStart.sh) before launching the application server
- Launch the application server (it will immediately enter a wait state)
- Connect to the JVM using RAD
- In some versions of RAD, this will immediately start profiling, in which case you'll probably want to click Pause - the JVM will continue to start but profiling will not be active
- When you're ready, resume the actual profiling and continue as long as necessary
- You'll probably want to select the option in RAD to save the data to a local file for post-analysis in addition to streaming it into RAD itself
There is also the option of using server=standalone which writes the profiling data to a local file and avoids the RAC itself and needing to connect in remotely from RAD. However, I tried this with a vanilla WAS install, and startup took really long and created about 15GB of data. It would have been cumbersome to analyze.
I won't cover all the analysis options from within RAD, as this article does a great job; however, I will highlight a particular approach we took as we were interested in what exactly was taking up the CPU (tracing profilers can also be used to look at usage from a cumulative time perspective, which will capture I/O bottlenecks, etc.).
First, we captured top -b -H -d 1800 -p $PID to gather accumulated CPU time per thread. We captured this at the start of profiling and at the end and took the difference to find the threads that accumulated CPU and sorted by that number. Next, within RAD's Execution Time Analysis, select the Call Tree tab and find these threads. Expand the threads and follow down the largest paths of cumulative time. Note that there may be some rows with very large cumulative times that are probably just the frames of the thread that are "waiting for work," such as a call to getTask or await, and these can be disregarded.
Once you find a high level method of interest (the art of profiling!), right click it and select Show Method Invocation Details. In the third table, "Selected Method Invokes," sort by Cumulative CPU Time, descending (if you don't have this column, you will need to make sure you have this option selected in one of the RAD attach/profiling screens when starting to profile). This will give the accumulated CPU time from a high level. You can then "drill down" further if you'd like to by doing the same procedure with rows from this table.
Note: It appears that this cumulative CPU time in the method invocation details is for the whole tracing profile, not from within the context of the call tree thread stack that you get here from.