Update : ActiveCount has been made a part of the PMI Basic collection (enabled by default) in WAS 126.96.36.199 and 188.8.131.52 .
WebSphere Application Server comes with the "Basic" level enabled in the Performance Monitoring Infrastructure (PMI) on every server by default . This usually has an overhead of about 2-3% . The Basic level comes with one statistic for all thread pools: PoolSize (see the "Level" column in the link). This is defined as "the average number of threads in pool." This is a bit confusing: it is... [More]
In a previous post , I showed how to break a heapdump down into retained sets by class as an alternative way to explore what's consuming the heap. I thought it would be worth elaborating on retained sets a bit more, as they are one of the most important concepts in heapdump analysis.
The MAT documentation gives a good definition of retained sets and a useful diagram:
Retained set of X is the set of objects which would be removed by GC when X is garbage collected.
When most people talk about the "size" of a... [More]
The IBM Memory Analyzer Tool (MAT) is my preferred tool for analyzing Java heapdumps and core dumps. Sometimes you'll see a dump where there are no obvious causes of high memory usage in the dominator tree nor the top consumers report - the two places I recommend going to first when investigating memory. This post will show how to do advanced analysis using the histogram, retained sets, and incoming references by class.
Here is a dump retaining 4.2GB of Java heap without any large dominators:
At a recent customer, we improved throughput by 50% simply by restarting with the AIX environment variable MALLOCOPTIONS=multiheap. This only applies to situations where there is heavy, concurrent malloc usage, and in many cases of WAS/Java, this is not the case.
The multiheap option does have costs, particularly increased virtual and physical memory usage. The primary reason is that each heap's free tree is independent, so fragmentation is more likely. There is also some additional metadata overhead.
malloc is often a... [More]
It may be useful to understand what PID is sending a kill signal to a process on AIX. You can use this kernel trace:
Login as root
# rm -rf /tmp/aixtrace; mkdir /tmp/aixtrace/; cd /tmp/aixtrace/
# trace -C all -a -T 10M -L 20M -n -j 134,139,465,14e,46c -o ./trc
... Reproduce the problem ... e.g. kill -3 7667754
# cp /etc/trcfmt .
# trcnm -a > trace.nm
# LDR_CNTRL=MAXDATA=0x80000000 gensyms > trace.syms
# LDR_CNTRL=MAXDATA=0x80000000 gennames -f > gennames.out
# pstat -i > trace.inode
# ls -al /dev >... [More]
I'm proud to announce a new tool I've been cooking recently: IBM Runtime Diagnostic Code Injection for the Java™ Platform (RDCI). The tool is an extensible command line JAR that uses the Late Attach API to inject Java code into a running JVM, allowing both exploratory and invasive surgery on a Java process. It uses a custom classloader to minimize the "scar" left on the process. Example commands include: Run a full garbage collection, execute a static method, perform a javacore, heapdump or system dump on IBM JVMs, and... [More]
We're pleased to announce the first public version of the WAS Performance Cookbook: https://publib.boulder.ibm.com/httpserv/cookbook/
The WebSphere Application Server Performance Cookbook covers performance tuning for WebSphere Application Server (WAS), although there is also a very strong focus on Java , Operating Systems , and theory which can be applied to other products and environments. The cookbook is designed to be read in a few different ways:
On the go: Readers short on time should skip to the Recipes chapter at... [More]
If you often have to create new application servers, you should consider application server templates: http://www14.software.ibm.com/webapp/wsbroker/redirect?version=compass&product=was-nd-mp&topic=trun_create_templates The process is simple: create one application server, add your customizations (JVM arguments, data sources, etc.), then click Templates... > New, and select that server. Starting in WAS 7, you can even edit the template after it has been created. Then, when you create a new application server, either through the... [More]
I'm not sure how this applies to modern computer chips and operating systems, but here is interesting research from Liedtke in 1995 and 1997 showing the overhead of system calls:
For measuring the system-call overhead, getpid, the shortest Linux system call, was examined. To measure its cost under ideal circumstances, it was repeatedly invoked in a tight loop. Table 2 shows the consumed cycles and the time per invocation derived from the cycle numbers. The numbers were obtained using the cycle counter register of the Pentium processor. Linux... [More]
As environments continue to grow, automation becomes more important. On POSIX operating systems, SSH keys may be used to automate running commands, gathering logs, etc. A 30 minute investment to configure SSH keys will save countless hours and mistakes.
Step #1: Generate an "orchestrator" SSH key
Choose one of the machines that will be the orchestrator (or a Linux, Mac, or Windows cygwin machine)
Ensure the SSH key directory exists:
$ cd ~/.ssh/
If this directory does not exist:
$ mkdir ~/.ssh... [More]
I was recently at a customer who believed that they had a Java memory leak. They compared heapdumps and couldn't find anything. They had experienced production OutOfMemoryErrors (OOMs) before (for a different reason), and they were so worried about what they perceived, that they increased the maximum heap size to 4GB so that the JVM could handle a day's worth of work, and then they put in a process to restart the JVMs every night.
At a first glance of verbose garbage collection, I agreed with them (loaded in the wonderful Garbage... [More]
If you are using a HotSpot-based JVM, and you are producing HPROF heapdumps that have more than 4GB of object data, and you are using the Memory Analyzer Tool to analyze those heapdumps, then make sure you check the Error Log for any warnings when first loading the dump. We recently discovered that some HotSpot JVMs write an incorrect length field in the HPROF file. MAT will end up reading only part of the heapdump, but other than the warning, there's no sign that you're only looking at a subset of the dump.
You can find more... [More]
Some applications use native libraries (e.g. JNI; .so, .dll, etc.) to perform functions in native code (e.g. C/C++) rather than through Java code. This may involve allocating native memory outside of the Java heap (e.g. malloc, mmap). These libraries have to do their own garbage collection and application errors can cause native memory leaks, which can ultimately cause crashes, paging, etc. These problems are one of the most difficult classes of problems, and they are made even more difficult by the fact that native libraries are often... [More]
On the IBM JVM, the various environment variables used to change dump parameters (e.g. IBM_JAVACOREDIR, etc.) have been deprecated in favor of the -Xdump generic JVM command line argument.
Someone asked how to change the directory where all the dump artifacts go using -Xdump. Here it is for *nix, and just change /tmp/ in each argument to the desired directory (on Windows, use the Windows path syntax):
-Xdump:java:file=/tmp/javacore.%Y%m%d.%H%M%S.%pid.%seq.txt -Xdump:heap:file=/tmp/heapdump.%Y%m%d.%H%M%S.%pid.%seq.phd... [More]
The DynaCache MBean is a scriptable interface to interact with DynaCache caches at runtime: http://publib.boulder.ibm.com/infocenter/wasinfo/v8r0/index.jsp?topic=%2Fcom.ibm.websphere.javadoc.doc%2Fweb%2FmbeanDocs%2FDynaCache.html For example, to clear a DynaCache cache instance on a server at runtime, use the following Jython wsadmin code: AdminControl.invoke(AdminControl.completeObjectName("type=DynaCache,node=MYNODE,process=MYSERVER,*"), "clearCache", "MYCACHEINSTANCE") The documentation for clearCache says... [More]