In a previous post , I discussed the importance of symbols for native libraries. Not only are they needed, but you may be actively deceived by the guesses of stack walkers if you don't have them. On Linux, although it is recommended to simply compile executables and libraries with symbols and ship them unstripped, even the operating system vendors do not do this (the simple reason is the size of the symbol data). For example, Fedora and RedHat do not ship binaries unstripped, but instead they separate the symbols into matching debuginfo... [More]
I'm proud to announce a new tool I've been cooking recently: IBM Runtime Diagnostic Code Injection for the Java™ Platform (RDCI). The tool is an extensible command line JAR that uses the Late Attach API to inject Java code into a running JVM, allowing both exploratory and invasive surgery on a Java process. It uses a custom classloader to minimize the "scar" left on the process. Example commands include: Run a full garbage collection, execute a static method, perform a javacore, heapdump or system dump on IBM JVMs, and... [More]
There are two broad categories of profilers: statistical/sampling profilers which sample call stacks and tracing profilers which record method entry/exit. IBM Health Center is an example of a sampling profiler and it is usually the best way to profile. In some cases, a tracing profiler is useful, such as analyzing low volume background work when there is no application activity (for example, to investigate the base footprint of a product like WAS). Even high frequency sampling may not capture enough data.
In our case, we were looking for... [More]
Surprisingly, even the latest version of HP-UX does not provide a simple tool (such as "ps") to print the full command line of a running program (no equivalent of Solaris /usr/ucb/ps). The -x parameter of ps only prints the first 1024 characters , which is often insufficient for Java programs: Only a subset of the command line is saved by the kernel; as much of the command line will be displayed as is available... The value of DEFAULT_CMD_LINE_WIDTH should be between 64 and 1020 (... [More]
One common type of OutOfMemoryError (OOM) occurs when an application thread accumulates too much memory. When looking at a PHD, however, you may only see a large object (and potentially uninteresting child objects) as GC roots , so it's hard to figure out the root cause. Even if there are clear package names in the large object, that still may be insufficient information for the developers to figure out the problem. For example, here is an example PHD heapdump as seen in the Memory Analyzer Tool and HeapAnalyzer:
In both tools, all we... [More]
In my last post , I wrote about understanding how much has been malloc'ed in a coredump through gdb, which was successful. In this post, I'll describe my investigations into total virtual memory usage in a core, which was unsuccessful.
First, I used the same program as before which calls malloc with various sizes and I changed it to sleep at the end. While it was sleeping, I took OS statistics and a core dump.
The first thing I ran was 'ps -o pid,vsz,rss -p 14062':
PID VSZ RSS
14062 44648 42508
VSZ is the total virtual... [More]
The following is my attempt at understanding how much memory has been malloc'ed at the time of a Linux coredump using gdb. I'm not aware of any built-in way to get this from gdb, surprisingly (or even better, the total virtual memory used by the process). This investigation is assuming a recent version of Linux and the default glibc malloc implementation, the latest source code of which is located here: http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=malloc/malloc.c;hb=HEAD . You'll need the debuginfo packages of glibc and glibc-common... [More]
I'll be presenting a WebSphere Technical Exchange on October 16th @ 11AM Eastern . It is free for all to join and includes a remote presentation and a question & answer session. The topic will be AIX Native Memory Problem Determination Techniques and Tools for WebSphere Application Server. Call-in information and slides are available here: http://www-304.ibm.com/support/docview.wss?uid=swg27036053 .
Some applications use native libraries (e.g. JNI; .so, .dll, etc.) to perform functions in native code (e.g. C/C++) rather than through Java code. This may involve allocating native memory outside of the Java heap (e.g. malloc, mmap). These libraries have to do their own garbage collection and application errors can cause native memory leaks, which can ultimately cause crashes, paging, etc. These problems are one of the most difficult classes of problems, and they are made even more difficult by the fact that native libraries are often... [More]
Below is a simple wsadmin script that calculates the approximate start time of a set of servers using the UpTime PMI statistic . For example:
$ ./wsadmin.sh -lang jython -username wsadmin -password wsadmin -f uptime.py -server server1
WASX7209I: Connected to process "dmgr" on node localhostCellManager11 using SOAP connector; The type of process is: DeploymentManager
WASX7303I: The following options are passed to the scripting environment and are available as arguments that are stored in the argv variable: "[-server, server1]"
It is generally a malpractice for an application to call System.gc() or Runtime.gc() (hereafter referring to both as System.gc(), since the former simply calls the latter). By default, these calls instruct the JVM to perform a full garbage collection, including tenured spaces and a full compaction. These calls may be unnecessary and may increase the proportion of time spent in garbage collection than otherwise would have occurred if the garbage collector was left alone.
The generic JVM arguments -Xdisableexplicitgc (IBM) and... [More]
One simple and very useful indicator of process health and load is its TCP activity. The following script takes a set of ports and summarizes how many TCP sockets are established, opening, and closing for each port. It has been tested on Linux and AIX. Example output:
$ portstats.sh 80 443
PORT ESTABLISHED OPENING CLOSING
80 3 0 0
443 10 0 2
Total 13 0 2
echo "usage:... [More]
A recent customer was comparing performance between WAS and Tomcat. Tomcat was performing much better. The application used temporary files intensively. After investigating thread dumps, we found that the sampled WAS threads showed much more temprorary file I/O activity than Tomcat threads. Next, we discovered that Tomcat changes Java's default temporary directory to the "temp" subdirectory of the Tomcat installation using the -Djava.io.tmpdir system property.
It turned out that Tomcat happened to be installed on a faster disk than... [More]
A previous post covered an older way of gathering configuration for Visual Configuration Explorer (VCE) using the VCE Headless Runtime, exported from ISA. The newer and preferred approach is to use the IBM Support Assistant Lite data collector, and this will work with WAS 8 (I've also tested this on 7). It does not work on V8.5.
Download the ISA Lite script: https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=swg-isalite&S_PKG=wasunixwin
Extract the ISA Lite script into the <WAS> root directory.
A common aspect to a problem is that an application worked and then the environment (WAS, etc.) was upgraded and the application stopped working. Many customers then say, "therefore, the product is the root cause." It is easy to show that this is a fallacy (neither necessary nor sufficient) with a real world example: A recent customer upgraded from WAS 6.1 to WAS 7 without changing the application and it started to throw various exceptions. It turned out that the performance improvements in WAS 7 and Java 6 exposed existing... [More]