You can never predict what the next problem will be that comes across your desk, I suppose that's half the fun. On Christmas Eve it was “This Java Applet is hanging !”.
Now I know what you're all thinking, but yes, there are applets out there and yes, IBM Java can be used as a plugin in a browser, and yes, the methodology used to fix this problem is the same as you could use to fix “normal” Java application problems. So forget it's an “applet” and just follow the series of blog posts resulting in the actual fix.
The first things to do with any problem is not to assume anything and get a proper problem description. Luckily, the report of the problem was from my boss's boss who gives excellent problem descriptions
What is a “hang” ?
The term “hang” is used fairly randomly and is generally used to describe how the user views the problem – in this case the applet didn't respond to any input from the user. But was it “hung” or was it “looping” ? In my view the term “hang” means that one or more of the threads are not doing any work, they are waiting for something to happen and that something never happens, thus they are “hung”. Whereas if one or more (typically one) thread is performing the same action over and over again without being interrupted, I would call that “looping”. The result will be the same to the user in that the application will not be responsive, but the methodology used to diagnose the problem will have subtle differences.
The user was using Linux (as do I) and thus to determine if it was “hung” or “looping” we simply issued the “top” command to tell if any threads were using large amounts (90%+) of cpu. They weren't, so we conclude that it was “hung”.
Remember the motto:
A dump (or core file) is your friend
The best thing to do with a hang is to get a dump of the Java process to determine what the threads are doing/waiting on. There are a number of ways of doing this but many involve altering the arguments (or command-line options) used to start the Java application and I wasn't sure how to do this for an applet running as a plugin in a browser, so I decided to use Health Center.
Health Center is a monitoring and diagnostic tool written by a team in IBM Hursley. It consists of a client that the user interacts with which connects to an agent running inside the target application. In this case we hadn't started the agent inside the plugin inside the browser, but that's not a problem, we can connect to the JVM running the hung applet and tell it to start the agent.
First we have to locate where Java is installed on the machine - I'll leave that task up to you for homework ! Then we have to locate one of the jar files which resides in the jre/lib/ext directory within the Java installation and run the main class inside that jar. Here is the (slightly tailored) output of the command run on my machine:
brian@mymachine:~$ java -jar /opt/ibm/ibm-java-x86_64-70/jre/lib/ext/healthcenter.jar
A Health Center agent may be attached to one of the following Java Virtual Machines:
/data/eclipse-37//plugins/org.eclipse.equinox.launcher_1.2.0.v20110502.jar -os linux -ws gtk -arch x86_64 -showsplash -launcher /data/eclipse-37/eclipse -name Eclipse --launcher.library /data/eclipse-37//plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.1.100.v20110505/eclipse_1407.so -startup /data/eclipse-37//plugins/org.eclipse.equinox.launcher_1.2.0.v20110502.jar --launcher.overrideVmargs -exitdata 7198016 -clean -console -vm /opt/ibm/ibm-java-x86_64-70/jre/bin/javaw -vmargs -Xmx2G -Xdump:system:events=user -Xshareclasses:name=javashareclasses,cacheDir=/var/cache/java,groupAccess -Djava.library.path=/usr/lib/jni -Dtestroot.dir=/data/junittestdir -jar /data/eclipse-37//plugins/org.eclipse.equinox.launcher_1.2.0.v20110502.jar: ID=10998
sun.plugin2.main.client.PluginMain write_pipe_name=/tmp/.com.sun.deploy.net.socket.17871.5736686908294359243.AF_UNIX: ID=17895
Please select the VM (enter number between 1 and 4) in which to enable the Health Center agent, or blank line to exit.
It responds with a list of JVMs running on this machine which can start their own Health Center agent. Let's have a quick look at what I am running:
Of course I'm running Eclipse ! (Actually this is "tailored" in that I'm running quite a few instances of Eclipse but the blog post was getting too big !)
This is the JVM actually running this command
This is the PID of the plugin inside Firefox which launches the applet, so although it's the "plugin" it is just the launcher and not the JVM that is actually running the code in question. This is a fairly common thing to happen, not just with applets and browsers - take a look at your Application Server for instance.
This is the actual Applet that is running and where we want to conenct Health center to
... so we enter the number 4 and it responds with:
Successfully enabled Health Center agent in VM: sun.plugin2.main.client.PluginMain write_pipe_name=/tmp/.com.sun.deploy.net.socket.17871.5736686908294359243.AF_UNIX
Health Center properties used by agent in target VM:
-- listing properties --
In the next post I'll explain how we connect the Health Center client to this agent and trigger a dump.
Note : Actually I'm lucky in that my boss's boss is a closet techie and can provide good problem descriptions