Troubleshooting Java code on AIX: Data collection for AIX core dumps

Do you want to save some time? This article has instructions for troubleshooting Java™ code for the IBM® AIX® operating system. In this article, the IBM Java Support on AIX Technical Team provides short, simple, and complete instructions for collecting an AIX core file and other files for analyzing process exceptions with Java applications running on AIX. You'll also learn how to package and send data to IBM Support.

IBM Java Support on AIX Technical Team, Technical Support, IBM, Software Group

This document was created by previous team members Roger Leuckie, Dawn Patterson, Jason Cheng, and Rajeev Palanki. Now this document is maintained and updated by the IBM ztrans Java Support team. The update is based on ongoing AIX and JVM changes. If you have suggestions or comments, you can contact the team's technical lead at weiluo@us.ibm.com.



25 March 2008 (First published 31 March 2004)

Also available in Chinese Russian

Important notice

This article is provided as a convenience to clients. The information contained in this article is provided "as-is", with no warranty or support. The contents of this article is updated on a "best effort basis," as time permits.

If you need, you can obtain detailed information on supported documentation and tools using the following references:

Introduction

This article gives instructions on how to collect data to debug the core dump problem. If you complete the steps in this article before contacting the support center, it will expedite a solution, since you already have the data. Those steps include setting AIX environment variables to enable fullcore dump, setting Java code to disable JVM signal handling, and collecting the core file and associated associative libraries.

Having the right expectation

There are several reasons why the Java process might abort, including:

  • Memory access problems
  • Improperly compiled native code
  • Invalid configuration
  • Bad code
  • Memory constraints

These might be due to JVM code (JIT), AIX (library), or other native code.

Collecting information might require reproducing the issue several times to collect and eliminate data to identify and resolve the issue. Only a Javacore file might be sufficient, and it might involve collecting AIX core files and other information. If the issue cannot be reproduced in a test lab or development environment, you might need to collect data from a production system. If data can't be collected from a system demonstrating the issue, the support teams might not be able to resolve the issue.

Collecting data in production environments can result in down time for the application, which can impact revenue, stability, and client perception. However, the support teams make every effort to keep the iterations for collecting information to a minimum and try to resolve any issues as soon as possible.

Service notice

It is very important to read the AIX Java service page for important support information for Java technology on AIX at IBM Developer kits for AIX, Java technology edition (see Resources). Specifically note the "End of Service" dates on the main download page and carefully read the terms and conditions for Java support on AIX below the download table.

Javacore file versus AIX core file

A Javacore file is a text representation of the Java application at an instance in time. The Javacore file is created by the Java process when:

  • A user runs the command kill -3 on the Java process.
  • The process aborts or terminates due to a fatal error/exception.

For specific information on the Javacore file, see the IBM JVM Diagnostic Guide in the Resources section.

For some situations, providing a collection of Javacore files generated from the same process can be helpful for understanding how the application is behaving. When a process is hung or has reached an unstable condition, the process might be unable to create the Javacore file.

An AIX core is a binary representation of the process in memory at some instance in time. From this core, you might get more information that is reported in a Javacore, plus additional process information not reported by the Javacore file. For this reason (in most cases), the support teams request an AIX core file over a Javacore file.

Setting up the operating system

By default, the core file is created in the working directory of the process being core dumped. But this default repository for core files can be changed by the user. To display the current repository for core files, run the following command:

syscorepath -g

If you're unfamiliar with the commands in this article, see the AIX Documentation library for information and syntax in the Resources section.

When running the commands, replace the italicized text with the appropriate value, or you'll get errors and unexpected results.

To enable the AIX operating system for generating full or complete core files, follow these steps:

  1. Set the user process limits to unlimited by running this command as the root user:
    chuser fsize=-1 data=-1 core=-1 user_id_running_application

    In this scenario, user_id_running_application is the name of the user running the Java application. For example:
    chuser fsize=-1 data=-1 core=-1 root

    This changes the file size (fsize), core file size (core), and memory size (data) limits for the user to unlimited. There can be risks associated when using these settings so, once information has been collected, the original settings should be re-enabled. Once the changes have been made, remove references to the ulimit command in the login, user profiles or any startup scripts. You must log in again as the user prior to starting the application to make the change in the new process. Without user re-login, a ulimit change might not be read by the process, even if the process has been restarted. Also, any application which is started from the /etc/inittab file or from cron deamon might not be able to read a ulimit change—you might have to use the ulimit command to change a ulimit in the process startup script, such as ulimit -d unlimited to change data limit to unlimited. Making these changes remove some system resource consumption restraints for the user ID. Be sure to restore old ulimits when the problem is resolved.

    Verify the change by running the ulimit -a command prior to starting the Java application.

  2. Enable full core dumps for the system by running the following command as root user:
    chdev -l sys0 -a fullcore=true

    This change does not require a system reboot. If you are familiar with the SMIT utility, the setting can be changed by running the smitty chgsys command, then setting the value for Enable full CORE dumps to true.

  3. The syscorepath utility, provided with AIX Version 5.2 or higher, can be used to specify a single system-wide directory where all core files of any processes will be saved. The directory should have read and write privileges for all users on the system. If a user does not have permission to write in the directory, a core file will not be created. The core files generated in this system-wide directory will be given unique names based on the process ID and time, such as core.pid.MMddhhmmss, where pid is the process ID, MM is month, dd is the day of the month, hh is the hour in 24-hour format, mm is minutes, and ss is seconds. But we don't recommend to use this method. The syntax for this command is:
    syscorepath -p alternate_directory
  4. Verify the directory, file ownership, and permissions for the core file by using:
    aclget directory_or_file

    This command verifies that the user running the application has authorization to write to the destination directory. If there is any doubt, run the following command while logged in as the user running the application:
    touch directory/core

    Use the chmod or chown command to modify ownership or permissions, respectively, or run smitty user to modify characteristics of the user account. Modifications to the user account require a re-login as that user.

  5. Ensure there is adequate disk space for saving the core file. The core file can be as large as the size of process in memory. The RSS (process size) field from the ps command output might be used to provide an approximate size of the core file. For example:
    ps avwwg java_pid

    If you need additional space, free space by deleting unwanted or older files, or increase the size of the destination file system.

  6. A set of AIX commands or utilities will be used to collect information. The following AIX filesets must be installed before continuing:
     File                               Fileset               
     --------------------------------------------------
    /usr/bin/uudecode            bos.net.uucp  
    /usr/bin/syscorepath         bos.rte.control 
    /usr/sbin/snapcore           bos.rte.serv_aid ( also /usr/bin/truss )

    To ensure all filesets are properly installed, run the command:
    lslpp -l fileset_name

    Any missing filesets should be installed from the AIX base installation media and then upgraded to the most current level using IBM Fix Central (see Resources).

Disabling Java signal handling

As discussed in the Javacore versus AIX core section, Javacore files are not always the best tool for debugging a hang situation. A binary AIX core file might provide more useful information. To get a good AIX core file, the JVM has to be set up so it does not create the Javacore when it receives a signal that is sent to the process.

When the signal handler doesn't disable, the process might show up in the "current" status as signal handling, which might hide the real problem. If the application has signal handlers that handle SIGILL, SIGFPE, SIGBUS, and SIGSEGV, those signal handlers should be disabled. Changes must be made in the environment running the application, prior to the application being started.

For situations where the application is started by another process, such as WebSphere®, setting this environment might impact all Java processes. For these instances, you need to reference that application's documentation for enabling environment settings specific to the application:

  1. Disable JVM signal handling.

    Java142 and Java131:

    These environment variables should be set before the application is restarted.

    export DISABLE_JAVADUMP=true
    export IBM_NOSIGHANDLER=true

    Java5:

    You do not need to set the two previous environment variables for Java5.

    Use this Java command line option:

    -Xdump:system:none -Xdump:tool:exec="/usr/bin/gencore %pid core.%Y%m%d.%H%M%S.dmp",events=gpf

    to allow Java5 to generate the core file, which captures all library activies and generates a useful core file.

    The application must be restarted for this change to become active. Note that this prevents Javacore and heapdump files from being created when running kill -3, or when any other signal is raised and they cause the process to terminate.

  2. Disable application signal handling.

    If the application handles SIGQUIT, SIGILL, SIGFPE, SIGBUS, SIGABRT, SIGSYS, and SIGSEGV etc signals, the application might need to disable its signal handlers.

    For example, IBM MQ handles SIGILL, SIGFPE, SIGBUS, and SIGSEGV, by default. To get a good core file, MQ might need to disable those signal handlers by setting the environment variables.

    export MQS_NO_SYNC_SIGNAL_HANDLING=1

    Note: Please check with MQ support to verify the settings, since MQ might change from time to time.

    It is important that you know if your application has signal handling, and how to disable them to generate a core file.

  3. Find out where to set the environment variables.

    The environment can be configured in a number of locations, depending on need:

    • /etc/profile
    • /etc/csh.login
    • $HOME/.profile
    • $HOME/.cshrc
    • $HOME/.kshrc
    • The application startup and configuration scripts

    However, we do not recommend you add it to the /etc/environment file. The application startup and configuration scripts overwrite all others.

Collecting data

When a problem situation occurs, the objective is to collect as much information as possible that will either identify the cause of the issue or provide a direction in understanding the cause. The issue could be the operating system (kernel/library), the JVM (or JIT), application Java Native Interface (JNI) native code, or third-party JNI code. This section includes some commands that collect data from each area. Most of the commands here must be executed as the root user.

Collect the operating system settings:

  1. After the core has been created, run the following commands:
    errpt -a > errpt.out
    lslpp -lc > lslpp.out
    instfix -i > instfix.out
    bootinfo -K > bootinfo.out
    lsattr -El sys0 > lsattr.out
    lsps -s > lsps.out
  2. Package the output files by running the following commands:
    tar -cf - *.out | compress -c > sysinfo.tar.Z
  3. Collect the core file and associate libraries.

    If you are unsure where the core file is located, use errpt -a | pg and find the entry beginning with LABEL: CORE_DUMP—the section called CORE FILE NAME might tell the proper location for the core file.

    Java131 and Java14

    mv core core.001
    javaLibsGrabber.sh core.001
    compress core.001

    Java5

    mv core core.001
    java5_install_directory/jre/bin/jextract   core.001
    		Replace java5_install_directory with Java5 installation directory
    		such as:"/usr/java5/jre/bin/jextract   core.001"

    Download the javaLibsGrabber.sh utility (see Resources also).

    You might check if javaLibsGrabber.sh collects libraries by using the following command:

    uncompress -c < core-libs.tar.Z | tar -tvf -

    If there is an error when you run javaLibsGrabber.sh or jextract, use snapcore to collect the libraries:

    snapcore -d save_directory core.001 fullpath_executable

    For example:

    snapcore -d /tmp/savedir core.001 /usr/java14/jre/bin/java

    This creates an archive (snapcore_pid.pax.Z) in the directory /tmp/savedir.

Packaging and sending data to IBM Support

Use the three steps below to package and send data to your IBM support personnel:

  1. TAR (archive) the files using the filename xxxxx.byyy.czzz.#.tar. For example:
    tar -cf xxxxx.byyy.czzz.#.tar sysinfo.tar.Z core.001.Z core-libs.tar.Z 
       optional-files

    or
    tar -cf xxxxx.byyy.czzz.#.tar sysinfo.tar.Z snapcore_pid.pax.Z 
       optional-files

    Note the size of the file where:
    • xxxxx is the PMR number.
    • yyy is the branch code.
    • zzz is the country code.
    • # is a sequence number or the date that's required to ensure that each file placed on the testcase server is unique.

    Before sending the file, verify that each archive file is valid and that the following is included:

    • sysinfo.tar.Z
    • core-libs.tar.Z or snapcore_pid.pax.Z
    • core.001.Z
  2. The files should be sent to the IBM testcase server using a unique filename for each file uploaded. To ensure timely response from the AIX support teams, it is important that you follow the instructions below. If you have trouble connecting or sending data to testcase servers, check the firewall and proxy settings within your network.
    ftp testcase.boulder.ibm.com
    login: anonymous
    password: user@host.com
    > cd /toibm/aix
    > bin
    > put xxxxx.byyy.czzz.#.tar
    > quit

You must contact IBM to get a PMR number for xxxxx.byyy.czzz by sending an e-mail or calling the IBM Support hotline.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=87784
ArticleTitle=Troubleshooting Java code on AIX: Data collection for AIX core dumps
publish-date=03252008