IBM Support

Crash on Linux produces no core or truncated core

Troubleshooting


Problem

This document outlines what needs to be done to ensure that a full core file is produced on Linux if WebSphere Application Server crashes.

Resolving The Problem

 

System core dump files should generate in WebSphere Application Server during a crash, or if manually triggered, and in some OutOfMemory instances.  A good system core dump is needed to diagnose crashes, some OutOfMemory issues, and some other issues as needed.  A few conditions can cause the core dumps to be truncated and unusable.  You may need to have your Linux OS Administrator make these changes for you.

NOTE: There is a different technote that discusses issues where the process does not record a crash event.


1. SET ULIMITS
See Also: Guidelines for Setting Ulimits

The ulimits for core (-c) and fsize (-f) need to be tuned so that the hard and soft limits are set to unlimited. This may require root access to change.
 

Global settings are set in the file /etc/security/limits.conf.

The format for setting each limit is as follows:
<domain> <type> <item> <value>

<domain> controls which users or groups will have these limits

<type> is either the string "soft" or "hard" limits.
The hyphen "-" can also be used which represents both soft and hard limits
Example:  wasadmin    soft   core  unlimited

** NOTE: If the appserver is associated with a nodeagent, BOTH the nodeagent and the appserver MUST be restarted to pick up the change.  In the case where this installation doesn't have a nodeagent, the appserver must be restarted to pick up the change.

2. DISK SPACE
Check your partitions where WebSphere Application Server resides and make sure there is enough space for the dump to be produced. Usually an error message will be seen in the native_stderr.log that indicates if the core was unable to be written.

To check all of your partitions, execute this command (the -k is for kilobytes):

df -k



3. CORE PATTERN CONFIGURATION
In some cases (which has been seen with -Xdisableexplicitgc configured), the core_pattern setting may have extra options added to its configuration which may need to be removed, such as:
Examples:
 - /proc/sys/kernel/core_pattern = |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
 - /proc/sys/kernel/core_pattern = |/tmp/cores/core.%e.%p.%h.%t
 
We know that settings like this will work:
/proc/sys/kernel/core_pattern = core
** "core" with no other options
This can be changed by having the Linux OS administrator edit /etc/sysctl.conf with the changes, save it, then run the command: sysctl -p
** With either the ulimit or core_pattern changes, if there are problems, please contact your Linux OS Support.  These are operating system settings, so your Linux OS Support will be able to assist you further.
==================================================
** Stop after step 3
Only do steps below if specifically instructed by IBM Support

4. DISABLE SIGNAL HANDLERS
To force the operating system to handle all signals sent to the JVM process, you can disable all JVM signal handlers.

For IBM SDK 5.0 and later, set this JVM argument:
-Xrs

NOTE: On SDK 6.0, to prevent unintentional crashes due to SIGTRAP, clear the shared class cache by executing <WAS_HOME>/bin/clearClassCache.sh



Additional Questions:
What happens if I do not have write permission in the profile's root directory, or the directory I am redirecting javacores, heapdumps, and system core files to?

This will result in a failure when writing these files to the system. Check for an error in the native_stderr.log, as it may try to write the dump to an alternate folder (such as /tmp).



Even with all ulimit settings set to unlimited, core files are truncated at 2GB?

There is a limitation on 32-bit processes which can be worked around if you enable large file support..
Using a 64-bit version of WebSphere Application Server also resolves this limitation, although if you run out of disk space the dump can still be truncated.



Can I test my configuration to see if a core can be generated?

Yes.  The preferred way is to generate the core via the Admin Console.  See "Collecting Java dumps and core files using the administrative console"
https://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.multiplatform.doc/ae/ttrb_dumpcore.html
Or you can simulate a crash by sending a signal 6 or signal 11 to the JVM process. This will terminate the process.

kill -6 PID
  or
kill -11 PID


An alternative is to use the gcore command. This produces a core file and keeps the process running.

gcore PID

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Crash","Platform":[{"code":"PF016","label":"Linux"}],"Version":"9.0.0.0;8.5.5;8.5;8.0;7.0","Edition":"Base;Express;Liberty;Network Deployment","Line of Business":{"code":"LOB36","label":"IBM Automation"}},{"Product":{"code":"SS7JFU","label":"WebSphere Application Server - Express"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Hangs\/performance degradation","Platform":[{"code":"PF016","label":"Linux"}],"Version":"8.5.5;8.0;7.0","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}},{"Product":{"code":"SSNVBF","label":"Runtimes for Java Technology"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Java SDK","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
30 November 2020

UID

swg21115658