IBM Support

Check and manage core files the way needed by IBM Support

How To


Summary

This technote provides the details to verify that an existing process core file is valid, how to configure the system and environment for generating valid and complete process core files, and the steps to package and upload the process core file and required data to IBM.
It is important that all of the instructions in this document be read and followed. Not reading and not following the instructions may lead to delays in the resolution of the reported situation.

Objective

A process core file is created when various errors occur during the runtime of a process. Errors such as memory address violations, illegal instructions, bus errors, and user generated quit signals, commonly cause a core dump. The process core file created contains a memory image of the terminated process. It can be used by IBM AIX support to analyze the root cause of the process termination.

This technote provides you with the steps needed to check and provide a core file that is useful for problem determination.

Steps

1. Locating the core

By default, the core file is created in the working directory of the process being core dumped. If required, user can change the default repository for core files (for details see technote "AIX Core Dump Facility" - its link is provided below).

An error report entry is created for each process dumping a core:

# errpt
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
A924A5FC   0717133118 P S SYSPROC        SOFTWARE PROGRAM ABNORMALLY TERMINATED

# errpt -aj A924A5FC
---------------------------------------------------------------------------
LABEL:          CORE_DUMP
IDENTIFIER:     A924A5FC
Date/Time:       Tue Jul 17 13:31:01 2018
...
USER'S PROCESS ID:
              14483744
...
CORE FILE NAME
/tmp/pmr/core
PROGRAM NAME
lauf
...

The core file in question of this example is file /tmp/pmr/core.

2. Identifying whether the core is fullcore and/or truncated

Usage of the file command:

# file /tmp/pmr/core
/tmp/pmr/core: AIX core file fulldump 32-bit, lauf

-> this file is fullcore, as it is listed as fulldump

# file /tmp/pmr/core
/tmp/pmr/core: AIX core file 32-bit, lauf

-> this file is not fullcore, as it is not listed as fulldump

The file command does not show if a core file is truncated or not.

Usage of the dbx command:

# type lauf
lauf is /wlbin/lauf

dbx /wlbin/lauf /tmp/pmr/core
Type 'help' for help.
warning: The core file is truncated. You may need to increase the ulimit
for file and coredump, or free some space on the filesystem.
[using memory image in /tmp/pmr/core]

-> this core file is truncated

# dbx /wlbin/lauf /tmp/pmr/core
Type 'help' for help.
warning: The core file is not a fullcore. Some info may
not be available.
[using memory image in /tmp/pmr/core]
reading symbolic information ...

-> the core file is not truncated, but also not fullcore.

# dbx /wlbin/lauf /tmp/pmr/core
Type 'help' for help.
[using memory image in /tmp/pmr/core]
reading symbolic information ...

-> this core file is fullcore and not truncated

3. Deciding if a core file is ready to upload to IBM and upload instructions

  * Do not upload truncated core files unless explicitly asked for.  The data is not generally usable.

  * Always try to upload fullcore core files.

If the problem shows up regularly, enable fullcore and wait for the problem to show up again. If possible, trigger the core dump. Then upload the fullcore core file.

If the problem does not show up regularly and you have no fullcore core file, you may also upload the core file you have. You should enable fullcore in case the problem shows again another time in order to have a fullcore core file in hand.

As soon as you have identified a valid core file, use the snapcore command to collect all necessary data:

# snapcore -r

This will remove old snapcore command output from the /tmp/snapcore directory.

# snapcore /<path>/core

The snapcore output will be placed in /tmp/snapcore. The name of the snapcore file looks like snapcore_<pidnumber>.pax.

Rename the file to <casenumber>.snapcore_<pidnumber>.pax and upload to IBM.

For upload details see link "How to upload data to IBM support" provided below.

Note: In addition to the core file, the libraries used by the process are required and that by not following these instructions and not providing the libraries, support will not be able to debug the core file. This would delay the identification of the issue and resolution of the situation.

4. Enabling fullcore on the system

Verify the current state with

# lsattr -El sys0 -a fullcore -F "attribute value"

An output of true means that fullcore is enabled.
An output of false means that fullcore is disabled.

Run the following command as root user to enable fullcore:

# chdev -l sys0 -a fullcore=true

This change does not require a system reboot. It will also remain after you reboot the system.

This setting can also be changed by running the smitty chgsys command and set the value of Enable full CORE dumps to true.

5. Disk space required to create a core file

Run the following command to check free space on the file system to store the core file.

# df -k <directory_of_core>

The core file can be as large as the size of process in memory. The RSS (process size) field of the ps command output will give the approximate size of the core file (in 1 KB units).

# ps vw <pid>
e.g.:
# ps vw 7602670
      PID    TTY STAT  TIME PGIN  SIZE   RSS   LIM  TSIZ   TRS %CPU %MEM COMMAND
  7602670      - A     2:07   24  1328  1552 32768   448   224  0.0  0.0 /usr/bin/topasrec  -L -s 300

6. Verifying the directory, file ownership, and permissions for the core file

e.g.: # ls -ld /tmp/pmr
drwxrwxr-x    2 wlu04     adgr4          256 Jul 17 13:55 /tmp/pmr

The owner of the process needs write permission to the according directory.

In our example the owner must be user wlu04 or the owner must be in group adgr4. Use the chmod or the chown command to modify the ownership and permissions and ownership.

7. Ensure that ulimit -c (core) and ulimit -f (file) is set to unlimited

To check the current settings run the following commands:

# ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         131072
stack(kbytes)        32768
memory(kbytes)       32768
coredump(blocks)     2097151
nofiles(descriptors) 2000
threads(per process) unlimited
processes(per user)  unlimited

To set the core and filesize to unlimited only for the current session, then run the below commands


# ulimit -c unlimited
# ulimit -f unlimited

To set  “ulimit –c” and “ulimit –f” permanently to unlimited for a specific user run below as root user:


# chuser fsize=-1 core=-1 <user>

For example:
# chuser fsize=-1 core=-1 root

You can also set the values in /etc/security/limits file by modifying the stanzas for the respective users or change the default value for all users by changing them in the "default:" stanza.

Note: Changing the limit in /etc/security/limits does not affect those processes that are currently running. A new login reflects the changes.

If necessary, revert back the settings to original values when investigation is done.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
12 November 2019

UID

ibm10716719