Core dump

The Topology Services daemon generates a core dump. The dump contains information normally saved in a core dump: user-space data segments for the Topology Services daemon.

The core dump refers to a particular instance of the Topology Services daemon on the local node. Other nodes may have a similar core dump. The core dump file will be located in the run_dir directory. An approximate size for the core dump file is between 7MB and 10MB.

When the Topology Services daemon invokes an assert() statement, or when it receives a segmentation violation signal for accessing its data incorrectly, it creates the dump automatically. Force Topology Services to generate a dump only under the direction of the IBM® Support Center, as the daemon has an internal check to protect against getting hung. (See the TS_THREAD_STUCK_ER error entry in Table 1. When directed to do so, you can create the dump manually by issuing the following command:

kill -6 pid_of_daemon
You can obtain the pid_of_daemon by issuing the following command:

The dump remains valid as long as the executable file /opt/rsct/bin/hatsd is not replaced. The system keeps only the last three core file instances. Copy the core dumps and the executable to a safe place.

Table 1 describes how to analyze core dumps. These procedures vary by node type, as follows:
Table 1. Dump analysis of Linux and AIX nodes
On Linux® nodes: On AIX® nodes:

To analyze the core dump, issue the command:

gdb /opt/rsct/bin/hatsd
core_file

To analyze the core dump, issue the command:
dbx /opt/rsct/bin/hatsd
core_file
Good results are similar to the following:
Type 'help' for help.  
reading symbolic
information ...  
[using memory image
in core]  

IOT/Abort trap in
evt._pthread_ksleep
[/usr/lib/libpthreads.a]
at 0xd02323e0 ($t6)
0xd02323e0
(_pthread_ksleep+0x9c)
80410014 lwz r2,0x14(r1)
Some of the error results are:
  1. This means that the current executable file was not the one that created the core dump.
    Type 'help' for help.  
    Core file program (hatsd)
    does not match current
    program (core ignored)  
    reading symbolic
    information ... (dbx)
  2. This means that the core file is incomplete due to lack of disk space.
    Type 'help' for help.  
    warning: The core file
    is truncated.
    You may need to
    increase the ulimit
    for file and coredump,
    or free some space
    on the filesystem.  
    reading symbolic
    information ...  
    [using memory image
    in core]  

    IOT/Abort trap in
    evt._pthread_ksleep
    [/usr/lib/libpthreads.a]
    at 0xd02323e0
    0xd02323e0
    (_pthread_ksleep
    +0x9c) 80410014  
    lwz r2,0x14(r1)
    (dbx)