You can generate a core dump file for your px-runtime image when a DataStage job fails. A core dump registers the state
of a job and the status of working memory at the moment a job unexpectedly stops.
About this task
This procedure describes how to generate a core dump and get a core stack trace for failing
processes in Red Hat OpenShift pods.
Procedure
- Set the
kernel.core_pattern parameter to enable core dump
generation:
- Log in to a Kubernetes worker node as the root user with the following command.
oc debug node/<node-name>
- Check the current
kernel.core_pattern with the following
command.
cat /proc/sys/kernel/core_pattern
- Set the
core_pattern in the DataStage pod.
sysctl -w kernel.core_pattern=/var/tmp/core-%e-sig%s-user%u-group%g-pid%p-time%t
- Repeat the steps for the worker nodes that host the pods.
- In your DataStage job, go to
Job Properties > Environment Variables section, and set the following
environment variables:
APT_NO_PM_SIGNAL_HANDLERS to 1
APT_DUMP_SCORE to TRUE
APT_PM_SHOW_PIDS to TRUE
- Install the gdb debugger program in the pods:
- Open a shell on the pod where you want to run the gdb debugger. Here,
ds-px-default-ibm-datastage-px-runtime-58d4458bf5-nb68s is an example name of the
px-runtime instance.
oc rsh ds-px-default-ibm-datastage-px-runtime-58d4458bf5-nb68s
- Open the public repository and download the getstacks.tar.gz
file.
sh-5.1$ cd /tmp
sh-5.1$ curl -LJO https://github.com/IBM/DataStage/raw/refs/heads/main/utils/getstacks/getstacks.tar.gz
- Decompress the downloaded utility tar. file.
sh-5.1$ tar xvfz getstacks.tar.gz
gdb-8.2.1-x86_64.tar.gz
getstacksre.sh
getstackscpd.sh
- Decompress the gdb binaries and run the gdb program.
sh-5.1$ tar xvfz gdb-8.2.1-x86_64.tar.gz
gdb
gstack
sh-5.1$ ./gdb
After you generate a core dump, you can collect the core stack trace.
- Collect the core stack trace:
- Log in to the pod.
- Locate the core file.
ls <dir-in-the-core_pattern>
- Start the gdb program and collect the stack trace.
For example,
/opt/ibm/PXService/Server/PXEngine/bin/osh is an
executable
application that creates the core file, and
core.1234 is a
core dump file of the application at the time of the
crash.
/tmp/gdb /opt/ibm/PXService/Server/PXEngine/bin/osh core.1234
- In the gdb, run the following commands independently.
bt
thread apply all bt
info sharedlibrary
quit
- Repeat the steps for all DataStage pods.