Generating a core dump file for px-runtime image

You can generate a core dump file for your px-runtime image when a DataStage job fails. A core dump registers the state of a job and the status of working memory at the moment a job unexpectedly stops.

About this task

This procedure describes how to generate a core dump and get a core stack trace for failing processes in Red Hat OpenShift pods.

Procedure

  1. Set the kernel.core_pattern parameter to enable core dump generation:
    1. Log in to a Kubernetes worker node as the root user with the following command.
      oc debug node/<node-name>
    2. Check the current kernel.core_pattern with the following command.
      cat /proc/sys/kernel/core_pattern
    3. Set the core_pattern in the DataStage pod.
      sysctl -w kernel.core_pattern=/var/tmp/core-%e-sig%s-user%u-group%g-pid%p-time%t
    4. Repeat the steps for the worker nodes that host the pods.
  2. In your DataStage job, go to Job Properties > Environment Variables section, and set the following environment variables:
    • APT_NO_PM_SIGNAL_HANDLERS to 1
    • APT_DUMP_SCORE to TRUE
    • APT_PM_SHOW_PIDS to TRUE
  3. Install the gdb debugger program in the pods:
    1. Open a shell on the pod where you want to run the gdb debugger. Here, ds-px-default-ibm-datastage-px-runtime-58d4458bf5-nb68s is an example name of the px-runtime instance.
      oc rsh ds-px-default-ibm-datastage-px-runtime-58d4458bf5-nb68s
    2. Open the public repository and download the getstacks.tar.gz file.
      sh-5.1$ cd /tmp
      sh-5.1$ curl -LJO https://github.com/IBM/DataStage/raw/refs/heads/main/utils/getstacks/getstacks.tar.gz
    3. Decompress the downloaded utility tar. file.
      sh-5.1$ tar xvfz getstacks.tar.gz
      gdb-8.2.1-x86_64.tar.gz
      getstacksre.sh
      getstackscpd.sh
    4. Decompress the gdb binaries and run the gdb program.
      sh-5.1$ tar xvfz gdb-8.2.1-x86_64.tar.gz
      gdb
      gstack
      sh-5.1$ ./gdb
    After you generate a core dump, you can collect the core stack trace.
  4. Collect the core stack trace:
    1. Log in to the pod.
    2. Locate the core file.
      ls <dir-in-the-core_pattern>
    3. Start the gdb program and collect the stack trace.
      For example, /opt/ibm/PXService/Server/PXEngine/bin/osh is an executable application that creates the core file, and core.1234 is a core dump file of the application at the time of the crash.
      /tmp/gdb /opt/ibm/PXService/Server/PXEngine/bin/osh core.1234
    4. In the gdb, run the following commands independently.
      bt
      thread apply all bt
      info sharedlibrary
      quit
    5. Repeat the steps for all DataStage pods.