IBM Support

Using DTrace on Solaris to trace system calls and find stack traces

General Page

DTrace is a comprehensive dynamic tracing facility that is built into Oracle Solaris. Administrators and developers can use DTrace on live production systems to examine the behaviour of both user programs and the operating system itself.
Using DTrace, you can explore your system to understand how it works, track down performance problems across many layers of software, or locate the cause of aberrant behaviour.

DTrace is also available on MacOS, Oracle Linux, and as a customised version on Windows.
This document describes an example of using DTrace on Solaris to trace system calls and find the associated stack traces.
Here is an example of a real-world problem: a server was collapsing due to a large amount of system calls made by the VM process. Applications were idle and the number of generated system calls was very high. 

The truss utility was run to trace open() syscalls:


truss -f -topen -p <<pid>

Here is a sample of the truss output:

41363/637: open("/proc/10017/psinfo", O_RDONLY) = 244
41363/637: fstat(244, 0xFFFFFFFE763FC8B0) = 0
41363/637: fstat(244, 0xFFFFFFFE763FC780) = 0
41363/637: ioctl(244, TCGETA, 0xFFFFFFFE763FC7EC) Err#25 ENOTTY41363/637: read(244, "02\0\0\0\0\0\001\0\0 ' !".., 512) = 416
41363/637: lseek(244, 0, SEEK_CUR) = 41641363/637: close(244) = 0
41363/637: open("/proc/10630/psinfo", O_RDONLY) = 244
41363/637: fstat(244, 0xFFFFFFFE763FC8B0) = 0
41363/637: fstat(244, 0xFFFFFFFE763FC780) = 0
41363/637: ioctl(244, TCGETA, 0xFFFFFFFE763FC7EC) Err#25 ENOTTY
41363/637: read(244, "02\0\0\0\0\0\001\0\0 )86".., 512) = 416
41363/637: lseek(244, 0, SEEK_CUR) = 41641363/637: close(244) = 0
41363/637: open("/proc/9056/psinfo", O_RDONLY) = 244
41363/637: fstat(244, 0xFFFFFFFE763FC8B0) = 0
41363/637: fstat(244, 0xFFFFFFFE763FC780) = 0
41363/637: ioctl(244, TCGETA, 0xFFFFFFFE763FC7EC) Err#25 ENOTTY
41363/637: read(244, "02\0\0\0\0\0\002\0\0 # `".., 512) = 416
41363/40: lwp_cond_wait(0x10053E148, 0x10053E130, 0xFFFFFFFF18EFF840, 0) Err#62 ETIME41363/637: lseek(244, 0, SEEK_CUR) = 416
41363/637: close(244) = 0
41363/637: open("/proc/626/psinfo", O_RDONLY) = 244

...

The output showed an excessive number of open() calls on the psinfo pseudo-files. The next action is to find the responsible stack trace for the open() calls, but truss cannot provide this information. We cannot use the -Xdump or -Xtrace options either, because these options don't exist in HotSpot, which is the VM on Solaris. So we used Dtrace, which is a built-in Solaris utility that you can use to print a stack trace for a specific system call.

DTrace has many useful features, one of which is filters. We created the following DTrace script file to filter open() calls by fetching and printing (using jstack) stack traces only when the target file includes the text "psinfo":


//File name  jstackforopencalls.d

    syscall::open:entry
                               
   /pid==$target && strstr(copyinstr(arg0), "psinfo")!=NULL/
      {
       wt=walltimestamp;

       printf("%Y.%09d: %d/%d: Opened %s at\n", wt,wt % 1000000000, pid,tid,copyinstr(arg0));

       jstack(100,100)
       
      }

You use the dtrace command to run the above script code in the file. For example:

dtrace -s jstackopencall.d -p <pid> -o <outputfile>

Here is a sample of the output: 

2 2705 open:entry 2019 May 14 13:27:58.726353553: 29299/209: Opened /proc/55960/psinfo at


libc.so.1`__open+0x8
libc.so.1`_endopen+0xa8
libc.so.1`fopen+0x1c
libProcessCPU64.so`
readSingleProcFromStatusFile+0x4c
libProcessCPU64.so`IRL_getProcessCPU+0x338
libProcessCPU64.so`getProcessCPULast+0x198 libProcessCPU64.so`Java_com_ibm_ws_xd_pmi_processcpu_ProcessCPU_nativeGetProcessCPULast___3J_3I+0xd8 0xffffffff6b3758e8 0xffffffff6f639d88 ....
...


The output shows that the root of the open sys calls for psinfo is the object "libProcessCPU64.so`Java_com_ibm_ws_xd_pmi_processcpu_ProcessCPU_nativeGetProcessCPULast___3J_3I+0xd8". 

This example shows how to trace system calls using DTrace on Solaris, then find the corresponding stack trace for the syscall. You can try other useful DTrace features by modifying the script.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSNVBF","label":"Runtimes for Java Technology"},"ARM Category":[],"Platform":[{"code":"PF027","label":"Solaris"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
22 April 2020

UID

ibm16186603