IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
    
     Home      Products      Services & solutions      Support & downloads      My account     
 
developerworks > My developerWorks >  Dashboard > Linux for Power Architecture > ... > Performance Problem Determination > SystemTap SIGUSR2 tracing > Information > Page Comparison
developerWorks
Log In   View a printable version of the current page.
Overview Connect Spaces Forums Wikis
SystemTap SIGUSR2 tracing
Version 1 by wburos
on Oct 09, 2008 11:02.


 
compared with
Current by wburos
on May 05, 2009 10:21.

(show comment)
 
Key
These lines were removed. This word was removed.
These lines were added. This word was added.

View page history


There are 3 changes. View first change.

 {tip:title=For discussions or questions...}
 To start a discussion or get a question answered, consider posting on the [Linux for Power Architecture forum|http://www.ibm.com/developerworks/forums/forum.jspa?forumID=375].
  
 Additional Linux on Power performance information is available on the [Performance page| http://www.ibm.com/developerworks/wikis/display/LinuxP/Performance]
 {tip}
  
 *Contents*
 {toc:minLevel=2}
  
 \\
 h2. An example problem - who's throwing signals?
  
 A customer reported they were seeing significant performance problems with their Java implementation on a new level of software base. Performance analysis teams were pulled into the discussions because the initial simple indications showed that the CPU utilization had soared to 100% CPU busy.
  
 In problem determination, the technical team eventually determined they were seeing a lot of unexpected SIGUSR2 signals being generated which was causing the Java engine to do a lot of unnecessary work - in particular repeated garbage collection. In essence, the system was stuck in garbage collection mode.
  
 In the course of the root cause discussions, concerns were raised that a new rogue process, or even the hardware system itself, was generating the SIGUSR2 signals. So before hardware technicians were called in, the team decided to use pre-canned SystemTap scripts to determine who might be generating the SIGUSR2 signals. SystemTap allows the end user to dynamically "hook into" code running in the system, and provide read'able output. We explain how the script works later on this page.
  
 So we went to the SystemTap Examples page at http://sourceware.org/systemtap/examples/ and searched for "signal". We found an example SystemTap script named sig_by_pid.stp at http://sourceware.org/systemtap/examples/process/sig_by_pid.stp
  
 The script was downloaded, and the script was run with SystemTap while the problem was occurring. The resulting output showed:
  
 {noformat}
  
 # stap sig_by_pid.stp
 Collecting data... Type Ctrl-C to exit and display results
 SPID SENDER RPID RECEIVER SIGNAME COUNT
 29561 java 29561 java SIGUSR2 412
 21656 java 21656 java 61 230
 29561 java 29561 java SIGTRAP 58
 19348 java 19348 java SIGTRAP 54
 19348 java 19348 java 61 53
 12571 java 12571 java SIGTRAP 51
 10078 java 10078 java SIGTRAP 51
 9172 java 9172 java SIGTRAP 45
 18140 java 18140 java SIGTRAP 45
 22725 java 22725 java SIGTRAP 43
 10078 java 10078 java 61 41
 21580 java 21580 java SIGTRAP 41
 9610 java 9610 java SIGTRAP 40
 5825 java 5825 java SIGTRAP 40
 12571 java 12571 java 61 39
 14925 java 14925 java SIGTRAP 39
 27406 java 27406 java SIGTRAP 39
 6117 java 6117 java SIGTRAP 39
 {noformat}
  
 With this result, the teams were able to easily see that one of the Java processes (pid #29561) was sending itself a lot of SIGUSR2 signals. This was the clue they were looking for, and the Java team went on to find the defect in their code.
  
 The nice part is using SystemTap is easy and non-intrusive. There are many example scripts available which can be selected to try. And the scripting language is easy to modify if the technical teams need slight variants of the examples provided.
  
\\
 h2. Considerations for setting up a system for SystemTap
  
 SystemTap comes with both RHEL 5.2 and SLES 10 sp2
  
 SystemTap depends on kernel-debuginfo. So before you can execute the SystemTap scripts the user will need to install this package. It's a big rpm package, but it doesn't add overhead to a running system.
  
  
 \\
 h2. What does the sig_by_pid.stp script do?
  
 Below is the script
  
 {noformat}
 #! /usr/bin/env stap
  
 # Copyright (C) 2006 IBM Corp.
 #
 # This file is part of systemtap, and is free software. You can
 # redistribute it and/or modify it under the terms of the GNU General
 # Public License (GPL); either version 2, or (at your option) any
 # later version.
  
 #
 # Print signal counts by process IDs in descending order.
 #
  
 global sigcnt, pid2name, sig2name
  
 probe begin {
  print("Collecting data... Type Ctrl-C to exit and display results\n")
 }
  
 probe signal.send
 {
  snd_pid = pid()
  rcv_pid = sig_pid
  
  sigcnt[snd_pid, rcv_pid, sig]++
  
  if (!(snd_pid in pid2name)) pid2name[snd_pid] = execname()
  if (!(rcv_pid in pid2name)) pid2name[rcv_pid] = pid_name
  if (!(sig in sig2name)) sig2name[sig] = sig_name
 }
  
 probe end
 {
  printf("%-8s %-16s %-5s %-16s %-16s %s\n",
  "SPID", "SENDER", "RPID", "RECEIVER", "SIGNAME", "COUNT")
  
  foreach ([snd_pid, rcv_pid, sig_num] in sigcnt-) {
  printf("%-8d %-16s %-5d %-16s %-16s %d\n",
  snd_pid, pid2name[snd_pid], rcv_pid, pid2name[rcv_pid],
  sig2name[sig_num], sigcnt[snd_pid, rcv_pid, sig_num])
  }
 }
 {noformat}
  

 
    About IBM Privacy Contact