Topic
  • 7 replies
  • Latest Post - ‏2015-04-22T10:53:01Z by Paresh B
bgrossman
bgrossman
10 Posts

Pinned topic determine CPU utilization of a process

‏2014-07-08T19:10:45Z |

Hi,

 

I'm trying to monitor for any process (individual PID) using more than a threshold amount of CPU.  TOPAS is showing me;

 

Topas Monitor for host:    wmosdsci    Interval: 2      Tue Jul  8 15:06:11 2014
 
                                DATA  TEXT  PAGE               PGFAULTS
USER        PID    PPID PRI NI   RES   RES SPACE    TIME CPU%  I/O  OTH COMMAND
sciadmin18546774       1  60 20 50.2M 64.0K  241M 149656:50 54.8    0    0 BIBusTKS
 

 

but, at the same time " ps aux | grep 18546774" shows this

sciadmin 18546774 12.4  1.0 246724 51520      - A      Mar 26 149658:00 /apps/ibm/cogn

 

so topas shows the PID is using 54% of CPU while ps shows 12.4% of CPU.   I understand these are instantaneous values, but they are consistent, over time in showing great difference.

Is there a way that I can get topas info without all the graphical stuff, or is there a command that will show the actual CPU utilization (by % of currently allocated CPU or by physical cores used by that PID). as text that I can use in my monitoring script?

  • puvichakravarthy
    puvichakravarthy
    55 Posts

    Re: determine CPU utilization of a process

    ‏2014-07-09T05:27:37Z  

    Kindly refer to the following link for the APIs on getting process specific metrics.

    https://www.ibm.com/developerworks/community/wikis/home?lang=en#/wiki/Power%20Systems/page/Programming%20CPU%20Utilization

    Also, the utilization shown against any process in topas is for that particular interval.

  • bgrossman
    bgrossman
    10 Posts

    Re: determine CPU utilization of a process

    ‏2014-07-09T11:59:02Z  

    Thank you for the quick reply.  I was hoping that I wouldn't have to start compiling C code.  I'm simply trying to put a script together to find any processes on any of our LPAR's that are using more than 1/4 of a core (threshold can be higher or lower)

  • puvichakravarthy
    puvichakravarthy
    55 Posts

    Re: determine CPU utilization of a process

    ‏2014-07-10T06:30:34Z  
    • bgrossman
    • ‏2014-07-09T11:59:02Z

    Thank you for the quick reply.  I was hoping that I wouldn't have to start compiling C code.  I'm simply trying to put a script together to find any processes on any of our LPAR's that are using more than 1/4 of a core (threshold can be higher or lower)

    ps - I believe shows the utilization of the process from the time the process was created rather than the interval which you are monitoring. Need to confirm though!!.

  • bgrossman
    bgrossman
    10 Posts

    Re: determine CPU utilization of a process

    ‏2014-07-10T11:04:16Z  

    ps - I believe shows the utilization of the process from the time the process was created rather than the interval which you are monitoring. Need to confirm though!!.

    Yes, that is my understanding too. I have taken a new approach to this, and am using my live nmon files.  I am pulling the TOP lines, and the LPARstat lines from the last interval in the nmon file, and calculating the number of core is in use by each of the top processes

     

    # of cores used by a process = %CPU * PHYSC / Virt-Procs

    The calculation is not precise, but it is close enough for me to use for monitoring to find any processes on my frame that are using more than a threshold

  • Paresh B
    Paresh B
    3 Posts

    Re: determine CPU utilization of a process

    ‏2015-04-21T13:04:36Z  
    • bgrossman
    • ‏2014-07-10T11:04:16Z

    Yes, that is my understanding too. I have taken a new approach to this, and am using my live nmon files.  I am pulling the TOP lines, and the LPARstat lines from the last interval in the nmon file, and calculating the number of core is in use by each of the top processes

     

    # of cores used by a process = %CPU * PHYSC / Virt-Procs

    The calculation is not precise, but it is close enough for me to use for monitoring to find any processes on my frame that are using more than a threshold

    Hi bgrossman

    I wan't to do the same as you did,I wan't pull out cpu utilization , physical and page space memory utilization of particular process in my AIX 6.1 environment.

    I am not much aware what data nmon files have and how can i pull from it.

    Can you please help me,How can i pull this from nmon files or any other way?

     

     

    Updated on 2015-04-21T13:26:23Z at 2015-04-21T13:26:23Z by Paresh B
  • bgrossman
    bgrossman
    10 Posts

    Re: determine CPU utilization of a process

    ‏2015-04-21T15:06:08Z  
    • Paresh B
    • ‏2015-04-21T13:04:36Z

    Hi bgrossman

    I wan't to do the same as you did,I wan't pull out cpu utilization , physical and page space memory utilization of particular process in my AIX 6.1 environment.

    I am not much aware what data nmon files have and how can i pull from it.

    Can you please help me,How can i pull this from nmon files or any other way?

     

     

    Hi John,

    Here's what I did, in Perl.  You may want to modify the alert methods (I use sendmail to send an e-mail.  This requires you to include the "-T' option when you start your nmon process.  If you are confident with your C programming, you might want to use puvichakravarthy's recommendation above for the process API's.

    As for Memory and Paging space, you can use "svmon -P $pid" to get that information (you'll need to parse out the information that you are looking for).

    Good Luck,
    Ben

     

    #!/bin/perl
     
    ##############################################################
    ##### this script is not supported, and no warranty is given.
    ##### use this script at your own risk, author is not
    ##### responsible for any damage caused by use of this script.
    ##############################################################
     
    ##### We assume that there is always an NMON process running and
    ##### it was started with the "-T" option (TOP processes).
     
    ##### @EMAILS is an array of e-mail addresses to send alerts to.
    ##### IMPORTANT: you MUST make sure that the "@" in each e-mail
    ##### address is preceded by a back-slash "\", as in "\@".  If you
    ##### forget the back-slash, you will never get the e-mail (Perl
    ##### will think that your domain name is actually an array.
     
    @EMAILS = ( "john\@mydomain.com",
                             "paul\@mydomain.com",
                             "george\@mydomain.com",
                             "ringo\@mydomain.com" );
     
    ##### set threshold based on % of a physical CPU used, 0.1999
    ##### will cause an alert to be sent anythime an individual
    ##### pid is using >= 0.2 cores
     
    $threshold = 0.1999;
     
    ##### set this to the path where your raw NMON file is kept
     
    $nmon_path = "/home/nmon_files";
    $nmon_start = "/usr/local/scripts/start_nmon.ksh";
     
    ##### get the name of this host
     
    $host = `hostname`;
    chomp ($host);
     
    ##### get the name of the current nmon file
     
    $file = `ls -tr $nmon_path/*nmon | tail -1`;
    chomp ($file);
     
    unless ( $file )
    {
            print "There is no nmon file, starting NMON now\n";
            $output = `$nmon_start`;
            print "$output\n";
     
            next;
    }
    else
    {
            print " $file\n";
    }
     
    ##### extract the last last interval number from the current nmon file
     
    $interval = `grep "^ZZZZ,T" $file | tail -1`;
    $interval = (split(/,/,$interval))[1];
     
    print " $interval\n";
     
    ##### extract the most current number of physical cores assigned
    ##### to this LPAR, this number is constantly changing. This
    ##### will be the number of physical CPU's assigned to the LPAR
    ##### when measured in the latest NMON interval.
     
    $physc = `grep "^LPAR,$interval," $file`;
    ($physc,$virt) = (split(/,/,$physc))[2,3];
     
    print " physical CPUs: $physc   virtual procs: $virt\n";
     
    ##### extract the PID, the % of cpu assigned to the LPAR that
    ##### this PID is using, and the command information from the
    ##### first four TOP lines in this nmon interval
     
    @TOPS = `grep "^TOP,[0-9]*,$interval," $file | head -4`;
     
    foreach $top_line (@TOPS)
    {
            chomp ($top_line);
     
            ($pid,$cpu,$cmd) = (split(/,/,$top_line))[1,3,13];
     
            $cpu = int ( $cpu / $virt * 100 ) / 100;
     
            ##### now calculate the actual number of physical cores
            ##### that this PID is using
     
            $actual_cpu = ( $cpu  / 100 ) *  $physc;
     
            print "         $host,$pid,$cmd,$cpu,$physc,$actual_cpu \n";
     
            if ( $actual_cpu > $threshold )
            {
                    ##### if this pid is using more than the threshold,
                    ##### push it onto the @ALERTS array, later we will
                    ##### use this array to send the alerts.  You can
                    ##### alternately trigger an alert now.
     
                    print "ALERT: $host,$pid,$cmd,$cpu,$physc,$actual_cpu \n";
                    push (@ALERTS,"$host,$pid,$cmd,$cpu,$physc,$actual_cpu");
            }
    }
     
    if ( @ALERTS )
    {
            open (MAIL,"| /usr/sbin/sendmail -t");
            print MAIL "From: \"AIX CPU Hogs\" <aix_cpu_hogs>\n";
     
            foreach $email (@EMAILS)
            {
                    print MAIL "To: $email\n";
            }
     
            print MAIL "Subject: AIX Processes Hogging CPU\n";
            print MAIL "Content-Type: text/html; charset=ISO-8859-1\n\n";
            print MAIL "<BODY><CENTER><TABLE BORDER=1>\n";
            print MAIL "<TR><TH>Host</TH><TH>PID</TH><TH>Command</TH><TH>CPU Utilization</TH></TR>\n";
     
            foreach $alert (@ALERTS)
            {
                    ($host,$pid,$cmd,$cpu,$physc,$actual_cpu) = split(/,/,$alert);
                    print MAIL "<TR><TH>$host</TH><TD>$pid</TD><TD>$cmd</TD><TD>using $cpu\% of";
                    print MAIL " $physc cores = $actual_cpu cores</TD></TR>\n";
            }
     
            print MAIL "</TABLE>\n";
    }
    close(MAIL);
     
    Updated on 2015-04-21T15:18:43Z at 2015-04-21T15:18:43Z by bgrossman
  • Paresh B
    Paresh B
    3 Posts

    Re: determine CPU utilization of a process

    ‏2015-04-22T10:53:01Z  
    • bgrossman
    • ‏2015-04-21T15:06:08Z

    Hi John,

    Here's what I did, in Perl.  You may want to modify the alert methods (I use sendmail to send an e-mail.  This requires you to include the "-T' option when you start your nmon process.  If you are confident with your C programming, you might want to use puvichakravarthy's recommendation above for the process API's.

    As for Memory and Paging space, you can use "svmon -P $pid" to get that information (you'll need to parse out the information that you are looking for).

    Good Luck,
    Ben

     

    #!/bin/perl
     
    ##############################################################
    ##### this script is not supported, and no warranty is given.
    ##### use this script at your own risk, author is not
    ##### responsible for any damage caused by use of this script.
    ##############################################################
     
    ##### We assume that there is always an NMON process running and
    ##### it was started with the "-T" option (TOP processes).
     
    ##### @EMAILS is an array of e-mail addresses to send alerts to.
    ##### IMPORTANT: you MUST make sure that the "@" in each e-mail
    ##### address is preceded by a back-slash "\", as in "\@".  If you
    ##### forget the back-slash, you will never get the e-mail (Perl
    ##### will think that your domain name is actually an array.
     
    @EMAILS = ( "john\@mydomain.com",
                             "paul\@mydomain.com",
                             "george\@mydomain.com",
                             "ringo\@mydomain.com" );
     
    ##### set threshold based on % of a physical CPU used, 0.1999
    ##### will cause an alert to be sent anythime an individual
    ##### pid is using >= 0.2 cores
     
    $threshold = 0.1999;
     
    ##### set this to the path where your raw NMON file is kept
     
    $nmon_path = "/home/nmon_files";
    $nmon_start = "/usr/local/scripts/start_nmon.ksh";
     
    ##### get the name of this host
     
    $host = `hostname`;
    chomp ($host);
     
    ##### get the name of the current nmon file
     
    $file = `ls -tr $nmon_path/*nmon | tail -1`;
    chomp ($file);
     
    unless ( $file )
    {
            print "There is no nmon file, starting NMON now\n";
            $output = `$nmon_start`;
            print "$output\n";
     
            next;
    }
    else
    {
            print " $file\n";
    }
     
    ##### extract the last last interval number from the current nmon file
     
    $interval = `grep "^ZZZZ,T" $file | tail -1`;
    $interval = (split(/,/,$interval))[1];
     
    print " $interval\n";
     
    ##### extract the most current number of physical cores assigned
    ##### to this LPAR, this number is constantly changing. This
    ##### will be the number of physical CPU's assigned to the LPAR
    ##### when measured in the latest NMON interval.
     
    $physc = `grep "^LPAR,$interval," $file`;
    ($physc,$virt) = (split(/,/,$physc))[2,3];
     
    print " physical CPUs: $physc   virtual procs: $virt\n";
     
    ##### extract the PID, the % of cpu assigned to the LPAR that
    ##### this PID is using, and the command information from the
    ##### first four TOP lines in this nmon interval
     
    @TOPS = `grep "^TOP,[0-9]*,$interval," $file | head -4`;
     
    foreach $top_line (@TOPS)
    {
            chomp ($top_line);
     
            ($pid,$cpu,$cmd) = (split(/,/,$top_line))[1,3,13];
     
            $cpu = int ( $cpu / $virt * 100 ) / 100;
     
            ##### now calculate the actual number of physical cores
            ##### that this PID is using
     
            $actual_cpu = ( $cpu  / 100 ) *  $physc;
     
            print "         $host,$pid,$cmd,$cpu,$physc,$actual_cpu \n";
     
            if ( $actual_cpu > $threshold )
            {
                    ##### if this pid is using more than the threshold,
                    ##### push it onto the @ALERTS array, later we will
                    ##### use this array to send the alerts.  You can
                    ##### alternately trigger an alert now.
     
                    print "ALERT: $host,$pid,$cmd,$cpu,$physc,$actual_cpu \n";
                    push (@ALERTS,"$host,$pid,$cmd,$cpu,$physc,$actual_cpu");
            }
    }
     
    if ( @ALERTS )
    {
            open (MAIL,"| /usr/sbin/sendmail -t");
            print MAIL "From: \"AIX CPU Hogs\" <aix_cpu_hogs>\n";
     
            foreach $email (@EMAILS)
            {
                    print MAIL "To: $email\n";
            }
     
            print MAIL "Subject: AIX Processes Hogging CPU\n";
            print MAIL "Content-Type: text/html; charset=ISO-8859-1\n\n";
            print MAIL "<BODY><CENTER><TABLE BORDER=1>\n";
            print MAIL "<TR><TH>Host</TH><TH>PID</TH><TH>Command</TH><TH>CPU Utilization</TH></TR>\n";
     
            foreach $alert (@ALERTS)
            {
                    ($host,$pid,$cmd,$cpu,$physc,$actual_cpu) = split(/,/,$alert);
                    print MAIL "<TR><TH>$host</TH><TD>$pid</TD><TD>$cmd</TD><TD>using $cpu\% of";
                    print MAIL " $physc cores = $actual_cpu cores</TD></TR>\n";
            }
     
            print MAIL "</TABLE>\n";
    }
    close(MAIL);
     

    Thanks bgrossman ,I will try this.