 | Level: Introductory Nam Keung (namkeung@us.ibm.com), Senior Technical Consultant, IBM
20 Jan 2004 This is the third in our series on AIX 5.2 performance tools updates. This article discusses the new AIX 5.2 tprof command and the Cpu Utilization Reporting Tool (CURT). tprof samples CPU utilization and accounts for time in threads and program modules using the system clock interrupts, which occur at every 100th of a second. CURT reads an AIX trace sequentially and processs the trace hooks, doing bookkeeping along the way to produce a CPU utilization report.
Introduction
This document assumes you're familiar with the trace command. You are encouraged to review the usage of AIX trace in sources such as InfoExplorer, man pages, or the AIX Performance Monitoring and Tuning Guide, SC23-2365.
TPROF
The tprof command is a standard UNIX performance tool and a versatile profiler that provides a detailed profile of CPU usage by every process ID and name. It further profiles at the application level, routine level, and even to the source statement level. It also provides both a global view and a detailed view. The tprof command can profile kernel extensions, stripped executable programs, and stripped libraries. It will do subroutine level profiling for most executable programs on which the stripnm command will produce a symbol table.
tprof profiles only CPU activity; it does not profile other system resources, such as memory or disks. It uses the system trace facility. Only one user at a time can execute the trace facility. Therefore, only one tprof command can be executing at one time.
The tprof command has been completely rewritten for AIX 5.2; it is much faster and provides more function. This command is a useful tool for anyone with a Java, C, C++, or FORTRAN program that might be CPU-bound and who wants to know which sections of the program are most heavily using the CPU.
tprof can charge CPU time to object files, processes, threads, subroutines (user mode, kernel mode, and shared library), and even to source lines of programs or individual instructions. Charging CPU time to subroutines is called profiling, and charging CPU time to source program lines is called microprofiling.
For subroutine-level profiling, the tprof command can be run without modifying executable programs (no recompilation with special compiler flags is necessary). This is still true if the executables have been stripped, unless the back tables have also been removed. However, recompilation is required to get a microprofile, unless a listing file is already available.
To perform microprofiling on a program, either the program should be compiled with -g and the source files should be accessible to tprof, or the program should be compiled with -qlist and either both the object listing files and the source files, or just the object listing files, should be accessible to tprof. To take full advantage of tprof microprofiling capabilities, it is best to provide both the .lst and the source file.
The tprof command resides in /usr/bin and is part of the bos.perf.tools fileset, which is installable from the AIX base installation media.
The new tprof enhancements are:
- Support for multiple program profiling in one pass, including microprofiling.
- Full support for threads. Reports can include a list of threads, all threads within one or more processes, or the system.
- New optional listing file annotation (instruction-level annotation).
- New optional detailed profiling report (address-level report).
- New front-end options to collect trace and name mapping information.
- Fully functional re-postprocessing mode supporting online and offline data collection modes transparently.
- New symbol mapping file format replacing gennames format.
- Increased speed of various other usability improvements.
AIX 5.2 tprof new syntax
All of the input and report files used by tprof are named RootString.suffix, where RootString is either specified with -r or is the program name specified with -x.
- If the -A flag is specified with the -r flag,
tprof runs in automated offline mode, where it generates the following files in addition to tprof report files:
- RootString.trc: raw trace file
- RootString.syms: symbol file
-
tprof generates a report file named RootString.prof that holds the process, thread, object file, and subroutine level profiling report.
- RootString.prof: profile report
- If -m is specified,
tprof generates microprofiling reports. The reports use the following naming convention: RootString.source.mprof, where source is the base name of a source file.
- RootString.sourcefilename.mprof: microprofiling report for source file name
- If neither -A flag nor -x are specified,
tprof runs either in manual offline or in post-processing mode.
- RootString.ctrc: cooked trace file (binary __trc_rpt2 equivalent)
- RootString.csyms: cooked symbol file (ascii file with only symbols needed)
- The default RootString is the program name when -x is used.
Specify search paths for object files (-S) and for source files (-M). The default is $PATH for both.
This section shows the tprof differences between AIX 5.1 and AIX 5.2. See the Commands Reference, Volume 5 for the tprof command syntax and description of flags.
AIX 5.1
Usage: tprof [ -m ] [ -v ] { -k [ -e ] | -s | -j Java Class | -p Program | -t Process_ID | [ -x Command ] | [ -i Trace_File [ -n Gennames_File ] [ -C { all | List } ] ]}
| -s | Profile shared libraries | | -k | Profile kernel | | -e | Profile kernel extensions if kernel profiling specified | | -v | Verbose mode, creates and leaves files | | -m | Enables microprofiling | | -p program | Profile 'program' | | -j JavaClass | Profile 'JavaClass' | | -t process_id | 'Process' profile | | -i file | Input trace file | | -n file | gen_names file | | -C all |list | All cpu's or a list of cpu's | | -J ID | Trace hook ID, default is 234 | | -x command | Execute 'command' |
Notes for the above information:
- -x command must be the last option issued on the command line.
- -i and -x may not be specified at the same time.
- -i must be specified if -C is used.
- -n should only be used with the -i flag.
- -p and -j may not be specified at the same time.
Aix 5.2
Usage: tprof [ -c ] [ -C { all | CPUList } ] [ -d ] -D ] [ -e ] [ -F ] [ -j ] [ -k ] [ -l ] [ -m ObjectsList ] [ -M SourcePathList ] [ -p ProcessList ] [ -P { all | PIDsList } ] [ -s ] [ -S SearchPathList ] [ -t ] [ -T BufferSize ] [ -u ] [ -v ] [ -V VerboseFileName ] [ -z ] { { -r RootString } | { [ -A { all | CPUList }] [ -r RootString ] -x Program } }
| -c | Turns on generation of cooked files | | -d | Turns on deferred tracing mode, defers data collection until trcon is called | | -D | Turns on detailed profiling which displays CPU usage by instruction offset under each subroutine | | -F | Overwrites cooked files if they exist. If used without the -x flag, this forces the manual offline mode. | | -l | Enables long names reporting. By default tprof truncates the subroutine, program, and source file names if they do not fit into the available space in the profiling report. This flag disables truncation. | | -T | Specifies the trace buffer size | | -u | Enables user mode profiling | | -s | Enables shared library profiling | | -z | Enables compatibility mode with the previous version of tprof
| | -t | Enables thread level profiling | | -v | Eanbles verbose mode |
To enable micro profiling with –m option:
- To enable microprofiling of programs, user mode profiling (-u) must be turned on.
- To enable microprofiling of shared libraries, shared library profiling (-s) must be turned on.
- To enable microprofiling of kernel extensions, kernel extension profiling (-e) must be turned on.
AIX 5.2 tprof format changes
For tprof output,
- with the summary format, tick counts are replaced by percentages
- for routine reports, Ticks, Address and Bytes columns are missing
- -z turns on the old format.
For the symbol file format:
-
gennames output was used in AIX 5.1, which includes mix of decimal and hexadecimal numbers
/unix | 00000000| 00000000| 00429ad8| | |initialize
Symbols from /unix
low.s | | file | | | |
.low | 0|unamex| | | |.text
pin_obj_start | 0|extern| | | |.text
start | 36|extern| | | |.text
.svc1 | 4128|extern| | | |.text
nonpriv_page | 12288|extern| | | |.text
g_copyr | 12288|extern| | | |.text
|
-
gensyms output is used in AIX 5.2, all in hexadecimal, format close to nm
Kernel: 00000000 00429ad8 00000000 /unix
Src: low.s
00000000 t .low
00000000 T pin_obj_start
00000024 T start
00001020 T .svc1
00003000 T nonpriv_page
00003000 T g_copyr
|
- default AIX 5.2 stripnm format is gensyms format
- -z turns on old gennames format
AIX 5.2 tprof profile output samples
Below is a summary by process name.
Process Freq Total Kernel User Shared Other
======= ==== ===== ====== ==== ====== =====
./vloop 2 94.35 0.56 93.79 0.00 0.00
/usr/bin/tprof 2 2.26 0.56 0.00 1.69 0.00
/usr/bin/sh 2 1.69 1.69 0.00 0.00 0.00
/usr/bin/trcstop 1 1.69 1.13 0.00 0.56 0.00
======= ==== ===== ====== ==== ====== =====
Total 7 100.00 3.95 93.79 2.26 0.00
Process PID TID Total Kernel User Shared Other
======= === === ===== ====== ==== ====== =====
./vloop 12184 19963 93.22 0.00 93.22 0.00 0.00
/usr/bin/trcstop 13518 18563 1.69 1.13 0.00 0.56 0.00
/usr/bin/tprof 12872 17999 1.69 0.00 0.00 1.69 0.00
/usr/bin/sh 13518 18563 1.13 1.13 0.00 0.00 0.00
./vloop 13516 18561 1.13 0.56 0.56 0.00 0.00
/usr/bin/tprof 13516 18561 0.56 0.56 0.00 0.00 0.00
/usr/bin/sh 13516 18561 0.56 0.56 0.00 0.00 0.00
======= === === ===== ====== ==== ====== =====
Total 100.00 3.95 93.79 2.26 0.00
Total Samples = 177 Total Elapsed Time = 1.77s
|
For user routines profile (-u):
Profile: ./vloop
Total % For All Processes (./vloop) = 26.88
Subroutine % Source
========== ====== ======
.doit 16.68 vloop.c
100003F0 16.68 <----+
.main 10.20 vloop.c | turned on by -D
10000210 10.20 <----+
|
Profile: /usr/bin/ksh
Total % For All Processes (/usr/bin/ksh) = 1.01
Subroutine % Source
========== ====== ======
.sh_wcstombs 0.50 /usr/bin/ksh
100051CC 0.25
100051BC 0.25
.sh_eval 0.25 /usr/bin/ksh
1001D0E0 0.25
.wctomb 0.25 glink.s
1000577C 0.25
|
Profile: /usr/bin/sh
Total % For All Processes (/usr/bin/sh) = 0.25
Subroutine % Source
========== ====== ======
.path_open 0.25 /usr/bin/sh
100152C0 0.25
|
For kernel routines profile (-k):
Total % For All Processes (KERNEL) = 62.31
Subroutine % Source
========== ====== ======
.waitproc 19.10 rnel/proc/dispatch.c
.v_zpage_ppc 7.04 zpage_ppc.s
.h_cede 6.53 hcalls.s
.v_copypage_ppc_splt 5.78 cpage_ppc_splt.s
.h_cede_native 5.03 nel/ml/POWER/stubs.
|
For kernel extensions summary (-e):
Total % For All Processes (KEX) = 3.27
Kernel Ext %
========== ======
/etc/drivers/scdiskpin[scdiskpin32] 1.01
/usr/lib/drivers/pci/s_scsiddpin[s_scsiddpin32] 0.75
/usr/lib/drivers/pci/cstokdd[cstokdd32] 0.50
/usr/lib/drivers/hd_pin[hd_pin32] 0.50
|
For each kernel extension profile (-e):
Total % For All Processes (/etc/drivers/scdiskpin[scdiskpin32]) = 1.01
Subroutine % Source
========== ====== ======
.scdisk_strategy 0.25 diskpin[scdiskpin32]
._ptrgl 0.25 ptrgl.s
.scdisk_coalesce 0.25 diskpin[scdiskpin32]
.scdisk_build_cmd 0.25 diskpin[scdiskpin32]
|
For shared libraries summary (-s):
Total % For All Processes (SH-LIBs) = 6.28
Shared Object %
============= ======
/usr/lib/libc.a[shr.o] 4.27
/usr/lib/libpthreads.a[shr_xpg5.o] 0.75
/usr/lib/libi18n.a[shr.o] 0.50
/usr/lib/libtrace.a[shr.o] 0.25
/usr/lib/liblvm.a[shr.o] 0.25
/usr/lib/libperfstat.a[shr.o] 0.25
|
For shared library profile (-s):
Profile: /usr/lib/libc.a[shr.o]
Total % For All Processes (/usr/lib/libc.a[shr.o]) = 4.27
Subroutine % Source
========== ====== ======
.malloc_y 1.01 /lib/libc/malloc_y.c
.nlist64 0.50 libc/POWER/nlist64.c
._doprnt 0.25 cs/lib/libc/doprnt.c
.malloc_y_heap 0.25 /lib/libc/malloc_y.c
.free_y_heap 0.25 /lib/libc/malloc_y.c
.wcscpy 0.25 cs/lib/libc/wcscpy.c
.free_y 0.25 /lib/libc/malloc_y.c
.wctomb 0.25 cs/lib/libc/wctomb.c
.fseeko64_unlocked 0.25 s/lib/libc/fseek64.c
.fread 0.25 ccs/lib/libc/fread.c
|
For thread report (-t):
Total % For wait[516] thread 517 (KERNEL) = 31.16
Subroutine % Source
========== ====== ======
.waitproc 19.10 rnel/proc/dispatch.c
.h_cede 6.53 hcalls.s
.h_cede_native 5.03 nel/ml/POWER/stubs.c
.disable_ints 0.25 i_able.s
.i_softmod 0.25 os/kernel/ios/intr.c
Total % For wait[516] thread 517 (KEX) = 0.25
Kernel Ext %
========== ======
/usr/lib/drivers/pci/cstokdd[cstokdd32] 0.25
Profile: /usr/lib/drivers/pci/cstokdd[cstokdd32]
Total % For wait[516] thread
517 (/usr/lib/drivers/pci/cstokdd[cstokdd32]) = 0.25
Subroutine % Source
========== ====== ======
.rx_handler 0.25 i/cstokdd[cstokdd32]
|
Microprofile
In AIX 5.1 you must use the -m flag to enable microprofiling along with the -p and -x options. To invoke tprof, use tprof -m -p <executable> -x <executable>
In AIX 5.2, you must use the additional flag -u to enable microprofiling, as follows:
tprof –m <executable> -u –sex <executable>.
For example,
$ tprof –m ./vloop –u –x ./vloop
Hot Line Profile of vloop.c
Line % PID
57 1.50 ALL
58 0.50 ALL
Total % for .main = 2.00
Line % Source
57 1.50 for (i = 0; i < NUMBERS; i++) {
58 0.50 Numbers[i] = i;
59 - }
60 -
61 - Arg1 = argv[1];
62 - Arg2 = argv[2];
63 -
64 - pthread_attr_init(&attr);
65 - pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_UNDETACHED);
66 - pthread_create(&threadid[0], &attr, worker, NULL);
67 - pthread_create(&threadid[1], &attr, worker, NULL);
68 -
69 - pthread_join(threadid[0], (void *) &rval);
70 - pthread_join(threadid[1], (void *) &rval);
71 - }
|
Modes
In AIX 5.2, tprof can run in the following four modes:
- Realtime
- Collect data while executing a program.
tprof –uske –x program generates the file program.prof with all sections (summary, use, shared libraries, kernel, and extension). Please check the tprof profile output example.
- Automated offline
- Start AIX trace in the background and logs trace data into a file; after trace collection is done,
tprof uses the trace data to produce tprof output.
tprof –A –x program generates the output program.trc, program.syms, and program.prof (with only the summary section). The program.syms contains name mapping information for all the loaded programs, libraries, and kernel extension. The program.trc is simply a trace output file.
program.syms sample output:
Symlib/AIX Version 5.1.1.1 -- Jan 24 2003 08:14:36 - gensyms format
Using load addresses for text symbols
Kernel: 00000000 006771c4 00000000 /unix
Kernel_data: 006771c8 01f95ec8 00000000
Kernel_validated: 1
Src: low.s
00000000 t .low
00000000 T pin_obj_start
00000024 T start
00001020 T .svc1
00003000 T nonpriv_page
00003000 T g_copyr
00003100 T .mulh
00003180 T .mull
00003200 T .divss
00003280 T .divus
00003300 T .quoss
00003380 T .quous
00003400 T ._clear_lock
00003410 T ._clear_lock_mem
00003420 T ._check_lock
….
|
- Manual offline
- Is used for the posted processing trace and symbol file. It reads already generated AIX trace data and produces
tprof output. For tprof –r rootstring –juske the input is rootstring.trc (from trace) and rootstring.syms (from gensyms). The output is rootstring.prof with all sections.
- Post-processing
-
tprof can post-process data already collected to produce different reports, such as:
- Generate new report from cooked files:
tprof –c –x cwhet
- Initial data cooking options:
- Online with cooking:
tprof –c –x cwhet
- Off line with cooking:
tprof –c –A –x cwhet
- Manual off line with cooking:
tprof –A –x cwhet; tprof –r cwhet -t
- Postprocessing to AIX 5.1 old format:
tprof –z -r cwhet -u
- Postprocessing with more details:
tprof –r cwhet –uskeD. This generates all subsections in the global report. Notice the address-level breakdown.
Sample:
Subroutine % Source
========== ====== ======
.fread_unlocked 0.04 ccs/lib/libc/fread.c
D01E5F18 0.01
D01E5EA8 0.01
D01E5E3C 0.01
.free_y 0.03 /lib/libc/malloc_y.c
D01E85D4 0.01
D01E8470 0.01
.malloc_y 0.03 /lib/libc/malloc_y.c
D01E9D00 0.03
.malloc_y_heap 0.03 /lib/libc/malloc_y.c
D01E959C 0.01
D01E9578 0.01
.expand_catname 0.01 s/lib/libc/catopen.c
D0206F74 0.01
.load_locale 0.01 lib/libc/setlocale.c
D02098A0 0.01
.fseeko 0.01 ccs/lib/libc/fseek.c
D020E180 0.01
.getwc_no_orientation 0.01 ccs/lib/libc/getwc.c
D0246F84 0.01
|
- Postprocessing with process and thread options:
tprof –r cwhet -u –P all
 |
CURT
CURT is a trace post processing tool that summarizes system utilization. It provides detailed CPU usage information on system, processor, processes, and kernel threads based on a trace and a name file. The input to the tool is a binary AIX 5 system trace file. The output consists of CPU and elapsed time reporting on processes and threads, and time spent in first and second level interrupt handlers.
CURT is contained in the bos.perf.tools fileset. The tool allocates all the time for each CPU in a trace to one of six states:
- Application - user space execution (includes kprocs other than wait)
- Kernel - system call (SVC) execution
- Flih - first-level interrupt handler execution
- Slih - second-level interrupt handler execution
- Dispatch - dispatch logic execution
- Wait - waitproc execution.
The syntax for the CURT command is:
curt -i inputfile [-o outputfile] [-n gennamesfile] [-m trcnmfile] [-a pidnamefile] [-f timestamp] [-l timestamp] [-ehpstP]
A CURT report contains the following sections:
- _ curt and AIX trace information
- The time and date when this particular
curt command was run, including the syntax of the curt command line that produced the report.
Run on Mon Sep 30 22:29:36 2003
Command line was:
curt -i trace.tr -n gennames.out -o /tmp/curt.out -ts
----
AIX trace file name = trace.tr
AIX trace file size = 4373708
AIX trace file created = Wed Jan 17 20:49:32 2001
Command used to gather AIX trace was:
trace -n -C all -d -L 10000000 -T 10000000 -afo trace.raw |
- _ System summary
- This section prints out the times and percentage of total time the system (all CPUs) spent in various states. The times are application (user) mode time, kernel (superviser) mode time, time spent in Flihs, time spent in Slihs, time spent running the dispatcher code, time spent in the dispatcher while dispatching the idle process, total busy time, and total idle (waitproc) time.
System Summary
--------------
processing percent percent
total time total time busy time
(msec) (incl. idle) (excl. idle) processing category
=========== =========== =========== ===================
4958.47 24.70 92.56 APPLICATION
322.27 1.61 6.02 SYSCALL
8.39 0.04 0.16 KPROC (not including IDLE)
53.86 0.27 1.01 FLIH
0.34 0.00 0.01 SLIH
21.91 0.11 0.41 DISPATCH (all procs. incl. IDLE)
9.24 0.05 0.17 IDLE DISPATCH (only IDLE proc.)
----------- ---------- -------
5356.86 26.69 100.00 CPU(s) busy time
14714.30 73.31 IDLE
----------- ----------
20071.15 TOTAL
Avg. Thread Affinity = 0.98
|
- _ Per-processor summary
- Follows the System Summary, and is essentially the same information but broken down on a processor-by-processor basis.
Processor Summary processor number 0
---------------------------------------
processing percent percent
total time total time busy time
(msec) (incl. idle) (excl. idle) processing category
=========== =========== =========== ===================
1397.79 27.82 92.54 APPLICATION
90.96 1.81 6.02 SYSCALL
0.59 0.01 0.04 KPROC (not including IDLE)
15.25 0.30 1.01 FLIH
0.34 0.01 0.02 SLIH
5.48 0.11 0.36 DISPATCH (all procs. incl. IDLE)
2.28 0.05 0.15 IDLE DISPATCH (only IDLE proc.)
----------- ---------- -------
1510.41 30.07 100.00 CPU(s) busy time
3513.25 69.93 IDLE
----------- ----------
5023.66 TOTAL
Avg. Thread Affinity = 0.98
Total number of process dispatches = 4833
Total number of idle dispatches = 5017 |
- _ Application summary
- Generates information about the amount of CPU time spent in application and system call (syscall) mode, expressed in milliseconds and as a percentage by thread, process, and process type. Also included are the number of threads per process and per process type.
Application Summary (by Tid)
----------------------------
-- processing total (msec) -- -- percent of total processing time --
combined application syscall combined application syscall name (Pid Tid)
======== =========== ======= ======== =========== =========================
354.9568 354.9568 0.0000 1.7542 1.7542 0.0000 (21218 20017)
301.8943 0.0000 301.8943 1.4920 0.0000 1.4920 whoisd(20798 21699)
267.2369 4.6100 262.6270 1.3207 0.0228 1.2979 whoisd(21228 20027)
267.0183 6.7290 260.2893 1.3196 0.0333 1.2864 whoisd(17862 18561)
266.8688 4.1496 262.7192 1.3189 0.0205 1.2984 whoisd(17864 18563)
266.5486 6.0502 260.4984 1.3173 0.0299 1.2874 whoisd(18928 24287)
265.8774 6.6241 259.2533 1.3140 0.0327 1.2813 whoisd(21230 20029)
265.7922 4.8246 260.9677 1.3136 0.0238 1.2897 whoisd(21222 20021)
265.7272 3.9613 261.7660 1.3132 0.0196 1.2937 whoisd(17860 18559)
...
Application Summary (by Pid)
----------------------------
-- processing total (msec) -- -- percent of total processing time --
combined application syscall combined application syscall name (Pid)(Thread Count)
======== =========== ======= ======== =========== ======= ===================
354.9568 354.9568 0.0000 1.7542 1.7542 0.0000 (21218)(1)
301.8943 0.0000 301.8943 1.4920 0.0000 1.4920 whoisd(20798)(1)
...
|
- _ kproc summary
- Generates information about the amount of CPU time spent executing each kernel process, including the idle process, expressed in milliseconds and as a percentage of the total CPU time. The kproc Summary (by TID) shows an output of all kernel process threads that were running on the system during the time of trace collection and their CPU consumption.
Kproc Summary (by Tid)
-----------------------
-- processing total (msec) -- -- percent of total time -- name (Pid Tid Type)
combined kernel operation combined kernel operation
======== ======== ========== ======== ======== =============================
48.6499 48.6499 0.0000 0.2404 0.2404 0.0000 kproc(1032 1033 W)
29.7313 29.7313 0.0000 0.1471 0.1471 0.0000 kproc(516 517 W)
20.5039 20.5039 0.0000 0.1013 0.1013 0.0000 kproc(774 775 W)
15.9160 15.9160 0.0000 0.0787 0.0787 0.0000 kproc(1290 1291 W)
0.0285 0.0285 0.0000 0.0177 0.0177 0.0000 netm(45078 45079 -)
|
- _ System calls summary
- Generates completed system calls that includes the name and address of the system call, the number of times the system call was executed, and the total CPU time expressed in milliseconds and as a percentage with average, minimum, and maximum time the system call was running.
System Calls Summary
--------------------
Count Total Time % sys Avg Time Min Time Max Time SVC (Address)
(msec) time (msec) (msec) (msec)
======== =========== ====== ======== ======== ======== ================
40 9470.9112 46.81% 236.7728 0.1803 297.1308 _exit(13cfd8)
43 45.5119 0.22% 1.0584 0.5038 1.2273 kfork(13bc0c)
1553 16.2352 0.08% 0.0105 0.0063 0.1183 ._disclaim64(15b7ea8)
116 9.2561 0.05% 0.0798 0.0052 0.3418 kwrite(1b8088)
442 6.6149 0.03% 0.0150 0.0019 0.0723 kread(1b80ac)
3 3.4469 0.02% 1.1490 1.1259 1.1667 getprocs(22b0d8)
3 2.8527 0.01% 0.9509 0.4882 1.2370 execve(13bc14)
41 2.6903 0.01% 0.0656 0.0263 0.1269 ._disclaim64(15b8334)
42 2.2641 0.01% 0.0539 0.0306 0.0905 connext(13d71c)
168 2.0774 0.01% 0.0124 0.0004 0.0876 close(1b801c)
...
|
- _ First level interrupt handler (FLIH) summary
- Lists all first level interrupt handlers that were called during the monitoring period, as shown in the following sample:
Global Flih Summary
-------------------
Count Total Time Avg Time Min Time Max Time Flih Type
(msec) (msec) (msec) (msec)
====== =========== =========== =========== =========== =========
32 0.6630 0.0207 0.0070 0.0265 5(IO_INTR)
578 13.1338 0.0227 0.0022 0.0982 32(QUEUED_INTR)
2082 30.4754 0.0146 0.0015 0.2881 31(DECR_INTR)
129 0.2067 0.0016 0.0007 0.0064 4(INSTR_PG_FLT)
9565 9483.9594 0.9915 0.0037 194.2942 3(DATA_ACC_PG_FLT)
Per CPU Flih Summary
--------------------
CPU Number 0:
Count Total Time Avg Time Min Time Max Time Flih Type
(msec) (msec) (msec) (msec)
====== =========== =========== =========== =========== =========
8 0.1526 0.0191 0.0099 0.0220 5(IO_INTR)
33 0.0492 0.0015 0.0007 0.0040 4(INSTR_PG_FLT)
127 2.9236 0.0230 0.0022 0.0973 32(QUEUED_INTR)
530 9.4741 0.0179 0.0070 0.0493 31(DECR_INTR)
2289 2460.5887 1.0750 0.0037 194.2942 3(DATA_ACC_PG_FLT)
|
- _ Second level interrupt handler (SLIH) summary
- The SLIH Summary lists all second level interrupt handlers that were called during the monitoring period, as shown in the following sample:
Global Slih Summary
-------------------
Count Total Time Avg Time Min Time Max Time Slih Name(Address)
(msec) (msec) (msec) (msec)
====== =========== =========== =========== =========== =================
1 0.0278 0.0278 0.0278 0.0278 s_scsiddpin(1464210)
31 3.8027 0.1227 0.0527 0.2039 phxentdd(1585cb4)
Per CPU Slih Summary
--------------------
CPU Number 0:
Count Total Time Avg Time Min Time Max Time Slih Name(Address)
(msec) (msec) (msec) (msec)
====== =========== =========== =========== =========== =================
1 0.0278 0.0278 0.0278 0.0278 s_scsiddpin(1464210)
7 0.7376 0.1054 0.0527 0.1897 phxentdd(1585cb4)
|
- _ Detailed process information, if -p is specified
- Gives detailed information about each process found in the trace. The -p flag process information includes the process ID and name, and a count and list of the TIDs belonging to the process. The total application and system call time for all the threads of the process is given. Includes summary reports of all completed and pending system calls for the threads of the process.
Process Details for Pid: 129190
Process Name: router
7 Tids for this Pid: 245889 245631 244599 82843 78701 75347 28941
Total Application Time (ms): 124.023749
Total System Call Time (ms): 8.948695
Process System Call Data
Count Total Time %sys Avg Time Min Time Max Time SVC (Address)
(msec) time (msec) (msec) (msec)
======== =========== ====== ======== ======== ======== ================
93 3.6829 0.05% 0.0396 0.0060 0.3077 kread(19731c)
23 2.2395 0.03% 0.0974 0.0090 0.4537 kwrite(1972f8)
30 0.8885 0.01% 0.0296 0.0073 0.0460 select(208c5c)
1 0.5933 0.01% 0.5933 0.5933 0.5933 fsync(1972a4)
106 0.4902 0.01% 0.0046 0.0035 0.0105 klseek(19737c)
13 0.3285 0.00% 0.0253 0.0130 0.0387 semctl(2089e0)
6 0.2513 0.00% 0.0419 0.0238 0.0650 semop(2089c8)
|
- _ Detailed thread information, if -t is specified
- A detailed report on thread status, including the amount of time the thread was in application and kernel mode, what system calls the thread made, processor affinity, the number of times the thread was dispatched, and to what CPU it was dispatched. The report generated with the -t flag includes a detailed report on thread status with the amount of time the thread was in application and kernel mode, what system calls the thread made, processor affinity, the number of times the thread was dispatched, and to what CPU it was dispatched.
Report for Thread Id: 48841 (hex bec9) Pid: 143984 (kex 23270)
Process Name: oracle
---------------------
Total Application Time (ms): 70.324465
Total Kernel Time (ms): 53.014910
Thread System Call Data
Count Total Time Avg Time Min Time Max Time SVC (Address)
(msec) (msec) (msec) (msec)
======== =========== =========== =========== =========== ================
69 34.0819 0.4939 0.1666 1.2762 kwrite(169ff8)
77 12.0026 0.1559 0.0474 0.2889 kread(16a01c)
510 4.9743 0.0098 0.0029 0.0467 times(f1e14)
73 1.2045 0.0165 0.0105 0.0306 select(1d1704)
68 0.6000 0.0088 0.0023 0.0445 lseek(16a094)
12 0.1516 0.0126 0.0071 0.0241 getrusage(f1be0)
|
The default report created by the curt command is:
# curt -i trace_report -m trace.nm -n gennames.out -o curt.out
The next section covers gathering the trace_report as input file.
Collecting a trace for curt
An example for collecting a trace for curt would be:
- Start the trace daemon with:
trace -a -d -f -C all -T Size -L Size -J curt -o trace_file |
The -a option runs the trace daemon asynchronously.
The -d option defers data collection until trcon is called.
The -f option causes the data collection to stop as soon as the in-memory buffer is filled.
If this is an SMP system, use the -C all option, which creates a set of trace buffers for every CPU. Otherwise you risk overfilling a single set of trace buffers, or at least force the CPUs to contend for access to the shared buffers.
The -T and -L options override the default trace buffer and trace log file sizes. By default, all trace events are collected, which can quickly fill up the trace buffer (and trace log file) with data that is not needed. Use the -J curt option to capture the events processed by curt.
- Start and stop trace data collection using:
The trcon command starts the trace data collection; the program is run, after which the trace data collection is stopped with trcoff. You always trace a single contiguous time-interval for curt.
- Collect the kernel names with:
or
The output of trcnm is typically much smaller than the output of gennames, since it maps only the kernel names and not the kernel extensions or shared libraries. curt doesn't use the extra information and will run somewhat faster without it.
- Stop trace with:
- Recombine the individual trace files (if the -C all option has been used) using:
trcrpt -r -C all trace_file > trace_report |
Resources
About the author  | |  | Nam Keung is a senior programmer for IBM Corporation in Austin, TX. He has worked in the area of AIX ISDN communication, AIX SOM/DSOM development, AIX Multimedia development, NT Clustering technology and Java performance. His current assignment involves helping ISVs in porting, deploying applications, performance tuning, and education for the pSeries platform. He has been a programmer with IBM since 1987. Nam can be reached at namkeung@us.ibm.com. |
Rate this page
|  |