IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
    
     Home      Products      Services & solutions      Support & downloads      My account     
 
developerworks > My developerWorks >  Dashboard > AIX > ... > Performance Monitoring Tips and Techniques > AIXV53filemon
developerWorks
Log In   View a printable version of the current page.
Overview Connect Spaces Forums Wikis
AIXV53filemon
Added by OneSkyWalker, last edited by OneSkyWalker on Jan 07, 2009  (view change)
Labels: 
(None)

How to use AIX V5.3 filemon to determine where I/O requests originate

This web page is meant to supplement, not replace, other documentation which has been published:

The filemon command can produce reports on the following:

Physical volume (hdisk) activity Read and write activity to AIX hdisks during filemon data collection, including rates, block sizes, response times, sequentiality of I/O read and write accesses, throughput, and hdisk utilization.

Most of these metrics are also reported by nmon.

Some of these metrics are reported by the iostat command. Others can be estimated from iostat output, but those estimates are inaccurate in various ways. See AIX V5.3 iostat awk script for more information.

All AIX tools measure response time as seen by a System p server. If AIX does not process interrupts promptly, the response time it reports can be considerably longer than the response time a disk storage subsystem is delivering. AIX lrud activity can sometimes delay interrupt processing:
# vmo -h lru_poll_interval
Help for tunable lru_poll_interval:
Determines the interval, in milliseconds, at which LRU page replacement
daemons poll for off-level interrupts.  Default: 10 milliseconds.
Possible values: 0 through 60,000 (1 minute).
A LRU page replacement daemon blocks low priority interrupts while running
on a CPU. If this option is enabled, LRU page replacement daemons will
process pending interrupts at the designated interval. On a heavily
loaded system with large amounts of I/O, enabling this option can improve
I/O throughput since I/O interrupts don't have to wait for LRU page
replacement daemons to finish their processing.
Logical volume activity Read and write activity to AIX logical volumes during filemon data collection, including rates, block sizes, response times, sequentiality of I/O read and write accesses, throughput, and logical volume utilization.

Please note that logical volume utilization does not mean much, since I/O operations to LVs do not block one another and an LV which is spread across many hdisks can be 100% busy when the hdisks on which the LV resides are not particularly busy.
Virtual memory segment activity Read and write activity to AIX virtual memory segments, including rates, block sizes, response times, and sequentiality of I/O read and write accesses.

Please note that all AIX JFS and JFS2 files (except those opened with Direct or Concurrent I/O (DIO or CIO)) are cached in real memory. Reads occur when a process makes a read file system call or when a process accesses a page in a file mapped to memory when the page is not resident in memory. Writes occur when a process makes a write file system call or when the AIX LRU daemon steals a dirty file page to make room in real memory for other pages.

If a segment for a given volume:inode does not appear in this report, it is because nothing was transferred between memory and disk for the segment during filemon data collection.
File activity Read and write activity to files generated by read, write, open, and lseek system calls during filemon data collection.

Please note that read and write requests can be satisfied from cached pages in real memory, so system calls might not result in physical I/O to hdisks.

Please note that lio_listio system calls (requesting asynchronous I/O) are not monitored by filemon, so the file activity report does not include file activity generated using asynchronous I/O (AIO) requests. And filemon does not monitor page faults, so the file activity report does not include file activity caused when a process accesses or updates a page in a file mapped to memory. The Virtual memory segment activity report does report on activity generated in these ways.

The remainder of this web page will focus on the filemon Most Active Files report. See below for a discussion of that report.

How to collect a filemon report

Because filemon uses AIX trace for data collection, root access is required.

Here is an example of use of filemon which will capture 10 seconds of trace information for a workload which is already running:

#!/bin/ksh
filemon -d -T 5242880 -O all -u -o /tmp/filemon.out
sleep 1					# Give filemon a chance to get AIX trace started
ps -ef >/tmp/psef.before.out		# Get list of processes just before trace starts
( sleep 5 ; ps -ef >/tmp/psef.during.out ) &	# Get a list of processed during the trace
trcon					# Start trace
sleep 10					# Capture 10 seconds of trace which filemon then analyzes
trcstop					# Stop trace so filemon can write report to /tmp/filemon.out
ps -ef >/tmp/psef.after.out	# Get list of processes just before trace ends

Notice that a list of processes is captured before, during, and after the trace. With a little experience using filemon in a particular environment, it might be possible to eliminate two of the three captures. But, as we shall see, it is very important to capture information about each process to help identify files in the filemon report.

Here is another example which captures trace information for a workload launched within the shell script:

#!/bin/ksh
filemon -d -T 5242880 -O all -u -o /tmp/filemon.out
sleep 1                                 # Give filemon a chance to get the trace started
( sleep 1 ; ps -ef >/tmp/psef.out ) &   # Make sure ps -ef runs after cat and dd (below) get started
trcon
cat /home/pittman/fmontest/unix_mp | dd of=/dev/null
trcstop

When the second example above is run, we should see something like:

# ./getfmon

Run trcon command to signal start of trace.

Run trcstop command to signal end of trace.
Tue Nov 25 10:48:07 2008
System: AIX 5.3 Node: repoman Machine: 006032934C00
23740+1 records in.
23740+1 records out.
# [filemon: Reporting started]
[filemon: Reporting completed]

[filemon: 1.211 secs in measured interval]

Here is /tmp/filemon.out produced by the second example above:

Tue Nov 25 10:48:07 2008
System: AIX 5.3 Node: repoman Machine: 006032934C00

Cpu utilization:  72.6%
Cpu allocation:  100.0%

Most Active Files
------------------------------------------------------------------------
  #MBs  #opns   #rds   #wrs  file                     volume:inode
------------------------------------------------------------------------
   4.1      0      0   1056  pid=30996_fd=1
   4.1      1   1056      0  unix_mp                  /dev/lv00:102403
   4.1      0   8382      0  pid=35890_fd=0
   4.1      1      0   8381  null
   0.4      0    111      0  pid=37152_fd=0
   0.3      1     78      0  unix                     /dev/hd2:2100
   0.1      0   8382      0  pid=15290_fd=6
   0.1      0      0   8382  pid=35890_fd=7
   0.0      0      0   8384  pid=15290_fd=5
   0.0      0   8383      0  pid=35890_fd=4
   0.0      4      8      0  ksh.cat                  /dev/hd2:180977
   0.0     12      7      0  passwd                   /dev/hd4:4153
   0.0      3      4      0  login.cfg                /dev/hd4:24
   0.0      1      3      0  methods.cfg              /dev/hd2:8519
   0.0      1      0      2  psef.out                 /dev/hd3:154
   0.0      1      1      0  current_cluster          /dev/hd9var:184
   0.0      1      1      0  nodedef.cfg              /dev/hd9var:185
   0.0      1      1      0  ps.cat                   /dev/hd2:181119
   0.0      0      2      0  pid=17048_fd=7
   0.0      0      0      1  pid=17048_fd=8

Most Active Segments
------------------------------------------------------------------------
  #MBs  #rpgs  #wpgs    segid  segtype                  volume:inode
------------------------------------------------------------------------
   4.2   1064      0    1f8eb  persistent               /dev/lv00:102403
   0.0      3      0     2f61  persistent               /dev/hd2:6969
   0.0      3      0   23405a  system
   0.0      0      1    1e0af  system
   0.0      1      0    2c655  persistent               /dev/lv00:108544
   0.0      1      0     b981  .inodes                  /dev/lv00:3

Most Active Logical Volumes
------------------------------------------------------------------------
  util  #rblk  #wblk   KB/s  volume                   description
------------------------------------------------------------------------
  0.20   8552      0 3531.3  /dev/lv00                /home/pittman
  0.01     24      0    9.9  /dev/hd2                 /usr
  0.01      0      8    3.3  /dev/hd8                 jfslog

Most Active Physical Volumes
------------------------------------------------------------------------
  util  #rblk  #wblk   KB/s  volume                   description
------------------------------------------------------------------------
  0.17   8552      0 3531.3  /dev/hdisk8              N/A
  0.02     24      8   13.2  /dev/hdisk0              N/A




------------------------------------------------------------------------
Detailed File Stats
------------------------------------------------------------------------

FILE: pid=30996_fd=1
opens:			0
total bytes xfrd:	4325376
writes:			1056	(0 errs)
  write sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  write times (msec):	avg   0.908 min   0.049 max   2.371 sdev   0.301

FILE: /home/pittman/fmontest/unix_mp  volume: /dev/lv00 (/home/pittman)  inode: 102403
opens:			1
total bytes xfrd:	4325376
reads:			1056	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.189 min   0.066 max  17.443 sdev   0.624

FILE: pid=35890_fd=0
opens:			0
total bytes xfrd:	4291584
reads:			8382	(0 errs)
  read sizes (bytes):	avg   512.0 min     512 max     512 sdev     0.0
  read times (msec):	avg   0.037 min   0.016 max  24.924 sdev   0.272

FILE: /dev/null
opens:			1
total bytes xfrd:	4291072
writes:			8381	(0 errs)
  write sizes (bytes):	avg   512.0 min     512 max     512 sdev     0.0
  write times (msec):	avg   0.019 min   0.013 max   0.455 sdev   0.012

FILE: pid=37152_fd=0
opens:			0
total bytes xfrd:	454656
reads:			111	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.098 min   0.083 max   0.369 sdev   0.030

FILE: /unix  volume: /dev/hd2 (/usr)  inode: 2100
opens:			1
total bytes xfrd:	319488
reads:			78	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.058 min   0.038 max   0.173 sdev   0.018
lseeks:			153

FILE: pid=15290_fd=6
opens:			0
total bytes xfrd:	67052
reads:			8382	(0 errs)
  read sizes (bytes):	avg     8.0 min       4 max       8 sdev     0.0
  read times (msec):	avg   0.090 min   0.013 max  25.305 sdev   0.280

FILE: pid=35890_fd=7
opens:			0
total bytes xfrd:	67052
writes:			8382	(0 errs)
  write sizes (bytes):	avg     8.0 min       4 max       8 sdev     0.0
  write times (msec):	avg   0.030 min   0.012 max   0.514 sdev   0.018

FILE: pid=15290_fd=5
opens:			0
total bytes xfrd:	33536
writes:			8384	(0 errs)
  write sizes (bytes):	avg     4.0 min       4 max       4 sdev     0.0
  write times (msec):	avg   0.017 min   0.012 max   0.477 sdev   0.010

FILE: pid=35890_fd=4
opens:			0
total bytes xfrd:	33532
reads:			8383	(0 errs)
  read sizes (bytes):	avg     4.0 min       4 max       4 sdev     0.0
  read times (msec):	avg   0.020 min   0.012 max   0.540 sdev   0.027

FILE: /usr/lib/nls/msg/en_US/ksh.cat  volume: /dev/hd2 (/usr)  inode: 180977
opens:			4
total bytes xfrd:	32768
reads:			8	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.076 min   0.028 max   0.133 sdev   0.044
lseeks:			20

FILE: /etc/passwd  volume: /dev/hd4 (/)  inode: 4153
opens:			12
total bytes xfrd:	28672
reads:			7	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.062 min   0.037 max   0.087 sdev   0.018
lseeks:			57

FILE: /etc/security/login.cfg  volume: /dev/hd4 (/)  inode: 24
opens:			3
total bytes xfrd:	16384
reads:			4	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.059 min   0.031 max   0.120 sdev   0.035
lseeks:			132

FILE: /usr/lib/security/methods.cfg  volume: /dev/hd2 (/usr)  inode: 8519
opens:			1
total bytes xfrd:	12288
reads:			3	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.036 min   0.018 max   0.071 sdev   0.024
lseeks:			43

FILE: /tmp/psef.out  volume: /dev/hd3 (/tmp)  inode: 154
opens:			1
total bytes xfrd:	6777
writes:			2	(0 errs)
  write sizes (bytes):	avg  3388.5 min    2697 max    4080 sdev   691.5
  write times (msec):	avg   0.437 min   0.425 max   0.449 sdev   0.012

FILE: /var/ct/cfg/current_cluster  volume: /dev/hd9var (/var)  inode: 184
opens:			1
total bytes xfrd:	4096
reads:			1	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.064 min   0.064 max   0.064 sdev   0.000

FILE: /var/ct/1613927244/cfg/nodedef.cfg  volume: /dev/hd9var (/var)  inode: 185
opens:			1
total bytes xfrd:	4096
reads:			1	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.034 min   0.034 max   0.034 sdev   0.000

FILE: /usr/lib/nls/msg/en_US/ps.cat  volume: /dev/hd2 (/usr)  inode: 181119
opens:			1
total bytes xfrd:	4096
reads:			1	(0 errs)
  read sizes (bytes):	avg  4096.0 min    4096 max    4096 sdev     0.0
  read times (msec):	avg   0.124 min   0.124 max   0.124 sdev   0.000
lseeks:			11

FILE: pid=17048_fd=7
opens:			0
total bytes xfrd:	512
reads:			2	(1 errs)
  read sizes (bytes):	avg   256.0 min     256 max     256 sdev     0.0
  read times (msec):	avg   0.019 min   0.016 max   0.022 sdev   0.003

FILE: pid=17048_fd=8
opens:			0
total bytes xfrd:	4
writes:			1	(0 errs)
  write sizes (bytes):	avg     4.0 min       4 max       4 sdev     0.0
  write times (msec):	avg   0.022 min   0.022 max   0.022 sdev   0.000

------------------------------------------------------------------------
Detailed VM Segment Stats   (4096 byte pages)
------------------------------------------------------------------------

SEGMENT: 1f8eb  segtype: persistent  volume: /dev/lv00  inode: 102403
segment flags:		pers 
reads:			1064	(0 errs)
  read times (msec):	avg   1.681 min   0.747 max  10.939 sdev   1.093
  read sequences: 	1
  read seq. lengths:	avg  1064.0 min    1064 max    1064 sdev     0.0

SEGMENT: 2f61  segtype: persistent  volume: /dev/hd2  inode: 6969
segment flags:		pers 
reads:			3	(0 errs)
  read times (msec):	avg   5.785 min   2.313 max  10.778 sdev   3.619
  read sequences: 	3
  read seq. lengths:	avg     1.0 min       1 max       1 sdev     0.0

SEGMENT: 23405a  segtype: system
segment flags:		pers jnld sys 
reads:			3	(0 errs)
  read times (msec):	avg   5.156 min   0.432 max  10.826 sdev   4.296
  read sequences: 	1
  read seq. lengths:	avg     3.0 min       3 max       3 sdev     0.0

SEGMENT: 1e0af  segtype: system
segment flags:		pers sys 
writes:			1	(0 errs)
  write times (msec):	avg   7.459 min   7.459 max   7.459 sdev   0.000
  write sequences: 	1
  write seq. lengths:	avg     1.0 min       1 max       1 sdev     0.0

SEGMENT: 2c655  segtype: persistent  volume: /dev/lv00  inode: 108544
segment flags:		pers jnld 
reads:			1	(0 errs)
  read times (msec):	avg   2.569 min   2.569 max   2.569 sdev   0.000
  read sequences: 	1
  read seq. lengths:	avg     1.0 min       1 max       1 sdev     0.0

SEGMENT: b981  segtype: .inodes  volume: /dev/lv00  inode: 3
segment flags:		pers jnld sys 
reads:			1	(0 errs)
  read times (msec):	avg  10.596 min  10.596 max  10.596 sdev   0.000
  read sequences: 	1
  read seq. lengths:	avg     1.0 min       1 max       1 sdev     0.0

------------------------------------------------------------------------
Detailed Logical Volume Stats   (512 byte blocks)
------------------------------------------------------------------------

VOLUME: /dev/lv00  description: /home/pittman
reads:			173	(0 errs)
  read sizes (blks): 	avg    49.4 min       8 max      64 sdev    19.6
  read times (msec):	avg   1.707 min   0.405 max  10.798 sdev   1.551
  read sequences: 	9
  read seq. lengths:	avg   950.2 min       8 max    7200 sdev  2228.1
seeks:			9	(5.2%)
  seek dist (blks):	init 868352,
			avg  7935.0 min       8 max   35104 sdev 11330.8
time to next req(msec): avg   6.951 min   0.009 max  25.293 sdev   3.863
throughput:		3531.3 KB/sec
utilization:		0.20

VOLUME: /dev/hd2  description: /usr
reads:			3	(0 errs)
  read sizes (blks): 	avg     8.0 min       8 max       8 sdev     0.0
  read times (msec):	avg   5.733 min   2.274 max  10.719 sdev   3.613
  read sequences: 	3
  read seq. lengths:	avg     8.0 min       8 max       8 sdev     0.0
seeks:			3	(100.0%)
  seek dist (blks):	init 1028896,
			avg    16.0 min      16 max      16 sdev     0.0
time to next req(msec): avg 229.083 min   7.685 max 667.521 sdev 310.028
throughput:		9.9 KB/sec
utilization:		0.01

VOLUME: /dev/hd8  description: jfslog
writes:			1	(0 errs)
  write sizes (blks): 	avg     8.0 min       8 max       8 sdev     0.0
  write times (msec):	avg   7.402 min   7.402 max   7.402 sdev   0.000
  write sequences: 	1
  write seq. lengths:	avg     8.0 min       8 max       8 sdev     0.0
seeks:			1	(100.0%)
  seek dist (blks):	init  22584
time to next req(msec): avg 989.175 min 989.175 max 989.175 sdev   0.000
throughput:		3.3 KB/sec
utilization:		0.01

------------------------------------------------------------------------
Detailed Physical Volume Stats   (512 byte blocks)
------------------------------------------------------------------------

VOLUME: /dev/hdisk8  description: N/A
reads:			141	(0 errs)
  read sizes (blks): 	avg    60.7 min       8 max      64 sdev    12.5
  read times (msec):	avg   1.533 min   0.299 max  10.690 sdev   1.642
  read sequences: 	9
  read seq. lengths:	avg   950.2 min       8 max    7200 sdev  2228.1
seeks:			9	(6.4%)
  seek dist (blks):	init 4477184,
			avg  7935.0 min       8 max   35104 sdev 11330.8
  seek dist (%tot blks):init 25.18929,
			avg 0.04464 min 0.00005 max 0.19750 sdev 0.06375
time to next req(msec): avg   8.590 min   0.726 max  25.367 sdev   2.107
throughput:		3531.3 KB/sec
utilization:		0.17

VOLUME: /dev/hdisk0  description: N/A
reads:			3	(0 errs)
  read sizes (blks): 	avg     8.0 min       8 max       8 sdev     0.0
  read times (msec):	avg   5.544 min   2.124 max  10.503 sdev   3.590
  read sequences: 	3
  read seq. lengths:	avg     8.0 min       8 max       8 sdev     0.0
writes:			1	(0 errs)
  write sizes (blks): 	avg     8.0 min       8 max       8 sdev     0.0
  write times (msec):	avg   7.198 min   7.198 max   7.198 sdev   0.000
  write sequences: 	1
  write seq. lengths:	avg     8.0 min       8 max       8 sdev     0.0
seeks:			4	(100.0%)
  seek dist (blks):	init 8373280,
			avg 411904.0 min      16 max 1235680 sdev 582497.6
  seek dist (%tot blks):init 47.10929,
			avg 2.31743 min 0.00009 max 6.95211 sdev 3.27722
time to next req(msec): avg 247.311 min   7.680 max 667.589 sdev 270.369
throughput:		13.2 KB/sec
utilization:		0.02

It is also possible to capture a trace file, format it, and then run filemon against the formatted trace. See below for more information.

How to interpret the Most Active Files report

Let's focus on the four busiest files:

Most Active Files
------------------------------------------------------------------------
  #MBs  #opns   #rds   #wrs  file                     volume:inode
------------------------------------------------------------------------
   4.1      0      0   1056  pid=30996_fd=1
   4.1      1   1056      0  unix_mp                  /dev/lv00:102403
   4.1      0   8382      0  pid=35890_fd=0
   4.1      1      0   8381  null

The Most active files article describes the columes in the report as follows:

#MBs Total number of megabytes transferred for the file. Rows are sorted by this field in decreasing order.
#opns Number of open system calls for the file.
#rds Number of read system calls for the file.
#wrs Number of write system calls for the file.
file File name (full path name is in detailed report).
volume:inode The logical volume that the file resides in and the i-node number of the file in the associated file system.

Please examine the #opns column and note that information is provided in the volume:inode column only when the trace captures at least one open system call (#opns), although volume:inode is not provided for null since it isn't a real file. If no open system calls are captured, filemon does not get (and can not report) a file's inode number and the file column will show only the ID of the process which is accessing the file and the file descriptor number through which I/Os are being driven.

4.1 megabytes were processed in all four of these files while trace information was being captured.

The first file is pid=30996_fd=1. Since we know stdin is fd=0, stdout is fd=1, and stderr is fd=2 (see the Standard input, standard output, and standard error files article), we realize pid=30996_fd=1 is not a file at all, but rather stdout for process 30996:

# grep 30996 /tmp/psef.out
    root 30996 45522  21 10:48:08  pts/3  0:00 cat /home/pittman/fmontest/unix_mp
#

Process 30996 is an AIX cat command. Given the test case used in the example shell script, that is no surprise. Writes to stdout frequently are written to a terminal or to stdin of another process, so they often don't result in any writes to hdisks. But stdout can, of course, be redirected to a file.

The second file is unix_mp, but what directory is the file in? The Detailed File Stats report has the file's full path name. Files appear in the same order in the Detailed File Stats report as they appear in the Most active files report.

Or the volume:inode information (/dev/lv00:102403) can be used to get the answer from the running system:

# ncheck -i 102403 /dev/lv00
/dev/lv00:
102403  /fmontest/unix_mp
# lsfs /dev/lv00
Name            Nodename   Mount Pt               VFS   Size    Options    Auto Accounting
/dev/lv00       --         /home/pittman          jfs   20971520 --         yes  no
#

The full path to file unix_mp is /home/pittman/fmontest/unix_mp. Again, no surprise given the test case used in the example shell script. Reads from this file will result in I/O to hdisks only if the file pages are not already cached in memory. To determine which AIX processes are currently accessing the file, use:

fuser -fu /home/pittman/fmontest/unix_mp

The third file is pid=35890_fd=0, which we recognize as stdin for process 35890:

# grep 35890 /tmp/psef.out
    root 15290 35890  69 10:48:08  pts/3  0:00 dd of=/dev/null
    root 35890 45522  53 10:48:08  pts/3  0:00 dd of=/dev/null
#

Process 35890 is an AIX dd command. Again, no surprise given the test case used in the example shell script. Reads from stdin are frequently satisfied by stdout of another process or stdin can be redirected to come from a file, so reads from stdin often don't result in any reads from hdisks. Hmmm, it looks like the dd forks a child and we are seeing stdin on the parent process, not the child.

The fourth most active file is null. Even though #opns is not zero for the file, no volume:inode information is provided. The Detailed File Stats report shows that the fourth file is /dev/null.

Suppose we capture a filemon trace for a few seconds on a running system. How likely are we to capture an open request for each file that is in use? Probably not very likely. If we know the application very well, we can probably guess what files are being accessed based on the process executable and the file descriptor number. But suppose we don't know the application very well. What then?

How about using truss to trace an application on a test system, invoking the application with the same flags used when the application was started on the running system, as captured in /tmp/psef.out?

cat /home/pittman/fmontest/unix_mp | truss -f -t open -o /tmp/truss.out dd of=/dev/null

In /tmp/truss.out, we find:

30800: open("/", O_RDONLY)                              = 3
30800: close(3)                                 = 0
30800: open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP) = 3
30800: close(5)                                 = 0
                                        = 0
30800: close(6)                                 = 0
29644:  close(4)                                        = 0
29644:  close(5)                                        = 0
29644:  close(6)                                        = 0
29644:  close(0)                                        = 0
29644:  close(3)                                        = 0
30800: close(7)                                 = 0
30800: close(4)                                 = 0
30800: close(0)                                 = 0
30800: close(3)                                 = 0
30800: open("/usr/lib/nls/msg/en_US/dd.cat", O_RDONLY)  = 0

The file descriptor follows the = sign on the far right of each line. Had we seen pid=35890_fd=3 rather than null in /tmp/filemon.out, the contents of /tmp/truss.out show that the AIX dd command gets fd=3 when it opens the file /dev/null specified by the of= parameter.

Running filemon against an AIX V5.3 trace file

It is certainly possible to capture a trace file, format it, and then run filemon against the formatted trace. To capture a trace while collecting only those trace hooks required for filemon, specify -J filemon on the trace command, as described in the Debug and Performance Tracing article. (Any event group can be specified on the trace -J flag. Use the command trcevgrp -l to list all event groups.)

Here is an example in a shell script:

#!/bin/ksh
dir="/tmp/fmontrc"
mkdir $dir
((trcbufsz=5*1024*1024))
((trclogsz=trcbufsz*10))
cmd="trace -ad -J filemon -T $trcbufsz -L $trclogsz -C all -o $dir/trace.raw"
echo $cmd
$cmd
( sleep 1 ; ps -ef >/tmp/psef.out ) &
trcon
cat /home/pittman/test/unix_mp | dd of=/dev/null
trcstop
# Produce formatted trace report
cmd="trcrpt -o $dir/trace.rpt -C all $dir/trace.raw"
echo $cmd
$cmd
# Capture gennames output in file
$cmd="gennames -f"
echo $cmd
$cmd >$dir/gennames.out
# Produce filemon report
cmd="filemon -i $dir/trace.raw -n $dir/gennames.out -O all -u -o /tmp/filemon.out"
echo $cmd
$cmd

Please note that if the trace will be captured on one server and formatted on another, the output of trcnm and the /etc/trcfmt file must be collected from the server on which the trace is captured, as described in the Formatting a report on a different system article.

Issues, and how to handle them

Note

If the AIX trcon command generates the message "/dev/systrctl: There is a request to a device or address that does not exist." when used to start a trace following a filemon -d, it is because the filemon command has not gotten the trace started when trcon is invoked. Try sleeping for a second after filemon and before trcon.

Note
As described in the filemon Command article, while monitoring I/O-intensive applications, the filemon command may not be able to consume trace events as fast as they are produced in real time. When that happens, the message:
nnnnnnn events were lost.  Reported data may have inconsistencies or errors.

will appear in near the top of the filemon report, indicating how many trace events were lost while trace buffers were full. The accuracy of the report will be diminished to some unknown degree. One way to prevent overflow is to monitor fewer levels of the file and I/O subsystems: the number of trace events generated is proportional to the number of levels monitored. Additionally, trace buffer size can be increased using the -T option, to accommodate larger bursts of trace events before overflow. But remember that increasing the trace buffer size will result in more pinned memory, and therefore may effect I/O and paging behavior. And as stated in the trace Daemon article:

When trace uses memory from the kernel heap which is the case for the -B option (32-bit kernel only), this memory remains part of kernel memory until the next reboot of the system. Thus, care should be taken when using large buffers.

and:

Also note that trace buffers use pinned memory, which means they are not pageable. Therefore, the larger the trace buffers, the less physical memory is available to applications.

Unless the -b or -B flags are specified, the system attempts to allocate the buffer space from the kernel heap. If this request can not be satisfied, the system then attempts to allocate the buffers as separate segments.

Please note that the first caveat applies only when AIX V5.3 is running a 32-bit kernel (pretty rare nowadays). The command bootinfo -K will display 64 or 32 to indicate which kernel is currently being run.

Note

Please note that while filemon reports 1.211 secs in measured interval in console output, that information does not appear in /tmp/filemon.out. It can, however, be calculated from information regarding a busy hdisk in the Detailed Physical Volume Stats report. For example, using /dev/hdisk8, multiply reads (141) times average read block size (60.7) and divide by throughput (3531.3) and blocks/KB (2) to get number of seconds analyzed by filemon: 141*60.7/(3531.3*2) = 1.212 seconds. If the hdisk handles writes as well as reads, those writes must be included with reads in the calculation.

Note

AIX V5.3 filemon might report hdisk utilizations which are too high. That is, hdisk utilizations displayed by filemon may be incorrect, if so, and significantly higher than hdisk utilizations displayed by iostat. The problem is described by APAR IY66667 (DISK UTILIZATION VALUES GIVEN BY FILEMON ARE NOT ACCURATE). Before depending on hdisk utilization reported by filemon, confirm that the fix for APAR IY66667 is installed:

# instfix -aivk IY66667
IY66667 Abstract: Disk utilization values given by filemon are not accurate

IY66667 Symptom Text:
 The filemon command gives inaccurate disk utilisation values
 that doesn't approximate the tm_act in the iostat command.


----------------------------
    Fileset bos.perf.tools:5.3.0.20 is applied on the system.
    All filesets for IY66667 were found.
#

Please note that hdisk utilizations reported by filemon and by iostat will never be exactly the same. The iostat command's estimates of hdisk utilization are derived from kernel counters updated by AIX as it samples hdisk busy status. The iostat estimates are not as accurate as hdisk utilizations reported by (the fixed version of) filemon.

AIX V5.3 iostat awk script

The following awk script can be used (with, for example, iostat 1 | awk -f resptimes.awk) to add I/O response time and block size to iostat output. With this method, you can see response times immediately and averaged over the one-second reporting interval specified on the iostat command in the example.

/^Disks/ { print $0," Resptime Blksize" ; next }
/^hdisk/ {
    hdisk=$1
    tmact=$2
    kbps=$3
    tps=$4
    kbread=$5
    kbwrtn=$6
    if ( tps == 0.0 ) {print $0 ; next }
    else
    {
      # Compute response time & block size
      resptime=10*tmact/tps
      blksize=kbps/tps
      # Display updated iostat output
      printf "%-14.14s %5.1f     %5.1f   %7.1f   %8u  %8u %7.1f %7.1f\n",\
             hdisk,tmact,kbps,tps,kbread,kbwrtn,resptime,blksize
    }
  }
!/^hdisk/ {print $0

But please note that the response time estimate produced by the awk script is flawed in two ways:

  1. The response time estimate calculation assumes that there is no I/O overlap (that is, no two I/O operations are active simultaneously). As I/O load increases, some I/Os will inevitably overlap and the response time estimate derived from iostat will be too small.
  2. The response time estimate includes both read and write response times. If one is willing to make the bold assumption that read and write block sizes are the same, it is possible to use the iostat Kbps, Kb_read, and Kb_wrtn metrics to derive separate estimates of average read response time and average write response time. But that assumption is perhaps too bold (too likely to be wrong).

And please note that the iostat extended disk report includes service time and queue time metrics for each hdisk. (Response time = queue time + service time). Use the -D flag on the AIX iostat command to request an extended disk report. See the Disk/Tape Utilization Report section of the AIX iostat command article for documentation of metrics displayed in an extended disk report.

( According to the AIX "From Strength to Strength" flyer, iostat disk performance metrics were introduced in the AIX V5.3 update which became available on 15-April-05. According to the AIX V5.3 Fix packs web site, AIX V5.3 Technology Level 2 was released in May 2005. So it seems likely that the iostat -D flag is available only in AIX V5.3 TL2 or above.)

In any case, please remember that hdisk response times measured on a server include latencies external to the disk storage subsystem (eg, the time taken to transit the SAN), so average response time delivered, measured, and reported by the disk storage subsystem will be lower than the average response time seen, measured, and reported by a server. And see the comment above regarding the LRU daemon, which can adversely affect I/O response times seen by a System p server.

The contents of this web page solely reflect the personal views of the authors and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management. Please use the Add Comment link at the bottom of the page to provide feedback. Note: Until you sign in (using the link in the upper right corner of this web page), you will not see the Add Comment link and you can not add a comment. If you do not already have an IBM ID, use the Register Now link on the sign in page to obtain one. Registration is quick and easy.


 
    About IBM Privacy Contact