Diagnostic tools (Linux and UNIX)

This section describes some essential commands for troubleshooting and performance monitoring on Linux® and UNIX platforms. The troubleshooting system commands described in the topic below are often pre-installed with the operating system, however, it is advised to confirm that these commands exist in your database system environments, and work with your system administrator to install them if necessary.

For details on any one of these commands, precede it with "man" on the command line. Use these commands to gather and process data that can help identify the cause of a problem you are having with your system. Once the data is collected, it can be examined by someone who is familiar with the problem, or provided to IBM Software Support if requested.

Troubleshooting commands (AIX)

The following AIX® system commands are useful for Db2® troubleshooting:

errpt
The errpt command reports system errors such as hardware errors and network failures.
  • For an overview that shows one line per error, use errpt
  • For a more detailed view that shows one page for each error, use errpt -a
  • For errors with an error number of "1581762B", use errpt -a -j 1581762B
  • To find out if you ran out of paging space in the past, use errpt | grep SYSVMM
  • To find out if there are token ring card or disk problems, check the errpt output for the phrases "disk" and "tr0"
lsps
The lsps -a command monitors and displays how paging space is being used.
lsattr
This command displays various operating system parameters. For example, use the following command to find out the amount of real memory on the database partition:
lsattr -l sys0 -E 
xmperf
For AIX systems using Motif, this command starts a graphical monitor that collects and displays system-related performance data. The monitor displays three-dimensional diagrams for each database partition in a single window, and is good for high-level monitoring. However, if activity is low, the output from this monitor is of limited value.
spmon
If you are using system partitioning as part of the Parallel System Support Program (PSSP), you might need to check if the SP Switch is running on all workstations. To view the status of all database partitions, use one of the following commands from the control workstation:
  • spmon -d for ASCII output
  • spmon -g for a graphical user interface
Alternatively, use the command netstat -i from a database partition workstation to see if the switch is down. If the switch is down, there is an asterisk (*) beside the database partition name. For example:
css0* 65520 <Link>0.0.0.0.0.0 
The asterisk does not display if the switch is up.

Troubleshooting commands (Linux and UNIX)

The following system commands are for all Linux and UNIX systems, including AIX, unless otherwise noted.

df
The df command lets you see if file systems are full.
  • To see how much free space is in all file systems (including mounted ones), use df
  • To see how much free space is in all file systems with names containing "dev", use df | grep dev
  • To see how much free space is in your home file system, use df /home
  • To see how much free space is in the file system "tmp", use df /tmp
  • To see if there is enough free space on the machine, check the output from the following commands: df /usr , df /var , df /tmp , and df /home
truss
The truss command is useful for tracing system calls in one or more processes.
pstack
The pstack command displays stack traceback information. In RHEL and SUSE environments, pstack is included in the gdb package. In SUSE environments, pstack is a symlink to gstack.

Performance Monitoring Tools

The following tools are available for monitoring the performance of your system.

vmstat
This command is useful for determining if something is suspended or just taking a long time. You can monitor the paging rate, found under the page in (pi) and page out (po) columns. Other important columns are the amount of allocated virtual storage (avm) and free virtual storage (fre).
iostat
This command is useful for monitoring I/O activities. You can use the read and write rate to estimate the amount of time required for certain SQL operations (if they are the only activity on the system).
netstat
This command lets you know the network traffic on each database partition, and the number of error packets encountered. It is useful for isolating network problems.