 | Tweaking in progress Work now in progress to update the page.. |
Contents
Gathering performance data
When customers ask about performance problems, there is generally a standard set of performance information which is needed to complete a first pass at problem determination, see this page
for an example of gathering information from a Power system running RHEL 5.1
Redhat provides an new system information gathering tool called "sosreport" (son of sysreport).
sosreport
snap
snap is an easy tool to use which captures the basic system information to provide to support teams. The tar ball generated is small enough to be attached to emails.
Linux Performance Tools
- Profiling tools
- Tracing tools
- Monitoring tools
- Other tools
Profiling Tools
Code profiling tools collect information about the code executing on the system. The system is periodically interrupted so the information can be collected. The information is then used to analyze the performance of the code. Code profiling data may identify "hot spots" in code. The hot spots can then be analyzed further for performance defects.
For customers using supported distros, oprofile is clearly the recommended tool to use. It ships with the system and can be easily run.
- Code profiling tools collect information about the code executing on the system
- The system is periodically interrupted so the information can be collected.
- The information is then used to analyze the performance of the code
- Code profilers
Tracing Tools
An Event Based Trace Facility collects information about events that occur on the system such as scheduling dispatches, interrupts, and I/O. Trace points are inserted in the kernel code to record the events to a trace buffer. User level tools are provided to view the trace events in a time sequenced fashion. The events can be analyzed to gain a better understanding of the dynamics of the system.
- Linux Trace Toolkit (http://www.opersys.com/LTT/
)
- a suite of tools designed to trace and extract program execution profile information
- Such as processor utilization and allocation information for a certain period of time
- strace
- Strace is a system call trace
- a debugging tool which prints out a trace of all system calls made by a process/program.
- the program to be traced need not be recompiled for this, so it can be used on binaries for which there is no source.
- In the simplest case, strace runs the specified command until it exits.
- It intercepts and records the system calls which are called by a process and the signals which are received by a process.
- The name of each system call, its arguments and its return value are printed on standard error or to the file specified with the -o option.
- Each line in the trace contains the system call name, followed by its arguments in parentheses and its return value
- Performance inspector (http://perfinsp.sourceforge.net/
)
Resource Monitoring Tools
- Linux provides facilities to monitor the utilization of memory resources under /proc filesystem
- /proc/meminfo and /proc/slabinfo; these 2 files capture the state of the physical memory
- Vmstat - virtual memory statistics
- Top - process statistics
- Netstat - network statistics
- sysstat - sar, iostat, mpstat (http://perso.wanadoo.fr/sebastien.godard/
)
- sysstat now ships with the primary distros
- Lockmeter
- instruments the spin locks in a multiprocessor Linux kernel
- used to identify which portions of the kernel code are responsible for causing lock contention.
- Lockmeter allows the following statistics to be measured for each spin lock:
- The fraction of the time that the lock is busy.
- The fraction of accesses that resulted in a conflict.
- The average and maximum amount of time that the lock is held.
- The average and maximum amount of time spent spinning for the lock.
- LPAR CPU statistics and documentation tool: LPAR2RRD

Standard Linux Performance Tools (System, I/O, etc.)
Standard Linux Performance Tools (Network)
In-depth materials
OProfile
http://oprofile.sourceforge.net/
Overview
OProfile is a system-wide profiler for Linux systems, capable of profiling all running code at low overhead. OProfile is released under the GNU GPL.
It consists of a kernel driver and a daemon for collecting sample data, and several post-profiling tools for turning data into information.
OProfile can leverage the hardware performance counters of the CPU to enable profiling of a wide variety of interesting statistics, which can also be used for basic time-spent profiling. All code is profiled: hardware and software interrupt handlers, kernel modules, the kernel, shared libraries, and applications.
Features
- Unobtrusive
- No special recompilations, wrapper libraries or the like are necessary. Even debug symbols (-g option to gcc) are not necessary unless you want to produce annotated source. No kernel patch is needed - just insert the module.
- System-wide profiling
- All code running on the system is profiled, enabling analysis of system performance.
- Performance counter support
- Enables collection of various low-level data, and assocation with particular sections of code.
- Call-graph support
- With a recent 2.6 kernel, OProfile can provide gprof-style call-graph profiling data.
- Low overhead
- OProfile has a typical overhead of 1-8%, dependent on sampling frequency and workload.
- Post-profile analysis
- Profile data can be produced on the function-level or instruction-level detail. Source trees annotated with profile information can be created. A hit list of applications and functions that take the most time across the whole system can be produced.
System support
OProfile works across a range of CPUs, include the Intel range, AMD's Athlon and AMD64 processors range, the Alpha, and more. OProfile will work against almost any 2.2, 2.4 and 2.6 kernels, and works on both UP and SMP systems from desktops to the scariest NUMAQ boxes.
Munin
http://munin.projects.linpro.no
Overview
From the home page:
Munin the monitoring tool surveys all your computers and remembers what it saw. It presents all the information in graphs through a web interface. Its emphasis is on plug and play capabilities. After completing a installation a high number of monitoring plugins will be playing with no more effort.
Using Munin you can easily monitor the performance of your computers, networks, SANs, applications, weather measurements and whatever comes to mind. It makes it easy to determine "what's different today" when a performance problem crops up. It makes it easy to see how you're doing capacity-wise on any resources.
Munin uses the excellent RRDTool (written by Tobi Oetiker) and the framework is written in Perl, while plugins may be written in any language. Munin has a master/node architecture in which the master connects to all the nodes at regular intervals and asks them for data. It then stores the data in RRD files, and (if needed) updates the graphs. One of the main goals has been ease of creating new plugins (graphs).
Features
- Support for most common UNIX-like systems, Linux and AIX included
- Easy installation on Linux, packages for most Linux distributions available. Native package included in Debian and Fedora.
- At installation time, the system is probed for suitable plugins, and they are automatically set up
- Webgui frontend showing trends for last day, week, month and year.
- Actively supported as an Open Source project, alle code available as Free Software
- Power specific: plugins that show real (physical) usage for an lpar available