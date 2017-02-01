Analyzing performance using perf annotate
Perf is a powerful performance analysis tool. It can be seen as combination of two
different components: userspace tool and kernel infrastructure. The userspace tool is also
included as part of Linux™ kernel repository under tools/perf/ path. Perf
provides several functions with different subcommands by using the infrastructure that is
provided by the kernel. These subcommands can be listed by running the
perf
--help command. You can instrument CPU performance counters,
tracepoints, kprobes, and uprobes (dynamic tracing) by using this tool.
In this article, we focus on the annotate feature that is provided by the userspace Perf tool. The following section describes the annotate feature in general. Later, annotation across architectures are described, which enables recording profile information on, say IBM® PowerPC®, and reporting and annotating this on your notebook or x86 system.
Prerequisites
You might have already installed the Perf tool in your system. If not, you can install it
using the
yum install perf command on Fedora/RHEL or the
apt-get install
linux-tools-common command on Ubuntu.
Annotate with perf
Perf annotate offers the ability to map recorded profile information to the actual
functions and instructions in the object code. You can use the code browsing capability to
follow the code execution alongside profiling information. It allows you to browse code by
using the
perf report,
perf top, and
perf
a
nnotate text-based user interface (TUI).
Record
The
perf record command records the cycles event by default; use the
perf list command to list all possible events supported on your system. You
may have to use the
sudo command for many of these commands.
$ perf record -a
Report
You can view the result by using the
perf report command.
$ perf report
Figure 1. Perf report
Annotate
Pressing 'a' on any symbol, for example
snooze_loop(), displays
assembly instructions of that function with the source code. If you do not see the source,
the debuginfo package for kernel/userspace-binary might be missing and needs to be
installed.
Figure 2. Annotate particular function
Numbers on the left side of the bar indicate the percentage of total samples that are
recorded against that particular instruction. For example, 40% samples of
snooze_loop() were recorded on the
beq 90 instruction. Perf also
shows these numbers in different colors based on how hot the instruction is.
Branch instructions display an arrow to the branch target. Pressing Enter on the branch instruction jumps to that target location.
Similarly, a right arrow is displayed for call instructions, and a left arrow is displayed for return instructions. Pressing Enter on the call instruction displays disassembled output of the target function. Pressing Enter on the return instruction gets you back to the caller's disassembled output. Also, you can press 'q' to go back one step.
Figure 3. Annotate call instruction
Select the
bl arch_local_irq_restore+0x8 line and press Enter. You will see
disassembly of
arch_local_irq_restore().
Figure 4. Jump to target function
Annotate help
Different options to change or manipulate annotate output are available in help. Press 'h' to open help.
Figure 5. Annotate help
Press the 's' key to toggle between display and hide the source code. You can see
some examples above that do not show the source. Those were captured with the toggle set to
hide the source. Similarly, press the 'o' key to display the actual
objdump output. Press the 'J' key to display the numbers before
those instructions, which are target to branch instructions. The number indicates how many
branch instructions are targeting this particular instruction.
Annotate with perf annotate
You can also use the
perf annotate
command to annotate a symbol.
$ perf annotate smp_call_function_single
Live annotate with perf top
You can also annotate using the
perf top command. Run the
perf top command and press 'a' on any particular symbol
that you want to annotate. It also dynamically updates data at a fixed interval.
Cross-arch annotate
Perf also supports annotate across architecture from kernel v4.10-rc1 onwards. That is, record on, say PowerPC, and annotate it on an x86 system. For example,
1. Record on PowerPC by running the following commands.
$ perf record -a # Generate perf.data $ perf archive # Generate perf.data.tar.bz2
Copy perf.data, perf.data.tar.bz2, and vmlinux with the debug information (on a Fedora system, /usr/lib/debug/lib/modules/<kernel_ver>/vmlinux) to the target x86 system (your notebook, for instance). In the following example, these files are suffixed with the text, powerpc.
2. Report/annotate on the x86 system.
$ yum install binutils-powerpc64le-linux-gnu.x86_64 # Install cross-tools $ tar xvf perf.data.powerpc.tar.bz2 -C ~/.debug $ perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc
Annotate any symbol by pressing 'a' on it.
Cross architecture annotate is only enabled for kernel symbols. Also, you must use the
--source option to annotate with source.
Resources
- Perf: Linux profiling with performance counter
- Perf examples
- The Unofficial Linux Perf Events Web-Page
- Fighting latency: How to optimize your system using Perf
- Perf documentation within kernel source