This three-part series focuses on the various aspects of Central Processing Unit (CPU) performance and monitoring. The first installment of the series provides an overview of how to efficiently monitor your CPU, discusses the methodology for performance tuning, and gives considerations that can impact performance, either positively or negatively. Though the first part of the series goes through some commands, the second installment focuses much more on the detail of actual CPU systems monitoring and analyzing trends and results. The third installment focuses on proactively controlling thread usage and other ways to tune your CPU to maximize performance. Throughout this series, I'll also expound on various best practices of AIX® CPU performance tuning and monitoring.
This article covers threads, processes, and CPU binding. It also discusses how to use several of the tools illustrated in prior installments to make changes to your systems. The most important commands used to tune the CPU scheduler and the various methods of binding threads that are available on AIX Version 5.3 are also covered.
A junior administrator might consider process management nothing more than
monitoring active processes and possibly killing runaway or zombie processes.
You'll find out that there is a lot more to process management than using the
kill command, or even nice. The fundamental question
that needs to be answered before moving forward is how processes relate to
threads. The answer is surprisingly simply. The process is the actual entity that
AIX uses to control the use of system resources, while the threads control the
actual time consumption, as each kernel thread is a single sequential flow of
control. Each process is made up of one or more threads. Controlling thread usage
is where you can make a difference. To do this, you need to understand the tools
that allow you to work with threads to improve your CPU performance, which is the
scope of this final part of the series.
In this section, I discuss the tools and commands that are available to help you monitor and analyze thread usage. While AIX Version 4 introduced the usage of threads to control processor time consumption, it was in AIX 5L™ where system management tools really evolved to help you monitor and analyze the thread usage. One such tool is procmon, which was introduced in AIX Version 5.3.
Procmon displays a list of processes (changing dynamically while your system changes) that enable you to gather information about what is running on your system. Where it really stands out compared to other monitoring tools is that it actually allows you to run commands to facilitate process and thread management. Some of the critical information that it gathers with respect to performance tuning includes:
- The actual amount of CPU time the process is using
- The amount of memory and I/O that the process is using
- The nice values of the process and their priorities
You can even kill jobs and renice them on the fly. Figure 1
gives a nice graphical representation of overall performance. To launch the
Performance Workbench Platform, use: # perfwb.
Figure 1. Procmon partition performance tab
There is also a process table view, which can actually show you a list of threads in a sorted table. You just select Show threads metrics (see Figure 2).
Figure 2. Procmon processes tab
Other menus allow you to either kill processes or
renice them (see
Figure 3).
Figure 3. Procmon processes tab
So what exactly is nice? Usage of the nice command
allows you to adjust the priority of a given process. Every process default value
is 20. Using the renice command (either through Procmon
or from the command line) can cause the system to either assign a higher or lower
priority to a given process. When you do this, you actually change the value of
the priority of a thread (default value of 40) by changing the nice value of its
process.
When you use the -l flag with ps, you will see your
nice information (see Listing 1).
Listing 1. nice information
# ps -l
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
200001 A 0 12972 45770 0 60 20 dea6 764 pts/1 0:00 ksh
200001 A 0 33816 12972 3 61 20 36168 440 pts/1 0:00 ps
240001 A 207 45770 40374 0 60 20 258ec 744 pts/1 0:00 ksh
|
Let's start a new ksh with nice, changing the priority of the process:
# nice --10 ksh (see Listing 2).
When you look at the process table again, you'll see that the priority of this
process has changed from its default as well as the child process that was forked
from it (ps).
Listing 2. A new ksh using nice
# ps -l
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
200001 A 0 12972 45770 0 60 20 dea6 764 pts/1 0:00 ksh
200001 A 0 17246 12972 0 50 10 68a1f 748 pts/1 0:00 ksh
200001 A 0 18450 17246 1 50 10 51bb1 380 pts/1 0:00 ps
240001 A 207 45770 40374 0 60 20 258ec 744 pts/1 0:00 ksh
|
You can also use the renice command (illustrated
previously with Procmon in Figure 3) to dynamically reassign a
priority to a running process.
Back to ps. If you want to see a more granular look at your threads, you would
use the -mo flag (see Listing 3).
Listing 3. Using the
-mo flag for a more granular look at your threads
# ps -mo THREAD
USER PID PPID TID ST CP PRI SC WCHAN F TT BND COMMAND
root 12800 45770 - A 0 60 1 - 200001 pts/1 - -ksh
- - - 56759 S 0 60 1 - 10400 - - -
root 44648 12800 - A 1 60 1 - 200001 pts/1 - ps -mo THREAD
- - - 64905 R 1 60 1 - 0 - - -
kmilberg 45770 40374 - A 0 60 1 - 240001 pts/1 - -ksh
- - - 54005 S 0 60 1 - 10400 - - -
|
Though most administrators usually use ps only when doing ps -ef, if you play around a bit more with its
features, you will see that there is a lot more to ps then meets the eye.
Changing the priority of threads
Now that you know how to change the priority of processes, how do you do this with threads? This section shows how you can change some of the CPU scheduling parameters, which are used to calculate the priority value for each thread. You do this by using schedo (schedune in AIX Version 5.2 and earlier).
First, let's make sure you have the filesets (see Listing 4).
Listing 4. Checking for the filesets
# lslpp -lI bos.perf.tune
Fileset Level State Description
----------------------------------------------------------------------------
Path: /usr/lib/objrepos
bos.perf.tune 5.2.0.10 COMMITTED Performance Tuning Support
Path: /etc/objrepos
bos.perf.tune 5.2.0.10 COMMITTED Performance Tuning Support
|
Now let's report back all the CPU parameters, as shown in Listing 5.
Listing 5. Reporting back all the CPU parameters
# schedo -a
%usDelta = 100
affinity_lim = 7
big_tick_size = 1
fixed_pri_global = 0
force_grq = 0
idle_migration_barrier = 4
maxspin = 16384
pacefork = 10
sched_D = 16
sched_R = 16
timeslice = 1
v_exempt_secs = 2
v_min_process = 2
v_repage_hi = 0
v_repage_proc = 4
v_sec_wait = 1
|
Start with fixed_pri_global. The default setting
is 0. When a CPU is ready to dispatch a thread, the global run queue is checked
before any of the others. When the thread completes its running slice on the CPU, it gets
put back on the queue. This helps maintain processor affinity (I'll get to this in
a little bit). To improve overall thread performance, there is an environment
variable called RT_GRQ that you can set to on. This automatically places the
thread on the global run queue. All fixed priority threads will be placed on the
run queue if you change the default from 0 to 1. You do this by:
#schedo -o fix_pri_global=1.
Let's get back to threads. The actually priority of a user process varies over
time, depending on the amount of overall CPU time that the process has used most
recent. The parameters that you need to look at are sched_R
and sched_D. The values for both are in 1/32 seconds
and each has a default value of 16. Further, when a thread is created, the CPU
value is zero. The more time that it spends on CPU, the more the usage increments.
Essentially, the scheduler ages using the following formula:
CPU usage = CPU usage*(D/32).
In this instance, if the D parameter is set to 32, the
thread usage does not decrease—the default value (16) allows the usage to
decrease over time, giving it more time on the CPU.
Each CPU has a dedicated run queue. A run queue is a list of runnable threads, sorted by thread priority value. There are 256 thread priorities (zero to 255). There is also an additional global run queue where new threads are placed.
Schedo is more commonly used to change the length of the scheduler time slice. To
change the time slice, use the
schedo -o timeslice=value option. Increasing the time
slice might improve system throughput, due to reduced context switching. Before
changing this, make sure you run vmstat enough to determine that there really is a
considerable amount of context switching going on.
In this section, I introduce the topic of CPU binding, which is allowing
processes to run on a specific processor. The term itself is called processor
affinity. Process affinity has many purposes, some of which are even used during
debugging. For example, you can bind threads to a given processor to find the root
cause of a hanging program. It is generally used when trying to spread around the
wealth of your system, in an SMP box, for example. The command that you use is the
bindprocessor command. Assuming that simultaneous
multithreading (SMT) is enabled (it is by default), each and every hardware thread
of the physical processor is listed as a separate processor when running the
bindprocessor command. On POWER5 chips, there
are two hardware threads on each processor. With shared processor
logical partitions (LPARs), using this command binds to virtual CPUs, so you must
be very careful because it could cause problems for applications that are
predisposed to run on a specific CPU. Let's first check to see if SMT is enabled
(see Listing 6).
Listing 6. Checking to see if SMT is enabled
# smtctl
SMT is currently enabled.
|
Listing 7 shows the output of a two-way box with SMT
enabled.
Listing 7. Output of a two-way box with SMT enabled
# bindprocessor -q
The available processors are: 0 1 2 3
|
If you want to bind a process to a particular CPU, it's as simple as this:
# bindprocessor 12741 2 |
Processor affinity also occurs naturally. When a thread is running on a CPU and gets interrupted, it usually gets placed back on the same CPU because the processor's cache might still have lines belonging to the thread. If it were to get dispatched to a different CPU, it might have to get information from RAM, which would slow down the processing time dramatically.
You can also bind threads using subroutines, though I would be very cautious when attempting to do so. What it does is bind all kernel threads in a process to a processor, which has the effect of forcing these threads to be run on that specific processor, until they are unbound.
Another important thread command used in programming is gprof. The
gprof command produces an execution profile of your
compiled programs, either in C, Pascal, FORTRAN, or even COBOL.
gprof reports on your flow control through all the
subroutines of your program and provides you with the amount of CPU time consumed
by each subroutine. This is very useful when troubleshooting how processes consume
CPU resources. The data is taken from the profile file (gmon.out). You can use
gprof to profile your program and determine which
functions are using the CPU. The profile data is taken from the call graph profile
file (gmon.out by default). So what's different in AIX Version 5.3? Because AIX
Version 5.3 allows
the profiling of output files to have a user-specified name, by setting special
environment variables, there is additional profiling support for threads and the
options that affect the type of profiling data that is collected along with it.
In this article, I've discussed the importance of controlling thread usage and
CPU binding. You've looked at the key tools and utilities used to analyze threads
and administrate your processes. Further, you've tuned your kernel using
schedo, learned all about processor affinity, and
figured out how to
bind CPUs. This three-part series on CPU monitoring first introduced the overall
concepts of tuning, then went into monitoring and data collection, and concluded
with systems tuning and administration. While most of you might be more familiar
with tuning memory subsystems, I hope this series illustrated the importance of
CPU monitoring and tuning.
Learn
-
Optimizing AIX 5L performance:
Check out other parts in this series.
-
High-Performance Architecture with a History
:
Read this paper for a brief description of PowerPC® architecture.
- "Power to the
People"
(developerWorks, May 2004): Read this article for a history of chip making at IBM.
- "Processor Affinity on AIX"
(developerWorks, November 2006): Using process affinity settings to bind or unbind
threads can help you find the root cause of troublesome hang or deadlock problems.
Read this article to learn how to use processor affinity to restrict a process and
run it only on a specified central processing unit (CPU).
- "CPU Monitoring and Tuning"
(developerWorks, March 2002): Read this article to learn how standard AIX tools can help you
determine CPU bottlenecks.
- "AIX 5L Version 5.3:
What's in it for you?"
(developerWorks, June 2005): Learn what features you can benefit from in AIX 5L
Version 5.3.
-
Operating System and Device Management:
This document from IBM provides users and system administrators with complete
information that can affect your selection of options when performing such tasks
as backing up and restoring the system, managing physical and logical storage, and
sizing appropriate paging space.
- "nmon
performance: A free tool to analyze AIX and Linux® performance"
(developerWorks, February 2006): This free tool gives you a huge amount of
information all on one screen.
- "nmon analyser—A free tool to produce AIX performance reports"
(developerWorks, April 2006): Read this article to learn how to produce a wealth
of report-ready graphs from nmon output.
- Check out the following IBM Redbooks:
- IBM Redbook AIX 5L Practical Performance Tools and Tuning Guide
- Advanced POWER Virtualization on IBM eServer™ p5 Servers: Architecture and Performance Considerations
- AIX 5L Practical Performance Tools and Tuning Guide
- The AIX 5L Differences Guide Version 5.3 Edition
-
Operating System and Device Management:
This document from IBM provides users and system administrators with complete
information that can affect your selection of options when performing such tasks
as backing up and restoring the system, managing physical and logical storage, and
sizing appropriate paging space.
- Check out the following wikis:
- Check out other articles and tutorials written
by Ken Milberg:
-
Popular content:
See what AIX and UNIX content your peers find interesting.
-
AIX and
UNIX:
The AIX and UNIX developerWorks zone provides a wealth of information relating to
all aspects of AIX systems administration and expanding your UNIX skills.
-
New to AIX and UNIX?:
Visit the New to AIX and UNIX page to learn more about AIX and UNIX.
-
AIX 5L Wiki:
Discover a collaborative environment for technical information related to AIX.
- Search the AIX and UNIX library by topic:
- System administration
- Application development
- Performance
- Porting
- Security
- Tips
- Tools and utilities
- Java™ technology
- Linux
- Open source
-
Safari bookstore:
Visit this e-reference library to find specific technical resources.
- See the IBM infocenter for information on
LPARSTAT.
-
developerWorks technical events and webcasts:
Stay current with developerWorks technical events and webcasts.
-
Podcasts: Tune in and
catch up with IBM technical experts.
-
Future Tech:
Visit Future Tech's site to learn more about their latest offerings.
Get products and technologies
-
IBM trial software:
Build your next development project with software for download directly from
developerWorks.
Discuss
- Participate in the
developerWorks blogs
and get involved in the developerWorks community.
- Participate in the AIX and UNIX forums:
- AIX 5L—technical forum
- AIX for Developers Forum
- Cluster Systems Management
- IBM Support Assistant
- Performance Tools—technical
- Virtualization—technical
- More AIX and UNIX forums
Ken Milberg is a Technology Writer and Site Expert for techtarget.com and provides Linux technical information and support at searchopensource.com. He is also a writer and technical editor for IBM Systems Magazine, Open Edition. Ken holds a bachelor's degree in computer and information science and a master's degree in technology management from the University of Maryland. He is the founder and group leader of the NY Metro POWER-AIX/Linux Users Group. Through the years, he has worked for both large and small organizations and has held diverse positions from CIO to Senior AIX Engineer. Today, he works for Future Tech, a Long Island-based IBM business partner. Ken is a PMI certified Project Management Professional (PMP), an IBM Certified Advanced Technical Expert (CATE, IBM System p5 2006), and a Solaris Certified Network Administrator (SCNA). You can contact him at kmilberg@gmail.com.
Comments (Undergoing maintenance)





