This three-part series focuses on the various aspects of memory management and tuning on IBM System p™ servers running AIX® 7. Part 1 provides an overview of memory on AIX 7, including a discussion of virtual memory and the Virtual Memory Manager (VMM). It also drills down into the tuning parameters for the paging and organization of your virtual and physical memory within AIX Version 7. Part 2 focuses on the detail of actual memory subsystem monitoring and discusses how to analyze the results. Part 3 focuses specifically on swap space and how best to tune your VMM settings to provide for optimum swap space configuration and performance.
Just what is swap (paging) space? It all starts with the VMM. VMM uses swap space (paging) as a holding bin for a process that is not using active RAM. Because of its purpose, it is a critical component of overall system performance. As an administrator, you need to know how to monitor and tune your paging parameters. The paging space itself is a special logical volume that stores the information that is currently not accessed. You must make sure that your system has adequate paging space. If the paging space is too low, entire processes can be lost and the system can crash when your space fills up. Although it is important to reiterate that paging is a normal part of VMM, it is even more important you really understand how the kernel brings the process into RAM—too much paging definitely hinders performance. AIX 7, through tight integration of the kernel and VMM, makes use of a methodology called demand paging. In fact, most of the kernel itself resides in virtual memory, which helps free up segments for other processes. We'll dig deeper into how this works and discuss some of the tools you need to use to manage and tune your paging space.
You will find that the tuning you do is based on what type of system you have. For example, systems that are using an Oracle Online Transaction Processing (OLTP) type of database usually have specific recommendations on how much swap space to configure and how to tune the paging parameters. As discussed in previous installments of the series (see Resources), you cannot really tune your paging settings unless you really know what is going on in the host system. You need to understand the tools to use, how best to analyze the data that you will be capturing, and familiarize yourself with best practices for implementing your paging space. In our experience, the number one cause for a system crash is running out of paging space. If you read this article carefully and follow its recommendations, this should never happen to you. Obviously, you never want your system to crash but, if it does, you don't want it to be anything you did as the systems administrator.
In this section, we provide an overview of how AIX 7 handles paging, define swapping and paging, and drill down into the different modes of paging space allocation. These concepts help you understand subsequent sections on monitoring, configuring, and tuning.
Most administrators think of paging as something that is onerous. Paging is actually a very normal part of what AIX 7 does, due to the tight integration of its kernel with the VMM and its implementation of demand paging. The way demand paging works is that the kernel only loads a few pages at a time into real memory. When the CPU is ready for another page, it looks at the RAM. If it cannot find it there, a page fault occurs, and this signals the kernel to bring more pages into RAM from disk.
One advantage of demand paging is that the paging space does not have to be particularly large, because data is constantly being shuffled between paging space and RAM. On older UNIX® systems, paging was pre-allocated to disks, whether they were used or not. This caused a condition where disk space would be allocated that was never used. Demand paging, in essence, avoids the condition where this disk space is allocated for no purpose. Swapping of processes is kept to a minimum, because many more jobs can be stored in RAM. This is true because only parts of processes (pages) are stored in RAM.
What about swapping? Though often used interchangeably, there is a subtle difference between paging and swapping. As discussed, only parts of the process are moved back and forth between disk and RAM with paging. When swapping occurs, you are moving entire processes back and forth. For this to happen, AIX 7 suspends the entire process prior to moving it to paging space. It could then only continue to process when it is swapped back into RAM at a later event. This is not good and you should do everything you can to prevent swapping from occurring, which can cause another condition called thrashing (we'll get into this more later).
As a UNIX administrator, you are probably already aware of some of the concepts of paging and swapping. AIX 7 provides three different modes of paging space allocation: deferred page space allocation, late page space allocation, and early page space allocation. The default policy of AIX 7 is deferred page space allocation. This works by making sure that the allocation of paging space is delayed until the time that it is necessary to page out the page, which ensures that there is no wasted paging space. In fact, when you have a large amount of RAM, you might actually never even use any of your paging space (see Listing 1).
Listing 1. Ensuring that there is no wasted paging space
# lsps -a
Page Space Physical Volume Volume Group Size %Used Active Auto Type
Chksumhd6 hdisk0 rootvg 768MB 3 yes yes lv
0
|
Only three percent of paging space is used in Listing 1. Note as well that the checksum on the paging space is disabled, as shown by the 0 under Chksum. Checksums can help to improve the reliability of the paging space. You can change the checksum using the chps command, or when creating new paging spaces using the mkps command.
Let's view how AIX 7 is currently handling paging space allocation (see Listing 2).
Listing 2. Checking how AIX 7 is handling paging space allocation
# vmo -o defps
|
Listing 2 illustrates that the default method, deferred page
space allocation, is being used. To disable this policy, you need to set the
parameter to 0. This activates the system to use the late paging space allocation
policy. Late paging space allocation causes paging disk blocks not to be allocated
until its corresponding pages in RAM are touched. This method is usually intended
for environments where optimum performance is more important than reliability. In
the scenario presented here, a program can fail due to the lack of memory. What
about early page space allocation? This policy is usually used if you want to make
certain that processes will not be killed because of low paging conditions. Early
page space allocation pre-allocates paging space. This is the opposite end of the
spectrum from late paging space allocation. It is used in environments where
reliability rules. The way to turn this on would be to set the PSALLOC
environment variable to early (PSALLOC=early).
Garbage collection of paging space
AIX 7 also supports garbage collection of paging space, which means that disk space used as paging blocks can be forcibly recovered in the event of your machine running out of paging space for a particular workload and set of applications. Garbage collection can be used to help eke out even more memory from your system and help improve the performance in workloads with a range of applications being used concurrently.
This allows you to configure less paging space than you might necessarily need by allowing pages to be recovered. Garbage collection works in a number of configurable ways through the deferred page space allocation policy. The default method is to perform the collection after the pages have been read back into memory. In this case, the pages are stored on disk, and also exist in memory, but are not deleted from the disk so that if the page has to be written out again (but with no changes), there is no performance hit.
There are two key parameters, npsrpgmin and npsrpgmax, which set the number of blocks when re-pagein garbage collection starts and then it should stop. Two further parameters, rpgclean and rpgcontrol, specify how the garbage collection should operate.
The rpgclean parameter defines whether the garbage collection is started on pages that are modified (the default, 0), or pages that are simply read from the paging space (1). Setting to the latter may increase the incidences of garbage collection and make more pages available, but may impact performance.
The rpgcontrol supports three options and controls when garbage collection is performed with reference to the npsrpgmin and npsrpgmax options. The default value of 2 specifies that garbage collection occurs irrespective of the limits. A value of 1 indicates that read accesses are processed. A value of 0 disable freeing of paging space disk blocks.
All settings can configured using the vmo tool.
An alternative to the garbage collection when pages are read in and written out is for the kernel process pgsc to scrub memory, identifying the pages that have been written to disk, but which now appear in memory and that have not been written out to disk again. This frees the paged blocks on the paging device so that they can be used by other applications.
Page scrubbing is a more intensive process than the pagein scrubbing, but has the benefit that it can free pages that have been created on disk but which are never actually written to the paging space.
Like pagein scrubbing, the operation is configured by vmo. The npscrubmin and npscrubmax parameters specify the number of free paging space blocks where scrubbing starts and when it stops.
The scrub parameter enables or disables scrubbing (default is disabled). The scrubclean parameter enables or disables the scrubbing of pages that are allocated but which have not been modified.
For both types of page scrubbing, on systems where the paging space is regularly low, you may want to configure a more aggressive scrubbing schedule to make more pages available for all the applications and workload that you need to support.
To better understand your paging requirements, let's first look at how to monitor your page space.
Monitoring and configuring paging space
In this section, we'll show you how to monitor the paging space on your system. We'll also discuss the various commands used for configuring paging space and other tools that help you work with paging space as a systems administrator.
The simplest way of determining the amount of paging space used on your system is
by running the lsps command (see
Listing 3).
Listing 3. Running the lsps command
# lsps -s
Total Paging Space Percent Used
768MB 3%
|
You looked earlier at the -a flag. We prefer using the
-s flag, because the -a flag
shows only paging space that is being used while the -s
command gives you a summary of all paging space allocated, including space
allocated using early page space allocation. Of course, this only applies if the
default method of paging allocation was turned off.
Next on the plate is vmstat.
Part
2 of this series discussed
vmstat in great detail, which is one of our favorite VMM
monitoring tools. We find that it is the quickest way to determine what is going on
in your system. If there is a lot of paging and thrashing going on, you will find
it here.
Let's look at some output shown in Listing 4.
Listing 4. Using vmstat
# vmstat 1 5
System Configuration: lcpu=2 mem=4096MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec
1 0 419284 18634 0 0 0 0 0 0 231 20152 529 6 62 32 0 0.92 368.5
1 0 418879 19038 0 0 0 0 0 0 231 18725 500 6 63 32 0 0.89 357.1
1 0 419644 18274 0 0 0 0 0 0 251 19784 526 6 63 31 0 0.91 365.3
1 0 419970 17948 0 0 0 0 0 0 153 19319 514 6 62 32 0 0.91 365.1
1 0 419575 18343 0 0 0 0 0 0 142 19617 535 6 62 32 0 0.95 378.8
|
The columns most meaningful for your purposes here are:
- avm—This column represents the amount of active virtual memory (in 4k pages) you are using, not including file pages.
-
fre—This column represents the size of your memory free
list. In most cases, we don't worry when this is small, as AIX 7 loves using every
last drop of memory and does not return it as fast as you might like. This
setting is determined by the minfree parameter of the
vmocommand. At the end of the day, the paging information is more important. - pi—This column represents the pages paged in from the paging space.
- po—This column represents the pages paged out to the paging space.
As you can see in Listing 4, there is essentially no paging going on in the system.
Listing 5 shows an example of a system that is probably thrashing.
Listing 5. Possible thrashing system
# vmstat 2 3
System Configuration: lcpu=4 mem=4096MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ecs
1 0 421128 36893 0 57 129 29 43 0 257 19123 581 6 63 32 0 0.94 374.3
1 0 422808 11079 0 49 384 91 199 0 358 18182 496 13 63 24 0 0.95 381.8
1 0 420776 37703 0 45 117 0 0 0 192 19392 550 6 63 31 0 0.95 381.6
1 0 421073 37406 0 0 0 0 0 0 157 20342 502 6 61 33 0 0.94 377.2
1 0 421119 37353 0 0 0 0 0 0 131 18633 507 6 62 32 0 0.88 352.8
1 0 420748 37717 0 0 0 0 0 0 154 19737 534 6 62 32 0 0.92 369.1
|
How can you tell this? First of all, look at the po column. This signifies
that pages are consistently being moved back and forth between disk and RAM. You
should also see a bottleneck on your system, as the blocked processes and wait
times are abnormally high. The freelist is also lower than it should be. In
looking at the freelist with the vmo command, you
determined that the number was 120. This means that this number should not be
falling below the 120 mark. Ordinarily, we would say it is not a problem when your
freelist is low but, in this case, it is below where it should be. When this
occurs, it usually signifies that thrashing is going on in your system. A classic
sign of thrashing is when the operating system attempts to release resources by
first warning processes to release paging space and then killing entire processes.
In tuning vmo parameters, you can help set the
thresholds when thrashing starts. You can also look at memory usage with either
topas or nmon.
What about maintaining the size of your paging space? You do this with the
swap command (see Listing 6) in
AIX 7 or by using the individual mkps and chps commands create paging space with logical partitions.
Listing 6. Using the
swap command
# swap -l
device maj,min total free
/dev/hd6 10, 2 768MB 751MB
|
This tells you that you have one swap partition defined. You'll also notice that only 3MB are actually being used. Listing 7 shows what happens if your paging space utilization is too high.
Listing 7. Running out of paging space
# lsps -a
Page Space Physical Volume Volume Group Size %Used Active Auto Type Chksum
hd6 hdisk0 rootvg 768MB 73 yes yes lv 0
|
In this case, your paging space is starting to get dangerously low. It is possible that your system has been up for a very long time. If you are running a database such as Oracle, virtual memory does not get released until you recycle your database. Let's see how long your system has been up (see Listing 8).
Listing 8. Using the uptime command
# uptime
11:58AM up 9 days, 15:50, 23 users, load average: 0.00, 0.03, 0.04
|
As shown in Listing 8, the system has been up for only nine days. If the paging space utilization has increased to 78 percent in such a short amount of time, you should consider adding more paging space. If you have plenty of space on your system, we would add another partition.
One best practice to keep in mind is to keep your paging spaces at the same size.
In this case, we would add another 4GB of paging space to your rootvg volume. You
can do this with the System Management Interface Tool (SMIT) and use either the
smit mkps and smit swapon
commands to activate the paging space. Alternatively, you can use the
swapon (including swapoff)
commands from the command line. If you can, use disks that are least used for
paging areas. Also try not to allocate more than one paging logical volume for
each physical disk. Though some administrators don't mind putting paging space on
external storage, we personally don't like that practice. If you do this and the
external storage is not available on a reboot, your system might crash (depending
upon the amount of space allocated to paging). If you can, spread them across
multiple platters and, of course, make sure they are online by using the lsps
-a command.
How much paging space do you need on your system? What is the rule of thumb? First, start with the folks that own your application. The DB2® or Oracle teams should be able to tell you how much paging space needs to be allocated on your system from a database perspective. If you are a small shop, you'll have to do the research on your own. Be careful, though. Database administrators usually like to request the highest number of everything and might instruct you to double the amount of paging space as your RAM, although with modern systems supporting many GB of RAM this rule is less practical. Often what you need is paging space in the event of running out of main RAM, rather than a permanent copy of the pages in both RAM and on disk. Monitor your system frequently after going live. If you see that you are never really approaching 50 percent of paging space utilization, don't add the space. A more sensible rule is to configure the paging space to be half the size of RAM plus 4GB with an upper limit of 32GB. In systems with more than 32GB of RAM, or on systems where you are using LPAR and WPAR to help split your workload, you can be significantly more selective and specific about your memory requirements.
As a rough rule, you should monitor space with the lsps -a command and not worry unless the utilization is over 25 percent on the system. Adding additional space that you won't use just wastes disk space and can in some cases have a negative impact on performance as the OS tries to cope with the paging space.
We are often asked how can you tell if a process is using paging space? Take a look
at svmon, as shown in Listing 9.
Listing 9. Using svmon
# l488pp065_pub[/tmp] > svmon -P 7209170
---------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
7209170 sshd 25978 11972 0 25837 N N N
PageSize Inuse Pin Pgsp Virtual
s 4 KB 234 4 0 93
m 64 KB 1609 748 0 1609
|
After identifying the PID number, using svmon can
drill down to this level. This can help you determine whether or not tuning needs
to be done to your application to either help stop the paging or to tune your
operating system. Do a man on svmon, as there are many
other purposes to this AIX memory-specific utility.
In this section, we use vmo to tune paging parameters
that can significantly reduce the amount of paging on your systems. We also discuss
thresholds to change and parameters that can influence your overall scanning
overhead.
So what can you tune on VMM to cut down on paging? In the first installment of
the series (see Resources), we discussed the
minperm and maxperm
parameters in great detail, and we'll summarize some of the most important concepts
here. Tuning vmo settings allows you to favor
either working or persistent storage. You want it to favor working storage. The
way to prevent AIX 7 from paging working storage and to utilize the caching from
your database would be to set maxperm to a high value
(greater than 80) and to make sure the lru_file_repage=0
parameter indicates whether or not the VMM re-page counts should be considered and
what type of memory it should steal. The default setting is 1, so you need to
change it to 0. This is done using the vmo command.
When you set the parameter to 0, it tells the VMM that you prefer that it steal
only file pages rather than computational pages. This is what you want to do. You
also need to set the minperm,
maxperm, and maxclient
parameters, as shown in Listing 10 below.
Listing 10. Setting the
minperm, maxperm and maxclient parameters
vmo -p -o minperm%=5
vmo -p -o maxperm%=90
vmo -p -o maxclient%=90
|
In prior AIX versions, you would tune strict_maxperm
and strict_maxclient from their default numbers. With
AIX Version 5.3, changing the lru_file_repage parameter
is a far more effective way of tuning, as you would prefer AIX 7 file caching not be
used at all. Now let's briefly summarize minfree and
maxfree. If the number of pages on your free list falls
below the minfree parameter, VMM starts to steal pages
until the free list has at least the amount of pages in the maxfree parameter. The
default settings in AIX Version 5.3 usually seem to work (see
Listing 11).
Listing 11. Default settings for
maxfree and minfree
# vmo -a | grep free
maxfree = 1088
minfree = 960
|
Let's discuss tuning page space thresholds. As stated earlier, when your
paging space starts becoming very low, it starts to warn offending processes and
then kills them. What thresholds can you change here to influence this activity?
They would be npswarn,
npskill, and nokilluid.
Npswarn is the threshold that is used to signal the
processes when space is getting low. Npskill is the
threshold where AIX 7 starts killing processes. If your policy is early page space
allocation, it will not kill the process. If you recall, we discussed earlier that
this was the most reliable method of paging. Nokillid
is an important threshold because, if this is set to 1, it makes certain that
processes owned by root will not be killed, even when the
npskill threshold is reached.
Further, when a process cannot be forked because of a paging space issue, the
scheduler retries to fork it again up to five times, delaying 10 clock ticks
before each retry. You can change the schedo parameter
to increase or decrease the amount of tries. The parameter used for this is the
pacefork value. Another important parameter you can look at is
lrubucket. Tuning this can reduce the scanning
overhead. Because the page replacement algorithm is always looking for free frames
while it is doing its scanning on systems with a lot of memory, the number of
frames to scan can be significant. Increasing the value decreases the amount of
buckets that need to be scanned. This can help performance.
Listing
12 uses the vmo command
with the -a option to display the values for
lrubucket.
Listing 12. Displaying the value for
lrubucket
# vmo -a | grep lru
lru_file_repage = 1
lru_poll_interval = 0
lrubucket = 131072 (this is in 4 KB frames)
|
To increase the default value from 512MB to 1GB, use
# vmo -o lrubucket=262144.
And that's how you can significantly reduce paging on your AIX 7 system using
vmo.
Part 3 of this series looked at some of the tools that are available to you in capturing data for swap analysis. You used some system administration commands to display and configure swap on your system, and learned about paging and swapping and the various methods of paging that are available on AIX 7. You also reviewed some best practices when configuring paging space on your systems. Finally, you studied specific methods of tuning your VMM specific to handle paging and swapping. Parts 1 and 2 of this series went over the VMM in great detail and covered troubleshooting memory bottlenecks. You used various tools to help you monitor your systems for both short-term analysis and long-term trending. You also learned all about the general tuning methodology and the importance of monitoring systems prior to bottlenecks occurring. This enables you to establish a baseline while your system is healthy so that you can practice some of the methods discussed in this series, which include tuning your memory subsystems. Just make sure you test them on your development or test environments prior to deploying any changes to production.
Learn
-
AIX
memory affinity support:
Learn about AIX memory from the IBM System p™ and AIX InfoCenter.
-
IBM Redbooks: See how Database Performance Tuning on AIX
is designed to help system designers, system administrators, and database
administrators design, size, implement, maintain, monitor, and tune a Relational
Database Management System (RDMBS) for optimal performance on AIX.
- "Power to the
People"
(developerWorks, May 2004): Read this article for a history of chip making at IBM.
- "Processor Affinity on AIX"
(developerWorks, November 2006): Using process affinity settings to bind or unbind
threads can help you find the root cause of troublesome hang or deadlock problems.
Read this article to learn how to use processor affinity to restrict a process and
run it only on a specified CPU.
- "CPU Monitoring and Tuning"
(March, 2002): Read this article to learn how standard AIX tools can help you
determine CPU bottlenecks.
-
Operating System and Device Management:
This document from IBM provides users and system administrators with complete
information that can affect your selection of options when performing such tasks
as backing up and restoring the system, managing physical and logical storage, and
sizing appropriate paging space.
- "nmon
performance: A free tool to analyze AIX and Linux® performance"
(developerWorks, February 2006): This free tool gives you a huge amount of
information all on one screen.
- "nmon analyser—A free tool to produce AIX performance reports"
(developerWorks, April 2006): Read this article to learn how to produce a wealth
of report-ready graphs from nmon output.
-
The AIX 7.1 Information Center is your source for technical information about the AIX operating system.
- The IBM AIX Version 6.1 Differences Guide can be a useful resource for understanding changes in AIX 6.1.
- The IBM AIX
Version 7.1 Differences Guide can be a useful resource for understanding changes in AIX 7.1.
-
Popular content:
See what AIX and UNIX content your peers find interesting.
-
AIX and
UNIX:
The AIX and UNIX developerWorks zone provides a wealth of information relating to
all aspects of AIX systems administration and expanding your UNIX skills.
-
New to AIX and UNIX?:
Visit the New to AIX and UNIX page to learn more about AIX and UNIX.
- Search the AIX and UNIX library by topic:
- System administration
- Application development
- Performance
- Porting
- Security
- Tips
- Tools and utilities
- Java™ technology
- Linux
- Open source
-
Safari bookstore:
Visit this e-reference library to find specific technical resources.
-
developerWorks technical events and webcasts:
Stay current with developerWorks technical events and webcasts.
-
Podcasts: Tune in and
catch up with IBM technical experts.
-
Future Tech:
Visit Future Tech's site to learn more about their latest offerings.
Get products and technologies
-
IBM trial software:
Build your next development project with software for download directly from
developerWorks.
Discuss
- Participate in the
developerWorks blogs
and get involved in the developerWorks community.
- AIX 7 Open Beta:
This forum is for technical discussions supporting the AIX 7 Open Beta Program.
- Follow developerWorks on Twitter.
- Get involved in the My developerWorks community.
-
Participate in the AIX and UNIX® forums:
- AIX Forum
- AIX Forum for developers
- Cluster Systems Management
- Performance Tools Forum
- Virtualization Forum
- More AIX and UNIX Forums
Martin Brown has been a professional writer for over eight years. He is the author of numerous books and articles across a range of topics. His expertise spans myriad development languages and platforms - Perl, Python, Java, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows, Solaris, Linux, BeOS, Mac OS/X and more - as well as web programming, systems management and integration. Martin is a regular contributor to ServerWatch.com, LinuxToday.com, IBM developerWorks and a regular blogger at Computerworld, The Apple Blog, and other sites. He is also a Subject Matter Expert (SME) for Microsoft. He can be contacted through his website at http://www.mcslp.com.
Ken Milberg is a technology writer and site expert for techtarget.com and provides Linux technical information and support at searchopensource.com. He is also a writer and technical editor for IBM Systems Magazine, Open Edition. Ken holds a bachelor's degree in computer and information science and a master's degree in technology management from the University of Maryland. He is the founder and group leader of the NY Metro POWER-AIX/Linux Users Group. Through the years, he has worked for both large and small organizations and has held diverse positions from CIO to Senior AIX Engineer. Today, he works for Future Tech, a Long Island-based IBM Business Partner. Ken is a PMI certified Project Management Professional (PMP), an IBM Certified Advanced Technical Expert (CATE, IBM System p5 2006), and a Solaris Certified Network Administrator (SCNA). You can contact him at kmilberg@gmail.com.



