This three-part series (see Resources) on the AIX® disk and I/O subsystem focuses on the challenges of optimizing disk I/O performance. While disk tuning is arguably less exciting than CPU or memory tuning, it is a crucial component in optimizing server performance. In fact, partly because disk I/O is your weakest subsystem link, there is more you can do to improve disk I/O performance than any other subsystem.
Unlike the tuning of other subsystems, tuning disk I/O should actually be started
during the architectural phase of building your systems. While there are virtual
memory equivalents of I/O tuning parameters (ioo and
lvmo), the best way to increase disk I/O performance is
by properly configuring your systems and not tuning parameters. Unlike virtual
memory tuning, it is much more complex to change the way in which you structure
your logical volumes after they have been created and running, so you usually get
only one chance to do this right. This article discusses the ways that you can
configure your logical volumes and where to actually place them with respect to
the physical disk, and it also addresses the tools used to monitor your logical
volumes. Most of these tools are not meant to be used for long-term trending and
are specific AIX tools that provide information as to how the logical volumes are
configured and if they have been optimized for your environment.
Part 1 (see Resources) of this series introduced
iostat, but it did not address using the tool outside
of viewing Asynchronous I/O servers. Part 2 uses iostat to monitor your disks and
shows you what it can do to help quickly determine your I/O bottleneck. While
iostat is one of those generic UNIX® utilities
that was not developed specifically for AIX, it is actually very useful for
quickly determining what is going on in your system. The more specific AIX logical
volume commands help drill down deeper into your logical volumes to help you
really analyze what your problems are, if any. It's important that you clearly
understand what you're looking for before using these tools. This article
describes the tools and also shows you how to analyze their output, which helps in
analyzing your disk I/O subsystem.
Logical volume and disk placement overview
This section defines the Logical Volume Manager (LVM) and introduces some of its features. Let's drill down into logical volume concepts, examine how they relate to improving disk I/O utilization, and talk about logical volume placement as it relates to the physical disk, by defining and discussing both intra-policy and inter-policy disk practices.
Conceptually, the logical volume layer sits between the application and physical layers. In the context of disk I/O, the application layers are the file system or raw logical volumes. The physical layer consists of the actual disk. LVM is an AIX disk management system that maps the data between logical and physical storage. This allows data to reside on multiple physical platters and to be managed and analyzed using specialized LVM commands. LVM actually controls all the physical disk resources on your system and helps provide a logical view of your storage subsystem. Understanding that it sits between the application layer and the physical layer should help you understand why it is arguably the most important of all the layers. Even your physical volumes themselves are part of the logical layer, as the physical layer only encompasses the actual disks, device drivers, and any arrays that you might have already configured. Figure 1 illustrates the concepts and shows how tightly integrated the logical I/O components relate to the physical disk and its application layer.
Figure 1. Logical volume diagram
Let's now quickly introduce the elements that are part of LVM, from the bottom up. Each of the drives is named as a physical volume. Multiple physical volumes make up a volume group. Within the volume groups, logical volumes are defined. The LVM enables the data to be on multiple physical drives, though they might be configured to be on a single volume group. These logical volumes can be either one or multiple logical partitions. Each of the logical partitions has a physical partition that correlates to it. Here is where you can have multiple copies of the physical portions for purposes such as disk mirroring.
Let's take a quick look at how logical volume creation correlates with physical volumes. Figure 2 illustrates the actual storage position on the physical disk platter.
Figure 2. Actual storage position on the physical disk platter
As a general rule, data that is written toward its center has faster seek times than data written on the outer edge. This has to do with the density of data. Because it is more dense as it moves toward its center, there is actually less movement of the head. The inner edge usually has the slowest seek times. As a best practice, the more intensive I/O applications should be brought closer to the center of the physical volumes. Note that there are exceptions to this. Disks hold more data per track on the edges of the disk, not on the center. That being said, logical volumes being accessed sequentially should actually be placed on the edge for better performance. The same holds true for logical volumes that have Mirror Write Consistency Check (MWCC) turned on. This is because the MWCC sector is on the edge of the disk and not at the center of it, which relates to the intra-disk policy of logical volumes.
Let's discuss another important concept referred to as the inter-disk policy of logical volumes. The inter-disk policy defines the number of disks on which the physical partitions of a logical volume actually resides. The general rule is that the minimum policy provides the greatest reliably and availability, and the maximum policy improves performance. Simply put, the more drives that data is spread on, the better the performance. Some other best practices include: allocating intensive logical volumes to separate physical volumes, defining the logical volumes to the maximum size you need, and placing logical volumes that are frequently used close together. This is why it is so important to know your data prior to configuring your systems so that you can create policies that make sense from the start.
You can define your polices when creating the logical volumes themselves using
the following command or smit fastpath: # mklv or
# smitty mklv.
Monitoring logical volumes and analyzing results
This section provides instructions on how to monitor your logical volumes and analyze the results. Various commands are introduced along with the purposes for which they are used, and you examine the output as well.
A ticket has just been opened up with the service desk that relates to slow
performance on some database server. You suspect that there might be an I/O issue,
so you start with iostat. If you recall, this command
was introduced in the first installment of the series (see
Resources), though only for the purposes of viewing
asynchronous I/O servers. Now, let's look at iostat in
more detail. iostat, the equivalent of using
vmstat for virtual memory, is arguably the most
effective way to get a first glance of what is happening with your I/O subsystem.
Listing 1. Using iostat
# iostat 1
System configuration: lcpu=4 disk=4
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 392.0 5.2 5.5 88.3 1.1
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk1 0.5 19.5 1.4 53437739 21482563
hdisk0 0.7 29.7 3.0 93086751 21482563
hdisk4 1.7 278.2 6.2 238584732 832883320
hdisk3 2.1 294.3 8.0 300653060 832883320
|
What are you seeing here and what does this all mean?
- % tm_act: Reports back the percentage of time that the physical disk was active or the total time of disk requests.
- Kbps: Reports back the amount of data transferred to the drive in kilobytes.
- tps: Reports back the number of transfers per second issued to the physical disk.
- Kb_read: Reports back the total data (kilobytes) from your measured interval that is read from the physical volumes.
- Kb_wrtn: Reports back the amount of data (kilobytes) from your measured interval that is written to the physical volumes.
You need to watch % tm_act very carefully, because when its utilization exceeds roughly 60 to 70 percent, it usually is indicative that processes are starting to wait for I/O. This might be your first clue of impending I/O problems. Moving data to less busy drives can obviously help ease this burden. Generally speaking, the more drives that your data hits, the better. Just like anything else, too much of a good thing can also be bad, as you have to make sure you don't have too many drives hitting any one adapter. One way to determine if an adapter is saturated is to sum the Kbps amounts for all disks attached to one adapter. The total should be below the disk adapter throughput rating, usually less than 70 percent.
Using the -a flag (see Listing
2)
helps you drill down further to examine adapter utilization.
Listing 2. Using iostat with the
-a flag
# iostat -a
Adapter: Kbps tps Kb_read Kb_wrtn
scsi0 0.0 0.0 0 0
Paths/Disk: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk1_Path0 37.0 89.0 0.0 0 0
hdisk0_Path0 67.0 47.0 0.0 0 0
hdisk4_Path0 0.0 0.0 0.0 0 0
hdisk3_Path0 0.0 0.0 0.0 0 0
Adapter: Kbps tps Kb_read Kb_wrtn
ide0 0.0 0.0 0 0
Paths/Disk: % tm_act Kbps tps Kb_read Kb_wrtn
cd0 0.0 0.0 0.0 0 0
|
Clearly, there are no bottlenecks here. Using the -d
flag allows you to drill down to one specific disk (see
Listing
3).
Listing 3. Using iostat with the
-d flag
# iostat -d hdisk1 1
System configuration: lcpu=4 disk=5
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk1 0.5 19.4 1.4 53437743 21490480
hdisk1 5.0 78.0 23.6 3633 3564
hdisk1 0.0 0.0 0.0 0 0
hdisk1 0.0 0.0 0.0 0 0
hdisk1 0.0 0.0 0.0 0 0
hdisk1 0.0 0.0 0.0 0 0
|
Let's look at some specific AIX LVM commands. You examined disk placement earlier and the importance of architecting your systems correctly from the beginning. Unfortunately, you don't always have that option. As system administrators, you sometimes inherit systems that must be fixed. Let's look at the layout of the logical volumes on disks to determine if you need to change definitions or re-arrange your data.
Let's look first at a volume group and find the logical volumes that are a part
of it. lsvg is the command that provides volume group
information (see Listing 4).
Listing 4. Using lsvg
# lsvg -l data2vg
Data2vg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
data2lv jfs 128 256 2 open/syncd /data2
loglv00 jfslog 1 2 2 open/syncd N/A
appdatalv jfs 128 256 2 open/syncd /appdata
|
Now, let's use lslv, which provides for specific data
on logical volumes (see Listing 5).
Listing 5. Using lslv
# lslv data2lv
LOGICAL VOLUME: data2lv VOLUME GROUP: data2vg
LV IDENTIFIER: 0003a0ec00004c00000000fb076f3f41.1 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfs WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 64 megabyte(s)
COPIES: 2 SCHED POLICY: parallel
LPs: 128 PPs: 256
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: center UPPER BOUND: 32
MOUNT POINT: /data LABEL: /data
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
|
This view provides a detailed description of your logical volume attributes. What do you have here? The intra-policy is at the center, which is normally the best policy to have for I/O-intensive logical volumes. As you recall from an earlier discussion, there are exceptions to this rule. Unfortunately, you've just hit one of them. Because Mirror Write Consistency (MWC) is on, the volume would have been better served if it were placed on the edge. Let's look at its inter-policy. The inter-policy is minimum, which is usually the best policy to have if availability is more important then performance. Further, there are double the number of physical partitions than logical partitions, which signify that you are mirroring your systems. In this case, you were told that raw performance was the most important objective, so the logical volume was not configured in such a way as to the reality of how the volume is being utilized. Further, if you are mirroring your system and using an external storage array, this would even be worse, as you're already providing mirroring at the hardware layer, which is actually more effective then using AIX mirroring.
Let's drill down even further in Listing 6.
Listing 6. lslv with the
-l flag
# lslv -l data2lv
data2lv:/data2
PV COPIES IN BAND DISTRIBUTION
hdisk2 128:000:000 100% 000:108:020:000:000
hdisk3 128:000:000 100% 000:108:020:000:000
|
The -l flag of lslv lists
all the physical volumes associated with the logical volumes and distribution for
each logical volume. You can then determine that 100 percent of the physical
partitions on the disk are allocated to this logical volume. The distribution
sections show the actual number of physical partitions within each physical
volume. From here, you can detail its intra-disk policy. The order of these fields
are as follows:
- Edge
- Middle
- Center
- Inner-middle
- Inner-edge
The reports show that most of the data is in the middle and some at the center.
Let's keep going and find out which logical volumes are associated with the one
physical volume. This is done with the lspv command
(see Listing 7).
Listing 7. Using the lspv command
# lspv -l hdisk2
hdisk2:
LV NAME LPs PPs DISTRIBUTION MOUNT POINT
loglv01 1 1 01..00..00..00..00 N/A
data2lv 128 128 00..108..20..00..00 /data2
appdatalv 128 128 00..00..88..40..00 /appdata
|
Now you can actually identify which of the logical volumes on this disk are geared up for maximum performance.
You can drill down even further to get more specific (see Listing 8).
Listing 8. lspv with the
-p flag
# lspv -p hdisk2
hdisk2:
PP RANGE STATE REGION LV ID TYPE MOUNT POINT
1-108 free outer edge
109-109 used outer edge loglv00 jfslog N/A
110-217 used outer middle data2lv jfs /data2
218-237 used center appdatalv jfs /appdata
238-325 used center testdatalv jfs /testdata
326-365 used inner middle stagingdatalv jfs /staging
366-433 free inner middle
434-542 free inner edge
|
This view tells you what is free on the physical volume, what has been used, and which partitions are used where. This is a nice view.
One of the best tools to look at LVM usage is with
lvmstat (see Listing 9).
Listing 9. Using lvmstat
# lvmstat -v data2vg
0516-1309 lvmstat: Statistics collection is not enabled for this logical device.
Use -e option to enable.
|
As you can see by the output here, it is not enabled (by default), so you need to
actually enable it prior to running the tool using
# lvmstat -v data2vg -e. The following command takes a
snapshot of LVM information every second for 10 intervals:
# lvmstat -v data2vg 1 10 |
This view shows the most utilized logical volumes on your system since you started the data collection tool. This is very helpful when drilling down to the logical volume layer when tuning your systems (see Listing 10).
Listing 10. lvmstat with the
-v flag
# lvmstat -v data2vg
Logical Volume iocnt Kb_read Kb_wrtn Kbps
appdatalv 306653 47493022 383822 103.2
loglv00 34 0 3340 2.8
data2lv 453 234543 234343 89.3
|
What are you looking at here?
- % iocnt: Reports back the number of read and write requests.
- Kb_read: Reports back the total data (kilobytes) from your measured interval that is read.
- Kb_wrtn: Reports back the amount of data (kilobytes) from your measured interval that is written.
- Kbps: Reports back the amount of data transferred in kilobytes.
Look at the man pages for all the commands discussed before you start to add them to your repertoire.
This section goes over using a specific logical volume tuning command. The
lvmo is used to set and display your pbuf tuning
parameters. It is also used to display blocked I/O statistics.
lvmo is one of those new commands first introduced in
AIX Version 5.3. It's important to note that the usage of the
lvmo command allows changes for LVM pbuf tunables only
that are dedicated to specific volume groups. The ioo utility is still the only
way to manage pbufs on a system-wide basis. This is because prior to AIX Version
5.3, the pbuf pool parameter was a system-wide resource. With the introduction of
AIX Version 5.3, LVM manages one pbuf pool for each volume group. What is a pbuf?
A pbuf is best defined as a pinned memory buffer. LVM uses these pbufs to control
pending disk I/O operations.
Let's display your lvmo tunables for the data2vg
volume group (see Listing 11).
Listing 11. Displaying lvmo tunables
# lvmo -v data2vg -a
vgname = data2vg
pv_pbuf_count = 1024
total_vg_pbubs = 1024
mag_vg_pbuf_count = 8192
perv_blocked_io_count = 7455
global_pbuf_count = 1024
global_blocked_io_count = 7455
|
What are the tunables here?
- pv_pbuf_count: Reports back the number of pbufs added when a physical volume is added to the volume group.
- Max_vg_pbuf_count: Reports back the max amount of pbufs that can be allocated for a volume group.
- Global_pbuf_count: Reports back the number of pbufs that are added when a physical volume is added to any volume group.
Let's increase the pbuf count for this volume group:
# lvmo -v redvg -o pv_pbuf_count=2048 |
Quite honestly, I usually stay away from lvmo and use
ioo. I'm more used to tuning the global parameters.
It's important to note that if you increase the pbuf value too much, you can
actually see a degradation in performance.
This article focused on logical volumes and how they relate to the disk I/O subsystem. It defined logical volumes at a high level and illustrated how it relates to the application and physical layers. It also defined and discussed some best practices for inter-disk and intra-disk polices as they relate to creating and maintaining logical volumes. You looked at ways to monitor I/O usage for your logical volumes, and you analyzed the data that was captured from the commands that were used to help determine what your problems were. Finally, you actually tuned your logical volumes by determining and increasing the amount of pbufs used in a specific volume group. Part 3 of this series focuses on the application layer as you move on to file systems, using various commands to monitor and tune your file systems and disk I/O subsystems.
Learn
- Use RSS
feed to request notification for the upcoming articles in this series:
- Optimizing AIX 5L™ performance: Tuning disk performance
- Optimizing AIX 5L performance: Tuning your memory settings
- Optimizing AIX 5L performance: Monitoring your CPU
- Check out other parts in each series:
- Optimizing AIX 5L performance: Tuning disk performance
- Optimizing AIX 5L performance: Tuning your memory settings
- Optimizing AIX 5L performance: Monitoring your CPU
- "Storage Management in AIX 5.3"
(Shiv Dutta, developerWorks, April 2005): This article focuses on some of the
features that have been introduced in AIX 5L Version 5.3 to enhance the scope,
functionality, and performance of the Logical Volume Manager (LVM) and Enhanced
Journal File System (JFS2).
- For information on developing a logical volume
strategy, see the
Choosing an Inter-Disk Allocation Policy for Your System
section of the book,
Storage Management Concepts: Operating System and Devices,
from the National Center for Supercomputing Applications.
-
Improving database
performance with AIX concurrent I/O:
Read this white paper for more information on how to improve database performance.
-
IBM Redbooks:
Database Performance Tuning on AIX is designed to help system designers,
system administrators, and database administrators design, size, implement,
maintain, monitor, and tune a Relational Database Management System (RDMBS) for
optimal performance on AIX.
-
Power
Architecture: High-Performance Architecture with a History:
Read this white paper.
- "Power to the People; A history of chip making at IBM"
(developerWorks, December 2005): This article covers the IBM power architecture.
- "Processor Affinity on AIX"
(developerWorks, November 2006): Using process affinity settings to bind or unbind
threads can help you find the root cause of troublesome hang or deadlock problems.
Read this article to learn how to use processor affinity to restrict a process and
run it only on a specified central processing unit (CPU).
- "CPU Monitoring
and Tuning"
(March, 2002): Learn how standard AIX tools can help you determine CPU
bottlenecks.
-
IBM
Redbooks:
AIX 5L Practical Performance Tools and Tuning Guide is a comprehensive guide about
performance monitoring and tuning tools that are provided with AIX 5L Version 5.3.
- "AIX 5L Version 5.3: What's in it for you?"
(Shiv Dutta, developerWorks, June 2005): Learn what features you can benefit from
in AIX 5L Version 5.3.
-
IBM Redbooks:
The AIX 5L Differences Guide Version 5.3 Edition focuses on the differences
introduced in AIX 5L Version 5.3 when compared to AIX 5L Version 5.2.
-
Operating System and Device Management:
This document from IBM provides users and system administrators with complete
information that can affect your selection of options when performing such tasks
as backing up and restoring the system, managing physical and logical storage, and
sizing appropriate paging space.
-
IBM Redbooks:
For help in obtaining IBM certification for AIX 5L and the eServer®
pSeries®, read IBM Certification Study Guide for eServer p5 and pSeries
Administration and Support for AIX 5L Version 5.3.
- Check out other articles and tutorials written
by Ken Milberg:
-
Popular content:
See what AIX and UNIX content your peers find interesting.
-
AIX and
UNIX:
The AIX and UNIX developerWorks zone provides a wealth of information relating to
all aspects of AIX systems administration and expanding your UNIX skills.
-
New to AIX and UNIX?:
Visit the New to AIX and UNIX page to learn more about AIX and UNIX.
-
AIX 5L Wiki:
Discover a collaborative environment for technical information related to AIX.
- Search the AIX and UNIX library by topic:
- System administration
- Application development
- Performance
- Porting
- Security
- Tips
- Tools and utilities
- Java™ technology
- Linux®
- Open source
-
Safari bookstore:
Visit this e-reference library to find specific technical resources.
-
developerWorks technical events and webcasts:
Stay current with developerWorks technical events and webcasts.
-
Podcasts: Tune in and
catch up with IBM technical experts.
-
Future Tech: Visit Future Tech's site to learn
more about their latest offerings.
Get products and technologies
- You can download the
nmon analyzer from
here.
-
IBM trial software:
Build your next development project with software for download directly from
developerWorks.
Discuss
- Participate in the
developerWorks blogs
and get involved in the developerWorks community.
- Participate in the AIX and UNIX forums:
- AIX 5L—technical forum
- AIX for Developers Forum
- Cluster Systems Management
- IBM Support Assistant
- Performance Tools—technical
- Virtualization—technical
- More AIX and UNIX forums
Ken Milberg is a Technology Writer and Site Expert for techtarget.com and provides Linux technical information and support at searchopensource.com. He is also a writer and technical editor for IBM Systems Magazine, Open Edition. Ken holds a bachelor's degree in computer and information science and a master's degree in technology management from the University of Maryland. He is the founder and group leader of the NY Metro POWER-AIX/Linux Users Group. Through the years, he has worked for both large and small organizations and has held diverse positions from CIO to Senior AIX Engineer. Today, he works for Future Tech, a Long Island-based IBM business partner. Ken is a PMI certified Project Management Professional (PMP), an IBM Certified Advanced Technical Expert (CATE, IBM System p5 2006), and a Solaris Certified Network Administrator (SCNA). You can contact him at kmilberg@gmail.com.
Comments (Undergoing maintenance)





