Level: Introductory Shiv Dutta, Technical Consultant, IBM
07 Nov 2002 An alternative I/O technique called Direct I/O can give your AIX applications improved performance. This article discusses the benefits of Direct I/O and tells how to implement it.
Introduction
The alternative I/O technique called Direct I/O was first introduced in AIX 4.3 and has been available for all later releases of AIX, including AIX 5L. It bypasses the Virtual Memory Manager (VMM) altogether and transfers data directly to/from the disk to/from the user's buffer. You may find improved performance of your applications when you implement this technique for file handling.
In the following discussion any reference to JFS will imply reference to both JFS and JFS2. JFS (Journaled File System) is native to the POWER-based platform. Although JFS2 (also known as Enhanced Journaled File System) is not native to the POWER-based platform, it is available on POWER. Both JFS and JFS2, used in AIX, exploit database journaling techniques to maintain its structural consistency. This prevents damage to the file system when the system is halted abnormally.
Overview
Normally, when an I/O request to a JFS file is invoked, the I/O goes from the application buffer to the Virtual Memory Manager (VMM) and then from the Virtual Memory Manager to the JFS. When the application makes a request for a file read, if the file page is not in memory, the JFS reads the data from the disk into the file buffer cache, then copies the data from the file buffer cache to the user's buffer. On the other hand, when the application makes a request for a write, the data is copied from the user's buffer into the file buffer cache. The actual writes to disk are done later if the write requests cannot be accommodated immediately.
This type of caching policy can be extremely effective in improving performance of JFS I/O when the cache hit rate is high. It would fully exploit the read-ahead and write-behind features of JFS. This would allow file writes to be asynchronous so that the applications can continue to process without having to wait for I/O requests to complete. On the other hand, if the applications have poor cache hit rates or if they do very large I/Os, such caching policy may not be of much benefit.
If you know that certain files have poor cache-utilization characteristics, then you could open those files as Direct I/O files. Doing this most likely will lead to improved performance of your application.
Direct I/O for files and raw I/O for devices are functionally equivalent, but Direct I/O doesn't impact raw I/O performance. In comparison, raw I/O performance is slightly better than Direct I/O, but Direct I/O does provide the benefits of a JFS as well as enhanced performance.
Enabling your applications to use Direct I/O
At the programming level, Direct I/O access to a file is enabled by passing the O_DIRECT flag to the fcntl.h. This flag is defined in open function. Applications must be compiled with _ALL_SOURCE enabled to have the definition of O_DIRECT available.
At the user level, starting with AIX 5.1D Direct I/O is enabled using the "dio" option on the mount command e.g. mount -odio /xyz where xyz is a filesystem. This works for both JFS and JFS2 filesystems. A filesystem mounted with the "dio" option will have all I/O treated as Direct I/O as long as the alignment requirements are met. The I/O should be aligned to page (4K byte) boundaries and in multiples of the page size. If the I/O doesn't meet those requirements, then the I/O will go through kernel buffers and the buffers will be flushed after the I/O completes. This will result in poor performance. Therefore, you should use the "dio" option on the mount command only if all applications running against the files in the filesystem are well behaved in this respect.
Once Direct I/O is implemented, it's easy to verify if it's working: Mount a filesystem with the dio option and record the number of memory pages used. Repeat the process with the filesystem mounted without the dio option. Notice that for the Direct I/O implemented filesystem, memory pages will NOT be used to cache pages, hence the vmtune parameter 'numperm' would not grow as in the case of normal I/O.
Rules for Direct I/O at the API level
- There are very strict rules for Direct I/O at the API level. Buffers for the I/O requests need to be 4K byte aligned, and the I/O lengths must be in multiples of 4K bytes. Failure on either at the API level will bypass Direct I/O. Normally, databases naturally obey these rules as they are true of raw logical I/O, too.
- Direct I/O does not bypass i-node locking. If i-node locking is a problem because of writes, it will likely continue to be a problem with Direct I/O.
- Direct I/O is unbuffered, so writes are synchronous. If the application does lots of writes which are buffered without Direct I/O, it may run very slowly with Direct I/O.
- Direct I/O is unbuffered, so there is no read-ahead. If application is doing a lot of sequential reads and taking advantage of the filesystem making them into bigger physical I/O's, Direct I/O may be slower.
- Direct I/O does not coalesce contiguous I/O's. This would be a possible issue for applications using aio, listio, or readv/writev.
 |
Benefits of Direct I/O
Direct I/O is only supported for program working storage, that is, local persistent files. The main benefit of Direct I/O is in the reduction of CPU cycles needed for file reads and writes. This results from not having to copy files from the VMM file cache to the user buffer as in the normal cache situations. For normal cache situation, if the cache hit rate is low, most read requests would go to the disk. As mentioned before, these are the ideal situations where applications would benefit from Direct I/O implementation. However, for cases where cache hit rate is high in normal cache situations, applications would see reduced CPU utilization from Direct I/O implementation but would not be able to take advantage of the read-ahead algorithms available under normal cache policy. Writes are faster with normal cached I/O in most cases. But if a file is opened with O_SYNC or O_DSYNC, then the writes have to go to disk. In these cases, applications would benefit from Direct I/O because the overhead of data copy is eliminated.
Another benefit of Direct I/O is that it doesn't allow applications to compromise the effectiveness of caching of other files. When a file is read or written, the file competes for space in the file cache, and this could cause other file data to be pushed out of the cache. If you know that certain files have poor cache-utilization characteristics, then only those files could be opened with O_DIRECT.
Performance of Direct I/O reads
Even though the use of Direct I/O has the potential to reduce the need of CPU cycles for application execution, ironically it leads to longer elapsed times in many cases. This is especially true for a series of small I/O requests.
Direct I/O reads from the disk are synchronous, and this can result in poor performance if the data was likely to be in memory under the normal caching policy. Direct I/O bypasses the VMM read-ahead algorithms because the I/Os would not go through the VMM. The read-ahead algorithm can be quite useful for sequential access to files because the VMM can initiate disk requests and have the pages already be resident in memory before the application has requested the pages. Applications can compensate for the loss of this read-ahead feature by using one of the following methods:
- Issue larger read requests.
- Issue asynchronous Direct I/O read-ahead by the use of multiple threads.
- Use the asynchronous I/O facilities such as aio_read() or lio_listio().
Performance of Direct I/O writes
Direct I/O writes bypass the VMM and go directly to the disk, so that there can be a significant performance penalty; in normal cached I/O, the writes can go to memory and then get flushed to disk later. Because Direct I/O writes do not get copied into memory, when a sync operation is performed, it will not have to flush these pages to disk, thus reducing the amount of work the syncd daemon has to perform.
Performance example
In the following example, performance is measured on an RS/6000 server running AIX 4.3.1. KBPS is the throughput in kilobytes per second, and %CPU is CPU usage in percent.
Listing 1. Performance example
# of 2.2 GB SSA Disks 1 2 4 6 8
# of PCI SSA Adapters 1 1 1 1 1
Sequential read throughput, using normal I/O
KBPS 7108 14170 18725 18519 17892
%CPU 23.9 56.1 92.1 97.0 98.3
Sequential read throughput, using Direct I/O
KBPS 7098 14150 22035 27588 30062
%CPU 4.4 9.1 22.0 39.2 54.4
Sequential read throughput, using raw I/O
KBPS 7258 14499 28504 30946 32165
%CPU 1.6 3.2 10.0 20.9 24.5
|
Conflicting file access modes
In order to avoid consistency issues between programs that use Direct I/O and programs that use normal cached I/O, Direct I/O is by default used in an exclusive use mode. If there are multiple opens of a file and some of them are direct and others are not, the file will stay in its normal cached access mode. Only when the file is open exclusively by Direct I/O programs will the file be placed in Direct I/O mode.
Similarly, if the file is mapped into virtual memory via the shmat() or mmap() system calls, then file will stay in normal cached mode.
The JFS or JFS2 will attempt to move the file into Direct I/O mode any time the last conflicting. non-direct access is eliminated (either by close(), munmap(), or shmdt() subroutines). Changing the file from normal mode to Direct I/O mode can be rather expensive since it requires writing all modified pages to disk and removing all the file's pages from memory.
Candidates for Direct I/O
I/O-intensive applications that don't benefit much from the normal caching policy are likely to see improved performance when Direct I/O is implemented.
Programs that are typically CPU-limited and perform lots of disk I/O are good candidates for Direct I/O. Codes that have large sequential I/Os are good candidates as well. Applications that do numerous small I/Os will typically see less performance benefit, since Direct I/O is unable to exploit read-ahead or write-behind algorithms available under normal caching policy. Applications that benefit from striping are also good candidates.
Resources
About the author  | |  | Shiv Dutta is a technical consultant for IBM eServer group where he assists independent software vendors with the enablement of their applications on pSeries servers. Shiv has considerable experience as a software developer, system administrator and an instructor. He provides AIX support in the areas of system administration, problem determination, performance tuning and sizing guides. Shiv has worked with AIX from its inception. He holds a Ph.D. in Physics from Ohio University and can be reached at sdutta@us.ibm.com. |
Rate this page
|