Benefits of direct I/O
The primary benefit of direct I/O is to reduce CPU utilization for file reads and writes by eliminating the copy from the cache to the user buffer.
This can also be a benefit for file data which has a very poor cache hit rate. If the cache hit rate is low, then most read requests have to go to the disk. Direct I/O can also benefit applications that must use synchronous writes because these writes have to go to disk. In both of these cases, CPU usage is reduced because the data copy is eliminated.
A second benefit of direct I/O is that it allows applications to avoid diluting the effectiveness of caching of other files. Anytime a file is read or written, that file competes for space in the cache. This situation may cause other file data to be pushed out of the cache. If the newly cached data has very poor reuse characteristics, the effectiveness of the cache can be reduced. Direct I/O gives applications the ability to identify files where the normal caching policies are ineffective, thus releasing more cache space for files where the policies are effective.
Performance costs of direct I/O
Although direct I/O can reduce CPU usage, using it typically results in the process taking longer to complete, especially for relatively small requests. This penalty is caused by the fundamental differences between normal cached I/O and direct I/O.
Direct I/O reads
Every direct I/O read causes a synchronous read from disk; unlike the normal cached I/O policy where read may be satisfied from the cache. This can result in very poor performance if the data was likely to be in memory under the normal caching policy.
Direct I/O also bypasses the normal JFS or JFS2 read-ahead algorithms. These algorithms can be extremely effective for sequential access to files by issuing larger and larger read requests and by overlapping reads of future blocks with application processing.
Applications can compensate for the loss of JFS or JFS2 read-ahead by issuing larger read requests. At a minimum, direct I/O readers should issue read requests of at least 128k to match the JFS or JFS2 read-ahead characteristics.
Applications can also simulate JFS or JFS2 read-ahead by issuing asynchronous direct I/O read-ahead either by use of multiple threads or by using the aio_read subroutine.
Direct I/O writes
Every direct I/O write causes a synchronous write to disk; unlike the normal cached I/O policy where the data is merely copied and then written to disk later. This fundamental difference can cause a significant performance penalty for applications that are converted to use direct I/O.
Conflicting file access modes
To avoid consistency issues between programs that use direct I/O and programs that use normal cached I/O, direct I/O is an exclusive use mode. If there are multiple opens of a file and some of them are direct and others are not, the file will stay in its normal cached access mode. Only when the file is open exclusively by direct I/O programs will the file be placed in direct I/O mode.
Similarly, if the file is mapped into virtual memory through the shmat or mmap system calls, the file will stay in normal cached mode.
The JFS or JFS2 will attempt to move the file into direct I/O mode anytime the last conflicting or non-direct access is eliminated (either by the close, munmap, or shmdt subroutines). Changing the file from normal mode to direct I/O mode can be rather expensive because it requires writing all modified pages to disk and removing all the file's pages from memory.
Enabling applications to use direct I/O
Applications enable direct I/O access to a file by passing the O_DIRECT flag to the open subroutine. This flag is defined in the fcntl.h file. Applications must be compiled with _ALL_SOURCE enabled to see the definition of O_DIRECT.
Offset/Length/Address alignment requirements of the target buffer
For direct I/O to work efficiently, the request should be suitably conditioned. Applications can query the offset, length, and address alignment requirements by using the finfo and ffinfo subroutines. When the FI_DIOCAP command is used, the finfo and ffinfo subroutines return information in the diocapbuf structure as described in the sys/finfo.h file. This structure contains the following fields:
- dio_offset
- Recommended offset alignment for direct I/O writes to files in this file system
- dio_max
- Recommended maximum write length for direct I/O writes to files in this system
- dio_min
- Recommended minimum write length for direct I/O writes to files in this file system
- dio_align
- Recommended buffer alignment for direct I/O writes to files in this file system
Failure to meet these requirements may cause file reads and writes to use the normal cached model and may cause direct I/O to be disabled for the file. Different file systems may have different requirements, as the following table illustrates.
File system format | dio_offset | dio_max | dio_min | dio_align |
---|---|---|---|---|
JFS fixed, 4 K blk | 4 K | 2 MB | 4 K | 4 K |
JFS fragmented | 4 K | 2 MB | 4 K | 4 K |
JFS compressed | n/a | n/a | n/a | n/a |
JFS big file | 128 K | 2 MB | 128 K | 4 K |
JFS2 | 4 K | 4 GB | 4 K | 4 K |
Direct I/O limitations
Direct I/O is not supported for files in a compressed-file file system. Attempts to open these files with O_DIRECT will be ignored and the files will be accessed with the normal cached I/O methods.
Direct I/O and data I/O integrity completion
Although direct I/O writes are done synchronously, they do not provide synchronized I/O data integrity completion, as defined by POSIX. Applications that need this feature should use O_DSYNC in addition to O_DIRECT. O_DSYNC guarantees that all of the data and enough of the metadata (for example, indirect blocks) have written to the stable store to be able to retrieve the data after a system crash. O_DIRECT only writes the data; it does not write the metadata.