Disk I/O pacing

Disk-I/O pacing is intended to prevent programs that generate very large amounts of output from saturating the system's I/O facilities and causing the response times of less-demanding programs to deteriorate.

Disk-I/O pacing enforces per-segment, or per-file, high and low-water marks on the sum of all pending I/Os. When a process tries to write to a file that already has high-water mark pending writes, the process is put to sleep until enough I/Os have completed to make the number of pending writes less than or equal to the low-water mark. The logic of I/O-request handling does not change. The output from high-volume processes is slowed down somewhat.

You can set the high and low-water marks system-wide with the SMIT tool by selecting System Environments -> Change / Show Characteristics of Operating System(smitty chgsys) and then entering the number of pages for the high and low-water marks or for individual file systems by using the maxpout and minpout mount options.

The maxpout parameter specifies the number of pages that can be scheduled in the I/O state to a file before the threads are suspended. The minpout parameter specifies the minimum number of scheduled pages at which the threads are woken up from the suspended state. The default value for maxpout is 8193, and minpout is 4096. To disable I/O pacing, simply set them both to zero.

Changes to the system-wide values of the maxpout and minpout parameters take effect immediately without rebooting the system. Changing the values for the maxpout and minpout parameters overwrites the system-wide settings. You can exclude a file system from system-wide I/O pacing by mounting the file system and setting the values for the maxpout and minpout parameters explicitly to 0. The following command is an example:
mount -o minpout=0,maxpout=0 /<file system>
Tuning the maxpout and minpout parameters might prevent any thread that is doing sequential writes to a file from dominating system resources.

The following table demonstrates the response time of a session of the vi editor on an IBM® eServer™ pSeries model 7039-651, configured as a 4-way system with a 1.7 GHz processor, with various values for the maxpout and the minpout parameters while writing to disk:

Value for maxpout Value for minpout dd block size (10 GB) write (sec) Throughput (MB/sec) vi comments
0 0 10000 201 49.8 after dd completed
33 24 10000 420 23.8 no delay
65 32 10000 291 34.4 no delay
129 32 10000 312 32.1 no delay
129 64 10000 266 37.6 no delay
257 32 10000 316 31.6 no delay
257 64 10000 341 29.3 no delay
257 128 10000 223 44.8 no delay
513 32 10000 240 41.7 no delay
513 64 10000 237 42.2 no delay
513 128 10000 220 45.5 no delay
513 256 10000 206 48.5 no delay
513 384 10000 206 48.5 3 - 6 seconds
769 512 10000 203 49.3 15-40 seconds, can be longer
769 640 10000 207 48.3 less than 3 seconds
1025 32 10000 224 44.6 no delay
1025 64 10000 214 46.7 no delay
1025 128 10000 209 47.8 less than 1 second
1025 256 10000 204 49.0 less than 1 second
1025 384 10000 203 49.3 3 seconds
1025 512 10000 203 49.3 25-40 seconds, can be longer
1025 640 10000 202 49.5 7 - 20 seconds, can be longer
1025 768 10000 202 49.5 15 - 95 seconds, can be longer
1025 896 10000 209 47.8 3 - 10 seconds

The best range for the maxpout and minpout parameters depends on the CPU speed and the I/O system. I/O pacing works well if the value of the maxpout parameter is equal to or greater than the value of the j2_nPagesPerWriteBehindCluster parameter. For example, if the value of the maxpout parameter is equal to 64 and the minpout parameter is equal to 32, there are at most 64 pages in I/O state and 2 I/Os before blocking on the next write.

The default tuning parameters are as follows:
Parameter Default Value
j2_nPagesPerWriteBehindCluster 32
j2_nBufferPerPagerDevice 512

For Enhanced JFS, you can use the ioo -o j2_nPagesPerWriteBehindCluster command to specify the number of pages to be scheduled at one time. The default number of pages for an Enhanced JFS cluster is 32, which implies a default size of 128 KB for Enhanced JFS. You can use the ioo -o j2_nBufferPerPagerDevice command to specify the number of file system bufstructs. The default value is 512. For the value to take effect, the file system must be remounted.

For Enhanced JFS, you can use the mount -o remount command to change the maxpout and minpout values of an already mounted file system.