GPFS use with Oracle
When GPFS is used with Oracle, the configuration and tuning include the considerations that are mentioned in this section.
- While setting up your LUNs, it is important to create the NSD, such that they map one-to-one with a LUN that is a single RAID device.
- For file systems that are holding large Oracle databases, set the GPFS file system block size through the
mmcrfs command by using the
-B
option, to a larger value:- 512 KB is generally suggested.
- 256 KB is suggested when there is activity other than Oracle by using the file system and many small files exist, which are not in the database.
- 1 MB is suggested for file systems that are 100 TB or larger in size.
- Set the GPFS worker threads through the
mmchconfig worker1Threads command to allow the maximum parallelism of the
Oracle AIO threads:
- Adjust the GPFS prefetch threads accordingly through the mmchconfig prefetchThreads command. The maximum value of prefetchThreads plus worker1Threads plus nsdMaxWorkerThreads is 8192 on all 64-bit platforms.
- When requiring GPFS sequential I/O, set the prefetch threads in the range 50 - 100 (the default is 72).
Note: These changes through the mmchconfig command take effect upon restart of the GPFS daemon. - The number of AIX AIO
kprocs
to create is approximately the same as the GPFS worker1Threads setting. - The AIX AIO
maxservers
setting is the number ofkprocs
PER CPU. It is suggested to set is slightly larger than the value of worker1Threads divided by the number of CPUs. For example, if worker1Threads is set to 500 on a 32-way SMP, setmaxservers
to 20. - Set the Oracle database block size equal to the LUN segment size or a multiple of the LUN pdisk segment size.
- Set the Oracle read-ahead value to prefetch one or two full GPFS blocks. For example, if your GPFS block size is 512 KB, set the Oracle blocks to either 32 or 64 16 KB blocks.
- Do not use the dio option on the mount command as this forces DIO when accessing all files. Oracle automatically uses DIO to open database files on GPFS.
- When running Oracle RAC 10 g, it is suggested you increase the value for
OPROCD_DEFAULT_MARGIN to at least 500 to avoid possible random restarts of
nodes.
In the control script for the Oracle CSS daemon, which is located in /etc/init.cssd the value for OPROCD_DEFAULT_MARGIN is set to 500 (milliseconds) on all UNIX derivatives except for AIX. For AIX, this value is set to 100. From a GPFS perspective, even 500 milliseconds maybe too low in situations where node failover might take up to a minute or two to resolve. However, if during node failure the surviving node is already doing direct IO to the oprocd control file, it must have the necessary tokens and indirect block that is cached and therefore must not wait during failover.