Partial file caching

With partial file caching, the cache can fetch only the blocks that are read and not the entire file, thereby utilizing network and local disk space more efficiently. This is useful when an application does not need to read the whole file. Partial file caching is enabled on an IBM Spectrum Scale™ block boundary.

Partial file caching is controlled by the afmPrefetchThreshold parameter which can be updated using the mmchfileset command. The default value of this parameter is 0, which means complete file caching and all blocks of a file are fetched after any three blocks have been read by the cache and the file is marked as cached. This is useful for sequentially accessed files that are read in their entirety, such as image files, home directories, and development environments.

The valid afmPrefetchThreshold values are between 1 and 100. This specifies the file size percentage that must be cached before the rest of the data blocks are automatically fetched into the cache. A large value is suitable for a partially-accessed file.

An afmPrefetchThreshold value of 100 disables full file prefetching. This value caches only the data blocks that are read by the application. This is useful for large random-access files, that are either too big to fit in the cache or are never expected to be read in their entirety. When all data blocks are available in the cache, the file is marked as cached.

For sparse files, the percentage for prefetching is calculated as the ratio of the size of data blocks allocated in the cache and the total size of data blocks at home. Holes in the home file are not considered in the calculation.

Writes on partially cached files

If a write is queued on a file that is partially cached, then the file is completely cached first. Only then the write is queued on the file. Appending to a partially cached file does not cache the whole file. In the LU mode alone, the write inset or append on a partially-cached file caches the whole file even if the prefetch threshold is set on the fileset.
Note: As Partial file caching is not backward compatible, all nodes must be on GPFS™ 3.5.0.11 or later.