Prefetch
Prefetch fetches the file metadata (inode information) and data from home before an application requests the contents.
Prefetch is a feature that allows fetching the contents of a file into the cache before actual reads.
Prefetching files before an application starts can reduce the network delay when an application requests a file. Prefetch can be used to pro-actively manage WAN traffic patterns by moving files over the WAN during a period of low WAN usage.
- Populate metadata
- Populate data
- View prefetch statistics
mmafmctl Device prefetch -j FilesetName [-s LocalWorkDirectory]
[--retry-failed-file-list|--enable-failed-file-list]
[--directory LocalDirectoryPath]|
{--list-file ListFile | --home-list-file HomeListFile} [--policy] |--home-inode-file PolicyListFile ]
[--home-fs-path HomeFileSystemPath][--metadata-only]
For
more information on the command, see mmafmctl command. If no options are given for prefetch,
the statistics of the last prefetch command run on the fileset are displayed.--metadata-only - Prefetches only the metadata and not the actual data. This is useful in migration scenarios. This option requires the list of files whose metadata is to be populated. It has to be combined with a list file option.
- Files with fully qualified names from cache.
- Files with fully qualified names from home
- Fist of files from home generated using policy. The file must not be edited.
--enable-failed-file-list - Turns on generating a list of files which failed during prefetch operation at the gateway node. The list of files is saved as .afm/.prefetchedfailed.list under the fileset. Failures that occur during processing are not logged in .afm/.prefetchedfailed.list. If you observe any errors during processing (before queuing), you might need to correct the errors and re-run prefetch.
--directory LocalDirectoryPath - Specifies path to the local directory from which you want to prefetch files. A list of all files in this directory and all its sub-directories is generated, and queued for prefetch.
--home-list-file HomeListFile - The specified file is a file containing a list of files from home that need to be pre-populated, one file per line. All files must have fully qualified path names. If the list of files to be prefetched have filenames with special characters then a policy should be used to generate the listfile. A policy generated file should be hand-edited to remove all other entries except the filenames. As of version 4.2.1, this option is deprecated. The –list-file option can handle this.
--home-inode-file PolicyListFile - The specified file is a file containing the list of files from home that need to be pre-populated in the cache and this file is generated using policy. This should not be hand-edited. This option is deprecated. The –list-file option can handle this.
--home-fs-path HomeFileSystemPath - Specifies the full path to the fileset at the home cluster and can be used in conjunction with –list-file. You must use this option, when in the NSD protocol the mount point on the gateway nodes of the afmTarget filesets does not match the mount point on the Home cluster. For example, the home filesystem is mounted on the home cluster at /gpfs/homefs1. The home filesystem is mounted on the cache using NSD protocol at /gpfs/remotefs1.
For example, mmafmctl gpfs1 prefetch -j cache1 –list-file /tmp/list.allfiles --home-fs-path /gpfs/remotefs1.
Prefetch is an asynchronous process and the fileset can be used while prefetch is in progress. Prefetch completion can be monitored by using the afmPrepopEnd callback event or looking at mmafmctl Device prefetch command with no options.
Prefetch pulls the complete file contents from home (unless the –metadata-only flag is used), so the file is designated as cached when it is completely prefetched. Prefetch of partially cached files caches the complete file.
Prefetch can be run in parallel on multiple filesets, although only one prefetch job can run on a fileset.
If a file is in the process of getting prefetched, it is not evicted.
If parallel data transfer is configured, all gateways participate in the prefetch process.
If the filesystem unmounts during prefetch on the gateway, prefetch needs to be issued again.
Prefetch can be triggered on inactive filesets.
Directories are also prefetched to the cache if specified in the prefetch file. If you specify a directory in the prefetch file and if that directory is empty, the empty directory is prefetched to cache. If the directory contains files or sub-directories, you must specify the names of the files or sub-directories which you want to prefetch. If you do not specify names of individual files or sub-directories inside a directory, that directory is prefetched without its contents.
If you run the prefetch command with data or metadata options, statistics like queued files, total files, failed files, total data (in Bytes) is displayed as in the following example of command and system output -
mmafmctl: Performing prefetching of fileset: <fileset>
Queued (Total) Failed TotalData (approx in Bytes)
0 (56324) 0 0
5 (56324) 2 1353559
56322 (56324) 2 14119335
Wed Oct 1 13:59:22.780 2014: [I] AFM: Prefetch recovery started for the file system gpfs1 fileset iw1.
mmafmctl: Performing prefetching of fileset: iw1
Wed Oct 1 13:59:23 EDT 2014: mmafmctl: [I] Performing prefetching of fileset: iw1
Wed Oct 1 14:00:59.986 2014: [I] AFM: Starting 'queue' operation for fileset 'iw1' in filesystem '/dev/gpfs1'.
Wed Oct 1 14:00:59.987 2014: [I] Command: tspcache /dev/gpfs1 1 iw1 0 257 42949 67295 0 0 1393371
Wed Oct 1 14:01:17.912 2014: [I] Command: successful tspcache /dev/gpfs1 1 iw1 0 257 4294967295 0 0 1393371
Wed Oct 1 14:01:17.946 2014: [I] AFM: Prefetch recovery completed for the filesystem gpfs1 fileset iw1. error 0
- Metadata population using prefetch:
# mmafmctl fs1 getstate -j ro
# mmafmctl fs1 prefetch -j ro --metadata-only --list-file=px.res.list.ListFileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec ------------ -------------- ----------- ------------ ------------ ------------- ro nfs://c26c3apv1/gpfs/homefs1/dir3 Active c26c2apv2 0 7 List Policy: RULE EXTERNAL LIST 'List' RULE 'List' LIST 'List' WHERE PATH_NAME LIKE'%' Run the policy at home:mmapplypolicy /gpfs/homefs1/dir3 -P px -f px.res -L 1 -N mount -I defer Policy creates a file which should be manually edited to retain only the file names. Thereafter this file is used at the cache to populate metadata.
mmafmctl: Performing prefetching of fileset: ro Queued (Total) Failed TotalData (approx in Bytes) 0 (2) 0 0 100 (116) 5 1368093971 116 (116) 5 1368093971 prefetch successfully queued at the gateway Prefetch end can be monitored by using this event: Thu May 21 06:49:34.748 2015: [I] Calling User Exit Script prepop: event afmPrepopEnd, Async command prepop.sh. The statistics of the last prefetch command can be viewed by running the following command: mmafmctl fs1 prefetch -j ro Fileset Name Async Read (Pending) Async Read (Failed) Async Read (Already Cached) Async Read (Total) Async Read (Data in Bytes) ------------ -------------------- ------------------ --------------------------- --------------------- ro 0 1 0 7 0
- Prefetch of data by giving list of files from home: # cat
/listfile1
# mmafmctl fs1 prefetch -j ro --list-file=/listfile1/gpfs/homefs1/dir3/file1 /gpfs/homefs1/dir3/dir1/file1
# mmafmctl fs1 prefetch -j rommafmctl: Performing prefetching of fileset: ro Queued (Total) Failed TotalData (approx in Bytes) 0 (2) 0 0 2 (2) 0 1368093971
Fileset Name Async Read (Pending) Async Read (Failed) Async Read (Already Cached) Async Read (Total) Async Read (Data in Bytes) ------------ -------------------- ------------------ --------------------------- ------------------ ro 0 0 0 2 122880
- Prefetch of data using list file that is generated using policy at home:
Inode file is created using the above policy at home, and must be used as such without hand-editing.
List Policy: RULE EXTERNAL LIST 'List' RULE 'List' LIST 'List' WHERE PATH_NAME LIKE '%'
For files with special characters, path names must be encoded with ESCAPE %.
RULE EXTERNAL LIST 'List' ESCAPE '%' RULE 'List' LIST 'List' WHERE PATH_NAME LIKE '%'
Run the policy at home:
# mmapplypolicy /gpfs/homefs1/dir3 -P px -f px.res -L 1 -N mount -I defer# cat /lfile2
#mmafmctl fs1 prefetch -j ro –list-file=/lfile2113289 65538 0 -- /gpfs/homefs1/dir3/file2 113292 65538 0 -- /gpfs/homefs1/dir3/dir1/file2
mmafmctl: Performing prefetching of fileset: ro # mmafmctl fs1 prefetch -j ro –list-file=/lfile2 mmafmctl: Performing prefetching of fileset: ro Queued (Total) Failed TotalData (approx in Bytes) 0 (2) 0 0 2 (2) 0 1368093971
- Prefetch using --home-fs-path option for a target with NSD
protocol:
# mmafmctl fs1 getstate -j ro2
# cat /lfile2Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec ------------ -------------- ----------- ------------ ------------ ------------- ro2 gpfs:///gpfs/remotefs1/dir3 Active c26c4apv1 0 7
# mmafmctl fs1 prefetch -j ro2 –list-file=/lfile2 --home-fs-path=/gpfs/homefs1/dir3113289 65538 0 -- /gpfs/homefs1/dir3/file2 113292 65538 0 -- /gpfs/homefs1/dir3/dir1/file2
# mmafmctl fs1 prefetch -j ro2mmafmctl: Performing prefetching of fileset: ro2 Queued (Total) Failed TotalData (approx in Bytes) 0 (2) 0 0 2 (2) 0 113292
Fileset Name Async Read (Pending) Async Read (Failed) Async Read (Already Cached) Async Read (Total) Async Read (Data in Bytes) ------------ -------------------- ------------------ ------------------ ---------------- ro2 0 0 0 2 122880