Parallel data transfers
Parallel data transfer improves the AFM data transfer performance.
To help the primary gateway exchange large files with the home cluster, a cache cluster can be configured to leverage all the gateways defined in the cluster. When using NFS for AFM data transfers multiple NFS servers are required at the home cluster. All NFS servers on the home cluster must export the home path using the same parameters.
In a cache cluster, using NFS for AFM data transfer, each gateway node can be mapped to a specific NFS server at home. A map replaces the NFS server name in the AFMTarget parameter. Creating an export server map can be used to define more than one NFS server and map those NFS servers to specific AFM gateways. A map can be changed without modifying the afmTarget parameter for a fileset, and needs fileset relink or file system remount for the map change to take effect. Use the mmafmconfig command to define, display, delete, and update mappings.
- Define a mapping.
- Use the mapping as the AFMTarget parameter for one or more filesets.
- Update parallel read and write thresholds, in chunk size, as required.
The following example shows a mapping for NFS target, assuming four cache gateway nodes hs22n18, hs22n19, hs22n20, and hs22n21, mapped to two home NFS servers js22n01 and js22n02 (192.168.200.11 and 192.168.200.12) and then creating SW filesets by using this mapping.
Define the mapping:
# mmafmconfig add mapping1 --export-map
js22n01/hs22n18,js22n02/hs22n19
mmafmconfig: Command successfully completed
mmafmconfig: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.
The
syntax followed here is
-mmafmconfig {add | update} MapName --export-map ExportServerMap
# mmafmconfig add mapping2 --export-map
js22n02/hs22n20,js22n01/hs22n21
mmafmconfig: Command successfully completed
mmafmconfig: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.
# mmafmconfig
show
Map name: mapping1
Export server map: 192.168.200.12/hs22n19.gpfs.net,192.168.200.11/hs22n18.gpfs.net
Map name: mapping2
Export server map: 192.168.200.11/hs22n20.gpfs.net,192.168.200.12/hs22n21.gpfs.net
#Create filesets by using these
mappings:
mmcrfileset gpfs1 sw1 --inode-space new –p afmMode=sw,afmTarget=nfs://mapping1/gpfs/gpfs2/swhome
mmcrfileset gpfs1 ro1 --inode-space new –p afmMode=ro,afmTarget=nfs://mapping2/gpfs/gpfs2/swhome
The syntax followed here is
-
mmcrfileset <FS> <fset_name> –p afmMode=<AFM Mode>,
afmTarget=<protocol>://<Mapping>/<remoteFS_Path>/<Target> --inode-space new
Parallel reads and writes are effective on files with sizes larger than those specified by the parallel threshold. The threshold is defined by using afmParallelWriteThreshold and afmParallelReadThreshold parameters, and is true for all types of files except reads on sparse files and files with partial file caching enabled, which is served only by the Primary gateway without splitting.
Use the afmParallelWriteChunkSize and afmParallelReadChunkSize parameters to configure the size of each chunk.
- While using native NSD protocol; if a fileset is created without any mapping, all gateway nodes are used for parallel data transfer.
- While using NFS protocol, if more than one gateway node is mapped to the same NFS server, only one performs a read task. However, a write task is split among all the gateway nodes.
- One gateway node cannot be mapped to more than one NFS server.
- Changes in the active mapping take effect after fileset re-link or file system remount.
- If mapping is not specified or if mapping does not match, data cannot be transferred by using parallel data transfers and normal data transfer function is used.
- Gateway designation can be removed from a node only if that node is not defined in any mapping.
This feature can be combined with the Parallel data transfer using multiple remote mounts feature to obtain better data transfer performance within an AFM cache and an AFM home. Both features use the same AFM gateway node mapping that is defined by using the mmafmconfig command. These features are independent of each other and you can set these features by considering what suits better for a workload.