Parallel data transfers

Parallel data transfer improves the AFM data transfer performance.

To transfer large files from a cache cluster to a home cluster by using the primary gateway, you can configure the cache cluster to use all gateways defined in the cluster. When NFS is used for AFM data transfers, multiple NFS servers are recommended on the home cluster to improve parallel data transfer performance through gateway-to-NFS server mappings that are configured by using the mmafmconfig command. Multiple gateways can share a single NFS server, which provides limited parallelism benefits such as split writes. However, a one-to-one mapping between a gateway and an NFS server delivers the best throughput. All NFS servers on the home cluster must export the home path using the same parameters.

In a cache cluster that uses NFS for AFM data transfer, you can map each gateway to a specific NFS server in the home cluster. This mapping replaces the NFS server name specified in the afmTarget parameter. An export server map allows you to define multiple NFS servers and associate them with specific AFM gateway nodes. The mapping can be modified without changing the afmTarget parameter of a fileset. However, a fileset relink or a file system remount is required for the updated mapping to take effect.

Use the mmafmconfig command to define, display, update, and delete gateway-to-NFS server mappings.

To define multiple NFS servers for an AFMTarget parameter and use parallel data transfers:
  1. Define a mapping by using the mmafmconfig command.
  2. Use the mapping as the AFMTarget parameter for one or more filesets.
  3. Update parallel read and write thresholds, in chunk size, as required.

Creating AFM filesets by using NFS gateway mapping

The following example demonstrates how to configure gateway node mapping for an NFS target and then create AFM filesets by using the defined mappings.

  • Four AFM cache gateway nodes are available:
    • hs22n18
    • hs22n19
    • hs22n20
    • hs22n21
  • Two home NFS servers are used:

    • js22n01 (192.168.200.11)
    • js22n02 (192.168.200.12)
    Gateway nodes are mapped to specific NFS servers to enable controlled parallel data transfer.
  1. Define gateway node mappings. Create mappings that associate cache gateway nodes with home NFS servers.
    1. Create mapping1.
      mmafmconfig add mapping1 --export-map js22n01/hs22n18,js22n02/hs22n19
      A sample output is as follows:
      mmafmconfig: Command successfully completed
      mmafmconfig: Propagating the cluster configuration data to all affected nodes.
      This is an asynchronous process.
    2. Create mapping2.
      mmafmconfig add mapping2 --export-map js22n02/hs22n20,js22n01/hs22n21
      A sample output is as follows:
      mmafmconfig: Command successfully completed
      mmafmconfig: Propagating the cluster configuration data to all affected nodes.
      This is an asynchronous process.
  2. Verify the mapping configuration.
    mmafmconfig show
    A sample output is as follows:
    Map name:             mapping1
    Export server map:    192.168.200.12/hs22n19.gpfs.net,
                          192.168.200.11/hs22n18.gpfs.net
    
    Map name:             mapping2
    Export server map:    192.168.200.11/hs22n20.gpfs.net,
                          192.168.200.12/hs22n21.gpfs.net
  3. Create AFM filesets by using the mappings.
    1. Create a single-writer (SW) fileset.
      mmcrfileset gpfs1 sw1 --inode-space new \
      -p afmMode=sw,afmTarget=nfs://mapping1/gpfs/gpfs2/swhome
    2. Create a read-only (RO) fileset.
      mmcrfileset gpfs1 ro1 --inode-space new \
      -p afmMode=ro,afmTarget=nfs://mapping2/gpfs/gpfs2/swhome

Parallel read and write

Parallel reads and writes are effective only for files larger than the configured parallel threshold. The thresholds are defined by the following parameters:
  • afmParallelWriteThreshold
  • afmParallelReadThreshold
Parallel transfer applies to all file types except the following files:
  • Reads on sparse files
  • Files with partial file caching enabled

For these files, read operations are handled only by the primary gateway and are not split into chunks. The size of each transfer chunk is configured by using:

  • afmParallelWriteChunkSize
  • afmParallelReadChunkSize

These parameters control how data is split for parallel transfers across participating gateways.

AFM gateway node mapping

The following functions apply to an AFM gateway node mapping:
NSD protocol
When the NSD protocol is used and a fileset is created without a map, all gateway nodes participate in parallel data transfer.
NFS protocol
When the NFS protocol is used and multiple gateway nodes are mapped to the same NFS server:
  • Only one gateway node performs read operations.
  • Write operations are distributed across all mapped gateway nodes.
Gateway node mapping limitation
A gateway node can be mapped to only one NFS server.
Effect of map changes
Mapping changes take effect only after one of the following actions:
  • The fileset is re-linked, or
  • The file system is remounted.
Missing or mismatched mapping
If a map is not specified, or if the mapping does not match, parallel data transfer is not used. Normal data transfer is used instead.
Removing gateway designation
The gateway role can be removed from a node only if the node is not defined in any mapping configuration.
Note: If an AFM home cluster contains mixed architectures (x86 and PPC), parallel data transfer works only for nodes that belong to a single architecture. The architecture that initiates the data transfer first determines which node group is used.

Parallel data transfer and multiple remote mounts

This feature can be combined with the parallel data transfer by using multiple remote mounts feature to improve data transfer performance between an AFM cache and an AFM home cluster.

  • Both features use the same AFM gateway node mapping defined by the mmafmconfig command.
  • The features operate independently.
  • You can enable either or both features depending on workload requirements.