Data locality based copy

Synopsis

localityCopy Device {-s {[Fileset]: Snapshot | srcDir | filePath}
                       {-t targetDir} [-l | -b] [-f] [-r]
                       [-a | -N {Node[,Node...] | NodeFile | NodeClass}]

Parameters

Device
The device name of the file system to which the disks belong. File system names need not be fully-qualified. fs0 is as acceptable as /dev/fs0. This must be the first parameter.
{-s {[Fileset]: Snapshot | srcDir | filePath}
Snapshot is the snapshot name. If :Snapshot is specified, the global snapshot is named Snapshot from Device. If there are more than 1 snapshots existing from :Snapshot or Snapshot, it will fail. Also, if it is fileset snapshot, ensure that the fileset is linked. srcDir is the source directory that is copied. The directory must exist in device. If the directory is the JunctionPath of one fileset, the fileset must be linked before running the script. filePath is the file path that will be copied.
Note: Snapshot is the snapshot name. srcDir and filePath must be absolute path.
-ttargetDir
Specifies the target directory to which the files from the snapshot or the directory will be copied. targetDir must be absolute and must exist on the node that is running the command.
-l
Only consider the locality if more than one node is involved. This might make some nodes busier than others. If there are active application jobs over the cluster and these jobs need enough network bandwidth, option -l makes the data copy consume as less as network bandwidth. When multiple nodes are specified with option -N, option -l might make the copy running over limited nodes and therefore take longer to finish data copy.
-b
Considers the locality if more than one node is involved and distributes the copy tasks among all involved nodes.
Note: The copy tasks are distributed at the file level (one file per copy task). The option -l and -b are exclusive. If either the option -l or -b is not specified, the option -l is true as default.
-f
If the to-be-copied file exists under targetDir, it will be overwritten if the option -f is specified. Or, the file will be skipped.
-r
When the option -s {srcDir} is specified, option -r will copy the files in recursive mode. For -s {[Fileset]:Snapshot}, option -r is always true.
-v
Displays verbose information.
-a
All nodes in the cluster are involved in copying tasks.
-N {Node[,Node...] | NodeFile | NodeClass}
Directs a set of nodes to be involved in copying tasks. -a is the default if option -N is not specified.

Notes

  1. If your file system mount point has special character, excluding +,-,_, it is not supported by this script.
  2. If the file path contains special character, such as a blank character or a line break character, the file is not copied with warning.
  3. When option -a or -N is specified, the file system for the -t targetDir must be mounted if it is from external NFS or another IBM Storage Scale file system.
  4. Only copies the regular data file, does not copy link, special files.
  5. If one file is not copied, the file is displayed and not copied again in the same invocation.
  6. You must specify option -s with snapshot. For directory, the file list is not rescanned to detect any newly created files or subdirectories.