Data specification file format
If you need to transfer a large number of files as a job data requirement, use a data specification file to provide a list of files that are required by the job.
Use the following rules to create a data specification file:
- The first line of the file must be the string #@dataspec.
- Blank lines and commented lines (beginning with #) preceding and following the #@dataspec string are ignored.
- The host_name:file_path pair specifies the location of the required data file.
- The host name must be a full host name. If no host name is specified, the submission host is used.
- IP addresses are not supported in lieu of host names.
- The file_path must be an absolute path (not a relative path) on the host.
- The file_path can contain only alpha-numeric characters (A-Z, a-z, and 0-9) and the following special characters: period (.), underscore (_), and dash (-). Spaces and other special characters are not supported except when a wildcard is used. The path to the data specification file itself must also conform to this convention except for wildcard characters: all paths resolved due to a wildcard are interpreted as files to transmit.
- Symbolic links (not as a result of resolving a wildcard character) are not permitted.
The following are rules for wildcard characters in the file paths that are used in the data
specification file:
- Ending the file with a slash character (/) transfers all of the files in the directory and all of its subdirectories.
- Ending the file path with slash and an asterisk (/*), transfers all files in immediate directory without recursion into subdirectories.
- The asterisk (*) wildcard is only permitted after a slash
(/) at the end of the file path.
When you use the asterisk character at the end of the path, the data requirements must be in quotation marks.
- If the data requirement is accessible from the submission host, bsub checks to determine whether it is a directory. If it is a directory on a remote host, bsub rejects the job.
- The %I wildcard resolves to an array index in a job array submission and can be used anywhere in the path. If the %I wildcard appears in the data specification file, but the job is not an array, it is interpreted as 0.