bstage out

Stages out data files for jobs with data requirements. The bstage command copies or creates symbolic links to files from the job current working directory to the data management cache, then requests a transfer job to copy the file or folder to a host.

Synopsis

bstage out -src file_path/file_name [-dst [host_name:]path[/file_name]] [-link]
bstage out -src folder_path/ [-dst [host_name:]path[/file_name]] [-link]
bstage out -src file_path/file_name -tag tag_name [-link | -g user_group_name]

Description

Copy or symbolically link a file or folder from the job current working directory to the data manager staging cache.

The bstage out command uses the value of the LSB_DATA_CACHE_TOP environment variable to find the staging area.

By default (if you specify the -src option, but not the -tag option), a transfer job is submitted by the LSF data manager to stage the file out from the staging area to the remote destination specified in the -dst option. If you specify the -tag option, the file or folder is copied only to the tag folder in the staging area and no transfer job is submitted.

Note: With the -src option, the transfer job that is submitted to LSF runs asynchronously even after the command returns. When you are staging out different files or folders, use a different destination for each one since the order in which the files are transferred is not guaranteed.

Options

-src file_path/file_name

Required. Path to the file to be copied from the job execution environment. Relative paths are resolved relative to the job current working directory.

If the -tag option is not specified, the path must be to a file. LSF contacts the LSF data manager to determine whether the file exists in the STAGING_AREA/stgout directory for the job. If the file does not exist, the file is first copied to the job stgout directory in the cache. LSF data manager then contacts LSF to submit a transfer job to LSF to stage the file out to the destination.

If the path contains symbolic links, LSF data manager uses the symbolic link name and copies the contents of the file to the staging area. For example, if the path is /tmp/linkdir1/outfile1, and linkdir1 is a symbolic link to /home/user1, the contents of the file outfile1 are copied to the cache in the appropriate stage out or tag folder under the relative path tmp/linkdir1/outfile1/, not tmp/home/user1/outfile1/.

-src folder_path/

Required. Path to the folder to be copied from the job execution environment. Relative paths are resolved relative to the job current working directory.

The asterisk (*) wildcard character -src folder_path/* is not supported for the bstage out command.

LSF contacts the LSF data manager to determine whether the folder exists in the STAGING_AREA/stgout directory for the job. If the folder does not exist, the folder is first copied to the job stgout directory in the cache. LSF data manager then contacts LSF to submit a transfer job to LSF to stage the folder out to the destination.

If the path contains symbolic links, LSF data manager uses the symbolic link name and copies the contents of the folder to the staging area. For example, if the path is /tmp/linkdir1/, and linkdir1 is a symbolic link to /home/user1, the contents of the folder linkdir1 are copied to the cache in the appropriate stage out or tag folder under the relative path tmp/linkdir1/, not tmp/home/user1/outfile1/.

-dst [host_name:]path][/file_name]
Path to the final destination of the transfer job that copies the file out of the staging area. If you do not specify the -dst option, the submission host and directory that is specified by the LSB_OUTDIR environment variable is assumed to be the root and the path that is provided to the -src option is appended to this root. The default host_name is the submission host.

The following table shows the mapping of the -dst argument and ultimate destination sent to the transfer tool command to stage out the job:

Command Transfer job destination
-dst not specified $LSB_SUB_HOST:$LSB_OUTDIR
-dst relative_path $LSB_SUB_HOST:$LSB_OUTDIR/relative_path
-dst absolute_path $LSB_SUB_HOST:absolute_path
-dst host_name:absolute_path host_name:absolute_path

The argument to the -dst option accepts both relative and absolute paths, both with and without host names, but all of these arguments are converted into an absolute host_name:path pair according to the table. You cannot use path descriptors that contain special names ~/, ./, and ../.

If the folders in the destination location do not exist, the success or failure of the transfer job depends on the transfer tool configured. For example, the default tool (the scp command) does not create destination folders, but other tools (such as the rsync command) do.

-link

Create symbolic links from the requested source files to the staging area cache location instead of copying them. Use the -link option to avoid unnecessary file copying between the execution host and the staging area. The staging area must be directly mounted on the job execution host to create the link.

Important: You must ensure that the source file will not be cleaned up after the end of the job to avoid the symbolic link from becoming stale before the file is staged out or used by a subsequent job.
-tag tag_name

Copy the file to the staging area tag directory associated with tag_name. LSF creates the directory if necessary. The LSF data manager does not submit a job to transfer out the file, or create a record for it in the cache. You cannot use the -tag option with the -dst or -link option.

LSF data manager associates the required files to an arbitrary name you choose, and the LSF data manager reports the existence of that tag if you query it with the bdata tags command.

Valid tag names can contain only alphanumeric characters ([A-z|a-z|0-9]), and a period (.), underscore (_), and dash (-). The tag name cannot contain the special operating system names for parent directory (../), current directory (./), or user home directory (~/). Tag names cannot contain spaces. Tag names cannot begin with a dash (-).

Use the bdata tags clean command to remove tags.

You must be the owner of the tag folder to copy files into it.

Important: You are responsible for the name space of your tags. LSF does not check whether the tag is valid. Use strings like the job ID, array index, and cluster name as part of your tag names to make sure that your tag is unique.
-g user_group_name
By default, when a job stages out files to a tag, the tag directory is only accessible by the user who submitted the job. When the CACHE_ACCESS_CONTROL = Y parameter is configured in the lsf.datamanager file, the –g option changes the group that is associated with the tag. The user_group_nameargument specifies the user group name to be associated with the tag. The permissions on the tag directory and its contents are set so that the specified group can access the files. You can also use the following command to change the group that is associated with a tag:
bdata chgrp -g group_name -tag tag_name