cache

Queries the LSF data management cache.

Options

File-based query

bdata cache [-w | -l] [-u all | -u user_name] [-g all | -g user_group_name] [-dmd cluster_name] "[host_name:/]abs_file_path"
bdata cache [-w | -l] [-u all | -u user_name] [-g all | -g user_group_name] [-dmd cluster_name] "[host_name:/]abs_folder_path/[*]"

Job-based query

bdata cache [-dmd cluster_name] [-w | -l] job_ID[@cluster_name]

Description

File-based and folder-based query

Use the bdata cache abs_file_path or bdata cache "abs_folder_path/[*]" command to determine whether the files or folders that are required for your job are already staged in to the cache.

LSF data manager administrators can see information about staged-in files in the cache for all users (with the -u all option) or for a specific user (with the -u user_name option). The CACHE_PERMISSIONS parameter in the lsf.datamanager file determines which cache is accessible to non-administrator users.
CACHE_PERMISSIONS=user
Each user has a cache in the staging area. Ordinary users can request information only about their own cached files. user is the default.
CACHE_PERMISSIONS=all
The staging area is a single cache. All users can see all files in the cache.
CACHE_PERMISSIONS=group
Each UNIX group has cache in the staging area. By default, only users that belong to the same primary group can see the files for their group.

If CACHE_PERMISSIONS=group is specified, the -g option shows the cached files that belong to the specified user group.

If you specify a host name (host_name:abs_file_path or host_name:abs_folder_path/), the bdata cache command shows the files or folders that are staged in from the specified host. The path must match the bjobs -data command output exactly.

If a host name is not specified, the bdata cache command shows files that are staged in from the current local host.

Job-based query

Use the bdata cache job_ID command to show files that are referenced by the specified job ID. If a cluster name (with the @cluster_nameoption) is not specified with the job ID, the current cluster name is assumed.

Cache cleanup for input and output file records

You can use file-based query to see input file records until LSF data manager cleans up the job record and input files. After the job is finished and the grace period that is specified by the CACHE_INPUT_GRACE_PERIOD parameter in the lsf.datamanager file expires, LSF data manager cleans up the job record and input files cannot be queried.

You can use job-based query to see input file records only until those jobs finish (DONE or EXIT status).

You can query output file records until the following events occur:
  • All of the output file records associated with the job have TRANSFERRED or ERROR status.
  • And the grace period that is specified by the CACHE_OUTPUT_GRACE_PERIOD parameter expires for all files.

If both output and input job records exist, you can query the cache until all of these conditions are met.

Output: Default format

By default, the following information is shown for each file:
HASH
The hash key of the particular copy of the file.
STATUS
The status of the file.
NEW
LSF data manager received a requirement for the file, but a transfer job is not submitted for it yet.
STAGING
For input files, the file is requested but is not yet in the cache. For output files, the file is in the cache and is either waiting to be transferred out or is being transferred out.
TRANSFERRED
For input files, the file is in the cache. For output files, the transfer job for the file is complete.
ERROR
Output file transfer failed.
UNKNOWN
During recovery, it's possible that previously transferred files might show up as unknown for a short period while data manager recovers its state.
LINKED
If LSF data manager can directly access the required file in the cache, no transfer job is needed and the file is not copied into the cache. LSF data manager creates a symbolic link from the cache to the required file. The LINKED status shows that the file was symbolically linked.
REF_JOB
For file-based query only. List of job IDs of jobs that request the file. REF_JOB is not displayed for job-based query.
XFER_JOB
The job ID of the transfer job. If LSF data manager can directly access the required file in the cache, no transfer job is needed and the file is not copied into the cache. A dash (-) indicates that no transfer job is associated with the file.
GRACE
After files are no longer needed by any job, unused input and output files in the data manager cache are cleaned up after a configurable grace period (CACHE_INPUT_GRACE_PERIOD and CACHE_OUTPUT_GRACE_PERIOD parameters in lsf.datamanager). GRACE shows the remaining hours and minutes of the grace period.
  • Input file records enter grace period after file transfer is complete (STATUS is TRANSFERRED), and the list of jobs for REF_JOB becomes empty. After the grace period expires, the files are cleaned up and can no longer be queried by file name. The default input grace period is 1440 minutes (one day).
  • Output file records enter grace period immediately after their status becomes TRANSFERRED. However, the files and job records are not cleaned up until the grace periods expire for all stage-out requirements that are associated with the same job. Output files can be queried by file name until the grace period expires for all output file records associated with the job. The default output grace period is 180 minutes (3 hours). Files that are uploaded to the cache with the bstage out -tag command must be cleaned manually with the bdata tags clean command.

Output: long format

In a long format display, the following additional information is displayed:
PERMISSION
Access permissions for the file, which is defined by the CACHE_PERMISSIONS parameter in lsf.datamanager.

When CACHE_PERMISSIONS=all, the PERMISSION field shows all.

When the CACHE_ACCESS_CONTROL=Y parameter is configured in lsf.datamanager, the PERMISSION field shows the user group and the file permissions.

SIZE
Units for file size.
  • nnn B if file size is less than 1 KB
  • nnn[.n] KB if file size is less than 1 MB
  • nnn[.n] MB if file size is less than 1 GB
  • nnn[.n] GB if file size is 1 GB or larger
  • nnn[.n] EB is displayed if file size is 1 EB or larger
MODIFIED
The last modified time of the file, as it was at job submission time or at the time of the stage out request.
CACHE_LOCATION
The full location of the file in the cache, as mounted on the data manager hosts.
bdata cache -l hostA:/home/user1/job.sh
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/job.sh

PERMISSION          user:user1
HASH                7fb71a04569b51c851122553e2c728c5
SIZE                5 MB
STATUS              TRANSFERRED
REF_JOB             1435@cluster1
XFER_JOB            1906@cluster2 FINISHED Mon Aug 18 09:05:25 2014
GRACE               -
MODIFIED            Thu Aug 14 17:01:57 2014

CACHE_LOCATION:
/scratch/user1/staging/stgin/user/user1/hostA/home/user1/job.sh/e2cc059b47c094544791664a51489c8c

Examples: query by file or folder

The file is in the cache with one shared copy that is cached for different jobs:

bdata cache hostA:/home/user1/transfer_tool.sh
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/transfer_tool.sh
HASH     STATUS   REF_JOB        XFER_JOB       GRACE
ab7dc9*  STAGING  2947@cluster1  2949@cluster1  - 
                  2952@cluster1
                  2954@cluster1
The following job requests a file that is owned by the user group design1:
bsub -data /home/user1/data/file1.txt -datagrp design1 sleep 9999
Job <11297> is submitted to default queue <normal>.
Use the -g option to query files that belong to the specified group:
bdata cache -g designl /home/user1/data/file1.txt
--------------------------------------------------------------------------------
INPUT:
hosta:/home/user1/data/file1.txt

HASH    STATUS       REF_JOB                XFER_JOB               GRACE      
fbea85* LINKED       11297@cluster1         -                      -          
Note: The status of the file /home/user1/data/file1.txt is LINKED, and XFER_JOB is shown as a dash (-). LSF data manager can directly access the required file in the cache, so no transfer job is needed and the file is not copied into the cache. LSF data manager created a symbolic link from the cache to the required file. The LINKED status shows that the file was symbolically linked. A dash (-) indicates that no transfer job is associated with the file.
When the input path ends in a slash (/), information about folders that were recursively copied into the cache are displayed.
bdata cache "/home/user1/folder1/" -l
--------------------------------------------------------------------------------
INPUT:
hb05b10:/home/user1/folder1/

PERMISSION          group:lsf    rwx------    [manual]
HASH                eb72d80f6deeeaf51e7f2913451bb9da
SIZE                4 KB
STATUS              TRANSFERRED
REF_JOB             44843@lsf913
XFER_JOB            44844@lsf913 FINISHED Wed May 10 14:51:47 2017
GRACE               -
MODIFIED            Tue May  9 09:53:32 2017

CACHE_LOCATION:
/home/user1/scratch/staging/stgin/all/hosta/home/user1/folder1//eb72d80f6deeeaf51e7f2913451bb9da
When the input path ends in a slash and an asterisk (/*) only the top-level folder was requested.
bdata cache "/home/user1/folder1/*" -l
--------------------------------------------------------------------------------
INPUT:
hb05b10:/home/user1/folder1/*

PERMISSION          group:lsf    rwx------    [manual]
HASH                f63ec6dc849a7390fc622620d88129f3
SIZE                4 KB
STATUS              TRANSFERRED
REF_JOB             44845@lsf913
XFER_JOB            44846@lsf913 FINISHED Wed May 10 14:51:48 2017
GRACE               -
MODIFIED            Tue May  9 09:53:32 2017

CACHE_LOCATION:
/home/user1/scratch/staging/stgin/all/hb05b10/home/user1/folder1/*/f63ec6dc849a7390fc622620d88129f3

When you use the asterisk character (*) at the end of the path, the data requirements string must be in quotation marks.

Examples: query by job

The job requests two input files. During job execution, file data2 is copied to two other locations. Files being staged out are listed as OUTPUT and show their destinations:

bdata cache 84044
Job <84044@cluster1> has the following file records in LSF data manager:
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/data2

HASH    STATUS       XFER_JOB               GRACE
68990b* TRANSFERRED  84045@cluster1         -

--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/data3

HASH    STATUS       XFER_JOB               GRACE
e2fff4* TRANSFERRED  84056@cluster1         -

--------------------------------------------------------------------------------
OUTPUT:
hostB:/home/user1/data2
TO:
hostA:/scratch/user1/workspace

HASH    STATUS       XFER_JOB               GRACE
68990b* TRANSFERRED  84091@cluster1         -

Examples: query a single file, long output

bdata cache -l hostA:/home/user1/testDATA/rt/rt1/ada2 -u user1
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/testDATA/rt/rt1/ada2

PERMISSION          user:user1
HASH                7fb71a05130b51c673953948e2c397c5
SIZE                50 MB
STATUS              TRANSFERRED
REF_JOB             1435@cluster1
XFER_JOB            1906@cluster1 FINISHED Mon Aug 18 09:05:25 2014
GRACE               -
MODIFIED            Wed Apr 30 14:41:22 2014

CACHE_LOCATION:
/data/cache/stgin/user/user1/hostA/home/user1/testDATA/rt/rt1/ada2/7fb71a05130b51c673953948e2c397c5
The following example uses the -g option to query a file that belongs to the user group design1:
bdata cache -l -g design1 /home/user1/data/file1.txt 
--------------------------------------------------------------------------------
INPUT:
hosta:/home/user1/data/file1.txt

PERMISSION          group:design1
HASH                fbea858bdf6ddefc6c7f44dc6a08f1a6
SIZE                4 B
STATUS              LINKED
REF_JOB             11297@cluster1
XFER_JOB            -
GRACE               -
MODIFIED            Thu Oct  9 07:54:19 2014

CACHE_LOCATION:
/scratch/data/user1/staging1/stgin/group/design1/hosta/home/user1/data/file1.txt/fbea858bdf6ddefc6c7f44dc6a08f1a6

Examples: query multiple files for the same job, long output

bdata cache -l 1909
Job <1909@cluster1> has the following file records in LSF data manager:
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/testDATA/status.1

PERMISSION          user:user1
HASH                0f9267a79de4bb2f9143b61ab741afda
SIZE                290 B
STATUS              TRANSFERRED
XFER_JOB            1908@cluster1 FINISHED Mon Aug 18 10:01:51 2014
GRACE               -
MODIFIED            Thu Jul  3 10:50:53 2014

CACHE_LOCATION:
/data/cache/stgin/user/user1/hostB/home/user1/testDATA/status.1/0f9267a79de4bb2f9143b61ab741afda

--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/testDATA/status.2

PERMISSION          user:user1
HASH                2b992669d4ce96902cd639dda190a586
SIZE                0 B
STATUS              TRANSFERRED
XFER_JOB            1910@cluster1 FINISHED Mon Aug 18 10:02:27 2014
GRACE               -
MODIFIED            Thu Jul  3 10:49:36 2014

CACHE_LOCATION:
/data/cache/stgin/user/user1/hostB/home/user1/testDATA/status.2/2b992669d4ce96902cd639dda190a586

--------------------------------------------------------------------------------
OUTPUT:
hostA:/home/user1/testDATA/status.1
TO:
hostA:/scratch/user1/data/out
HASH                0f9267a79de4bb2f9143b61ab741afda
SIZE                290 B
STATUS              STAGING
XFER_JOB            1911@cluster1
GRACE               -
MODIFIED            Thu Jul  3 10:50:53 2014

CACHE_LOCATION:
/data/cluster1cache/stgout/cluster1/hostA/1909/home/user1/testDATA/status.1/0f9267a79de4bb2f9143b61ab741afda

--------------------------------------------------------------------------------
OUTPUT:
hostA:/home/user1/testDATA/status.2
TO:
hostA:/scratch/user1/data/out

HASH                2b992669d4ce96902cd639dda190a586
SIZE                0 B
STATUS              STAGING
XFER_JOB            1912@cluster1
GRACE               -
MODIFIED            Thu Jul  3 10:49:36 2014

CACHE_LOCATION:
/data/cache/stgin/user/user1/hostB/home/user1/testDATA/status.2/2b992669d4ce96902cd639dda190a586

Examples: query the same file for multiple jobs, long output

bdata cache -l /home/user1/testDATA/status.1
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/testDATA/status.1

PERMISSION          user:user1
HASH                0f9267a79de4bb2f9143b61ab741afda
SIZE                290 B
STATUS              TRANSFERRED
REF_JOB             1909@cluster1
                    1913@cluster1
XFER_JOB            1908@cluster1 FINISHED Mon Aug 18 10:01:51 2014
GRACE               -
MODIFIED            Thu Jul  3 10:50:53 2014

CACHE_LOCATION:
/data/cluster1cache/stgin/user/user1/hostA/home/user1/testDATA/status.1/0f9267a79de4bb2f9143b61ab741afda

Examples: query by cluster with the -dmd option

The following example queries all instances of the file /newshare/scal/user1/data_files/seqdata.0 on host hostA for cluster cluster1.
bdata cache -dmd cluster1 hostA:/newshare/scal/user1/data_files/seqdata.0
--------------------------------------------------------------------------------
INPUT:
hostA:/newshare/scal/user1/data_files/seqdata.0

HASH    STATUS       REF_JOB                XFER_JOB               GRACE
6e91e3* TRANSFERRED  15@cluster1            5@cluster1             -
                     16@cluster1

The following example queries data requirements for job 15 on cluster cluster1.

bdata cache -dmd cluster1 15
Job <15@cluster1> has the following file records in LSF data manager:
--------------------------------------------------------------------------------
OUTPUT:
hostA:/newshare/scal/user1/data_files/15/seqdata.1
TO:
hostB:/newshare/scal/user1/data_files/seqdata.15

HASH    STATUS       XFER_JOB               GRACE
e21557* STAGING      5@cluster1            -

Examples: file query when user group cache access is enabled

When the CACHE_ACCESS_CONTROL=Y parameter is configured in lsf.datamanager, the bdata cache -l command shows the user group and the file permissions.
bdata cache -l 1152
Job <1152@dm1> has the following file records in LSF data manager:
--------------------------------------------------------------------------------
INPUT:
hostA:/newshare/scal/user1/data_files/15/seqdata.1
PERMISSION          group:pcl   rwxr-x---   [manual]
HASH                b7202f200c0240a66493f81f0e2e8875
SIZE                1 KB
STATUS              TRANSFERRED
XFER_JOB            1153@dm1 FINISHED Tue Nov 17 10:42:25 2015
GRACE               -
MODIFIED            Tue Apr 19 15:59:28 2015 

CACHE_LOCATION:
/data/cluster1cache/stgin/user/user1/hostA/home/user1/testDATA/seqdata.1/b7202f200c0240a66493f81f0e2e8875