cache
Queries the LSF data management cache.
Options
File-based query
bdata cache [-w | -l] [-u all | -u user_name] [-g all | -g user_group_name] [-dmd cluster_name] "[host_name:/]abs_file_path"Job-based query
bdata cache [-dmd cluster_name] [-w | -l] job_ID[@cluster_name]Description
- File-based and folder-based query
-
Use the bdata cache abs_file_path or bdata cache "abs_folder_path/[*]" command to determine whether the files or folders that are required for your job are already staged in to the cache.
LSF data manager administrators can see information about staged-in files in the cache for all users (with the -u all option) or for a specific user (with the -u user_name option). The CACHE_PERMISSIONS parameter in the lsf.datamanager file determines which cache is accessible to non-administrator users.- CACHE_PERMISSIONS=user
- Each user has a cache in the staging area. Ordinary users can request information only about their own cached files. user is the default.
- CACHE_PERMISSIONS=all
- The staging area is a single cache. All users can see all files in the cache.
- CACHE_PERMISSIONS=group
- Each UNIX group has cache in the staging area. By default, only users that belong to the same
primary group can see the files for their group.
If CACHE_PERMISSIONS=group is specified, the -g option shows the cached files that belong to the specified user group.
If you specify a host name (host_name:abs_file_path or host_name:abs_folder_path/), the bdata cache command shows the files or folders that are staged in from the specified host. The path must match the bjobs -data command output exactly.
If a host name is not specified, the bdata cache command shows files that are staged in from the current local host.
- Job-based query
-
Use the bdata cache job_ID command to show files that are referenced by the specified job ID. If a cluster name (with the @cluster_nameoption) is not specified with the job ID, the current cluster name is assumed.
- Cache cleanup for input and output file records
-
You can use file-based query to see input file records until LSF data manager cleans up the job record and input files. After the job is finished and the grace period that is specified by the CACHE_INPUT_GRACE_PERIOD parameter in the lsf.datamanager file expires, LSF data manager cleans up the job record and input files cannot be queried.
You can use job-based query to see input file records only until those jobs finish (DONE or EXIT status).
You can query output file records until the following events occur:- All of the output file records associated with the job have TRANSFERRED or ERROR status.
- And the grace period that is specified by the CACHE_OUTPUT_GRACE_PERIOD parameter expires for all files.
If both output and input job records exist, you can query the cache until all of these conditions are met.
Output: Default format
- HASH
- The hash key of the particular copy of the file.
- STATUS
- The status of the file.
- NEW
- LSF data manager received a requirement for the file, but a transfer job is not submitted for it yet.
- STAGING
- For input files, the file is requested but is not yet in the cache. For output files, the file is in the cache and is either waiting to be transferred out or is being transferred out.
- TRANSFERRED
- For input files, the file is in the cache. For output files, the transfer job for the file is complete.
- ERROR
- Output file transfer failed.
- UNKNOWN
- During recovery, it's possible that previously transferred files might show up as unknown for a short period while data manager recovers its state.
- LINKED
- If LSF data manager can directly access the required file in the cache, no transfer job is needed and the file is not copied into the cache. LSF data manager creates a symbolic link from the cache to the required file. The LINKED status shows that the file was symbolically linked.
- REF_JOB
- For file-based query only. List of job IDs of jobs that request the file. REF_JOB is not displayed for job-based query.
- XFER_JOB
- The job ID of the transfer job. If LSF data manager can directly access the required file in the cache, no transfer job is needed and the file is not copied into the cache. A dash (-) indicates that no transfer job is associated with the file.
- GRACE
- After files are no longer needed by any job, unused input and output files in the data manager
cache are cleaned up after a configurable grace period
(CACHE_INPUT_GRACE_PERIOD and CACHE_OUTPUT_GRACE_PERIOD
parameters in lsf.datamanager). GRACE shows the
remaining hours and minutes of the grace period.
- Input file records enter grace period after file transfer is complete (STATUS is TRANSFERRED), and the list of jobs for REF_JOB becomes empty. After the grace period expires, the files are cleaned up and can no longer be queried by file name. The default input grace period is 1440 minutes (one day).
- Output file records enter grace period immediately after their status becomes TRANSFERRED. However, the files and job records are not cleaned up until the grace periods expire for all stage-out requirements that are associated with the same job. Output files can be queried by file name until the grace period expires for all output file records associated with the job. The default output grace period is 180 minutes (3 hours). Files that are uploaded to the cache with the bstage out -tag command must be cleaned manually with the bdata tags clean command.
Output: long format
- PERMISSION
- Access permissions for the file, which is defined by the CACHE_PERMISSIONS
parameter in lsf.datamanager.
When CACHE_PERMISSIONS=all, the PERMISSION field shows all.
When the CACHE_ACCESS_CONTROL=Y parameter is configured in lsf.datamanager, the PERMISSION field shows the user group and the file permissions.
- SIZE
-
Units for file size.
- nnn B if file size is less than 1 KB
- nnn[.n] KB if file size is less than 1 MB
- nnn[.n] MB if file size is less than 1 GB
- nnn[.n] GB if file size is 1 GB or larger
- nnn[.n] EB is displayed if file size is 1 EB or larger
- MODIFIED
- The last modified time of the file, as it was at job submission time or at the time of the stage out request.
- CACHE_LOCATION
- The full location of the file in the cache, as mounted on the data manager hosts.
bdata cache -l hostA:/home/user1/job.sh
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/job.sh
PERMISSION user:user1
HASH 7fb71a04569b51c851122553e2c728c5
SIZE 5 MB
STATUS TRANSFERRED
REF_JOB 1435@cluster1
XFER_JOB 1906@cluster2 FINISHED Mon Aug 18 09:05:25 2014
GRACE -
MODIFIED Thu Aug 14 17:01:57 2014
CACHE_LOCATION:
/scratch/user1/staging/stgin/user/user1/hostA/home/user1/job.sh/e2cc059b47c094544791664a51489c8c
Examples: query by file or folder
The file is in the cache with one shared copy that is cached for different jobs:
bdata cache hostA:/home/user1/transfer_tool.sh
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/transfer_tool.sh
HASH STATUS REF_JOB XFER_JOB GRACE
ab7dc9* STAGING 2947@cluster1 2949@cluster1 -
2952@cluster1
2954@cluster1
bsub -data /home/user1/data/file1.txt -datagrp design1 sleep 9999
Job <11297> is submitted to default queue <normal>.
bdata cache -g designl /home/user1/data/file1.txt
--------------------------------------------------------------------------------
INPUT:
hosta:/home/user1/data/file1.txt
HASH STATUS REF_JOB XFER_JOB GRACE
fbea85* LINKED 11297@cluster1 - -
bdata cache "/home/user1/folder1/" -l
--------------------------------------------------------------------------------
INPUT:
hb05b10:/home/user1/folder1/
PERMISSION group:lsf rwx------ [manual]
HASH eb72d80f6deeeaf51e7f2913451bb9da
SIZE 4 KB
STATUS TRANSFERRED
REF_JOB 44843@lsf913
XFER_JOB 44844@lsf913 FINISHED Wed May 10 14:51:47 2017
GRACE -
MODIFIED Tue May 9 09:53:32 2017
CACHE_LOCATION:
/home/user1/scratch/staging/stgin/all/hosta/home/user1/folder1//eb72d80f6deeeaf51e7f2913451bb9da
bdata cache "/home/user1/folder1/*" -l
--------------------------------------------------------------------------------
INPUT:
hb05b10:/home/user1/folder1/*
PERMISSION group:lsf rwx------ [manual]
HASH f63ec6dc849a7390fc622620d88129f3
SIZE 4 KB
STATUS TRANSFERRED
REF_JOB 44845@lsf913
XFER_JOB 44846@lsf913 FINISHED Wed May 10 14:51:48 2017
GRACE -
MODIFIED Tue May 9 09:53:32 2017
CACHE_LOCATION:
/home/user1/scratch/staging/stgin/all/hb05b10/home/user1/folder1/*/f63ec6dc849a7390fc622620d88129f3
When you use the asterisk character (*) at the end of the path, the data requirements string must be in quotation marks.
Examples: query by job
The job requests two input files. During job execution, file data2 is copied to two other locations. Files being staged out are listed as OUTPUT and show their destinations:
bdata cache 84044
Job <84044@cluster1> has the following file records in LSF data manager:
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/data2
HASH STATUS XFER_JOB GRACE
68990b* TRANSFERRED 84045@cluster1 -
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/data3
HASH STATUS XFER_JOB GRACE
e2fff4* TRANSFERRED 84056@cluster1 -
--------------------------------------------------------------------------------
OUTPUT:
hostB:/home/user1/data2
TO:
hostA:/scratch/user1/workspace
HASH STATUS XFER_JOB GRACE
68990b* TRANSFERRED 84091@cluster1 -
Examples: query a single file, long output
bdata cache -l hostA:/home/user1/testDATA/rt/rt1/ada2 -u user1
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/testDATA/rt/rt1/ada2
PERMISSION user:user1
HASH 7fb71a05130b51c673953948e2c397c5
SIZE 50 MB
STATUS TRANSFERRED
REF_JOB 1435@cluster1
XFER_JOB 1906@cluster1 FINISHED Mon Aug 18 09:05:25 2014
GRACE -
MODIFIED Wed Apr 30 14:41:22 2014
CACHE_LOCATION:
/data/cache/stgin/user/user1/hostA/home/user1/testDATA/rt/rt1/ada2/7fb71a05130b51c673953948e2c397c5
bdata cache -l -g design1 /home/user1/data/file1.txt
--------------------------------------------------------------------------------
INPUT:
hosta:/home/user1/data/file1.txt
PERMISSION group:design1
HASH fbea858bdf6ddefc6c7f44dc6a08f1a6
SIZE 4 B
STATUS LINKED
REF_JOB 11297@cluster1
XFER_JOB -
GRACE -
MODIFIED Thu Oct 9 07:54:19 2014
CACHE_LOCATION:
/scratch/data/user1/staging1/stgin/group/design1/hosta/home/user1/data/file1.txt/fbea858bdf6ddefc6c7f44dc6a08f1a6
Examples: query multiple files for the same job, long output
bdata cache -l 1909
Job <1909@cluster1> has the following file records in LSF data manager:
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/testDATA/status.1
PERMISSION user:user1
HASH 0f9267a79de4bb2f9143b61ab741afda
SIZE 290 B
STATUS TRANSFERRED
XFER_JOB 1908@cluster1 FINISHED Mon Aug 18 10:01:51 2014
GRACE -
MODIFIED Thu Jul 3 10:50:53 2014
CACHE_LOCATION:
/data/cache/stgin/user/user1/hostB/home/user1/testDATA/status.1/0f9267a79de4bb2f9143b61ab741afda
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/testDATA/status.2
PERMISSION user:user1
HASH 2b992669d4ce96902cd639dda190a586
SIZE 0 B
STATUS TRANSFERRED
XFER_JOB 1910@cluster1 FINISHED Mon Aug 18 10:02:27 2014
GRACE -
MODIFIED Thu Jul 3 10:49:36 2014
CACHE_LOCATION:
/data/cache/stgin/user/user1/hostB/home/user1/testDATA/status.2/2b992669d4ce96902cd639dda190a586
--------------------------------------------------------------------------------
OUTPUT:
hostA:/home/user1/testDATA/status.1
TO:
hostA:/scratch/user1/data/out
HASH 0f9267a79de4bb2f9143b61ab741afda
SIZE 290 B
STATUS STAGING
XFER_JOB 1911@cluster1
GRACE -
MODIFIED Thu Jul 3 10:50:53 2014
CACHE_LOCATION:
/data/cluster1cache/stgout/cluster1/hostA/1909/home/user1/testDATA/status.1/0f9267a79de4bb2f9143b61ab741afda
--------------------------------------------------------------------------------
OUTPUT:
hostA:/home/user1/testDATA/status.2
TO:
hostA:/scratch/user1/data/out
HASH 2b992669d4ce96902cd639dda190a586
SIZE 0 B
STATUS STAGING
XFER_JOB 1912@cluster1
GRACE -
MODIFIED Thu Jul 3 10:49:36 2014
CACHE_LOCATION:
/data/cache/stgin/user/user1/hostB/home/user1/testDATA/status.2/2b992669d4ce96902cd639dda190a586
Examples: query the same file for multiple jobs, long output
bdata cache -l /home/user1/testDATA/status.1
--------------------------------------------------------------------------------
INPUT:
hostA:/home/user1/testDATA/status.1
PERMISSION user:user1
HASH 0f9267a79de4bb2f9143b61ab741afda
SIZE 290 B
STATUS TRANSFERRED
REF_JOB 1909@cluster1
1913@cluster1
XFER_JOB 1908@cluster1 FINISHED Mon Aug 18 10:01:51 2014
GRACE -
MODIFIED Thu Jul 3 10:50:53 2014
CACHE_LOCATION:
/data/cluster1cache/stgin/user/user1/hostA/home/user1/testDATA/status.1/0f9267a79de4bb2f9143b61ab741afda
Examples: query by cluster with the -dmd option
bdata cache -dmd cluster1 hostA:/newshare/scal/user1/data_files/seqdata.0
--------------------------------------------------------------------------------
INPUT:
hostA:/newshare/scal/user1/data_files/seqdata.0
HASH STATUS REF_JOB XFER_JOB GRACE
6e91e3* TRANSFERRED 15@cluster1 5@cluster1 -
16@cluster1
The following example queries data requirements for job 15 on cluster cluster1.
bdata cache -dmd cluster1 15
Job <15@cluster1> has the following file records in LSF data manager:
--------------------------------------------------------------------------------
OUTPUT:
hostA:/newshare/scal/user1/data_files/15/seqdata.1
TO:
hostB:/newshare/scal/user1/data_files/seqdata.15
HASH STATUS XFER_JOB GRACE
e21557* STAGING 5@cluster1 -
Examples: file query when user group cache access is enabled
bdata cache -l 1152
Job <1152@dm1> has the following file records in LSF data manager:
--------------------------------------------------------------------------------
INPUT:
hostA:/newshare/scal/user1/data_files/15/seqdata.1
PERMISSION group:pcl rwxr-x--- [manual]
HASH b7202f200c0240a66493f81f0e2e8875
SIZE 1 KB
STATUS TRANSFERRED
XFER_JOB 1153@dm1 FINISHED Tue Nov 17 10:42:25 2015
GRACE -
MODIFIED Tue Apr 19 15:59:28 2015
CACHE_LOCATION:
/data/cluster1cache/stgin/user/user1/hostA/home/user1/testDATA/seqdata.1/b7202f200c0240a66493f81f0e2e8875