File attributes in SQL expressions

SQL expressions can include file attributes that specify certain clauses.

The following file attributes can be used in SQL expressions specified with the WHERE, WEIGHT, and SHOW clauses:
ACCESS_TIME
Specifies an SQL time stamp value for the date and time that the file was last accessed (POSIX atime). See EXPIRATION_TIME.
BLOCKSIZE
Specifies the size, in bytes, of each block of the file.
CHANGE_TIME
Specifies an SQL time stamp value for the date and time that the file metadata was last changed (POSIX ctime).
CLONE_DEPTH
Specifies the depth of the clone tree for the file.
CLONE_IS_PARENT
Specifies whether the file is a clone parent.
CLONE_PARENT_FILESETID
Specifies the fileset ID of the clone parent. The fileset ID is available only if CLONE_PARENT_IS_SNAP is a nonzero value.
CLONE_PARENT_INODE
Specifies the inode number of the clone parent, or NULL if it is not a file clone.
CLONE_PARENT_IS_SNAP
Specifies whether the clone parent is in a snapshot.
CLONE_PARENT_SNAP_ID
Specifies the snapshot ID of the clone parent. The snapshot ID is available only if CLONE_PARENT_IS_SNAP is a nonzero value.
CREATION_TIME
Specifies an SQL time stamp value that is assigned when a file is created.
DEVICE_ID
Specifies the ID of the device that contains the directory entry.
DIRECTORY_HASH
Can be used to group files within the same directory.

DIRECTORY_HASH is a function that maps every PATH_NAME to a number. All files within the same directory are mapped to the same number and deeper paths are assigned to larger numbers.

DIRECTORY_HASH uses the following functions:
CountSubstr(BigString,LittleString)
Counts and returns the number of occurrences of LittleString in BigString.
HashToFloat(StringValue)
Is a hash function that returns a quasi-random floating point number ≥ 0 and < 1, whose value depends on a string value. Although the result might appear random, HashToFloat(StringValue) always returns the same floating point value for a particular string value.
The following rule lists the directory hash values for three directories:
RULE 'y' LIST 'xl' SHOW(DIRECTORY_HASH)
LIST 'xl' /abc/tdir/randy1 SHOW(+3.49449638091027E+000)
LIST 'xl' /abc/tdir/ax     SHOW(+3.49449638091027E+000)
LIST 'xl' /abc/tdir/mmPolicy.8368.765871DF/mm_tmp/PWL.12 SHOW(+5.21282524359412E+000)
LIST 'xl' /abc/tdir/mmPolicy.31559.1E018912/mm_tmp/PWL.3 SHOW(+5.10672733094543E+000)
LIST 'xl' /abc/tdir/mmPolicy.31559.1E018912/mm_tmp/PWL.2 SHOW(+5.10672733094543E+000)

The following rule causes files within the same directory to be grouped and processed together during deletion. Grouping the files can improve the performance of GPFS™ directory-locking and caching.

RULE 'purge' DELETE WEIGHT(DIRECTORY_HASH) WHERE (deletion-criteria)
EXPIRATION_TIME
Specifies the expiration time of the file, expressed as an SQL time-stamp value. If the expiration time of a file is not set, its expiration time is SQL NULL. You can detect such files by checking for "EXPIRATION_TIME IS NULL".
Remember the following points:
  • EXPIRATION_TIME is tracked independently from ACCESS_TIME and both values are maintained for immutable files.
  • Expiration time and indefinite retention are independent attributes. You can change the value of either one without affecting the value of the other.
FILE_HEAT
Specifies the heat of the file based on the file access time and access size. For more information, see /usr/lpp/mmfs/samples/ilm/README.
FILE_SIZE
Specifies the current size or length of the file, in bytes.
FILESET_NAME
Specifies the fileset where the path name for the files is located, or is to be created.
Note: Using the FOR FILESET clause has the same effect and is more efficient to evaluate.
GENERATION
Specifies a number that is incremented whenever an INODE number is reused.
GROUP_ID
Specifies the numeric group ID of the file's group.
INODE
Specifies the file's inode number.
KB_ALLOCATED
Specifies the number of kilobytes of disk space allocated for the file data.
MODE
Displays the type and permission bits of a file as a 10-character string. The string has the same format as the first 10 characters of the output from the UNIX ls -l command. For example, -rwxr-xr-x is the MODE string of a file that can be read or executed by its owner, its group, or any user, but written only by its owner.
The first character of the MODE attributes displays the file type. The following values are supported:
d
Directory.
l
Link.
c
Character device.
b
Block device.
p
Pipe.
s
Socket.
?
Some other attribute, or unknown.
MISC_ATTRIBUTES
Specifies various miscellaneous file attributes. The value is a string of characters that are defined as follows:
a
The file is appendOnly.
A
Archive.
c
The file is selected to be compressed.
D
Directory. To match all directories, you can use %D% as a wildcard character.
F
Regular data file.
I
Some data blocks might be ill-placed.
J
Some data blocks might be ill-replicated.
K
Some data blocks might be ill-compressed.
L
Symbolic link.
M
Co-managed.
2
Data blocks are replicated.
o
Offline.
O
Other (not F, D, nor L). For example, a device or named pipe.
P
AFM
R
Read-only.
u
AFM cached-complete flag.
U
The file is trunc-managed.
V
Read-managed.
W
Write-managed.
X
Immutability.
Y
Indefinite retention.
MODIFICATION_SNAPID
Specifies the integer ID of the snapshot after which the file was last changed. The value is normally derived with the SNAP_ID() built-in function that assigns integer values to GPFS snapshot names. This attribute allows policy rules to select files that are modified after a snapshot image is taken.
MODIFICATION_TIME
Specifies an SQL time stamp value for the date and time that the file data was last modified (POSIX mtime).
NAME
Specifies the name of a file.
NLINK
Specifies the number of hard links to the file.
PATH_NAME
Specifies a path for the file; the path includes the name of the file.
POOL_NAME
Specifies the storage pool where the file data is located.
Note: Using the FROM POOL clause has the same effect and is often preferable.
RDEVICE_ID
Specifies the device type for a device.
USER_ID
Specifies the numeric user ID of the owner of the file.
Notes:
  1. When file attributes are referenced in initial placement rules, only the following attributes are valid: FILESET_NAME, GROUP_ID, NAME, and USER_ID. The placement rules, like all rules with a clause, might also reference the current date and current time and use them to control matching.
  2. When file attributes are used for restoring files, the attributes correspond to the attributes at the time of the backup, not to the current restored file.
  3. For SQL expressions, if you want to show any of these attribute fields as strings (for example, FILE_HEAT), use SHOW('[FILE_HEAT]') rather than SHOW('FILE_HEAT'), as the latter is expanded.
  4. All date attributes are evaluated in Coordinated Universal Time (a time standard abbreviated as UTC).