Policy rules: Examples and tips

Before you write and apply policies, consider the following advice.

It is a good idea to test your policy rules by running the mmapplypolicy command with the -I test option and the -L 3 or higher option. Testing helps you understand which files are selected as candidates and which candidates are chosen to be processed.

Do not apply a policy to an entire file system of important files unless you are confident that the rules correctly express your intentions. To test your rules, choose a directory that contains a small number of files, some of which you expect to be selected by your SQL policy rules and some of which you expect to be skipped.

Then enter a command like the following one:
mmapplypolicy /TestSubdirectory  -L 6  -I test
The output shows which files are scanned and which match rules or no rules. If a problem is not apparent, you can add a SHOW() clause to your rule to see the values of file attributes or SQL expressions. To see multiple values, enter a command like the following one:
SHOW('x1=' || varchar(Expression1) || ' x2='  || varchar(Expression2) || ... )
where ExpressionX is the SQL variable or expression of function that you suspect or do not understand. Beware that if any expression evaluates to SQL NULL, then the entire show clause is NULL, by the rules of SQL. One way to show null vs. non-null values is to define a macro and call it as in the following example:
define(DISPLAY_NULL,[CASE WHEN ($1) IS NULL THEN '_NULL_' ELSE varchar($1) END])

rule list a SHOW( 'x1=' || DISPLAY_NULL(xattr('user.marc')) || ' and  x2=' || DISPLAY_NULL(xattr('user.eric')))
Note: For examples and more information on the -L flag, see The mmapplypolicy -L command.
The following examples illustrate some useful policy rule techniques:
  1. Delete files from the storage pool that is named pool_1 that were not accessed in the last 30 days and that are named like temporary files or appear in any directory that is named tmp:
    RULE 'del1' DELETE FROM POOL 'pool_1'
       WHERE (DAYS(CURRENT_TIMESTAMP) – DAYS(ACCESS_TIME) > 30)
            AND (lower(NAME) LIKE '%.tmp' OR PATH_NAME LIKE '%/tmp/%')
  2. Use the SQL LIKE predicate to test file names and path names:
    RULE '*/_*'  DELETE WHERE PATH_NAME LIKE '%/x_%' ESCAPE 'x'
    RULE '*XYZ*' DELETE WHERE NAME LIKE '%XYZ%'
    RULE '12_45' DELETE WHERE NAME LIKE '12x_45' ESCAPE 'x'
    RULE '12%45' DELETE WHERE NAME LIKE '12x%45' ESCAPE 'x'
    RULE '12?45' DELETE WHERE NAME LIKE '12_45'
    RULE '12*45' DELETE WHERE NAME LIKE '12%45'
    RULE '*_*'   DELETE WHERE NAME LIKE '%x_%' ESCAPE 'x'
    Where:
    • A percent % wildcard in the name represents zero or more characters.
    • An underscore (_) wildcard in the name represents 1 byte.

    Use the optional ESCAPE clause to establish an escape character, when you need to match '_' or '%' exactly.

  3. Use the SQL UPPER and LOWER functions to ignore case when testing names:
    RULE 'UPPER'  DELETE WHERE upper(PATH_NAME) LIKE '%/TMP/OLD/%' 
    RULE 'lower'  DELETE WHERE lower(PATH_NAME) LIKE '%/tmp/old/%' 
  4. Use the SQL SUBSTR or SUBSTRING functions to test a substring of a name:
    RULE 's1' DELETE WHERE SUBSTRING(NAME FROM 1 FOR 5)='XXXX-'
    RULE 's2' DELETE WHERE SUBSTR(NAME,1,5)='YYYY-'
  5. Use the SQL SUBSTR and LENGTH functions to test the suffix of a name:
    RULE 'sfx' DELETE WHERE SUBSTR(NAME,LENGTH(NAME)-3)='.tmp'
  6. Use a WHEN clause to restrict rule applicability to a particular day of the week:
    RULE 'D_SUN' WHEN (DayOfWeek(CURRENT_DATE)=1) /* Sunday */
      DELETE WHERE PATH_NAME LIKE '%/tmp/%'

    CURRENT_DATE is an SQL built in operand that returns the date portion of the CURRENT_TIMESTAMP value.

  7. Use the SQL IN operator to test several possibilities:
    RULE 'D_WEEKEND' WHEN (DayOfWeek(CURRENT_DATE) IN (7,1)) /*  Saturday or Sunday */
      DELETE WHERE PATH_NAME LIKE '%/tmp/%'

    For information on how to use a macro processor such as m4 to make reading and writing policy rules easier, see Using macro processing utilities with policy rules.

  8. Use a FILESET clause to restrict the rule to files within particular filesets:
    RULE 'fsrule1' MIGRATE TO POOL 'pool_2'
             FOR FILESET('root','fset1')

    In this example there is no FROM POOL clause, so regardless of their current storage pool placement, all files from the named filesets are subject to migration to storage pool pool_2.

    Note: To have the migrate rule applied to snapshot files, you must specify the mmapplypolicy fs -S snap1 option, where snap1 is the name of the snapshot where the files reside.
  9. Use an EXCLUDE rule to exclude a set of files from all subsequent rules:
    RULE 'Xsuper' EXCLUDE WHERE USER_ID=0
    RULE 'mpg'    DELETE  WHERE lower(NAME) LIKE '%.mpg' AND FILE_SIZE>20123456
    Notes:
    1. Specify the EXCLUDE rule before rules that might match the files that are being excluded.
    2. You cannot define a list and what to exclude from the list in a single rule. You must define two LIST statements, one specifying which files are in the list and one specifying what to exclude from the list. For example, to exclude files that contain the word test from the LIST rule allfiles, define the following rules:
      RULE EXTERNAL LIST 'allfiles' EXEC '/u/brownap/policy/CHE/exec.list'
      
      RULE 'exclude_allfiles' LIST 'allfiles' EXCLUDE where name like '%test%'
      
      RULE 'all' LIST 'allfiles' SHOW('misc_attr ='|| MISC_ATTRIBUTES || HEX(MISC_ATTRIBUTES)) \
        where name like '%'
  10. Use the SQL NOT operator with keywords, along with AND and OR:
    RULE 'D_WEEKDAY' WHEN (DayOfWeek(CURRENT_DATE) NOT IN (7,1)) /*  a weekday */
      DELETE WHERE (PATH_NAME LIKE '%/tmp/%' OR NAME LIKE '%.tmp')  
              AND (KB_ALLOCATED > 9999 AND NOT USER_ID=0)
  11. To migrate only snapshot files that for which data blocks are allocated, use the following rule:
    RULE "migrate snap data" MIGRATE FROM POOL X TO POOL Y WHERE KB_ALLOCATED > 0
  12. Use a REPLICATE clause to increase the availability of selected files:
    RULE 'R2' MIGRATE FROM POOL 'ypooly' TO POOL 'ypooly'
      REPLICATE(2) WHERE USER_ID=0

    Before you increase the data replication factor for any file, the file system must be configured to support data replication.

  13. The difference of two SQL Timestamp values can be compared to an SQL Interval value:
    rule 'a' migrate to pool 'A' where CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL  '10' DAYS
    rule 'b' migrate to pool 'B' where CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL  '10' HOURS
    rule 'c' migrate to pool 'C' where CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL  '10' MINUTES
    rule 'd' migrate to pool 'D' where CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL  '10' SECONDS

    For the best precision, use the INTERVAL...SECONDS construct.

  14. By carefully assigning both weights and thresholds, the administrator can formally express rules like this:

    If the storage pool named pool_X has an occupancy percentage greater than 90% now, bring the occupancy percentage of storage pool that is named pool_X down to 80% by migrating files that are three months or older to the storage pool that is named pool_ZZ. But, if you can find enough year-old files to bring the occupancy percentage down to 50%, do that also.

    RULE 'year-old' MIGRATE FROM POOL 'pool_X'
       THRESHOLD(90,50) WEIGHT(weight_expression) 
       TO POOL 'pool_ZZ'
       WHERE  DAYS(CURRENT_TIMESTAMP) – DAYS(ACCESS_TIME) > 365
    
    RULE '3month-old' MIGRATE FROM POOL 'pool_X'
       THRESHOLD(90,80) WEIGHT(weight_expression) 
       TO POOL 'pool_ZZ'
       WHERE  DAYS(CURRENT_TIMESTAMP) – DAYS(ACCESS_TIME) > 90
    More information about weights is available in the next example.
    A goal of this mmapplypolicy job is to reduce the occupancy percentage of the FROM POOL to the low occupancy percentage specified on the THRESHOLD clause, if possible. The mmapplypolicy job does not migrate or delete more files than are necessary to produce this occupancy percentage. The task consists of these steps:
    1. Each candidate file is assigned a weight.
    2. All candidate files are sorted by weight.
    3. The highest weight files are chosen to MIGRATE or DELETE until the low occupancy percentage is achieved, or there are no more candidates.
    The administrator who writes the rules must ensure that the computed weights are as intended, and that the comparisons are meaningful. This is similar to the IBM Spectrum Protect™ convention, where the weighting function for each file is determined by the equation:
    X * access_age + Y * file_size
    where:
    • access_age is DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME)
    • file_size is FILE_SIZE or KB_ALLOCATED
    • X and Y are weight factors that are chosen by the system administrator.
  15. The WEIGHT clause can be used to express ideas like this (stated informally):
    IF access_age > 365 days THEN weight = 100000 + access_age
    ELSE IF access_age < 30 days THEN weight = 0
    ELSE weight= KB_ALLOCATED
    This rule means:
    • Give a very large weight bias to any file older than a year.
    • Force the weight of any file younger than 30 days to 0.
    • Assign weights to all other files according to the number of kilobytes occupied.
    The following codeblock shows the formal SQL syntax:
    CASE
     WHEN DAYS(CURRENT_TIMESTAMP) – DAYS(ACCESS_TIME) > 365
      THEN 100000 + DAYS(CURRENT_TIMESTAMP) – DAYS(ACCESS_TIME)
     WHEN DAYS(CURRENT_TIMESTAMP) – DAYS(ACCESS_TIME) < 30
      THEN 0
     ELSE
      KB_ALLOCATED
    END
  16. The SHOW clause has no effect in matching files but can be used to define additional attributes to be exported with the candidate file lists. It can be used for any purpose but is primarily used to support file aggregation.

    To support aggregation, you can use the SHOW clause to output an aggregation value for each file that is selected by a rule. You can then output those values to a file list and input that list to an external program that groups the files into aggregates.

  17. If you have a large number of filesets against which to test, use the FILESET_NAME variable as shown in the following example:
    RULE 'x' SET POOL 'gold' WHERE FILESET_NAME LIKE 'xyz.%.xyz' 
    However, if you are testing against just a few filesets, you can use the FOR FILESET(’xyz1’, ’xyz2’) form instead.
  18. You can convert a time interval value to a number of seconds with the SQL cast syntax, as in the following example:
    define([toSeconds],[(($1) SECONDS(12,6))])
    
    define([toUnixSeconds],[toSeconds($1 - '1970-1-1@0:00')])
    
    RULE external list b
    RULE list b SHOW('sinceNow=' toSeconds(current_timestamp-modification_time) )
    RULE external list c
    RULE list c SHOW('sinceUnixEpoch=' toUnixSeconds(modification_time) )
    The following method is also supported:
    define(access_age_in_days,( INTEGER(( (CURRENT_TIMESTAMP - ACCESS_TIME) SECONDS)) /(24*3600.0) ) )
    
    RULE external list w exec ''
    RULE list w weight(access_age_in_days) show(access_age_in_days)
  19. You can create a policy that lists the files that are created, accessed, or modified later than a specified timestamp. The timestamp must be converted from native format to UTC format. The following example policy lists all the files that were created after the timestamp 2017-02-21 04:56 IST:
    cat policy
    RULE 'filesRule' LIST 'files'
            SHOW(varchar(kb_allocated)  || '  ' || varchar(file_size)) 
            WHERE (CREATION_TIME > TIMESTAMP('LAST_CREATE'))
    To implement this policy, enter the following commands. The third line converts the time stamp to UTC format.
    $LC='2017-02-21 04:56 IST'
    echo $LC
    $LCU=$(date +%Y-%m-%d" "%H:%M  -d "$LC" -u)
    echo $LCU
    $mmapplypolicy gpfs0 -P policy -I defer -f /tmp -M LAST_CREATE="$LCU"
    The /tmp/list.files file contains the list of selected files.

    You can modify the SHOW clause to list any file attribute. For more information, see File attributes in SQL expressions.