Topic
  • 5 replies
  • Latest Post - ‏2014-06-03T12:15:40Z by marc_of_GPFS
db808
db808
86 Posts

Pinned topic Can "WHERE REGEX(xxx,yyy)" clause be used on "SET POOL" policy rule?

‏2014-05-05T20:44:26Z |

Can the "WHERE REGEX(xxx,yyy)" clause be used on "SET POOL" policy rule?

All the examples of WHERE clauses on the SET POOL rule in the GPFS manual use "LOWER(NAME) LIKE "xxx" clauses, where "xxxx" is an example wild card string.

All the examples of a WHERE REGEX(xxx,yyy) clause are LIST, MIGRATE,  LIST or other types of rules .... no explicit examples on "SET POOL" rules.

Can we use REGEX, which allows more complex regular expressions than "LIKE",  on the SET POOL rule?

Can we use any arbitrary GPFS SQL expression on SET POOL that only use attributes that are available at file create time?  I understand that we can not use expressions based on file size, since file size is not yet known at file create time.

Thanks for your help.

Dave B

  • dlmcnabb
    dlmcnabb
    1012 Posts

    Re: Can "WHERE REGEX(xxx,yyy)" clause be used on "SET POOL" policy rule?

    ‏2014-05-07T22:24:22Z  

    I would guess that you can use REGEX on the file base name. However, placement rules cannot so anything with path names since the path name is not available for it to use, only in a migration policy.

  • db808
    db808
    86 Posts

    Re: Can "WHERE REGEX(xxx,yyy)" clause be used on "SET POOL" policy rule?

    ‏2014-05-08T15:32:28Z  
    • dlmcnabb
    • ‏2014-05-07T22:24:22Z

    I would guess that you can use REGEX on the file base name. However, placement rules cannot so anything with path names since the path name is not available for it to use, only in a migration policy.

    Hi Dan,

    Thanks for the quick reply, but I believe that you are NOT correct.  The file's name IS explicitly available at file create time, and as such can be used in a SET POOL placement policy.

    There are many examples in the GPFS manuals such as:

    RULE 'mp4_files' SET POOL 'LARGE_FILE_POOL' WHERE LOWER(NAME) LIKE '%.mp4'

    which is a file placement policy based on the file name ... the file extension in this example.

    The question I was making is that the "LIKE" qualifier has a restricted set of "matching" functionality, where REGEX has a much expanded set of pattern matching functionality and could provide value if the desired pattern could not be handled by the simpler LIKE.

    The manual section on the REGEX clause does not specify what RULE categories that REGEX can be (or can not be) used with, and none of the examples illustrate REGEX with SET POOL.

    The syntax that I would like to explore is something similar to:

    RULE 'other_files' SET POOL 'XXXX' WHERE REGEX(LOWER(NAME),'complex_pattern')

    In my specific case, I found that the NOT clause could be used with LIKE, and the resulting functionality solved the problem.  I was trying to pattern match the case of a file name with NO extension (no period in its name).  I originally though that I would need a REGEX, since LIKE alone would not handle the condition.  However, the clause NOT LIKE '%.%' yields the desired pattern match.

    My immediate need for REGEX with SET POOL has waned, but I can see the need in the future.

    Dave B

  • dlmcnabb
    dlmcnabb
    1012 Posts

    Re: Can "WHERE REGEX(xxx,yyy)" clause be used on "SET POOL" policy rule?

    ‏2014-05-08T16:20:29Z  
    • db808
    • ‏2014-05-08T15:32:28Z

    Hi Dan,

    Thanks for the quick reply, but I believe that you are NOT correct.  The file's name IS explicitly available at file create time, and as such can be used in a SET POOL placement policy.

    There are many examples in the GPFS manuals such as:

    RULE 'mp4_files' SET POOL 'LARGE_FILE_POOL' WHERE LOWER(NAME) LIKE '%.mp4'

    which is a file placement policy based on the file name ... the file extension in this example.

    The question I was making is that the "LIKE" qualifier has a restricted set of "matching" functionality, where REGEX has a much expanded set of pattern matching functionality and could provide value if the desired pattern could not be handled by the simpler LIKE.

    The manual section on the REGEX clause does not specify what RULE categories that REGEX can be (or can not be) used with, and none of the examples illustrate REGEX with SET POOL.

    The syntax that I would like to explore is something similar to:

    RULE 'other_files' SET POOL 'XXXX' WHERE REGEX(LOWER(NAME),'complex_pattern')

    In my specific case, I found that the NOT clause could be used with LIKE, and the resulting functionality solved the problem.  I was trying to pattern match the case of a file name with NO extension (no period in its name).  I originally though that I would need a REGEX, since LIKE alone would not handle the condition.  However, the clause NOT LIKE '%.%' yields the desired pattern match.

    My immediate need for REGEX with SET POOL has waned, but I can see the need in the future.

    Dave B

    I was only trying to make the point that at file creation time only the NAME of the file can be tested, not the full PATHNAME. I don't know how REGEX fits into the matching of the NAME.

  • marc_of_GPFS
    marc_of_GPFS
    33 Posts

    Re: Can "WHERE REGEX(xxx,yyy)" clause be used on "SET POOL" policy rule?

    ‏2014-06-02T15:46:23Z  

    Within the GPFS/SQL policy language, RegEx() is a built-in sql-function returning a Boolean that can be used as or within any sql expression.


    More specifically to address your question - where you might have used ( X LIKE y) then you may use RegEx(X,y_prime) instead.

    However, if you have a cluster where some of the nodes are running a "backlevel" version of GPFS - or some of your file systems are not enabled for the latest release features -- then you will find that you can NOT store a policy file into/onto a particular file system  (mmchpolicy) -- because - if we let you do that - then a backlevel node attempting to mount said filesystem would suffer an SQL error at mount time!  Because backlevel nodes simply do not "know" how to interpret the RegEx() function.

  • marc_of_GPFS
    marc_of_GPFS
    33 Posts

    Re: Can "WHERE REGEX(xxx,yyy)" clause be used on "SET POOL" policy rule?

    ‏2014-06-03T12:15:40Z  

    Within the GPFS/SQL policy language, RegEx() is a built-in sql-function returning a Boolean that can be used as or within any sql expression.


    More specifically to address your question - where you might have used ( X LIKE y) then you may use RegEx(X,y_prime) instead.

    However, if you have a cluster where some of the nodes are running a "backlevel" version of GPFS - or some of your file systems are not enabled for the latest release features -- then you will find that you can NOT store a policy file into/onto a particular file system  (mmchpolicy) -- because - if we let you do that - then a backlevel node attempting to mount said filesystem would suffer an SQL error at mount time!  Because backlevel nodes simply do not "know" how to interpret the RegEx() function.

    An example:                                                                                                                         (from another longer post on a different thread)

     

    rule 'several-file-types-with-regex' set pool 'B'  where  RegEx(lower(name),['\.avi$|\.mp[34]$|\.[mj]pe?g$'])
     
     is preferable to to the equivalent 7(!) like predicates
     
    rule 'several-file-types-with-or-like' set pool 'B' where lower(name) like '%.avi' OR lower(name) like '%.mp3' OR lower(name) like '%.mp4'
              OR lower(name) like '%.mpeg' OR lower(name) like '%.jpeg' OR lower(name) like '%.mpg' OR lower(name) like '%.jpg'