Topic
  • 6 replies
  • Latest Post - ‏2014-06-17T20:29:24Z by marc_of_GPFS
joe_usr
joe_usr
7 Posts

Pinned topic A policy to delete the oldest files?

‏2014-05-22T21:50:21Z |

I need suggestions for creating a RULE that once the high water mark is hit, will delete the oldest files until the low water mark is reached.

RULE 'delete_oldest_files' DELETE FROM POOL 'my_pool' THRESHOLD (90,80) ...

Any suggestions?

Updated on 2014-06-17T19:40:54Z at 2014-06-17T19:40:54Z by joe_usr
  • marc_of_GPFS
    marc_of_GPFS
    35 Posts

    Re: A policy to delete the oldest files?

    ‏2014-06-02T15:28:47Z  

    The rule you gave is a starting point, but you almost certainly want to add a WHERE clause and perhaps some other qualifiers,
    otherwise you will DELETE files you want to keep!

    Please review the GPFS Advanced Admin Guide, Chapter 2. Information Lifecycle Management for GPFS as well as the mmapplypolicy command documentation in the Admin Guide.  

    Always TEST your policy rules against a test subdirectory with some files that you expect to match your rules and some that do not match your rules.
    Also use the -I test option.  Even then, preferably test on a "dummy" file system that you can afford to completely "hose up"!  

    IOW - RTFineM - and come back here with any specific questions - or suggestion on how to improve the documentation.

  • joe_usr
    joe_usr
    7 Posts

    Re: A policy to delete the oldest files?

    ‏2014-06-02T16:19:01Z  

    The rule you gave is a starting point, but you almost certainly want to add a WHERE clause and perhaps some other qualifiers,
    otherwise you will DELETE files you want to keep!

    Please review the GPFS Advanced Admin Guide, Chapter 2. Information Lifecycle Management for GPFS as well as the mmapplypolicy command documentation in the Admin Guide.  

    Always TEST your policy rules against a test subdirectory with some files that you expect to match your rules and some that do not match your rules.
    Also use the -I test option.  Even then, preferably test on a "dummy" file system that you can afford to completely "hose up"!  

    IOW - RTFineM - and come back here with any specific questions - or suggestion on how to improve the documentation.

    Yes.  I need assistance constructing a WHERE clause that'll delete the oldest files in succession until the low water mark is reached.

  • joe_usr
    joe_usr
    7 Posts

    Re: A policy to delete the oldest files?

    ‏2014-06-02T16:23:20Z  

    The rule you gave is a starting point, but you almost certainly want to add a WHERE clause and perhaps some other qualifiers,
    otherwise you will DELETE files you want to keep!

    Please review the GPFS Advanced Admin Guide, Chapter 2. Information Lifecycle Management for GPFS as well as the mmapplypolicy command documentation in the Admin Guide.  

    Always TEST your policy rules against a test subdirectory with some files that you expect to match your rules and some that do not match your rules.
    Also use the -I test option.  Even then, preferably test on a "dummy" file system that you can afford to completely "hose up"!  

    IOW - RTFineM - and come back here with any specific questions - or suggestion on how to improve the documentation.

    Yes.  I need assistance constructing a WHERE clause that'll delete the oldest files in succession until the low water mark is reached.

  • marc_of_GPFS
    marc_of_GPFS
    35 Posts

    Re: A policy to delete the oldest files?

    ‏2014-06-02T17:22:54Z  
    • joe_usr
    • ‏2014-06-02T16:23:20Z

    Yes.  I need assistance constructing a WHERE clause that'll delete the oldest files in succession until the low water mark is reached.

    The THRESHOLD clause governs that "low water mark".  Period.  BUT ....

    Use WEIGHT(CURRENT_TIMESTAMP-ACCESS_TIME)  to DELETE "older" files first.  Or MODIFICATION_TIME or CREATION_TIME, if that's your criteria for "age"

    BUT BUT BUT I again caution that you must decide and specify which files are "okay" to delete, for example:
     

    WHERE PATH_NAME LIKE '/tmp/%' OR NAME LIKE '%.tmp'   /* this rule only applies to files in the /tmp directory or files the user has named "*.tmp"

    I urge you to at least slog through the examples in chapter 2 of the Advanced Admin Guide.

    Also, work with an experience programmer who knows at least a little SQL -

    Then test, test and retest on a "throwaway" file system.

    BEFORE even thinking of applying this to a production system!!!

    DELETE is serious stuff - it may be impossible to recover files you DELETE by mistake.

     

    IOW

     

    rule 'rm -rf' DELETE   /* is pretty much equivalent the superuser running dreaded classic `rm -rf /`  command! */

     
  • joe_usr
    joe_usr
    7 Posts

    Re: A policy to delete the oldest files?

    ‏2014-06-17T19:44:23Z  

    The THRESHOLD clause governs that "low water mark".  Period.  BUT ....

    Use WEIGHT(CURRENT_TIMESTAMP-ACCESS_TIME)  to DELETE "older" files first.  Or MODIFICATION_TIME or CREATION_TIME, if that's your criteria for "age"

    BUT BUT BUT I again caution that you must decide and specify which files are "okay" to delete, for example:
     

    WHERE PATH_NAME LIKE '/tmp/%' OR NAME LIKE '%.tmp'   /* this rule only applies to files in the /tmp directory or files the user has named "*.tmp"

    I urge you to at least slog through the examples in chapter 2 of the Advanced Admin Guide.

    Also, work with an experience programmer who knows at least a little SQL -

    Then test, test and retest on a "throwaway" file system.

    BEFORE even thinking of applying this to a production system!!!

    DELETE is serious stuff - it may be impossible to recover files you DELETE by mistake.

     

    IOW

     

    rule 'rm -rf' DELETE   /* is pretty much equivalent the superuser running dreaded classic `rm -rf /`  command! */

     

    Yes, getting closer!  This almost does the trick...

    RULE 'myRule'
    DELETE
    FROM POOL 'myPool'
    THRESHOLD (90,80)
    WEIGHT (CURRENT_TIMESTAMP - CREATION_TIME)

    However, directories are skipped (NO RULE APPLIES)

    I'd like to delete old directories, too.

  • marc_of_GPFS
    marc_of_GPFS
    35 Posts

    Re: A policy to delete the oldest files?

    ‏2014-06-17T20:29:24Z  
    • joe_usr
    • ‏2014-06-17T19:44:23Z

    Yes, getting closer!  This almost does the trick...

    RULE 'myRule'
    DELETE
    FROM POOL 'myPool'
    THRESHOLD (90,80)
    WEIGHT (CURRENT_TIMESTAMP - CREATION_TIME)

    However, directories are skipped (NO RULE APPLIES)

    I'd like to delete old directories, too.

    Deleting directories is trickier, because a simple rmdir command only works against empty directories.

    CAUTION: you probably do NOT want to remove files or directories just because they were created a long time ago -  usually you want ACCESS_TIME and even then just because the file has not been read for a long time does not necessarily mean you don't want to keep it around.

    I believe the current GPFS releases support

    RULE 'd' LIST 'd' DIRECTORIES_PLUS  

    but not 

    RULE 'dx' DELETE DIRECTORIES_PLUS.

    So if you want to use policy today to delete directories you're going to have to use the LIST rule with appropriate qualifiers + a little script - see the GPFS samples/ilm directory.

    samples/ilm/tsprm is interesting - but may be overkill for your needs.

     

    Also, that "sample" was written before we added DIRECTORY_HASH, which gives you an easier way to sort files into an order that is appropriate for DELETEion of directories.

    Updated on 2014-06-18T12:03:10Z at 2014-06-18T12:03:10Z by marc_of_GPFS