Topic
  • 11 replies
  • Latest Post - ‏2012-12-12T10:46:23Z by Tucks
Tucks
Tucks
78 Posts

Pinned topic GPFS sorting affects speed of HSM recall - is it possible to intervene?

‏2012-12-10T10:31:04Z |
Hello

I've discovered a performance issue with our HSM workflow.

When we hit an individual stgpool THRESHOLD callback we generate a list of candidates to migrate to TSM LTO5 via an oldest first policy.

E.G.

RULE 
'yadayada' MIGRATE FROM POOL 
'mypool' THRESHOLD (95,85,70) WEIGHT((DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) TO POOL 
'hsmpool' WHERE some matching rules..

The above creates a candidate list which is sorted according to WEIGHT((DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME)))
Candidates are supplied to TSM in -B sized batches in the order of the sort.

To recall we generate a file list via:

find /path/to/dir -type f -print > filelist dsmrecall -Detail -FI=filelist

However on recall tape shoe-shining occurs as the filelist is not representative of the order that files were migrated to tape. I.E. the sort order generated via GPFS.

Quandry:

1. Can we post-process the sort of GPFS candidates to perform a further sort -n (we have sequences of images) before the batches are passed to dsmmigrate?

2. Alternatively, we could sort our find /path/to/dir results to match the GPFS sort. How do we guarantee a post-sort on the find would match the sorted order of the GPFS supplied candidate list?

3. I have a gut feeling (I.E. still looking into this) that this may be compounded by the fact that dsmmigrate is supplied in -B batchsize and different nodes could supply these out of order? Perhaps defer is the way to go?

The crux of it is that in order to have efficient streaming off tape with no/minimal shoe-shining, the files must be dsmrecalled in the order they were written to tape.

Thanks.
Updated on 2012-12-12T10:46:23Z at 2012-12-12T10:46:23Z by Tucks
  • Tucks
    Tucks
    78 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-10T13:59:15Z  
    So.. investigations continue...

    Looking at the supplied mmPolicyExec=hsm.script, I saw it had a RECALL option.

    Therefore:
    
    RULE 
    'default EXTERNAL POOL 'hsmpool
    ' EXEC '/var/mmfs/etc/mmpolicyExec-hsm.script
    ' OPTS 'RECALL -v
    '   RULE 
    'mypool recall' MIGRATE FROM POOL 
    'hsmpool' WEIGHT (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) TO POOL 
    'mypool' WHERE PATH_NAME LIKE 
    'topdir%' AND MISC_ATTRIBUTES LIKE 
    '%V%'
    

    Then:
    
    mmapplypolicy /path/to/top/dir -P policyfile -M topdir=/path/to/top/dir -I yes
    


    Presumption being that the weighted file list will be the same order as the one that was used to migrate the files to tape.

    However - what if the files under the /path/to/top/dir are in different storage pools?
    You'd need to know which pool the file should be migrated back to in advance.

    You could write a migration rule for the files in /path/to/top/dir for each known storage pools.
    However, that would invoke one dsmrecall for every storage pool.
    I can't see that being at all efficient in terms of optimum linear retrieval.

    Can anyone offer any improvements on this? Even wild suggests to be investigated are appreciated.
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-10T16:58:19Z  
    • Tucks
    • ‏2012-12-10T13:59:15Z
    So.. investigations continue...

    Looking at the supplied mmPolicyExec=hsm.script, I saw it had a RECALL option.

    Therefore:
    <pre class="jive-pre"> RULE 'default EXTERNAL POOL 'hsmpool ' EXEC '/var/mmfs/etc/mmpolicyExec-hsm.script ' OPTS 'RECALL -v ' RULE 'mypool recall' MIGRATE FROM POOL 'hsmpool' WEIGHT (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) TO POOL 'mypool' WHERE PATH_NAME LIKE 'topdir%' AND MISC_ATTRIBUTES LIKE '%V%' </pre>
    Then:
    <pre class="jive-pre"> mmapplypolicy /path/to/top/dir -P policyfile -M topdir=/path/to/top/dir -I yes </pre>

    Presumption being that the weighted file list will be the same order as the one that was used to migrate the files to tape.

    However - what if the files under the /path/to/top/dir are in different storage pools?
    You'd need to know which pool the file should be migrated back to in advance.

    You could write a migration rule for the files in /path/to/top/dir for each known storage pools.
    However, that would invoke one dsmrecall for every storage pool.
    I can't see that being at all efficient in terms of optimum linear retrieval.

    Can anyone offer any improvements on this? Even wild suggests to be investigated are appreciated.
    As you may know, I'm the guy who wrote most of mmapplypolicy and still maintain/enhance it.

    I understand you want this to work well together, but let's separate the concerns and questions:

    a) I believe that when you specify FROM POOL 'some external pool' TO POOL 'some gpfs pool' then any files that are ultimately chosen by this rule
    are first tagged as belonging to 'some gpfs pool' and then HSM recalled. So you shouldn't have to worry about moving the data twice.
    (Please verify this!)

    b) The WEIGHT(DAYS ...) clause causes the "oldest" files to be considered first for migration. In your example, I don't see a THRESHOLD clause, so if you were indeed wanting every file matching the WHERE clause to be recalled, and didn't particularly care which were processed first, you could leave out the WEIGHT clause...

    c) BUT it seems you have a (good!) idea to try to order the files so that TSM can avoid thrashing tape mounts and such...
    However, here's where my knowledge is weak... I don't know if there is any reliable way to predict which files should be lumped together into tsm recall requests, based solely on Posix file attributes. In fact I rather doubt it! In theory, when you present one dsmrecall command with a large filelist to TSM, it optimizes the recall for that filelist... So maybe you want to increase the -B parameter to mmapplypolicy?

    d) OTOH, if you could compute or discover an attribute that corresponds to something like TapeVolumenOnWhichIAmHSMed, it would make sense to sort on that.

    e) If sort by WEIGHT() does not do quite what you want, you may want to investigate the `-I prepare` and `-r` options of mmapplypolicy.

    f) If you would like to speed up mmapplypolicy by eliminating or short circuiting the strict sort by WEIGHT, please try the `--choice-algorithm fast` option.
  • sberman
    sberman
    61 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-10T20:17:17Z  
    As you may know, I'm the guy who wrote most of mmapplypolicy and still maintain/enhance it.

    I understand you want this to work well together, but let's separate the concerns and questions:

    a) I believe that when you specify FROM POOL 'some external pool' TO POOL 'some gpfs pool' then any files that are ultimately chosen by this rule
    are first tagged as belonging to 'some gpfs pool' and then HSM recalled. So you shouldn't have to worry about moving the data twice.
    (Please verify this!)

    b) The WEIGHT(DAYS ...) clause causes the "oldest" files to be considered first for migration. In your example, I don't see a THRESHOLD clause, so if you were indeed wanting every file matching the WHERE clause to be recalled, and didn't particularly care which were processed first, you could leave out the WEIGHT clause...

    c) BUT it seems you have a (good!) idea to try to order the files so that TSM can avoid thrashing tape mounts and such...
    However, here's where my knowledge is weak... I don't know if there is any reliable way to predict which files should be lumped together into tsm recall requests, based solely on Posix file attributes. In fact I rather doubt it! In theory, when you present one dsmrecall command with a large filelist to TSM, it optimizes the recall for that filelist... So maybe you want to increase the -B parameter to mmapplypolicy?

    d) OTOH, if you could compute or discover an attribute that corresponds to something like TapeVolumenOnWhichIAmHSMed, it would make sense to sort on that.

    e) If sort by WEIGHT() does not do quite what you want, you may want to investigate the `-I prepare` and `-r` options of mmapplypolicy.

    f) If you would like to speed up mmapplypolicy by eliminating or short circuiting the strict sort by WEIGHT, please try the `--choice-algorithm fast` option.
    In addition to all the great answers Marc just gave, I will throw in a couple of points that pertain to what I know about HSM behavior in TSM. The recall of files in a list does frequently cause this behavior you describe as "shoe shining" the tape. We have explored with TSM development the idea that a list of files submitted to TSM could be re-ordered in such a way that it would minimize the tape thrash. I believe investigation of such a feature is still ongoing. If you had reason to believe that the migration order is also the ideal recall order you could save the list generated in the past as Marc suggested with the -I and -r flags to mmapplypolicy and refer back to them when generating new recall requests.
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-10T20:48:24Z  
    • sberman
    • ‏2012-12-10T20:17:17Z
    In addition to all the great answers Marc just gave, I will throw in a couple of points that pertain to what I know about HSM behavior in TSM. The recall of files in a list does frequently cause this behavior you describe as "shoe shining" the tape. We have explored with TSM development the idea that a list of files submitted to TSM could be re-ordered in such a way that it would minimize the tape thrash. I believe investigation of such a feature is still ongoing. If you had reason to believe that the migration order is also the ideal recall order you could save the list generated in the past as Marc suggested with the -I and -r flags to mmapplypolicy and refer back to them when generating new recall requests.
    It would be nice if TSM/HSM set an extended attribute for each migrated file, that would tell you to which tape volume it copied the file and the approximate position on the tape.
    I think TSM/HSM does set some EAs, but not documented?
  • Tucks
    Tucks
    78 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-10T22:47:30Z  
    As you may know, I'm the guy who wrote most of mmapplypolicy and still maintain/enhance it.

    I understand you want this to work well together, but let's separate the concerns and questions:

    a) I believe that when you specify FROM POOL 'some external pool' TO POOL 'some gpfs pool' then any files that are ultimately chosen by this rule
    are first tagged as belonging to 'some gpfs pool' and then HSM recalled. So you shouldn't have to worry about moving the data twice.
    (Please verify this!)

    b) The WEIGHT(DAYS ...) clause causes the "oldest" files to be considered first for migration. In your example, I don't see a THRESHOLD clause, so if you were indeed wanting every file matching the WHERE clause to be recalled, and didn't particularly care which were processed first, you could leave out the WEIGHT clause...

    c) BUT it seems you have a (good!) idea to try to order the files so that TSM can avoid thrashing tape mounts and such...
    However, here's where my knowledge is weak... I don't know if there is any reliable way to predict which files should be lumped together into tsm recall requests, based solely on Posix file attributes. In fact I rather doubt it! In theory, when you present one dsmrecall command with a large filelist to TSM, it optimizes the recall for that filelist... So maybe you want to increase the -B parameter to mmapplypolicy?

    d) OTOH, if you could compute or discover an attribute that corresponds to something like TapeVolumenOnWhichIAmHSMed, it would make sense to sort on that.

    e) If sort by WEIGHT() does not do quite what you want, you may want to investigate the `-I prepare` and `-r` options of mmapplypolicy.

    f) If you would like to speed up mmapplypolicy by eliminating or short circuiting the strict sort by WEIGHT, please try the `--choice-algorithm fast` option.
    Re:

    a) So you're saying that because the external pool migration (in effect a recall in this instance) just calls dsmmigrate then in theory you should be able to migrate from an hsmpool to any (known?) pool name and the file should arrive back in its original pool?

    b) The THRESHOLD was in the original migrate to tape policy, but was not required for a recall. My theory is that if the migrate to tape rule generated the files to tape in a set order, then a similar policy - minus the threshold should generate the same list in the same order for a recall. Hence TSM should optimize the given candidates identically each time. I'm not sure this works in practice. I need to perform more extensive testing.

    c) Tape thrashing is the bane of all HSM workflows. I know they've done some work on the optmization in TSM 6.3+, but I'm currently running 6.2. I can say that I did an unsorted list of 200,000 files versus a sort -n of 200,000 files and the results were very disparaging. I'm not convinced TSM particularly performed any intelligent optimization. Still recalling several days later. The disadvantage of increasing -B is the 'hogging' of tape drives.

    d) I think this would take too long to compute from the user end. A call out to db2 / TSM API would be required per file. IIRC the computation of that is quite slow.

    e) We have upwards of 200m migrated files. That's a lot of logs to keep about. I think b) is a better methodology, if it is reproducible. That said - I've not calculated what would occur on recall, reconciliation, change of HSM policy etc.

    I think we need to speak to the TSM dev team at length regarding the fine detail of the dsmmigrate and dsmrecall engines.
  • Tucks
    Tucks
    78 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-10T22:59:38Z  
    It would be nice if TSM/HSM set an extended attribute for each migrated file, that would tell you to which tape volume it copied the file and the approximate position on the tape.
    I think TSM/HSM does set some EAs, but not documented?
    I also think TSM also has some EAs.
    Can you co-ordinate with Dean H. of Almaden?
    There's a couple we wanted to add/query for offline file support.

    • tape vol
    • tsm stgpool name (backup, archive, hsm pools?)
    • block

    If these were exposed (perhaps all the EAs?) to the GPFS policy engine that would be great.

    
    RULE 
    'blah' DO SOMETHING WHERE AND IS IN TSM_STGPOOL_NAME 
    'mytest'
    


    Thinking further ahead, things like that might help to integrate LTFS, GPFS and TSM.
    Perhaps ping me off-list?

    Back in the office Wed.
    Will test the theories then.
  • Tucks
    Tucks
    78 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-10T23:03:15Z  
    • Tucks
    • ‏2012-12-10T22:47:30Z
    Re:

    a) So you're saying that because the external pool migration (in effect a recall in this instance) just calls dsmmigrate then in theory you should be able to migrate from an hsmpool to any (known?) pool name and the file should arrive back in its original pool?

    b) The THRESHOLD was in the original migrate to tape policy, but was not required for a recall. My theory is that if the migrate to tape rule generated the files to tape in a set order, then a similar policy - minus the threshold should generate the same list in the same order for a recall. Hence TSM should optimize the given candidates identically each time. I'm not sure this works in practice. I need to perform more extensive testing.

    c) Tape thrashing is the bane of all HSM workflows. I know they've done some work on the optmization in TSM 6.3+, but I'm currently running 6.2. I can say that I did an unsorted list of 200,000 files versus a sort -n of 200,000 files and the results were very disparaging. I'm not convinced TSM particularly performed any intelligent optimization. Still recalling several days later. The disadvantage of increasing -B is the 'hogging' of tape drives.

    d) I think this would take too long to compute from the user end. A call out to db2 / TSM API would be required per file. IIRC the computation of that is quite slow.

    e) We have upwards of 200m migrated files. That's a lot of logs to keep about. I think b) is a better methodology, if it is reproducible. That said - I've not calculated what would occur on recall, reconciliation, change of HSM policy etc.

    I think we need to speak to the TSM dev team at length regarding the fine detail of the dsmmigrate and dsmrecall engines.
    It's late: a) ... the external pool migration (in effect a recall in this instance) just calls dsmrecall ..
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-10T23:49:53Z  
    • Tucks
    • ‏2012-12-10T22:47:30Z
    Re:

    a) So you're saying that because the external pool migration (in effect a recall in this instance) just calls dsmmigrate then in theory you should be able to migrate from an hsmpool to any (known?) pool name and the file should arrive back in its original pool?

    b) The THRESHOLD was in the original migrate to tape policy, but was not required for a recall. My theory is that if the migrate to tape rule generated the files to tape in a set order, then a similar policy - minus the threshold should generate the same list in the same order for a recall. Hence TSM should optimize the given candidates identically each time. I'm not sure this works in practice. I need to perform more extensive testing.

    c) Tape thrashing is the bane of all HSM workflows. I know they've done some work on the optmization in TSM 6.3+, but I'm currently running 6.2. I can say that I did an unsorted list of 200,000 files versus a sort -n of 200,000 files and the results were very disparaging. I'm not convinced TSM particularly performed any intelligent optimization. Still recalling several days later. The disadvantage of increasing -B is the 'hogging' of tape drives.

    d) I think this would take too long to compute from the user end. A call out to db2 / TSM API would be required per file. IIRC the computation of that is quite slow.

    e) We have upwards of 200m migrated files. That's a lot of logs to keep about. I think b) is a better methodology, if it is reproducible. That said - I've not calculated what would occur on recall, reconciliation, change of HSM policy etc.

    I think we need to speak to the TSM dev team at length regarding the fine detail of the dsmmigrate and dsmrecall engines.
    a) When you migrate from an external pool to a gpfs pool 'abc', first mmapplypolicy marks the inode to say data goes into 'abc', if not already so marked.
    This is equivalent to mmchattr -P 'abc' AND then it calls your script, which is typically a wrapper for dsmrecall.

    EAs) All EAs can be accessed from within policy rules via the XATTR() and SETXATTR() functions.
    You are root, so be careful with SetXAttr()! RTFM! IIRC, SetXattr() was added in release 3.5.
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-11T14:07:23Z  
    Hi,

    since version 6.3 TSM for Space Management provides the function "tape optimized recall". The tape optimized recall processing is implemented to perform the recall in two steps: 1. analyze input file list and do ordering. Create output file lists (one per tape - all files in optimized order; one for non-tape media; one file that is called collection file and holds meta information of the other file lists - number of files to be recalled, number of bytes to be recalled). Running in prepare mode only step 1. is processed. 2. recall data. If the recall was not started in prepare mode all tapes will be recalled in optimized file order. If the recall was started in prepare mode you can now take single tape-cartridge specific file lists and recall them. Or you can modify the collection file (Maybe change the ordering of the tape-cartridges to recall the most important data first) and use the collection as input for the recall.

    The tape optimized recall doesn't integrate transparent recalls in the current recall queue. The transparent recall to a tape-cartridge that is already in use will wait until the tape optimized recall to this cartridge has finished.

    Greetings, Dominic.
    p.s. See also TSM Information Center: http://pic.dhe.ibm.com/infocenter/tsminfo/v6r4/index.jsp?topic=%2Fcom.ibm.itsm.client.doc%2Ft_protect_wf.html
    search for "optimized tape recall"
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-11T17:31:25Z  
    Hi,

    since version 6.3 TSM for Space Management provides the function "tape optimized recall". The tape optimized recall processing is implemented to perform the recall in two steps: 1. analyze input file list and do ordering. Create output file lists (one per tape - all files in optimized order; one for non-tape media; one file that is called collection file and holds meta information of the other file lists - number of files to be recalled, number of bytes to be recalled). Running in prepare mode only step 1. is processed. 2. recall data. If the recall was not started in prepare mode all tapes will be recalled in optimized file order. If the recall was started in prepare mode you can now take single tape-cartridge specific file lists and recall them. Or you can modify the collection file (Maybe change the ordering of the tape-cartridges to recall the most important data first) and use the collection as input for the recall.

    The tape optimized recall doesn't integrate transparent recalls in the current recall queue. The transparent recall to a tape-cartridge that is already in use will wait until the tape optimized recall to this cartridge has finished.

    Greetings, Dominic.
    p.s. See also TSM Information Center: http://pic.dhe.ibm.com/infocenter/tsminfo/v6r4/index.jsp?topic=%2Fcom.ibm.itsm.client.doc%2Ft_protect_wf.html
    search for "optimized tape recall"
    Thanks for the explanation and pointers, Dominic!

    Tucks: Just to get a sense of the magnitude of the problem, about how many files might a single mmapplypolicy job recall? 10,000? 100,000? 1 million, 10 million, 100 million?

    Dominic: Is the dsmrecall command capable of reading an input filelist of 100 million pathnames and splitting it into optimized-for-tape-recall sublists?

    Tucks: Assuming that is practical from a TSM point of view. You may like to consider:

    a) using mmapplypolicy to build a (potentially) very large file list of files to recall.
    b) using dsmrecall with -preview to plan the optimized recalls from tape.
    c) invoking dsmrecall to do the recalls, preferably multiple in parallel calls from multiple nodes.
  • Tucks
    Tucks
    78 Posts

    Re: GPFS sorting affects speed of HSM recall - is it possible to intervene?

    ‏2012-12-12T10:46:23Z  
    Thanks for the explanation and pointers, Dominic!

    Tucks: Just to get a sense of the magnitude of the problem, about how many files might a single mmapplypolicy job recall? 10,000? 100,000? 1 million, 10 million, 100 million?

    Dominic: Is the dsmrecall command capable of reading an input filelist of 100 million pathnames and splitting it into optimized-for-tape-recall sublists?

    Tucks: Assuming that is practical from a TSM point of view. You may like to consider:

    a) using mmapplypolicy to build a (potentially) very large file list of files to recall.
    b) using dsmrecall with -preview to plan the optimized recalls from tape.
    c) invoking dsmrecall to do the recalls, preferably multiple in parallel calls from multiple nodes.
    Will be plugging away at this again today.
    However - noticed a slight flaw in the way migrations / recalls occur between GPFS and TSM, so slight tangent which I thought I'd raise as it is relevant to this workflow.

    According to the docs preservelastaccessdate=YES should be set when using HSM. This makes a lot of sense and works well.

    However, because this option is set, when a file is processed via dsmrecall rather than directly accessed from an application, the access date is not updated.
    This means that if the filesystem was:
    • near the threshold
    • and dsmrecall pushes it over its threshold
    • and you consider files on an WEIGHT(ACCESS_AGE) strategy

    Then the data can be first (or among) to be considered to be migrated back to tape.
    I.E. I spent 2 days doing a recall and as of last night it's all migrated again. Fun!

    Ping-pong.

    So either:
    • the supplied example migration script should be updated to touch every file it recalls after dsmrecall has compeleted
    • dsmrecall should ignore the preservelastaccessdate option
    • TSM should have an extra option updatelastaccessdateonrecall=YES