Topic
  • 4 replies
  • Latest Post - ‏2014-06-02T16:22:16Z by marc_of_GPFS
oester
oester
171 Posts

Pinned topic FILE_HEAT - does it actually do what is says?

‏2014-03-31T12:43:14Z |

I've been running some tests using File Heat  and policy migration, and it doesn't  seem to behave like the documents say it does. First, I was scanning my file systems using a simple policy to dump out file sizes and heat values:

define(DISPLAY_NULL,[CASE WHEN ($1) IS NULL THEN '_NULL_' ELSE varchar($1) END])
 
rule fh1 external list 'fh' exec ''
rule fh2 list 'fh' weight(FILE_HEAT) show( DISPLAY_NULL(FILE_HEAT) || '|' || varchar(file_size) )

Using this, I could (supposedly) see  easily the most active files and how big they are. What this told me was that most of the files in the file system had a heat value of "0" every day, and that an SSD tier of 100GB would easily hold most of the active files in a 30TB file system. This seemed strange, but I'll go with it. I then created and SSD tier and ran a policy like this:

rule grpdef GROUP POOL gpool IS ssd LIMIT(90) THEN Plevel1
rule repack MIGRATE FROM POOL gpool TO POOL gpool WEIGHT(FILE_HEAT)

Where Plevel1 is my default data pool. This seems to run as expected, and I see files being moved to the ssd pool. However, the bulk of the IO still seems to be going to the disk (Plevel1) pool.  If I look at the policy that migrates the data, I see this at execution time:

rule repack MIGRATE FROM POOL gpool TO POOL gpool WEIGHT(computeFileHeat(CURRENT_TIMESTAMP-ACCESS_TIME,xattr('gpfs.FileHeat'),KB_ALLOCATED))

Which leads me to believe that the FILE_HEAT movement is based on size as well as activity - can anyone confirm this? There is nothing comprehensive in the documentation that describes *exactly* how it all works. 

Is anyone successfully using FILE_HEAT to migrate data between pools?

  • fleers
    fleers
    26 Posts

    Re: FILE_HEAT - does it actually do what is says?

    ‏2014-05-07T20:24:50Z  

    Hi oester - did you get any replies to this query off-list?  I'd be interested to hear experiences of other users of FILE_HEAT as I'm about to embark on some experimentation of my own.

     

     

  • oester
    oester
    171 Posts

    Re: FILE_HEAT - does it actually do what is says?

    ‏2014-05-07T20:32:42Z  
    • fleers
    • ‏2014-05-07T20:24:50Z

    Hi oester - did you get any replies to this query off-list?  I'd be interested to hear experiences of other users of FILE_HEAT as I'm about to embark on some experimentation of my own.

     

     

    No I didn't - thought someone from the IBM side would (should) respond. This is very powerful and poorly documented, IMHO. Since this posting, I've done about two months worth of experimentation with a 30TB file system and 2.4 TB SSD tier. I'm placing new files on the SSD tier, then doing a nightly rebalance based on file heat values, leaving 30% of the SSD tier free for new files. I'm catching about 90% of the overall file system traffic in the SSD tier now, using this scenario. I also want to experiment with GPFS local cache on the NSD server in GPFS 4.1

    Lots of other thoughts to share as well - too many for here. I will be talking at IBM Edge in a few weeks on this subject (GPFS, Tiering, and Flash/SSD) if you want to catch me then. Otherwise you can contact me directly: robert.oesterlin@nunace.com

     

    Bob Oesterlin

    Sr Storage Engineer, Nuance Communications, HPC Grid

  • dlmcnabb
    dlmcnabb
    1012 Posts

    Re: FILE_HEAT - does it actually do what is says?

    ‏2014-05-07T22:32:55Z  
    • oester
    • ‏2014-05-07T20:32:42Z

    No I didn't - thought someone from the IBM side would (should) respond. This is very powerful and poorly documented, IMHO. Since this posting, I've done about two months worth of experimentation with a 30TB file system and 2.4 TB SSD tier. I'm placing new files on the SSD tier, then doing a nightly rebalance based on file heat values, leaving 30% of the SSD tier free for new files. I'm catching about 90% of the overall file system traffic in the SSD tier now, using this scenario. I also want to experiment with GPFS local cache on the NSD server in GPFS 4.1

    Lots of other thoughts to share as well - too many for here. I will be talking at IBM Edge in a few weeks on this subject (GPFS, Tiering, and Flash/SSD) if you want to catch me then. Otherwise you can contact me directly: robert.oesterlin@nunace.com

     

    Bob Oesterlin

    Sr Storage Engineer, Nuance Communications, HPC Grid

    I cannot answer about file heat. (Hardly anyone knows how it works. So you could become the expert!)

    GPFS local cache is a client only function. It will do nothing on NSD servers which know nothing about the files being used, just the blocks being accessed.

    The only caching on NSD servers is done by the GNR/GSS subsystem code since there can only be a single node handling all the traffic for a GNR vdisk at any one time.

  • marc_of_GPFS
    marc_of_GPFS
    35 Posts

    Re: FILE_HEAT - does it actually do what is says?

    ‏2014-06-02T16:22:16Z  
    • oester
    • ‏2014-05-07T20:32:42Z

    No I didn't - thought someone from the IBM side would (should) respond. This is very powerful and poorly documented, IMHO. Since this posting, I've done about two months worth of experimentation with a 30TB file system and 2.4 TB SSD tier. I'm placing new files on the SSD tier, then doing a nightly rebalance based on file heat values, leaving 30% of the SSD tier free for new files. I'm catching about 90% of the overall file system traffic in the SSD tier now, using this scenario. I also want to experiment with GPFS local cache on the NSD server in GPFS 4.1

    Lots of other thoughts to share as well - too many for here. I will be talking at IBM Edge in a few weeks on this subject (GPFS, Tiering, and Flash/SSD) if you want to catch me then. Otherwise you can contact me directly: robert.oesterlin@nunace.com

     

    Bob Oesterlin

    Sr Storage Engineer, Nuance Communications, HPC Grid

    Okay, I am one of the IBM guys who knows a little something about FILE_HEAT and GROUP POOL If you have specific questions, post them on this board and we will try to answer them.

    In the meanwhile, I advise the following.  BEFORE deploying a set of policy rules that exploit these relatively new features - you have to get FILE_HEAT working, and  get some feel that is does work and  how it works:

    1) Turn on FILE_HEAT tracking with the mmchconfig command. Refer to the GPFS Advanced Admin Guide - (RTFineM) - particularly the section with the heading "Tracking file access temperature within a storage pool".   Start with the recommended values:

      mmchconfig fileheatperiodminutes=1440,fileheatlosspercent=10

    2) Experiment!  First create a little policy rules file just as Bob (who I believe use to be an IBMer) suggested:

    define(DISPLAY_NULL,[CASE WHEN ($1) IS NULL THEN '_NULL_' ELSE varchar($1) END])
    rule fh2 list 'fh' weight(FILE_HEAT) show( DISPLAY_NULL(FILE_HEAT) || ' | ' || varchar(file_size) )

     Next create a few test files in a test directory:

      mkdir /gpfs/whatever/test; cd /gpfs/whatever/test

       dd if=/dev/urandom of=g bs=1G count=1 

    (urandom may be slow on your system, but I like to use that, to be sure nothing in the path between the command line and the disk is "optimizing" for a run of zeros or such.)

      cat g g g g g >5g

      cat g g g g g >5f

    And so forth to make several multi-gig files with "random" contents.

     Then  "heat" up one or more of one or some of those files by repeatedly reading it...

     cat 5f 5f 5f 5f 5f 5f >/dev/null

     Run an mmapplypolicy report to see how hot your files have become.

    `mmapplypolicy /gpfs/whatever/test  -P /mypolicies/test-file-heat.policy -I test -L 2`

    Notice that FILE_HEAT is normalized so that a 1 gig file read 5 times will be just about as "hot" as a 5 gig files read 5 times recently.

    The file heat is also averaged over the entire file, so even if the first few megs are "hot" the whole file may still be cool on average.

    Well I just did that whilst writing this and here is some of the output:

    WEIGHT(7.928872) LIST 'fh' /abc/test/g SHOW(+7.92887155173308E+000 | 789617810)
    WEIGHT(4.974842) LIST 'fh' /abc/test/5f SHOW(+4.97484152756019E+000 | 3948089050)
    WEIGHT(0.000000) LIST 'fh' /abc/test/5g SHOW(_NULL_ | 3948089050)
     

    Hmmm... notice that g has 7.9 units of heat and is almost twice as warm at 5f.   Because we read g 10 times a while ago but 5f only 5 times and somewhat more recently.

    Also 5g has no heat yet, because we only wrote it out once...