Topic
  • 2 replies
  • Latest Post - ‏2012-10-05T21:56:20Z by SystemAdmin
botemout
botemout
70 Posts

Pinned topic Policy rule to migrate files to slower disk questions

‏2012-10-05T01:10:09Z |
Greetings,

I have a large collection of older raid arrays which support a feature the vendor calls automaid, i.e., disks spin down when not used. We have many files which are infrequently accessed and I'd like to stick it in storage pools which are constructed of this disk with this feature enabled. It will look like this:
  • fileystem named FS is create using a large-ish number of SSD luns for the system storage pool, marked metadataOnly
  • each raid array's LUNs are added to its own storage pool as dataOnly
  • some of these arrays will be set to spin down when not used
  • a policy rule will be written which will migrate unused data to this storage pool

I think this is all pretty standard. What I'm not clear about is how the filesystem changes when a migration occurs. For instance, let's say the filesystem tree looks like:
/gpfs/fs/raid1
/gpfs/fs/raid2
/gpfs/fs/raid3_automaid
each of raid1, raid2 and raid3_automaid is a storage pool linked (mmlinkfileset) to NDSs that come from the same raid array.

The reason that I want each fileset to represent a seperate storage pool is that I want to limit the impact on the filesystem as a whole of the loss of any storage array. This data will not be replicated or backed up. The LUNs will be RAID 6 and I trust that the controllers won't corrupt the data, nor that we'll lose > 2 drives per array at once. Yes, I'm trusting but, by definition, though inconvenient, these data could be reacquired.

So, I run my policy rule which traverses raid1 and raid2 looking for files which haven't been accessed in > 3 months; when found I migrate the files to raid3_automaid. Let's say before the run I had a file:
/gpfs/fs/raid1/not_used_file.dat
which gets migrated. Will an 'ls /gpfs/fs/raid1/not_used_file.dat' still find the file? Or will it disappear from there and only be found at /gpfs/fs/raid3_automaid?

The answer I want to hear is that its name (pointer/inode) will stay in raid1 but the actual data will move to raid3*. (Perhaps it would be a good idea for me to not link the raid3* storage pool so that users never traverse into it?)

Thanks much
JR
Updated on 2012-10-05T21:56:20Z at 2012-10-05T21:56:20Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: Policy rule to migrate files to slower disk questions

    ‏2012-10-05T14:01:38Z  
    Generally speaking:

    The MIGRATE rule of mmapplypolicy never changes the pathname of a file. Disk pools and filesets are "orthogonal" constructs in GPFS.
    HOWEVER, you can use the SET POOL rule of policy to assign the data of all the files within a particular fileset to a particular pool,
    and later change that assignment for "old files" by running mmapplypolicy with an appropriate MIGRATE rule.

    In your example, it seems to me that there is no need to define a fileset or directory called raid3_automaid.

    If you would please show us your policy rules and `mmlsfileset xxx -L` perhaps I could say more to your question.
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: Policy rule to migrate files to slower disk questions

    ‏2012-10-05T21:56:20Z  
    As Marc said, filesets and pools are orthogonal. That is, file's pool assignment doesn't necessarily have a bearing on its path. So running mmapplypolicy won't cause any path changes.

    The fileset structure you're suggesting could be used to implement poor man's pool migration: if filesetA is tied (through placement policy) to poolA, and filesetB is tied to poolB, then "mv filesetA/file1 filesetB" would not only change the file's path, but also migrate its blocks to poolB. Had there been no proper way to do block migration in GPFS, you'd need something along those lines. However, a proper mechanism does exist (mmapplypolicy), so you don't need multiple filesets for what you're trying to accomplish. Note that since all metadata stays in the system pool, a catastrophic failure of a disk in one of the data storage pools will only affect the availability of file data residing in that pool. You'll be able to use mmfileid to compile a list of affected files, so there's no need to rely on file path segregation.

    yuri