Topic
  • 4 replies
  • Latest Post - ‏2012-12-03T21:24:05Z by HajoEhlers
ttavasc
ttavasc
5 Posts

Pinned topic GPFS Cache Performance

‏2012-08-23T20:27:50Z |
Hello all,

I'm wondering if there is a mechanism within GPFS v3.4 that can be used to monitor it's cache performance - things like what percent of requests are being satisfied from cache as opposed to physical storage. While we have run GPFS for a number of years on both AIX and Linux, this is the first time where we've encountered poor application performance that requires tuning within GPFS. In this case the java application was using a native OS file stat to retrieve the size attribute of a particular file and it was taking 300-400ms each time. We ultimately resolved the issue by increasing the maxFilesToCache value which also increased maxStatCache. This improved the time to retrieve to about 2-3ms.

At the moment I'm most interested in understanding what tool(s) might be available as opposed to the specifics of the poor performance mentioned above. I've been reviewing chapter 6 of the Advance Administration Guide which details mmpmon, and we pulled down and experimented with the gpfs_perf.pl script found on the GPFS Wiki site. However, I don't see anything that specifically addresses GPFS cache performance, and since GPFS cache clearly played a role in the above example I'd like to know if we can more proactively monitor this. I am in the process of getting signed up for a formal GPFS tuning class, but given the schedule that is still a few months down the road.

I appreciate any feedback or pointers provided.

-Todd
Updated on 2012-12-03T21:24:05Z at 2012-12-03T21:24:05Z by HajoEhlers
  • FelipeKnop
    FelipeKnop
    25 Posts

    Re: GPFS Cache Performance

    ‏2012-08-27T02:36:21Z  
    There does not seem to exist any supported mechanism to retrieve how many requests could be satisfied by either the pagepool or the "stat cache".

    I'll investigate more, in case I missed something.

    Felipe
  • ttavasc
    ttavasc
    5 Posts

    Re: GPFS Cache Performance

    ‏2012-08-27T22:46:31Z  
    There does not seem to exist any supported mechanism to retrieve how many requests could be satisfied by either the pagepool or the "stat cache".

    I'll investigate more, in case I missed something.

    Felipe
    Thanks Felipe. If I have time this week I am thinking about opening a PMR with IBM Support to get their feed back. This particular application is also posing some meta-data performance issues as well. It's using a multiple copies of a repository that uses a multi-level hash directory structure of 256 directories x 256 directories. This structure seems to be causing a lot of grief for our backup software as nightly scans are taking upwards of 2 or 3 hours before it actually gets to the point where it's writing data to the backup device.

    -Todd
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: GPFS Cache Performance

    ‏2012-12-03T16:33:00Z  
    Perhaps the SNMP monitoring will help. According to https://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=%2Fcom.ibm.cluster.gpfs.v3r50-3.gpfs200.doc%2Fbl1adv_filesystemperf.htm , there's OIDs to stat cache hit / stat cache miss, # of reads (i.e. physical reads), # of caches (i.e. cache hits).
  • HajoEhlers
    HajoEhlers
    253 Posts

    Re: GPFS Cache Performance

    ‏2012-12-03T21:24:05Z  
    I like pretty much 'mmdiag'

    
    mmdiag --stats
    
    for cache statistics ( It helped to solve an issue where user had very large directories >= 100k files )

    
    mmdiag --iohist
    
    give a list for up to 512 data block (r/w , data, metadata,inode ). If you see all the tinme inode lookup maybe something strange is going on.

    cheers
    Hajo