Down In The Dumps?
MartinPacker 11000094DH Visits (2012)
That’s such a horrible pun I must’ve used it before. If so sorry (but not very).
This post follows on from Enigma And Variations Of A Memory Kind in a way. In that post I mentioned DUMPSRV, in almost a throwaway fashion: I happened to notice the memory usage in SMF 30 by DUMPSRV grew at just the point free memory took a dip.
This post takes that idea and extends it a little - and I think it might be something you want in your everyday reporting.
I ran a query against Data from all the customer’s LPARs in one pled - using SMF 30 data for DUMPSRV: I pulled out hours when the DUMPSRV CPU was more than 0.1% of a processor, printing the memory used, blocks transferred (think “I/O traffic”) and CPU. This highlighted that across the LPARs quite a lot of dumping happened, sometimes simultaneously on the systems. It made me think that “dump containment” is quite a big issue for this customer.
There are some issues with this approach:
One thing that is worth working into the reporting is what happened to free memory at that point. If it was driven into the ground that’s a sign you need to take dumping seriously.
As with all such things it’s a matter of priority as to whether I write a “RDUMPSRV” REXX EXEC to detect this sort of thing. It wouldn’t take long.
More to the point I worked up this post from the one liner in the other one because I think it’s a technique worth thinking about: If you’re a Performance person it might not be obvious but you really do want to know about dumping prevalence, and substantial dump occurrences in particular. And if dumping does happen you’ll certainly want to be prepared to handle it in ways I’ve mentioned before - such as adequate memory, good paging subsystem design or, notably, zFlash.