DB2 Data Sharing and XCF Job Name - Revisited
MartinPacker 11000094DH Comments (3) Visits (3011)
It's been almost four years since I wrote DB2 Data Sharing and XCF Job Name. It mostly stands the test of time but there are a couple of things I want to bring up.
I was in the DB2 Development lab a couple of days ago, talking with a couple of developer friends about DB2 Data Sharing and XCF. They know DB2 Data Sharing and IRLM much better than I do but XCF not so much. (It's probable that XCF Development have a complementary set of knowledge.)
So this conversation provided a fresh set of data as well as a chance to rehearse the contents of that blog post again.
The first thing to note is that I was inaccurate in one regard: Because in 2009 I'd only seen data from installations where the XCF group name for IRLM was "DXRabcd" where "abcd" is the DB2 Data Sharing group name I'd made the poor assumption this was always the case. In this fresh set of data the IRLM XCF group name is "DXRGROUP", which has nothing to do with the Data Sharing group name. You can have a DB2 Data Sharing group of up to 8 characters long so "DXRgrpname" couldn't work as a convention.
(And if you think the terms "XCF group name" and "DB2 Data Sharing group name" are confusingly similar, I'm inclined to agree.)
But all is not lost as the field that started it all - R742MJOB - contains the IRLM address space name. IRLM address space names are quite easy to find - in SMF Type 30 - because the program name is always "DXRRLM00". But you might have several within the same z/OS image. So the method I outlined for finding the IRLM XCF group name - and monitoring its performance - still stands, with this minor tweak.
The other thing the conversation did was to reinforce something I've been gradually sensitised to:
Keep track of how DB2 and IRLM address space CPU behaves over time.
Here I'm talking about not just the IRLM address space for a subsystem but also DBM1, MSTR and DIST. The conversation started with a customer seeing spikes in IRLM CPU. As we only had very few data points it was impossible to do what I like to do: Plot stuff by time of day over several days. If I've worked with your data you'll know I do this to establish patterns.
So are these spikes regular, or at least vaguely regular? Or are they something specific going wrong? (The notion of "going wrong" is interesting, too.) If you have spikes in IRLM CPU in the Batch Window maybe it's because some jobs are driving a lot of locking activity. (And so it would be with e.g. DBM1.)
What would be interesting would be to see a coincidence between IRLM CPU and these two XCF groups' - DXR and IXCLO - traffic spiking. (Or indeed the lack of a coincidence.) It's important to notice that much IRLM activity goes nowhere near XCF or indeed the LOCK1 Coupling Facility structure.
But we didn't get to do that. Which is a pity. But still, I learn from every situation: And seeing lots of them is my good fortune.