A Record Of Sorts
MartinPacker 11000094DH Visits (5282)
When looking at a batch job1 I like to see how the data flows through the various steps.
The first step - some 23 years ago - was to look at the Life Of A Data Set (“LOADS” for short).2
With LOADS - for VSAM and non-VSAM data sets - you can see who reads and writes the data set. You can also see the EXCP count. More on that in a bit but suffice it to say EXCP count might be enough to tell you if the data set was written or read in its entirety.
Why Record Counts Matter
Probably just out of curiosity.
Actually, really not…
I just said I can detect readers and writers and I used the words “in its entirety”. But I think it useful to go deeper. Here are two - off the top of my head - reasons to want record counts:
Estimating Record Counts
I just used the word “estimating”. Under some circumstances we can do better than estimating, as we’ll see.
One of the reports our “Job Dossier” code produces is called “Job Data Set”. Basically a list of steps and the data sets each step accesses.3
For data sets accessed by QSAM we can estimate the number of records in the data set by examining the LRECL, the Block size and the EXCP count. But there are lots of problems with this:
Still, where applicable it’s a good start.
But we can do better:
So in a very simple case - a single sort invocation in a step - we can use these record counts to estimate the number of records in the SORTIN and SORTOUT data sets. And we can find the SORTIN data set represented by an SMF 14 record and the SORTOUT data set by an SMF 15 record.
Record Counts And SQL Statements
Several times in a recent batch study the SMF 101 SQL counts have borne some relation to record counts. Consider the following (very realistic) scenario:
The sort step reads a data set (SORTIN DD) and writes one (SORTOUT DD). The DB2 step reads the same data set and does something with DB2 data based on the records read.
For example, in one job step the Singleton Select count matches the input record count.
So we can glean that the selects are record-driven - just with SMF.
By the way, we match SMF 101 records with SMF 30–4 Step End records by Timestamp comparison and Correlation ID matching, which I describe in gory detail in Finding The DB2 Accounting Trace Records For an IMS Batch Job Step. Ignore the “IMS” bit if you like; The preamble is the more general bit.
What My Code Does Today
We map all this, of course.
My first toe in the water is very limited:
For the “single sort in a step with one input data set and one output data set” case I use the SMF 16 record counts as the data set sizes. These overwrite any EXCP / block size / LRECL estimate for FB data sets - as it’s more accurate.
The really nice thing is it gives me an accurate estimate for VB data sets, which I didn’t have before.
This is quite a long list of potential extensions - but each one is fiddle. Some will get done; Some possibly won’t.
All I know is our code’s ability to estimate record counts took a leap forward, and that is proving useful straightaway. And writing this has helped me sort my thoughts out, as has explaining it to a couple of friends (with a stake in this). And I haven’t even begun to talk about VSAM yet…