A Slice Of PI
MartinPacker 11000094DH Visits (3301)
A rather obscure pun, but I hope it’ll make sense. Not “Pi” by the way but “PI”. Though this post contains arithmetic, it’s not a mathematics post.
To be honest, I never knew how we calculated Performance Index (or PI) for Percentile goals before. But now I do, so I’m sharing it with you. Plus a couple of observations, too.
To be even more honest, when I say “I never knew how we calculated Performance Index” I should say “I never knew how we should calculate Performance Index” - as I’ve just corrected it.
(This post follows on (after 6 years) from WLM Response Time Distribution Reporting With RMF. More on that later.)
Before we go any further I have to give a little background information.
What Are Percentile Goals?
When Workload Manager (WLM) manages transactions explicitly two kinds of goals become available:
By the way, for WLM to manage transactions at all requires cooperation / exploitation by middleware or at any rate a work manager. Examples include:
Average response time goals say something like “the average response time for this service class period should be 0.5 seconds”.
A Percentile response time goal might be “90% of all transactions in this service class period should finish in 300 milliseconds”.
What Is Performance Index - Or PI?
Performance Index (PI) is a measure of goal attainment.
The point about PI is that it is a metric for goal attainment that is neutral with regard to workload type.
PI is, of course, used to drive WLM’s algorithms. But I regard it as just the first metric. Others, such as WLM’s ability to help a service class period, are important too.
How Do We Calculate PI For Percentile Goals?
The calculation for goal attainment for Average response time goals is straightforward: Sum up the response times for each transaction and divide by the number of transactions ending.
The calculation for Percentile goals is more complex.
For any kind of transaction-based goal, at transaction ending WLM uses the transaction’s response time to assign it to one of 14 buckets. So WLM is counting transaction endings in these buckets.
The buckets have the following boundaries:
The bolded values are of special significance, as we shall see.
Suppose we have a goal of “85% to complete within 0.2 seconds”. WLM knows how many transactions completed in each bucket and how many overall.
Suppose 1000 transactions completed. 85% of 1000 is 850 transactions.
Starting with Bucket 1, WLM tallies up the transaction endings until it meets 850 transactions. The upper limit of the bucket in which that happens is what determines the PI.
Suppose Buckets 1 to 3 tally up to 800 transactions and Bucket 4 contains 100 transactions. So Buckets 1 to 3 don’t meet 850 but Buckets 1 to 4 do.
Bucket 4’s upper limit is 80% of goal. So the PI is 80%/100 or 0.8.
Suppose it took Buckets 1 to 8 to reach or exceed 850. Then Bucket 7’s upper limit would be 110% and the PI would be 110%/100 or 1.1.
The code I inherited didn’t do this calculation. But now it does.
Actually the calculation is not quite that simple: If by the time we’ve tallied up buckets 1 to 13 and we still haven’t reached that 850 number we set the PI to 4.0 (which makes sense).
From the above description of how PI is calculated for percentile goals, we can observe a few things:
Revisiting That Old Blog Post
This seems as good a place as any to follow up on WLM Response Time Distribution Reporting With RMF.
I made some refinements to the graph I showed there:
Here is a modern case of a CICS transaction service class.
Here are some observations:
By the way, WLM doesn’t have complete control over the response time achieved for a transaction. And that’s particularly relevant here.
This transaction goal service class is served by two region goal service classes. Both of these show almost no “Delay For X” samples. What they do have is lots of “Using I/O” and “Using CPU” samples.
So, to improve transaction response time it’s probably necessary to try:
Neither of these are things WLM can do.