And Latency Once More
MartinPacker 11000094DH Visits (5586)
This is about the third time I’ve written about this, and it probably won’t be the last. 
I was presenting to customers about the Coupling Facility Path Latency statistics I’ve previously spoken of when one of them told me of the following incident. I’m sure he won’t mind me sharing it without you, so long as I don’t identify the source.
The customer has two zEC12 machines, with Internal Coupling Facilities (ICFs) in each, and with z/OS LPARs in each machine, using Infiniband links and Internal Coupling links to these ICFs. 
The customer believed they had two groups of four Infiniband paths between one z/OS image and a remote CF. These groups of paths take routes said to be  5km and 8km long.
One day they looked at an RMF Coupling Facility Activity postprocessor report and saw the new path data information, new with OA37826 and CFLEVEL 18. That was a nice surprise.
What wasn’t a nice surprise was the report indicating three paths at 8km and five paths at 5km. This was not what they expected.
Their initial suspicion was that the routing was wrong and the instrumentation right. But it proved otherwise:
So, the upshot was that the adapter card was reporting the incorrect distance. The card has, fairly obviously, been replaced and everything is fine now.
There’s no suggestion there was anything else wrong with the card, but it’s good it was replaced. An interesting question is whether incorrect latency measurements could cause poor routing decisions, but I certainly can’t comment on that publicly.
Another question I can’t answer is whether the latency measurement suddenly went bad; All we know is that when the customer looked at the Coupling Facility Activity report for the first time it had the wrong number in it.
While I don’t propose to write reporting that assumes dynamically changing CF Path Latency values I do think it’s worthwhile to look occasionally at this data. I always do when I get customer data - and most customers have OA37826 applied and are at CFLEVEL 18 or higher.
So please do look at this every so often, including right now, as a useful verification exercise.
I’m now keeping a list of my blog posts on Coupling Facility links in a separate file. Here’s what it looks like so far: