I was on a call with a customer who had asked for help understanding MQ performance of their system.
This was to understand the impact of synchronous and asynchronous Coupling Facility (CF) requests and why measurements on a day to day basis had different response times and with different transaction profiles.
The blog entry here will help explain the difference between asynchronous and synchronous CF requests.
The MQ SMF data can provide very deep levels of insight to your MQ applications, particularly with regards to shared queue messaging.
Depending on your requirements, more detail can be extracted using SupportPac MP1B via the "detail" attribute.
The level of detail ranges from: "basic monitoring of the cost" to "Give me more detail on the breakdown of synchronous and asynchronous elapsed times".
The following offers an example of information available as the detail level increases for an MQPUT of messages to a shared queue:
At the bottom of this post is some real data from my test system.
Basic monitoring of the elapsed time
For many people this level of detail is sufficient.
MQPUT Count 1000 MQPUT average elapsed time 25 uS
Where uS is microSeconds.
Basic monitoring of elapsed time including CF.
Specify DETAIL 10 in MQSMF.
This level provides a high level view of the impact of the CF on the MQ API.
MQPUT Count 1000 MQPUT average elapsed time 25 uS CF time/verb 10 uS
On average 10 of the 25 microseconds was spent doing CF requests.
How much of the CF work is synchronous?
The previously referenced blog entry gives some examples of why the CF may respond synchronously or asynchronously, as well as the impact of these different responses.
Specify DETAIL 11 in MQSMF. This is the minimum level of detail to required in order to return the breakdown of synchronous to asynchronous requests. In this section some of the rows have been omitted to make it easier to follow
The time spent doing CF requests can be broken down by synchronous and asynchronous requests.
Asynchronous requests may offer a tuning opportunity, for example using a local CF rather than a remote CF.
MQPUT Count 1000 MQPUT average elapsed time 25 uS CF time/verb 10 uS CF Avg Sync elapsed time/verb 9 uS CF Avg Async elapsed time/verb 1 uS
At this level of detail, the CF was spending on average 10 microseconds per request. Of this, 9 microseconds was spent in synchronous requests and 1 microsecond in asynchronous requests.
These averages are calculated from the total time spent doing synchronous or asynchronous requests divided by the total number of MQPUT requests.
There may have been only one asynchronous request - taking 1000 microseconds, or two asynchronous requests taking 500 microseconds each, which averages out to 1 microsecond per MQPUT.
This gives you an idea of the proportion of time spent on average in synchronous and asynchronous requests for an MQPUT, even though an "average" MQPUT doesn't exist.
Unless you are interested in potential tuning opportunities, this may be enough detail.
To determine how many requests actually factored into the average times, a further level of detail is required.
Give me more detail on the breakdown of synchronous and asynchronous elapsed times
Some API requests have to issue more than one request to the CF for each MQ API request.
Specify DETAIL 11 in MQSMF.
MQPUT Count 1000 MQPUT average elapsed time 25 uS CF time/verb 10 uS CF Avg Sync elapsed time/verb 9 uS CF Sync number of request 1500 CF Avg Sync CF response time 6 uS CF Avg Async elapsed time/verb 1 uS CF Async number of request 10 CF Avg ASync CF response time 100 uS
In the example, there were 1000 MQPUT requests, which resulted in 1500 synchronous requests to the CF. (These are made up numbers for illustration only)
The total time spent performing the synchronous requests was 9000 microseconds.
The average time in the CF for each MQPUT is: 9000 microseconds / 1000 MQPUTs, i.e. 9 microseconds for each MQPUT.
The average time in the CF for each CF request is: 9000 microseconds / 1500 CF requests, i.e. 6 microseconds for each CF request.
Similar calculations can be performed for the asynchronous requests.
The CF logic within z/OS determines whether the request should be synchronous or asynchronous, and this decision can change based on many factors including CF response times.
Even if the CF requests could be 100% synchronous, the CF will periodically issue an asynchronous request to get the response time, and this is true for the reverse i.e. when requests are predominantly asynchronous, periodic synchronous requests will occur.
It is usual to get a mixture of synchronous and asynchronous requests, but the factors such as message size, location of the CF and responsiveness of the CF can all play a part in influencing how much is synchronous.
These numbers when monitored over a day-to-day or even an hour-to-hour basis may show different response times or even different ratio of synchronous to asynchronous ratio of calls for a similar workload.
The data can be used to CF variability based on workload or just general load.
Some real data for queue APP1Q1
Open Count 2 APP1Q1 Open Avg elapsed time 1002 uS APP1Q1 CF time/verb 944 uS CF Avg Sync elapsed time/verb 534 us CF Sync number of request 3 CF Avg Sync CF response time 356 us CF Avg Async elapsed time/verb 410 us CF Async number of request 2 CF Avg ASync CF response time 410 us
We can see for 2 opens, there were a total of 3 Sync + 2 Async requests.
The Sync response time 356 uSeconds is much higher than I would expect in production - but this was a test system.
Most of the elapsed time ( 944 out of 1002) was spent in the CF.
This CF access can be eliminated- see Saving CPU by having a job that does nothing!
Get count 1000 APP1Q1 Get avg elapsed time 2012 uS APP1Q1 CF time/verb 1937 uS CF Avg Sync elapsed time/verb 73 us CF Sync number of request 460 CF Avg Sync CF response time 159 us CF Avg Async elapsed time/verb 1863 us CF Async number of request 1540 CF Avg ASync CF response time 1210 us Most of the elapsed time( 1937 out of 2012) was spent in the CF. Of the 1937, 73 was spent doing Sync requests, and 1853 was spent doing Async requests. The number of Sync + Async = number of gets The average response time of the Synch requests was 159 uS. Put count 1000 APP1Q1 Put avg elapsed time 1064 uS APP1Q1 Put + put1 valid count 1000 APP1Q1 CF time/verb 990 uS CF Avg Sync elapsed time/verb 23 us CF Sync number of request 94 CF Avg Sync CF response time 245 us CF Avg Async elapsed time/verb 967 us CF Async number of request 906 CF Avg ASync CF response time 1067 us
The analysis is left to the reader.