Interpreting message rate information

The message rate information gathered is presented in multi-line message CNZZ043I:

The message rate distribution graph shows the percent of time at a given message rate on the Y-axis and instantaneous message rates (in messages/second) on the X-axis. The X-axis scale is logarithmic with each character position being a factor of 2 greater than the previous position in the rightward direction. Tick marks are provided at 8X intervals.

Each vertical bar of asterisks in the graph is rightward cumulative, that is, each bar represents not only the fraction of time at its own rate, but the fraction of time with a lesser rate. (A bar's own contribution to the time at a given message rate is therefore the difference between its height and the height of its immediate leftward neighbor).

A vertical line (|) indicates the most common message rate.

The graph should have a characteristic S shape to it caused by there being relatively few messages occurring at very low message rates (the bottom left of the S curve) and very few messages occurring at very high message rates (the top right of the S curve).

CNZZ043I     MSGFLD Message Rates
                 Instantaneous Message Rates
            515 messages in       492 seconds     1.046 msg/sec
% of time at msg rate              112 messages w/most common rate
     100.000%|           | ***********
      96.000%|           |************
      92.000%|           |************
      88.000%|           |************
      84.000%|           *************
      80.000%|           *************
      76.000%|           *************
      72.000%|           *************
      68.000%|           *************
      64.000%|           *************
      60.000%|          **************
      56.000%|          **************
      52.000%|          **************
      48.000%|          **************
      44.000%|          **************
      40.000%|          **************
      36.000%|          **************
      32.000%|         ***************
      28.000%|         ***************
      24.000%|         ***************
      20.000%|         ***************
      16.000%|        ****************
      12.000%|        ****************
       8.000%|      ******************
       4.000%|      ******************
            0+--+--+>-+--|--+--+---+-<+-------------------
             0           1  8 64  1K 8K  messages/second

             Suggested threshold for 95% is     2
             Suggested threshold for 96% is     3
             Suggested threshold for 97% is     3
             Suggested threshold for 98% is     4
             Suggested threshold for 99% is     6

This example was produced using a testcase that issued messages with an exponential distribution of inter-arrival times and a mean inter-arrival time of 0.5 seconds. The vertical bar indicates that the most common (mean) message rate is 1 messages/second. The average message rate is only slightly more than 1 message/second, a rate that has been determined by IBM® human factor studies to be the maximum rate that messages should be presented on any one console.

On the X-axis, the minimum and maximum message rates recorded are indicated (by the > and < symbols respectively) on either side of the mean message rate. The percentage of messages occurring at the maximum message rate is usually quite small and may not be visible unless the resolution of the graph is improved by increasing the number of message lines in the graph.

The graph presents instantaneous message rates that are determined from the inter-arrival times of the messages. Small inter-arrival times result in high instantaneous message rates; large inter-arrival times result in low instantaneous message rates. A high message rate on the graph does not necessarily imply that multiple, consecutive messages were issued at that rate. It is quite possible (as in the example) for a high message rate to be indicated without Message Flood Automation being triggered. (It is multiple, consecutive, high message rate messages that trigger Message Flood Automation).

The suggested threshold values represent the message rates that are not exceeded some fraction of the time. In the example, a message rate of 4 messages/second is not exceeded 98% of the time; a message rate of 6 messages/second is not exceeded 99% of the time. You can use the suggested threshold values to set an appropriate REGULAR MSGTHRESH value.

Look at a more interesting graph:

CNZZ043I     MSGFLD Message Rates
                 Instantaneous Message Rates
          34299 messages in     78111 seconds     0.439 msg/sec
% of time at msg rate             5993 messages w/most common rate
     100.000%|         **********************
      96.000%|      *************************
      92.000%|     **************************
      88.000%|     **************************
      84.000%|    ***************************
      80.000%|    ***************************
      76.000%|    ***************************
      72.000%|   ****************************
      68.000%|   ****************************
      64.000%|   ****************************
      60.000%|   ****************************
      56.000%|   ****************************
      52.000%|   ****************************
      48.000%|  *****************************
      44.000%|  *****************************
      40.000%|  *****************************
      36.000%|  *****************************
      32.000%|  *****************************
      28.000%|  *****************************
      24.000%|  *****************************
      20.000%| ******************************
      16.000%| ******************************
      12.000%| ******************************
       8.000%| ******************************
       4.000%| ******************************
            0+->+--+--+--+--+--+--|+--+-----<-------------
             0           1  8 64  1K 8K  messages/second

             Suggested threshold for 95% is     1
             Suggested threshold for 96% is     1
             Suggested threshold for 97% is     1
             Suggested threshold for 98% is     1
             Suggested threshold for 99% is     1

This graph looks very different from the previous one. The first reaction of most people is to look at the average message rate of 0.439 messages per second and the fact that the most commonly occurring message has a rate of 512 messages per second and wonder how these two statistics can be reconciled. It is important to understand what an average can tell you and what it cannot. What an average can tell you is that (in this case) 34299 messages occurred during the 78111 second interval that was monitored. What the average message rate cannot tell you is how those messages were distributed within the monitoring interval. If the messages were distributed uniformly within the monitoring interval, the time between messages would be the same -- but a quick look at the graph shows this to not be the case: there were some number of messages that occurred at an instantaneous message rate of 1 message every 1024 seconds (at the left edge of the graph) and there were some number of messages that occurred at an instantaneous message rate of 262144 messages per second (at the right edge of the graph). And there were the 5993 "most commonly occurring" messages that occurred at a rate of 512 messages per second. The answer to this riddle is that one or more message "spikes" occurred at some point in the monitoring interval, and those spikes produced at least 5993 messages at a rate of 512 messages per second. Why doesn't this very high message rate affect the overall average message rate? Because, this very high message rate only occurred for 11.7 seconds (5993/512) -- which represents only 0.015% of the time within the interval of 78111 seconds.

The very broad "top" to the graph is indicative of a very small number of messages that occurred with very high instantaneous message rates. However, these messages occur for such brief periods of time that they have almost no effect on the overall message rate. The very broad "base" of the graph is indicative of a very non-uniform distribution of messages within the monitoring interval.

The "suggested thresholds" are all one because one is the lowest value that can be specified.