Metrics published on the system topics

Metrics are categorized into classes, and sub-categorized into types. There are various metrics published under each metric class and type.

Index

CPU (platform central processing units)
DISK (platform persistent data stores)
STATMQI (API usage statistics)
STATQ (API per-queue usage statistics)
STATAPP (per-application usage statistics)
NHAREPLICA (per-instance Native HA statistics)

[Windows] [Linux] See Monitoring system resource usage by using the amqsrua command for information on how you collect data for the options listed, with the exception of NHAREPLICA.

You can use also the ALTER QMGR command to monitor STATMQI and STATQ at the queue manager level, or the local queue attribute STATQ to monitor individual queues; see ALTER QUEUES for this option.

CPU (platform central processing units)

Introduction

Where statistics refer to the current interval, this is the interval defined in the published message by the MQIAMO64_MONITOR_INTERVAL parameter.

Statistics are usually published every 10 seconds, the published interval, as long as there is at least one active subscriber, but the precise interval should always be taken from the message.

Important: Unless otherwise specified, metrics are otherwise absolute values for the point in time when captured.

SystemSummary (CPU performance - platform wide)

User CPU time percentage X%

The average percentage of time (taken over the last 10 second interval) used by the CPU when it was in non-privileged code.

System CPU time percentage X%

The average percentage of time (taken over the last 10 second interval) used by the CPU when it was in privileged code.

CPU load - one minute average X

The one minute load average. "Load average" is a industry-wide term, but the exact value reported might differ across platforms.

CPU load - five minute average X

The five minute load average. "Load average" is a industry-wide term, but the exact value reported might differ across platforms.

CPU load - fifteen minute average X

The fifteen minute load average. "Load average" is a industry-wide term, but the exact value reported might differ across platforms.

CPU system summary

RAM free percentage X%

RAM total bytes XMB

RAM free percentage X%

RAM total bytes XMB

QMgrSummary (CPU performance - running queue manager)

User CPU time - percentage estimate for queue manager X%

The average percentage of time (taken over the last 10 second interval) used by the CPU when this queue manager's processes were in non-privileged code.

System CPU time - percentage estimate for queue manager X%

The average percentage of time (taken over the last 10 second interval) used by the CPU when this queue manager's processes were in privileged code.

RAM total bytes - estimate for queue manager XMB

This is an approximation of the memory used by the queue manager.

DISK (platform persistent data stores)

The SystemSummary and QMgrSummary are absolute values at the point of time of capture. See the Introduction for details of the published interval.

SystemSummary (disk usage - platform wide): MQ errors file system - bytes in use XMB; MQ errors file system - free space X%; MQ FDC file count X; MQ trace file system - bytes in use XMB; MQ trace file system - free space X%
QMgrSummary (disk usage - running queue managers): Queue Manager file system - bytes in use XMB; Queue Manager file system - free space X%
Log (disk usage - queue manager recovery log): Log - bytes in use X; Log - bytes max X
The maximum number of bytes that can be written to the log if all the primary and secondary extents were full. This is less than the size of the log filesystem; Log file system - bytes in use X; Log file system - bytes max X; Log file system – free space X%; Log – disk written log sequence number X
The LSN written and forced to disk as a 64-bit number; Log - physical bytes written for the current interval X.
See the Introduction for the definition of current interval.; Log - logical bytes written for the current interval X; Log - write latency X uSec
A rolling average that represents the time that a single write to disk takes.; Where LogWriteIntegrity=TripleWrite, the physical number of bytes written to disk is greater than the logical bytes written.; Log - write size X, also rolling average.; Log - occupied by extents waiting to be archived X.
Only published when logtype= linear and LogManagement = archive. See Log stanza of the qm.ini file for more information.; Log - space in MB required for media recovery X.
Only published when logtype= linear.; Log - space in MB occupied by reusable extents X
Only published when logtype= linear and LogManagement = automatic. See Log stanza of the qm.ini file for more information.; Log - current primary space in use X%.
Log file space in use as a percentage of primary logs. This value can be more than 100%.; Log - workload primary space utilization X%.
The percentage log file space in use as a rolling average over recent history.; Log – quorum log sequence number X
The LSN that has been replicated between a quorum of instances in the HA group as a 64-bit number (only returned if the queue manager is configured for Native HA).; Log - slowest write since restart
The highest latency individual log write since the queue manager was started (in microseconds).; Log - timestamp of slowest write
The UTC timestamp when the highest latency individual log write occurred (expressed as microseconds since the epoch - 1970-01-01T00:00:00Z).

STATMQI (API usage statistics)

All API usage statistics reflect occurrences and/or percentages for the published interval. See the Introduction for the definition of published interval.

The statistics outputs a count of the number of failed MQI calls, but not every failed MQI call appears in these statistics - indeed the failures of not every MQI call have their statistics recorded. This is because many reasons that MQI calls fail are diagnosed before the MQI call reached the internals of the queue manager where the statistics are recorded.

An example of this is MQRC_HCONN_ERROR returned to a client application. If a client application passes a bad hconn, the MQ client diagnoses that error and returns MQRC_HCONN_ERROR without passing the MQI call onto the queue manager. Hence, the failed MQI call never appears in the statistics recorded by the queue manager.

Statistics of failed MQI calls are useful because they enable customers to troubleshoot poorly-written applications that generate unnecessary failed MQI calls, thereby impacting performance. Some examples of failing reasons for various MQI calls that are recorded in the statistics:

MQCONN/MQCONNX/MQOPEN returns 2035 MQRC_NOT_AUTHORIZED when diagnosed by the queue manager, not the client. For example running amqsput as nobody.
MQPUT/MQPUT1 returns 2053 MQRC_Q_FULL because MAXDEPTH has been exceeded.
MQGET returns 2033 MQRC_NO_MSG_AVAILABLE when browsing or destructively getting from an empty queue
MQSUBRQ returns 2437 MQRC_NO_RETAINED_MSG because there is no retained message

CONNDISC (MQCONN and MQDISC): MQCONN/MQCONNX count X; Failed MQCONN/MQCONNX count X; Concurrent connections - high water mark X; MQDISC count X
OPENCLOSE (MQOPEN and MQCLOSE): MQOPEN count X Y/sec; Failed MQOPEN count X; MQCLOSE count X Y/sec; Failed MQCLOSE count X
INQSET (MQINQ and MQSET): MQINQ count X; Failed MQINQ count X; MQSET count X; Failed MQSET count X
PUT (MQPUT): Interval total MQPUT/MQPUT1 count X; Interval total MQPUT/MQPUT1 byte count X Y/sec; Non-persistent message MQPUT count X; Persistent message MQPUT count X; Failed MQPUT count X; Non-persistent message MQPUT1 count X; Persistent message MQPUT1 count X; Failed MQPUT1 count X; Put non-persistent messages - byte count X Y/sec; Put persistent messages - byte count X; MQSTAT count X
GET (MQGET): Interval total destructive get- count X; Interval total destructive get - byte count X Y/sec; Non-persistent message destructive get - count X; Persistent message destructive get - count X; Failed MQGET - count X; Got non-persistent messages - byte count X Y/sec; Got persistent messages - byte count X; Non-persistent message browse - count X; Persistent message browse - count X; Failed browse count X; Non-persistent message browse - byte count X Y/sec; Persistent message browse - byte count X; Expired message count X; Purged queue count X; MQCB count X; Failed MQCB count X; MQCTL count X
SYNCPOINT (commit and rollback): Commit count X; Rollback count X
SUBSCRIBE (subscribe): Create durable subscription count X; Alter durable subscription count X; Resume durable subscription count X; Create non-durable subscription count X; Failed create/alter/resume subscription count X; Delete durable subscription count X; Delete non-durable subscription count X; Subscription delete failure count X; MQSUBRQ count X; Failed MQSUBRQ count X; Durable subscriber - high water mark X; Durable subscriber - low water mark X; Non-durable subscriber - high water mark X; Non-durable subscriber - low water mark X
PUBLISH (publish): Topic MQPUT/MQPUT1 interval total X; Interval total topic bytes put X Y/sec; Published to subscribers - message count X; Published to subscribers - byte count X; Non-persistent - topic MQPUT/MQPUT1 count X; Persistent - topic MQPUT/MQPUT1 count X; Failed topic MQPUT/MQPUT1 count X

STATQ (API per-queue usage statistics)

GENERAL (General): messages expired X (moved from GET for IBM® MQ 9.3.0 and later CD versions); queue purged count X (moved from GET for IBM MQ 9.3.0 and later CD versions); average queue time X uSec (moved from GET for IBM MQ 9.3.0 and later CD versions); Queue depth X (moved from GET for IBM MQ 9.3.0 and later CD versions); open input count
Number of queue handles open at the end of the interval for input (MQGET). This value is the same as the IPPROCS value reported by DISPLAY QLOCAL / QSTATUS.; open output count
Number of queue handles at the end of the interval that are open for output (MQPUT). This value is the same as the OPPROCS value reported by DISPLAY QLOCAL / QSTATUS.; open browse count
Number of queue handles open at the end of the interval that included the MQOO_BROWSE option on their call to MQOPEN. Note: these handles are also included in the open input count.; open publish count
Number of queue handles open at the end of the interval that were opened by queue manager processes to put messages to subscriptions that specified this queue as their destination. Note: these handles are also included in the open output count.
OPENCLOSE (MQOPEN and MQCLOSE): MQOPEN count X; MQCLOSE count X
INQSET (MQINQ and MQSET): MQINQ count X; MQSET count X
PUT (MQPUT and MQPUT1): MQPUT/MQPUT1 count X; MQPUT byte count X; MQPUT non-persistent message count X; MQPUT persistent message count X; rolled back MQPUT count X; MQPUT1 non-persistent message count X; MQPUT1 persistent message count X; non-persistent byte count X; persistent byte count X; lock contention X%
The percentage of attempts to lock the queue that resulted in waiting for another process to release the lock first. Decreasing lock contention is likely to increase the maximum throughput of your system because taking a lock that is not currently locked is a more efficient than waiting for a lock to be released.; queue avoided puts X%
If a message is put to a queue when there is a waiting getter, the message might not need to be queued as it can be possible for it to be passed to the getter immediately. So this message is said to have avoided the queue, and "queue avoided puts" is the count of such messages. Increasing queue avoidance is likely to increase the maximum throughput of your system because it avoids the cost of putting the message onto the queue and getting it off again.; queue avoided bytes X%
If a message is put to a queue when there is a waiting getter, the message might not need to be queued as it can be possible for it to be passed to the getter immediately. So this message is said to have avoided the queue, and "queue avoided bytes" is the count of such bytes. Increasing queue avoidance is likely to increase the maximum throughput of your system because it avoids the cost of putting the message onto the queue and getting it off again.
GET (MQGET): MQGET count X; MQGET byte count X; destructive MQGET non-persistent message count X; destructive MQGET persistent message count X; rolled back MQGET count X; destructive MQGET non-persistent byte count X; destructive MQGET persistent byte count X; MQGET browse non-persistent message count X; MQGET browse persistent message count X; MQGET browse non-persistent byte count X; MQGET browse persistent byte count X; messages expired X (moved to GENERAL from IBM MQ 9.3); queue purged count X (moved to GENERAL from IBM MQ 9.3); average queue time X uSec (moved to GENERAL from IBM MQ 9.3); Queue depth X (moved to GENERAL from IBM MQ 9.3); destructive MQGET fails X; destructive MQGET fails with MQRC_NO_MSG_AVAILABLE X; destructive MQGET fails with MQRC_TRUNCATED_MSG_FAILED X; MQGET browse fails X; MQGET browse fails with MQRC_NO_MSG_AVAILABLE X; MQGET browse fails with MQRC_TRUNCATED_MSG_FAILED X
EXTENDED: msg search count
Number of MQGETs where the queue manager searched for a message (this will be every MQGET that was not satisfied by directly passing an MQPUT to a waiting getter - see "queue avoided puts").; msg not found count
Number of MQGETs where the queue manager failed to find a message.; msg examine count
Number of (matching and unmatching) messages examined by searches. Reasons for match failures are described in the following statistics:; intran get skipped count
Messages examined but skipped because they were locked by an uncommitted MQGET transaction.; put skipped count
Messages examined but skipped because they had been put in a transaction that had not been committed.; selection mismatch count
Messages that were checked and did not match properties required by selector.; correlid mismatch short count
Messages examined because of MQMO_MATCH_CORREL_ID and skipped because quick CorrelId hash did not match requested Id.; correlid mismatch long count
Messages examined and matching quick CorrelId hash check, but failing full CorrelId comparison.; msgid mismatch count
Messages examined because of MQMO_MATCH_MSG_ID and skipped because MsgId did not match requested Id.; load msg dtl count
Messages or message headers that needed to be loaded from the Q file to check for a match.

STATAPP (per-application usage statistics)

INSTANCE (instance statistics): Instance count X absolute; Movable instance count X absolute; Instance shortfall count X absolute; Instances started X interval; Initiated outbound instance moves X interval; Completed outbound instance moves X interval; Instances ended during reconnect X interval; Instances ended X interval

NHAREPLICA (per-instance Native HA statistics)

REPLICATION (replication statistics): Average network round-trip time X uSec; Synchronous log bytes sent X; Catch-up log bytes sent X; Synchronous compressed log bytes sent X; Catch-up compressed log bytes sent X; Synchronous uncompressed log bytes sent X; Catch-up uncompressed log bytes sent X; Synchronous log data average compression time X uSec; Catch-up log data average compression time X uSec; Synchronous log bytes decompressed; Catch-up log bytes decompressed; Synchronous log data average decompression time X uSec; Catch-up log data average decompression time X uSec; Log write average acknowledgment latency X uSec; Log write average acknowledgment size X; Backlog bytes X; Backlog average bytes X; Acknowledged log sequence number X
The LSN the instance has acknowledged as a 64-bit number; Log file system - bytes in use X
The number of bytes in use by the log file system on the instance; Log file system – free space X%
The amount of free space; Queue Manager file system - bytes in use X MB
The number of MB in use by the queue manager file system; Queue Manager file system - free space X%
The amount of free space; MQ FDC file count X
The number of FDCs present on the instance

RECOVERY (recovery group statistics): Average network round-trip time X uSec; Compressed log bytes sent; Log data average compression time X uSec; Log bytes decompressed; Log data average decompression time X uSec; Log bytes sent X
The number of bytes sent to the group in this interval; Backlog bytes X
The number of bytes the group is behind; Backlog average bytes X
The short-term rolling average number of bytes the group is behind; Rebase count X
The number of times the group has been rebased; Recovery LSN X
The LSN the group could recover from as a 64-bit number