Comparison between broker accounting statistics data and CandleMonitor node statistics data

The WebSphere® Message Broker Monitoring Agent provides two types of statistical data to monitor message flows in a broker:

  • Broker accounting and statistics data
  • CandleMonitor node statistics data

You can use the information in this topic to make a choice that best suits your environment between the two data types.

Broker accounting and statistics data

When statistics are collected for the broker and XML format is specified as the output destination, the WebSphere Message Broker Monitoring Agent reports on accounting and statistics data that is produced by the broker. The agent automatically subscribes to the broker to receive this type of data.

For the accounting and statistics data, you only need to use the mqsichangeflowstats command to enable statistics collection at the broker, and after that you can see the data in the Tivoli® Enterprise Portal workspaces. With the mqsichangeflowstats command, you can indicate whether you want to do the following things:

  • Collect statistics for a specific message flow
  • Collect statistics for all message flows
  • Collect thread related statistics
  • Collect node related statistics (including terminals for the nodes)

All these data are supported by the WebSphere Message Broker Monitoring Agent.

Tip: You can issue the mqsichangeflowstats command to the broker-managed systems at any time by using the predefined QI Change Flow Stats command. For more information about how to issue this Take Action command for Tivoli Enterprise Portal, see Sending a Take Action command and WMB Change Flow Stats/QI Change Flow Stats command.

There are two types of data collection: snapshot and archive. Although they are used for different purposes, data of both types support the same levels of details about message flows, threads, nodes, and terminals.

Archive data is intended for use in more long-term accounting and statistics data. This data is the type that you want to collect continuously for general monitoring of message flows. Archive data is collected at an interval that you can configure, with the minimum interval being 10 minutes. The default interval for this data is 60 minutes. To change this interval, use the mqsichangebroker command with the -v parameter or the QI Change Broker command from Tivoli Enterprise Portal. The broker must be stopped when the command is issued.

Snapshot data is the type that you want to collect for a short period of time when you are troubleshooting a problem in one or more message flows. The snapshot data is collected every 20 seconds, and you cannot change this interval. The performance of the broker can be affected by collecting snapshot data. So you must be careful to enable the snapshot data collection only when you need it and disable the collection when you are finished. You can enable or disable the snapshot data collection with the QI Change Flow Stats command from Tivoli Enterprise Portal.

There are four levels of data: message flow, thread, node, and terminal. The attributes for the accounting and statistics data vary for different levels. The data of message flow level include elapsed and CPU timings, input and output message counts and sizes, and various special or error counts. The data of thread level report on threads that process message flows and includes CPU and Elapsed timings, and message size and rate. The node level data report on elapsed and CPU timings for each node in a message flow. The terminal level data present counts of invocations of the various terminals for nodes in the message flow.

A set of workspaces is available to display the following accounting and statistics data. And situations can be targeted at the most current interval data to automatically detect problems.

  • Data collected during the most current interval
  • Data collected during several most recent intervals for trending
  • Historical data (when historical data collection is enabled)

The historical data collection function can track the archive statistics for accounting purposes without purchasing or developing a second application.

If you intend to use the accounting origin support to organize your data, you must configure participating message flows to provide the appropriate origin identifier. As described in the WebSphere Message Broker documentation, this involves coding an ESQL statement in either a Compute, Database, or Filter node that sets the value that you want. In addition, you must specify the -b basic parameter on the mqsichangeflowstats command that you use to start data collection.

CandleMonitor node statistics data

CandleMonitor node statistics data are produced by the CandleMonitor node that are placed in message flows. The CandleMonitor node can be placed multiple times within message flows, depending on the amount of monitoring that you want. To use the CandleMonitor node for monitoring, you must do the following things:

  1. Make the CandleMonitor node available in the Message Broker Toolkit.
  2. Modify the message flow.
  3. Redeploy the message flow to the broker.

For typical monitoring, it is sufficient to place one CandleMonitor node at the beginning of the flow (after input node) and one CandleMonitor node at the end of the flow (before the output node). Only the message flows with at lease one CandleMonitor node are represented in the reported data.

You can also configure the CandleMonitor node to produce user-defined message flow events for situation detection and Tivoli Enterprise Portal display of problems. For example, a message flows down a failure path, and the CandleMonitor node can automatically report exceptions that are propagated from any node as message flow events. Configuration parameters for activating certain nodes can also be used to disable different types of nodes from reporting data to scale back data collection. So you can place many nodes during message flow development (for example, to monitor sub-flows in a more granular pattern) and leave them in place later when moving to production, although they are inactive.

The CandleMonitor node implementation is provided at the broker by the kqipnode.lil file, which must be made available to the broker before deploying a message flow with the node. When the broker initializes the kqipnode.lil file, it will set up a shared memory area for recording the statistics. When a message flows through the CandleMonitor node, a little processing of data is required here. Instead, the WebSphere Message Broker Monitoring Agent reads the shared memory and does all the calculations and summarization of data. So the CandleMonitor node has little impact on the message flow. By default, the agent does this work at a 1-minute interval. If there is any message flow event that is posted by a CandleMonitor node, the WebSphere Message Broker Monitoring Agent also picks up the event from shared memory. And this occurs every 15 seconds by default. You can use the agent configuration parameters to modify these intervals.

Basically, the same set of statistics giving elapsed timings, input and output message counts, and queue timings are available at the following summarization levels:

  • CandleMonitor node
  • Sub-flow
  • Message flow
  • Execution group
  • Broker

The base data that are collected by the agent are reported in Monitor Node Base Statistics workspace/CandleMonitor Node Statistics workspace. Data in Sub-Flow Statistics workspace are summarized for each subflow that is delineated by the CandleMonitor nodes of subflow type. Data in Monitor Node Message Flow Statistics workspace/Message Flow Statistics workspace are summarized for each message flow with at least one CandleMonitor node of input type. The statistics are summarized also at the execution group level in Monitor Node Execution Group Statistics workspace/Execution Group Statistics workspace and at the broker level in Monitor Node Broker Statistics workspace/Broker Statistics workspace.

You can also use the WMB Create User Statistics/QI Create User Statistics command to create what are called collectively as user statistics. These are the same statistics and levels, except that you collect the data when you want by issuing a Take Action command named WMB Sample User Statistics/QI Sample User Statistics. Issue one WMB Sample User Statistics/QI Sample User Statistics command when you want the interval to begin; issue another WMB Sample User Statistics/QI Sample User Statistics command when you want the interval to end. This type of collection is useful, for example, to gather statistics for a certain set of messages flowing.

Cumulative statistics with Overall attribute names are maintained internally at the CandleMonitor node level. These statistics are simultaneously reset to 0 for all CandleMonitor nodes when a deploy operation to a broker involves any message flow containing a CandleMonitor node, when any CandleMonitor node detects an overflow condition for statistics, or when the Reset Statistics Take Action command has been issued.

A reset of statistics includes all statistics that are maintained for the broker to preserve the integrity of summarized statistics. The CandleMonitor node produces an Event Log message when a reset occurs because of a numeric overflow condition, and the monitoring agent logs a message when a reset is detected.

At the time of a reset of statistics, workspaces displays that the Overall statistics have started over from zero.

Historical workspaces displays data before the reset combined with data following the reset for the interval in which the reset occurred. This ensures that no historical data is lost. In subsequent intervals, the historical workspaces displays the Overall values as having started over from zero.

Table 1. Comparison between broker accounting statistics and CandleMonitor node statistics
Category Similarities Differences
Broker accounting and statistics CandleMonitor node statistics
Data attributes

Both types have the following attributes:

  • Elapsed Timings
  • Message Counts
  • Message Sizes
  • Byte Rates
  • Special error counts at the message flow level
  • CPU Timings
  • Invocation counts for nodes and terminals
  • Queue Time
  • Message Flow Events
Data levels

Both types have the message flow level.

  • Thread
  • Node
  • Terminal

Data of all levels are available with archive and snapshot accounting.

  • Broker
  • Execution group
  • Sub-flow
  • CandleMonitor node
Data of all levels are available with regular and user-defined statistics.
Detection of message flowing on a failure path  

Accounting and statistics data at the terminal level can be used in a situation to detect that a failure terminal has an invocation count that is greater than zero, which means that a message flowed to a failure path. However no information about the message is available.

CandleMonitor node statistics can be positioned along a failure path in a message flow with the event attribute set to an event message. This message flow event can be displayed at the Tivoli Enterprise Portal and can be detected by situations. The message ID and correlation ID are among the message data available.
Collection interval

The interval can be changed for some or all of the data collection.

Archive interval has a minimum of 1 minutes and a default of 60 minutes. Snapshot interval cannot be changed, and the default is 20 seconds.

The default interval is 1 minutes for statistics (which can be set to less than 1 minute) and 15 seconds for message flow events. The interval must be set in the kqi.xml agent configuration file with the defaultStatisticsInterval and defaultFlowEventInterval parameters.
Performance  

Accounting and statistics interval can impact broker performance. Snapshot accounting should be used only for problem determination and not the regular monitoring.

CandleMonitor node has little impact on a message flow. The interval does not impact the broker performance.
Agent installation  

Accounting and statistics data have no extra installation steps.

CandleMonitor node requires root authority for installation of the kqipnode.lil file at the broker on UNIX systems. On z/OS® systems, additional steps are required to integrate the node into the broker environment. You must make the CandleMonitor node available in the Message Broker Toolkit separately.
Configuration

Configuration parameters are provided in the kqi.xml configuration file. The WebSphere Message Broker Monitoring Agent must be restarted to make the parameter change take effect.

Accounting and statistics data are configured by the mqsichangeflowstats command, which is also available as a Take Action command from the Tivoli Enterprise Portal interface. The configuration can be done dynamically.

Primary configuration of the CandleMonitor node is within the Message Broker Toolkit to implement message flows with the node. The kqipnode.cfg configuration file is available at the broker. The broker must be restarted to make the configuration change take effect.
Reason most often chosen

Both types provide data that are required to determine the situation of a message flow.

Accounting and statistics data are chosen because they are easier to configure, comparing with the CandleMonitor node statistics data. Accounting and statistics data have CPU timings, error counts, and node invocation counts.

CandleMonitor node statistics data are chosen because of better performance, the ability to refresh monitoring data at a shorter interval, queue times, and the timely reporting of message flow events.