IBM WebSphere® Information Integrator Q Replication is the number one choice for replicating large volumes of data at very low latency. In a mission-critical environment with replication, you want an easy way to see the overall systems status and get immediate notifications when something goes wrong. This is where IBM Tivoli Monitoring comes into play. It provides a place where all business relevant applications report their current status, allowing administrators to see the overall health condition of their systems.
Since there is no dedicated Q Replication agent for Tivoli Monitoring, this article shows how to access Q Replication monitoring information, how to bring this data into the Tivoli Platform, and how to use alerts and situations to receive notifications when critical events occur.
Figure 1. Apply_Monitor attribute group in the Tivoli Enterprise Portal Client
Figure 1 shows the Tivoli Enterprise Portal Client and graphs that display live data from Q Replication. The graphs are updated automatically so that the latest information is always available. Using the Tivoli Monitoring "Historical Data Collection" feature, it is also possible to show data from any point back in time.
IBM Websphere Information Integration Q Replication
Q Replication mainly consists of two programs:
- Q Capture
- Q Apply
The capture program monitors changes on source tables and converts committed transactional data into messages. These messages are sent to the target location through WebSphere MQ message queues, where they are read from the queues and converted back into transactional data by the apply program. The transactions are then applied to the target tables with a highly parallelized method that preserves the integrity of the data. Figure 2 shows a sample configuration for Q Replication.
Figure 2. Sample configuration for Q Replication
To learn more about Q Replication and SQL replication see "Introduction to Replication and Event Publishing" (GC18-7567-00), available at the DB2 UDB, DB2 Connect and DB2 Information Integrator Version 8 product manuals page.
IBM Tivoli Monitoring
The IBM Tivoli Monitoring platform is a suite of products that monitor and manage system and network applications on a variety of platforms. These products keep track of the availability and performance of all parts of your enterprise from one or more designated workstations, and provide reports you can use to track trends and troubleshoot problems.
The Monitoring platform consists of the following products:
- The Tivoli Enterprise Portal Client (1): A Java-based user interface for viewing and monitoring your enterprise.
- The Tivoli Enterprise Portal Server (2): Software components for the client that retrieve, manipulate, and analyze data from the agent programs monitoring your enterprise applications.
- The Tivoli Enterprise Monitoring Server (3): Acts as a collection and control point for alerts received from the agents. It also collects performance and availability data from the agents and passes it on to the portal server.
- Tivoli Enterprise Monitoring Agents (4): These are installed on the systems or subsystems you want to monitor. These agents collect and distribute data to a monitoring server. For example, Tivoli Monitoring for Databases contains such agents for Database Products.
Figure 3. IBM Tivoli Monitoring data flow
This article describes how to set up the Tivoli Universal Agent for Q Replication. It is assumed that Websphere II and Tivoli Monitoring are already installed. For information about setting up these products, see the documents in the Related references section below.
- IBM Tivoli Monitoring Version 6.1.0
- IBM Tivoli Universal Agent Version 6.1.1
- IBM WebSphere Information Integrator Version 8.2
For more details on planning, configuring, and administering a Q Replication environment, refer to the DB2 Information Center and "Replication and Event Publishing Guide and Reference" (SC18-7568-00), available at the DB2 UDB, DB2 Connect and DB2 Information Integrator Version 8 product manuals page.
Accessing Q Replication status information
Each capture and apply program has a corresponding set of control tables. These tables are stored
at the same location as the Q Apply and Q Capture programs, and contain configuration values plus the status information you want to access. You can find these tables in the schema you defined as
capture and apply schema during your replication setup.
Figure 4 shows the monitor control table from the Q Apply program as an example.
Figure 4. Q Apply status table
Since each of the Q Apply and Q Capture programs potentially runs on a different system and the control tables are local to the programs, Tivoli Monitoring needs to connect to each of the systems separately. The following section provides detailed steps on how to do this.
If you want a full list of all Q Replication control tables, and what data they contain, refer to the corresponding chapters in the Replication and Event Publishing Guide and Reference.
Set up the Tivoli Universal Agent
For detailed instructions on how to install the Universal Agent refer to Installing monitoring agents and the documentation from the product CD-ROM. For additional information, refer to the Tivoli Universal Agent User's Guide.
After successful installation of the Universal Agent, follow the steps below to set-up and configure a new instance of the agent that uses the Open Database Connectivity (ODBC) data provider to collect monitoring information.
- Open Manage Tivoli Enterprise Monitoring Services.
- Right-click on the Universal Agent with Task/Subsystem Primary, and select create instance....
- On the screen that opens, enter the name for the new Universal Agent instance, for example, "Q_REPL_AGENT."
- Right-click on the new Universal Agent Instance you just created, and select configure using defaults.
- Click Yes when asked you want to "update the file KUMENV_Q_REPL_AGENT prior to configuration of the universal agent."
- In the text editor that opens, change the line that starts with:
*-----------------------------------------------------------------* * UA Startup automatic start DP options * * (ASFS,APIS,FILE,SOCK,HTTP,SNMP,POST,WBEM,ODBC) * *-----------------------------------------------------------------* KUMA_STARTUP_DP=ASFS
*-----------------------------------------------------------------* * UA Startup automatic start DP options * * (ASFS,APIS,FILE,SOCK,HTTP,SNMP,POST,WBEM,ODBC) * *-----------------------------------------------------------------* KUMA_STARTUP_DP=ODBC
- After closing the text editor, you are asked if you now want to configure the agent. Click Yes.
- To check if the configuration was successful, see if the red exclamation mark on the left of the new agent's name has changed to a green circle.
Next, create a meta file for the Universal Agent. This file describes the data you want to monitor for the Tivoli platform and also contains configuration information like datasource name and location for the Universal Agent.
For a quick start, you can download the sample meta file found in the Download section and continue with the section Meta-file adaption below. Save
the sample meta file you downloaded under
The Universal Agent ships with a command line tool to generate a valid meta file from an existing ODBC datasource. However, the generated file needs some additional work to adapt it to your environment.
Before you can start and use the Universal Agent tool to generate the meta file, you need to set up ODBC data source names (DSN) for the databases where the Q Replication control tables are stored. To create a DSN on a Windows system, go to Administrative Tools, Data Sources (ODBC), create a system DSN using the DB2 ODBC driver, and select the appropriate database that contains the Q Replication control tables for the replication program you want to monitor.
After you created the DSN's, use the
KUMPCON tool located at
IBM\ITM\TMAITM6. Start it in a console window with
KUMPCON GENERATE YourDataSourceName user=YourUserID pswd=YourPassword, then
follow the instructions on the screen.
When asked to pattern match on particular table names, answer yes, and use "IBMQREP" as the pattern to match. This way the KUMPCON tool only generates meta data for tables starting with IBMQREP, as all the Q Replication control tables do.
Figure 5 shows sample output from the
Figure 5. KUMPCON sample output
Tivoli Agent meta files are plain text files and can be edited with any text editor. After running
KUMPCON GENERATE, you can find the generated file in the folder IBM\ITM\TMAITM6\metafiles. Since it was generated automatically, it is in a very raw form and needs further tweaks. Here is list of things you might want to change:
- Data Types: Tivoli Monitoring uses its own type system so that any data from outside must be described with a Tivoli type. The (simplified) syntax for an attribute definition in the meta file is the
attribute-name attribute-type maximum-size [KEY] [ATOMIC] [@help text]
You can find a list of Tivoli attribute types in Table 1.
The reason why you need to modify the generated meta file is because
KUMPCON GENERATEtends to use very restrictive types and generates more D (DisplayString) and N (DisplayNumeric) typed attributes than necessary. Those attributes cannot be used for bar, pie, or gauge charts in the Tivoli Enterprise Portal Client. These chart-types only accept pure numeric attributes (such as types G, C, A, or #). Character and date attributes can only be displayed in a table. Also, pure numeric types are much more flexible in situation definitions, such as a rule like if x > 10 is not possible when attribute x has a character (display*) - type.
As a general rule, you should always use a numeric type for attributes you know to contain only numeric values. In this case, where the attribute values come from DB2, you can check the column type from the table definition.
Table 1. Tivoli data types Type Description S Switch. Boolean 0 or 1. G Gauge. Positive or negative integer. C Counter. Positive integer. A Average. Data to be averaged over all collections. D DisplayString. Series of characters. N DisplayNumeric. Series of numeric characters. T Time. The format is CYYMMDDHHMMSSmmm (where C=1 for the 21st century). # Delta value. Presents the value of the attribute as the difference between samples. For example, if the value for sample 1 is 100 and for sample 2 is 120, the delta is 20. % Percentage of change. Presents the value of the attribute as the difference between samples expressed as a percentage. For example, if ReceiveCount is defined as % data type, and the value for sample 1 is 100 and for sample 2 is 120, the percentage of change is 20. ? Rate of change. Presents the value of the attribute as the delta per second between samples. For example, if ReceiveCount is defined as ? data type, the value for the first sample is 100, the value for the second sample is 120, and the elapsed time between samples 1 and 2 is 5 seconds, the rate of change is 4 per second.
Attribute properties: An attribute can have none, one, or both of the following properties:
Key: Indicates that an attribute is a key attribute. Tivoli Monitoring uses key attributes to determine whether multiple events have the same cause. Key attributes help correlate data rows with identical keyed attribute values. When the Universal Agent receives data rows for keyed attribute data, it checks to see if it already has a data row with matching values for keyed attribute. If so, the new data row replaces the existing one. Note: Up to five key attributes per attribute-group are allowed.
Atomic Indicates that an attribute is atomic. Atomizing an attribute means that separate events are generated if a single situation on that attribute evaluates to true. For example, if the situation definition is
IF mem_usage > 100. The atomized attribute
mem_usagewould raise an event for every process that has allocated more than 100MB of memory. If more than one process fulfills that condition, each process raises a separate event. A non-atomized version of the
mem_usageattribute behaves different. Only one event is raised, even if more than one process fulfills the condition at the same time.
It is generally a good idea for ODBC meta files to use keyed tables because it prevents the same retrieved rows from being added multiple times whenever the SQL
selectstatement is executed, and most ODBC tables have one or more indexed columns which logically correspond to key attributes in the meta file.
Note: To be more flexible with how to display the data in the Tivoli Enterprise Portal Client, use the SQL clause in the attribute group definition only to select the latest data. But do the filtering of which attributes to display in the chart definition at the Portal Client. Also, make sure that you understand the structure and the data of the control tables before creating graphs in the portal client. For example, the
IBMQREP_APPLYMONtable (and therefore the
Apply_Monitorattribute group from the sample meta file) contains data for each queue browser thread (one per queue the apply program listens to) of the apply program. You should use the Enterprise Portals filter and grouping features to create a separate graph per receive queue.
To learn how to create charts in the Tivoli Enterprise Portal Client, refer to the Tivoli Enterprise Portal User Guide. For detailed information about Q Replication control tables, refer to Replication and Event Publishing Guide and Reference.
Data sampling Method: You define the data sampling method, along with other attribute group properties in the
//NAMEstatement. Here is its full syntax:
//NAME attribute-group-name sample-method [time-to-live] [AddSourceName] [AddTimeStamp] [Interval=] [SkipNonNumeric=Y/N] [@help text]
Sample-method can be one of the four below:
- P: Polled data becomes available periodically and only the latest set of values is available for situation monitoring and reporting.
- S: Sampled data behaves in the same way as polled data except that more than one set of attribute data values may be available for use.
- K: Keyed data behaves in the same way as sampled data, but allows you to correlate events. Up to five attributes in each group can be designated as key attributes.
- E: Event data occurs unpredictably and is reported as it becomes available.
For ODBC data, K (keyed sampling) is the most appropriate sampling method, since almost every table has at least one key column and the DBMS ensures its semantics.
Note: You should set the
Interval=property on each attribute group in the meta file to the same value as the
monitor_intervalparameter from the corresponding Q Apply and Q Capture programs. You get this value by issuing
asnqccmd capture_server=YourDB2Alias capture_schema=YourApplySchema qryparmsand
asnqacmd apply_server=YourDB2Alias apply_schema=YourApplySchema qryparmsin a command line window.
You might want to change application and attribute group names as well. Also, delete all attribute groups and attributes you don't need. This saves network traffic and ensures proper response times for the Tivoli Enterprise Portal Client.
Listing 1. Apply_Monitor Attribute Group
//NAME Apply_Monitor K 300 Interval=20 //SOURCE ODBC DB2_TARGET user=***** pswd=***** //SQL SELECT * FROM APPLY221222.IBMQREP_APPLYMON order by monitor_time DESC fetch first 100 rows only //ATTRIBUTES MONITOR_TIME D 28 KEY ATOMIC RECVQ D 48 QSTART_TIME D 28 CURRENT_MEMORY C 999999 QDEPTH C 999999 END2END_LATENCY C 999999 QLATENCY C 999999 APPLY_LATENCY C 999999 TRANS_APPLIED C 999999 ROWS_APPLIED C 999999 TRANS_SERIALIZED C 999999 ...
When using the sample meta file from this article's Download section, make sure that
you modify it to fit your systems ODBC DSNs, control table schema, application, and
attribute group names. You should also check if you need to modify the SQL statements
from the metafiles
//SQL definition. Also, make sure to set the
Interval= statement from each attribute group to the correct value as described in the note above.
When you are done with editing the meta file, make sure to validate its syntax with
KUMPCON VALIDATE metafile-name in a command line window.
Start your Universal Agent instance
Before staring your Universal Agent instance, you need to tell it where to find the meta file it should use.
This is done by a configuration file specific for the Universal Agent instance you created in the previous steps.
In a default installation, you can find this file in at IBM\ITM\TMAITM6\work\KUMPCNFG_[$YOUR_AGENTS_INSTANCE_NAME]
(for example, IBM\ITM\TMAITM6\work\KUMPCNFG_Q_REPL_AGENT) where [$YOUR_AGENT_INSTANCE_NAME] stands for the name you gave to the new agent instance in
Set up the Tivoli Universal Agent. If this file doesn't exist, simply create an empty
text file with that name, open it in a text editor, and add a new line with the meta file name for every meta file
you want the agent to load at startup. Also, make sure that your
KUMP_INIT_CONFIG_PATH environment variables are set to correct values.
If the Universal Agent is already running and you want to import a new meta file without
stopping the agent, you can use
KUMPCON IMPORT metafile-name to activate
a new meta file. Also, if you want to update an already active meta file without service interruption,
KUMPCON REFRESH metafile-name.
You can now start your new Universal Agent instance by right-clicking Manage Tivoli Enterprise Monitoring Services, then
selecting Start in the context menu. When you open the Portal Client, you should see a new application
being monitored by the universal data provider and that this application contains a separate workspace
for each of your attribute groups from the meta file. Figure 1 shows the
attribute group from the sample meta file that uses different graphs to visualize the
current status of the replication.
To find out more about
KUMPCON, meta files and the Universal Agent configuration,
refer to the Universal Agent User Guide.
A first place to look if something is wrong, is the Universal Agents log directory located at IBM\ITM\TMAITM6\logs. If the Universal Agent is running but doesn't behave as desired, you should start the Tivoli Enterprise Portal Client and have a look at the Data Provider Log (DPLOG) workspace of the Universal Agent instance that is in doubt. The DPLOG workspace is similar to a system console log. It provides a detailed audit trail from the data provider.
If your agent instance starts, but doesn't show up at the Portal Client, you should also check if the Universal Agent configuration has the correct IP address or hostname for the primary monitoring server.
You can configure this value under Manage Tivoli Enterprise Monitoring Services by right-clicking on the Universal Agent instance, then click Advanced, click Configure advanced..., then click OK. The window that opens contains the agents connection settings for the primary monitoring server.
Situations and alerts for Q Replication
A situation is a logical expression involving one or more attributes from an attribute group defined in a meta file. Situations are used to monitor the condition and health of systems in your network. They can trigger executables on the system that caused them to fire or simply notify administrators that a certain event occurred. You can manage situations from the Portal Client with the Situation Editor.
Each dedicated agent comes with a set of predefined situations. You may activate them as they are, or take them as a starting point for your own set of situations. The Universal Agent - as we use it for monitoring Q Replication, doesn't come with predefined situations because of its generic nature. We therefore want to provide a list of situation definitions you can take as "inspiration" for your own situations for Q Replication. Also, the Formula column from the table below is a good start for the "Threshold" feature of table views in the Portal Client.
|APPLY_E2E_LATENCY_LIMIT||If you want to be notified, or execute a command, if replication end-to-end latency for ANY queue from the monitored apply process reached a certain value (here: >10s).|
|APPLY_DETECT_SPILLING||If you want to be notified, or execute a command, if the apply program needed to spill a row to the spill queue in Websphere MQ.|
|APPLY_QDEPTH_LIMIT||If you want to be notified, or execute a command, if there are too many messages waiting in ANY of the queues from the monitored apply program.|
|APPLY_DETECT_MONSTER_TX||If you want to be notified, or execute a command, if one or more transactions on ANY apply queue exceeded the apply program's memory limit.|
|APPLY_DETECT_MEM_FULL||If you want to be notified, or execute a command, if the apply program could not process transactions from ANY receive queue because agents were using all available memory.|
|APPLY_WARNING||If you want to be notified, or execute a command, if the apply program reported a warning message.|
|APPLY_ERROR||If you want to be notified, or execute a command, if the apply program reported an error message.|
|CAPTURE_TX_SPILLED||If you want to be notified, or execute a command, if the capture program spilled transactions to disk or virtual I/O.|
|CAPTURE_WARNING||If you want to be notified, or execute a command, if the capture program reported a warning message.|
|CAPTURE_ERROR||If you want to be notified, or execute a command, if the capture program reported an error message.|
|QREP_HOTLIST_OFFLINE||If you want to be notified, or execute a command, if one the the Q Replication processes isn't running any more.|
When editing the situation, make sure to select an appropriate sampling interval for the situation under the Condition tab. You should also adjust the settings under the Action tab of your situation definition. Especially parameters, like "If the condition is true for more than one monitored item." or "If the condition stays true over multiple intervals." need to be set to the correct value.
The real power of the Tivoli Monitoring system becomes visible when you create situations that are triggered from different attribute groups that come from different agents. This is very useful for Q Replication because this product uses other software to do parts of its work. Q Replication runs on top of an operating system, uses Websphere MQ to transport messages and DB2 to capture and apply transactions. It is clear that proper operation of Q Replication heavily depends on the health of its underlying systems.
The Tivoli Monitoring product family already has dedicated agents for Websphere MQ, DB2, and all major operating systems. You should install them as well and create situations or workflow definitions that span across the whole software stack Q Replication uses. For example, do simple root cause analysis: If Q Replication shows errors and the Websphere MQ Agent tells Tivoli Monitoring that the Admin Queue is down, generate a message that Replication has stopped because there is a WebSphere MQ problem.
Note: You can only create situations that use attributes from the same attribute group. If you want to use attributes from different groups in your situation, you must create another situation and embed it in the first. Alternatively, you can create a new attribute group that contains all the attributes you want to include in the situation. You can also use the Workflow Editor to assemble situations.
For more information about situations and the Portal Client, refer to the Tivoli Enterprise Portal User Guide.
Other tools to monitor Q Replication
When talking about monitoring Q Replication, it should also be mentioned that there are other tools available. First, there is the Replication Alert Monitor that ships with the Q Replication product and is available from the Replication Control Center.
Also, there is the "Q Replication Live Monitor" (developerWorks, September 2005), a small, lightweight tool that graphically displays real-time latency and throughput information available at developerworks.
A third tool, quite similar to the live monitor but with a lot more features, is the Q Replication Dashboard, available through the Q Replication Tools page.
This article showed you how to monitor and integrate IBM Websphere Information Integrator Q Replication into the IBM Tivoli monitoring platform. Q Replication status information is available through control tables on each of the systems where the Q Apply and Q Capture processes run. You connect to these control tables by creating a meta file for the Universal Agents ODBC data provider. The meta file groups the data to read into attribute groups and describes each attribute with an Tivoli specific data type. Once the Universal Agent is configured and running, you can use the Tivoli Enterprise Portal client to create and distribute situation definitions that notify you (or even execute a custom command) if a critical event occurred.
- "WebSphere Information Integrator Q Replication" (developerWorks, March 2005): Introduces Q Replication and its architecture.
- "Q Replication Live Monitor" (developerWorks, Sept 2005): Introduces a Live Monitor Tool for Q Replication.
- Developerworks Information Roadmap for Q Replication.
- Tivoli Enterprise Portal User Guide: Information about situation definitions and the Enterprise Portal.
"Introduction to Replication
and Event Publishing" (GC18-7567-00) and
"Replication and Event Publishing Guide and Reference" (SC18-7568-00),
available at the DB2 UDB, DB2 Connect and DB2 Information Integrator Version 8 product manuals page.
- DB2 Information Center: A good source for all DB2 related information.
- Tivoli Information Center: A good source for all Tivoli related information.
- Tivoli Software Information Center: Download all Tivoli related product documentation.
- Tivoli Universal Agent: Find information on Tivoli Universal Agent.
- developerWorks Information Management zone: Learn more about DB2. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
Get products and technologies
- Download a trial version of WebSphere Information Integrator V8.2.
- Download a trial version of Tivoli Monitoring Express V6.1.
- Q Replication Tools An overview of the free software tools that are available to help WebSphere Information Integrator replication and event publishing customers with configuration, monitoring, and problem determination.
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in developerWorks blogs and get involved in the developerWorks community.