Before you start
IBM now has a Web-based, live monitoring tool that greatly simplifies the task of tracking the health of your Q replication and event publishing environment. The Q Replication Dashboard is part of Version 1.1.2 of the Data Studio Administration Console.
IBM Data Studio Administration Console 1.1.2 is a complementary rich Web interface to help you with the following tasks:
- Monitor the health and availability of IBM DB2® for Linux®, UNIX®, and Windows® (Version 9.5) databases, view dashboard metrics, investigate alert details, and troubleshoot problems using expert recommendations.
- Monitor IBM WebSphere® Replication Server and WebSphere Data Event Publisher (Version 8.2 or later), generate replication health reports, and perform basic replication operations.
The focus of this tutorial is on the Q replication dashboard part of the Data Studio Administration Console.
This tutorial introduces you to the dashboard by providing scripts and batch files to set up a source and two target databases, and to create the Q replication and WebSphere MQ objects to replicate data between them. You will download a small application that generates activity at the source database. Once replication starts, you can use the dashboard to watch the overall health of replication, check performance, and diagnose a problem at one of the targets.
About this tutorial
Although the dashboard has many useful monitoring capabilities as Table 1 shows, this tutorial will focus on the following tasks:
- Creating a dashboard view
- Using the health summary
- Tracking latency
- Identifying and diagnosing data exceptions
|Q replication health summary||At-a-glance information on:
|Live monitoring||Live graphs that display latency and throughput statistics|
|Q Capture and Q Apply programs||Ability to:
|Problem determination||From any health category:
After you take this tutorial, you should be able to:
- Use scripts to set up a working Q replication environment
- Create a dashboard view that includes the configurations that you want to monitor
- Use the Summary page to monitor the overall health of replication
- Track replication latency
- Identify and diagnose a data exception
This tutorial should take approximately two hours to finish.
To complete the tutorial, you should have an understanding of Q replication concepts such as source and target servers, control tables, replication queue maps, and Q subscriptions.
You can complete the tutorial on a single Windows computer. The tutorial assumes that your Windows user account has complete access to all DB2 objects through the operating system, which is the default when you install DB2 on Windows. In this case, the user ID is a member of the DB2ADMNS user group.
To learn more about setting up Q replication, see the Q replication tutorial in the Information Management Software for z/OS Solutions Information Center (see Resources).
To take this tutorial, you need the following software installed on a single Windows computer:
- WebSphere Replication Server Version 9.1 or later
- DB2 for Linux, UNIX, and Windows Version 9.1 or later
- WebSphere MQ Version 5.3 or later
- Data Studio Administration Console Version 1.1.2
- Microsoft Windows XP Professional Edition SP2 (32-bit / IA32), Microsoft Windows Server 2003 Standard Edition with R2 SP2 or later (32-bit / IA32), or Microsoft Windows Server 2003 Enterprise Edition with R2 SP2 or later (32-bit / IA32)
To run the tutorial, you need 400MB of free space on the drive where the databases are created and another 150MB free space on the drive from where you invoke the scripts.
Setting up replication
Before using the Q Replication Dashboard, let's set up a running Q replication configuration for you to monitor. We've simplified this part of the task by providing a set of scripts and batch files that will create the configurations that you need and get everything running.
Before you begin
Make sure that DB2 and WebSphere MQ are running.
Figure 1, below, shows what you will be setting up:
Figure 1. Replication setup for tutorial
- A source database, LONDON, and two target databases, DALLAS and TOKYO.
- A queue manager, WORLDWIDE_QM, and three local queues that act as both send and receive queues.
- Unidirectional Q subscriptions between LONDON and DALLAS and between LONDON and TOKYO for four source tables.
- Small Java™ applications that create a steady stream of activity at the source database so that the changes can be replicated in one direction to the two targets and also make changes to a target database to create data exceptions that you can monitor.
- From the Downloads section of the tutorial,
download and extract the files from one of the following compressed files,
depending on your WebSphere Replication Server version:
Table 2. Download options Version Compressed file 9.5 tutorial_setup_v95.zip 9.1 tutorial_setup_v91.zip
- From a system command prompt, change to the tutorial_setup_version
directory and enter
Create_MQto run the Create_MQ.bat file. The batch file does the following things:
- Creates the WebSphere MQ queue manager WORLDWIDE_QM
- Starts the queue manager
- Creates the queues that are needed for Q replication
db2cmdto open a DB2 command window. Make sure the command window points to the tutorial_setup_version directory. Enter
setupto run the Setup.bat file. The batch file does the following things:
- Creates and configures three DB2 databases (LONDON, TOKYO, and DALLAS)
- Creates the source tables at the LONDON database that will be replicated to the TOKYO and DALLAS targets
- Creates Q Capture control tables at LONDON and Q Apply control tables at TOKYO and DALLAS
- Creates replication queue maps and Q subscriptions that map the source tables and queues to the target tables and queues
Figure 2. Results of setup command
Note: The setup command takes some time to complete; it includes backing up the newly created databases. For detailed results, you can view the setup.out file that the command generates in the same directory.
start_replto start the Q Capture program at the LONDON database and the Q Apply programs at the TOKYO and DALLAS databases.
Look for messages similar to the following to confirm that the programs started:
Listing 1. Sample initialization messages from Q Capture
2008-02-15-16.02.03.588000 <asnqwk> ASN0572I "Q Capture" : "ASN" : "WorkerThread" : The program initialized successfully. 2008-02-15-16.01.58.118000 <brwzMain> ASN7526I "Q Apply" : "ASN" : "BR00001" : The Q Apply program has started processing the receive queue "SEND_RECVQ1_LON_DALLAS" for replication queue map "LONDON_ASN_TO_DALLAS_ASN".
Important: After you start the replication programs, leave the command window open to avoid stopping replication.
- Edit the setenv.bat file and replace <your_SQLLIB_location> with the location of the SQLLIB directory where DB2 is installed. The default location is C:\Program Files\IBM\SQLLIB. Save the setenv.bat file.
- Open another command window and change to the tutorial_setup_version
setenv, then enter
Figure 3. Output of the runwork command
The runwork.bat file runs several small Java applications to create a workload in the LONDON database. The workload runs for 1 to 1.5 hours.
Important: Do not close the command window in which you are running the workload, or it will stop the applications from running and could leave your database in an inconsistent state before the next time you run the workload applications. If you stop the applications, run CleanTables.bat before you run your next workload to clean up the tables to avoid any inconsistency problems.
- When you are done with the tutorial and the applications have finished running, you can run the stop_repl.bat batch file to stop the Q Capture and Q Apply programs.
- If the Q Apply program did not start for some reason but Q Capture did start, your queues might fill up if the workload is running. In the WebSphere MQ Explorer, open the WORLDWIDE_QM queue manager in the left navigation pane, select Queues, right-click each queue in the Queues table, and select Clear Messages before you restart Q Capture.
- The Q Apply program is set up to stop reading from the queue in case of errors. If Q Apply appears to not be reading data from the queue, you might have to manually start processing by running the start_queues.bat batch file.
- As a last resort, use the stop_repl.bat file to stop the Q Capture and Q Apply programs, drop the LONDON, DALLAS, and TOKYO databases, and start from the beginning.
After you set up your replication configurations and start transactions flowing from the source to the target, you can set up the Q Replication Dashboard to monitor the progress of your replication environment.
Creating a dashboard view
A dashboard view is the collection of replication or publishing configurations that you want to monitor. The view enables you to keep close tabs on the status and performance of your replication programs, Q subscriptions, and queues.
After you set up a dashboard view, you can save it to use at any time, and it can be shared by many users.
In this lesson, you will create a view to monitor two configurations: LONDON.ASN > DALLAS.ASN, and LONDON.ASN > TOKYO.ASN.
A replication configuration is the set of replication objects (programs, control tables, queues) that enable Q subscriptions of the same subscription type and are defined in a set of replication control tables. For example, the LONDON.ASN > DALLAS.ASN configuration contains the following objects:
- The control tables with the ASN schema on the LONDON database.
- The control tables with the ASN schema on the DALLAS database.
- Two replication queue maps, LONDON_ASN_TO_DALLAS_ASN and LONDON_ASN_TO_DALLAS_ASN2, which contain the SEND_RECVQ1_LON_DALLAS and SEND_RECVQ1_LON_DALLAS combined send queue-receive queues.
- Four unidirectional Q subscriptions: EMPLOYEE0001, DEPARTMENT0001, STAFF0001, and SALES0001.
The LONDON.ASN > TOKYO.ASN configuration contains the same control tables at the LONDON database, but different queue maps, queues, and Q subscriptions that enable unidirectional replication from LONDON to TOKYO.
- Start the Data Studio Server. From the windows Start menu, select All Programs > IBM Data Studio > IBM Data Studio V1.1.2 > Start Data Studio Server.
- Launch the Data Studio Administration Console. From the windows Start menu, select All Programs > IBM Data Studio > IBM Data Studio V1.1.2 > Data Studio Administration Console.
- On the Welcome page, enter your Windows log-in user ID and password for the
machine where you installed DSAC, and click on Log in:
Figure 4. Log-in page for DSAC
- On the left navigation panel, click on the plus (+) sign next to Q
Replication Dashboard, and select View Health:
Figure 5. View Health link under Q Replication Dashboard
The Q Replication Dashboard opens, with the Add Server window in front. (If you already registered a data server, click Create in the Dashboard Views table and then Add in the Server table). You will add the LONDON database as a server:
Figure 6. Add Server window
- Enter the following values, and click OK:
LONDONin the Database alias field.
- The host name or IP address of your computer in the Host name or IP
address field. If you are running DB2 and DSAC on the same local
computer, you can enter
- Your user ID and password for connecting to the LONDON database.
After the Add Server window closes, the Create Dashboard View wizard opens, showing the Server page with LONDON in the Server table.
Why you registered the LONDON server first
You started building your view with the LONDON server because it is the source for both the DALLAS and TOKYO target servers. By choosing LONDON, you enable the dashboard to discover both DALLAS and TOKYO (the control tables at the LONDON server contain the definitions for both configurations). You could build the view by starting with any of the servers, but because DALLAS and TOKYO are not related, you would have had to bring them into the view by using the wizard twice.
Regardless of which server you register first, the dashboard discovers all other servers that are related to this server through a Q subscription or publication.
- With the LONDON data server selected in the Server table, click Next
in the wizard:
Figure 7. Create Dashboard View wizard
- On the Schema page of the wizard in the Schemas for Server LONDON table, the
schema ASN is selected by default, so you can click Next.
The dashboard reads the control tables on the LONDON server with the ASN schema (LONDON.ASN) and discovers all replication or publishing configurations that are defined in them.
- On the Configurations page, two configurations are displayed in the
Configurations for LONDON.ASN table:
- LONDON.ASN > TOKYO.ASN
- LONDON.ASN > DALLAS.ASN
The two configurations are also displayed visually in the Topology area. In the Connection Information column of the table, an "Add missing information" control is shown:
Figure 8. Configurations page of wizard
The dashboard needs information to connect to the Tokyo and Dallas databases to monitor replication at the target as well as the London source.
- Click Add missing information. Fill in the host name or IP address of
your computer, and the user ID and password for connecting to the Tokyo and
Figure 9. Connection information complete
- Click Next. On the Summary page of the wizard, accept the default
view name of london.asn, the default refresh interval of 10 seconds, and the
default threshold for end-to-end latency:
Figure 10. Summary page of wizardHere are brief explanations of some of the concepts on the Summary page:
Table 3. Summary page concepts Concept Explanation Dashboard view name By default, views are named after the server.schema on which they are based. Refresh interval This interval determines how often the dashboard updates monitoring information. The dashboard reads monitoring information that the Q Capture and Q Apply program insert into their control tables. If you want very granular monitoring information, you need to also make sure to set a low monitor_interval for the programs to ensure that they update performance information frequently enough to justify repeated refreshes by the dashboard. Threshold for end-to-end latency You can tell the dashboard to set an alert icon when end-to-end latency rises above a threshold. End-to-end latency is the time between the Q Capture program reading a transaction in the recovery log and the Q Apply program applying the transaction at a target table. The default threshold is 3 seconds. You can choose different thresholds for individual configurations within your view.
After you click Finish on the wizard, the dashboard displays its health summary page:
Figure 11. Health summary page of dashboard
In the next section, you will learn how to use the summary page to monitor replication performance.
Using the dashboard health summary
You can now use the dashboard view that you created to monitor the health of replication between the LONDON source database and the DALLAS and TOKYO target databases. This section gives you a tour of your london.asn dashboard view.
The Summary page has three main parts:
- A topology graphic that depicts each configuration.
- A latency and throughput summary for one of the Q Capture-Q Apply pairs.
- A Replication Configurations table in which each row shows rolled-up states for queues, subscriptions, latency, exceptions, and messages. The following notation in the Q Capture (Send queues) and Q Apply (Receive queues) columns stands for "program state (queues state, rolled up across all queues)":
- In the Replication Configurations table, click on the unidirectional icons
to the left of the two LONDON.ASN Q Capture entries to expand your view of the
Figure 12. Expanded configurations table
The expanded table includes a row for each send queue-receive queue pair. In this case, the send and receive queue are the same local queue. The rows include the following information:
- The status of the send queue and receive queue
- The number of Q subscriptions for each queue, and an icon that shows their status
- The average end-to-end latency for transactions that were sent on the queues during the last refresh interval
- The number of data exceptions that were written to the IBMQREP_EXCEPTIONS control table for Q subscriptions that use the queues
- Icons that notify you of any error messages that the Q Capture or Q Apply programs generate
Note: Some of the values in your Replication Configurations table are likely to be different from the screen capture above, especially latency and the number of exceptions. The latency depends on a number of factors, including the speed of your system, and the number of exceptions will differ depending on the length of time between starting replication and creating the dashboard view.
- In the upper-right area of the Summary page, notice the graphical latency
and throughput summary for the LONDON.ASN > DALLAS.ASN configuration:
Figure 13. Latency and throughput summary
The moving chart tracks two important measures of replication performance for one Q Capture-Q Apply pair:
Table 4. Replication performance measures Measure Description Log latency The average time that elapses between transactions being committed at the source table and the Q Capture program reading the transaction commit statement in the recovery log. This measurement, in seconds, gives you a rough idea of how "caught up" Q Capture is in reading the log. Q Apply throughput The average number of rows per second that the Q Apply program applied to the target table.
To view charts for both configurations at the same time, replace the Topology graphic -- click on the Change throughput summary icon above the topology: The topology view is a useful way of quickly seeing which servers are connected and how. For active monitoring, you might prefer to see both charts.
- In the Replication Configurations table, look for the row that identifies
SEND_RECVQ2_LON_DALLAS. Note that in the Q subscriptions column, both Q
subscriptions are active. To get more details about the two Q subscriptions,
click the column that contains the number 2 next to a green diamond:
Figure 14. Drilling down on Q subscription state
The Q subscription details window that opens provides more detail on the status of all Q subscriptions that use this send queue-receive queue pair.
- In the Q subscription details window, click the number 2 next to
"Active:", as Figure 15 illustrates:
Figure 15. Q subscription details window
The Subscriptions page opens with a table that provides more details about the Q subscriptions and their state. From this page you can start, stop, and reinitialize Q subscriptions and publications, and also indicate that a manual load of the target table is complete. You can also use the drop-down boxes to view Q subscription status for another configuration (Q Capture > Q Apply), or for a different send queue-receive queue within the same configuration:
Figure 16. Q subscription drill down
You can use the health summary table to drill down on programs and queues in addition to Q subscriptions, and the dashboard provides the same drill-down capability for latency statistics, exceptions, and messages. Click any table cell or link in the summary table to drill down into more detailed information.
- Note that the dashboard view is divided into five tabs:
Figure 17. Tabs on dashboard view
In addition to the Summary tab, the following tabs give you the ability to monitor and manage other elements of your replication environment:
Table 5. Dashboard view tabs Tab Description Live graphs Provides graphical, near-real-time latency and throughput information for each send queue-receive queue pair. Programs Shows the status of the Q Capture and Q Apply programs and current values of their operation properties. You can also change property values. Queues Shows the status of queue managers and queues, including queue depth. You can also view and change queue map properties. Subscriptions Shows the status of Q subscriptions or publications.
- Select the Live Graphs tab. Each block of four moving graphs shows
statistics from one send queue-receive queue pair (the latency and throughput
graph on the summary page is for a Q Capture-Q Apply pair). The Live Graphs
tab provides statistics on a more detailed level than the graph on the Summary
page, and enables you to see latency and throughput data for each specific
send queue-receive queue pair rather than just the rolled-up information that
you saw in the Summary page.
The graphs provide these additional statistics:
Table 6. Additional graph statistics Measure Description End-to-end latency The average elapsed time in seconds between transactions being committed at the source database and being applied at the target. This is the same measure for which you can set a threshold when you create or change a dashboard view. Q Capture throughput The number of rows per second that the Q Capture program puts on the send queue.
Figure 18. Live Graph page
You can enlarge the graphs to 150 percent or 200 percent, or change which live graphs are displayed by clicking on the Change live graphs control in the upper-right area:
You can also maximize the number of live graphs that display by clicking on the Open new window control to open the live graphs in a separate window:
Or click on the Legend icon to open a legend window that describes what the colors and lines in the graphs mean:
Figure 19. Legend for live graphs
You have now had a tour of the overall structure of a dashboard view. The next section will show you how to use the dashboard to track and analyze replication latency between the source and target tables.
Using the dashboard to track latency
Low latency is one of the defining characteristics of Q replication, and the Q Replication Dashboard gives you several tools to keep close tabs on how quickly transactions are moving between source and target.
You can set an alert threshold so that the dashboard signals with icons when latency drops below your maximum desired value. When a threshold is exceeded, you can easily view details that show where replication might be slower than expected, and use the analysis to help you tune your system for lower latency.
Follow these steps to drill down on latency:
- From the Summary page of your dashboard view, make sure the configurations table is expanded.
- Look in the row for the SEND_RECVQ_LON_TOKYO combined send and receive queue
in the LONDON.ASN > TOKYO.ASN configuration. Depending on your system
speed, the icon in the Latency column might be red or green. Here is how the
icon is determined:
- You set the alert threshold for end-to-end latency when you create or change a dashboard view. (The default threshold is 3 seconds.)
- If the average end-to-end latency exceeded the threshold the last time the dashboard looked at replication program performance information, a red square icon is shown. Otherwise, a green diamond displays.
Figure 20. Latency value that exceeds the alert threshold
- Click the table cell for the SEND_RECVQ_LON_TOKYO in the Latency column. A
details window opens:
Figure 21. Latency details window
The window gives you quick statistics on the three types of latency that together make up end-to-end latency:
Table 7. Latency types Type Description Q Capture latency The time between a transaction being read in the DB2 log and the message containing the transaction being put on the send queue. MQ latency The time between Q Capture putting a message on a send queue and Q Apply getting the message from the receive queue. Q Apply latency The time between Q Apply getting the message from the receive queue and applying it to the target table.
In this case, for the most recent instance that the latency alert threshold was exceeded, Q Capture latency was the main factor in exceeding the threshold. However, the dashboard helps you look more closely at latency factors before reaching any conclusions.
- On the latency details window, click on the arrow next to the "Time interval" field, and select Last 10 minutes of available data.
- Select Generate Report.
Figure 22. Latency report
A red line is set at the alert threshold. Each individual bar is divided into three colors that represent the three components of end-to-end latency, but in a more dynamic and visual way.
It's easy to see that the Q Capture latency (green portion of each bar) accounts for most of the elapsed time. You also get more detail about how many times the latency threshold was exceeded during the reporting period and the best and highest latency values.
You can also click on Recommendations on the latency details report to generate an analysis of factors that might contribute to higher-than-expected latency.
Now that you've seen the speed of transactions replicating from source to target, let's take a look at how the dashboard helps pinpoint problems that can occur in a replication environment.
Using the dashboard to identify and diagnose data exceptions
The Q Replication Dashboard makes it easy to identify and diagnose data exceptions at target tables by displaying the exceptions prominently in the health summary view, helping you find out what caused them, and providing more detailed information about them.
Figure 23. Exception shown in summary view
The icon in the Exceptions column of the Replication Configurations table turns to a red square when an exception occurs on at least one send queue-receive queue pair in the configuration:
When you expand the table, you can see which queue pair had the exception and use detail windows and reports to diagnose the problem.
One of the ways that exceptions occur is when an application other than the Q Apply program is updating the target table. In this tutorial, the runwork application creates exceptions at the STAFF target table by taking the following actions:
- Deletes a row from the target table
- Updates the same row in the source table
When the update is replicated, the Q Apply program cannot find the row in the target table to update, and an exception occurs.
Follow these steps to identify and diagnose an exception:
- From the summary page, notice the red square that indicates at least one
exception in the LONDON.ASN > DALLAS.ASN configuration:
In the Replication Configurations table, click on the unidirectional icon to the left of the upper LONDON.ASN Q Capture entry to expand your view of the configuration:
Figure 24. Exceptions column expanded
You can see that the exceptions are occurring only on the SEND_RECVQ2_LON_DALLAS send queue-receive queue. The table shows the number of exceptions.
- Select the table cell with the red square. A window opens that provides
details about the most recent exception.
Figure 25. Exception detail window
The "Target table:" field shows that the exception occurred at the STAFF table, and that the Q Apply program was attempting an UPDATE operation at STAFF. The SQLCODE and SQLSTATE that help identify the cause of the exception are listed, along with links to reference information in the DB2 Information Center for more detailed explanations. You can also see that the row was applied despite the exception ("Applied: Yes").
In the Reports area of the window, you can specify a time interval for viewing detailed reports about exceptions that occur for a send queue-receive queue pair.
- To diagnose the exception more closely, select Generate Report. The
Exceptions Report opens in a new browser window. The report provides a
statistical summary of all exceptions reported during the time range that you
specified on the details window. It also has a row for each exception that
includes the following information:
- The time of the exception
- The SQL statement that the Q Apply program was trying to execute when the exception occurred
- The conflict rule that was specified for the Q subscription
Figure 26. Exceptions report
The exceptions that are occurring at the STAFF table were caused by the same UPDATE operation, an attempt to update a row with a value of 10 in the ID column. Because the runwork application deleted the row before it inserted the same row in the source table at the LONDON database, a SQLCODE of 100 is generated (no row was found that meets the specified search condition).
Although the problem triggered an exception, the Q Apply program applied the transaction to the target table for the following reason:
- The STAFF0001 Q subscription specifies that the Q Apply program should check the key values of the TARGET table for conflicts before applying a row.
- Because the runwork application deleted the row with an ID of 10, there is no key in the STAFF table that matches the key in the UPDATE statement, so there is no conflict. The Q Apply program turns the UPDATE into an INSERT and inserts the row with only the values that were specified in the UPDATE statement (ID of 10, SALARY of 74,999.00).
After taking this tutorial, you should have a good idea of how the Q Replication Dashboard fits into your overall set of replication and database tools, and how it can help you stay on top of replication activity and performance.
Continue to explore the dashboard. Some of its features were not covered in this tutorial, including:
- Using the Programs tab to check Q Capture and Q Apply status, and to check or change runtime values for program properties
- Using the Queues tab to view the status of queues and number of messages on queues, and to view and change queue map properties
- Viewing messages and message reports from the Summary tab
By completing this tutorial, you learned about the following concepts and tasks:
- Dashboard views
- Replication configurations
- How to use the Q Replication Dashboard to monitor a Q replication environment
- How to track replication latency
- How to identify and diagnose a data exception
|Files to set up replication (Version 9.5)||tutorial_setup_v95.zip||21KB|
|Files to set up replication (Version 9.1)||tutorial_setup_v91.zip||21KB|
- "Q Replication Information Roadmap" (developerWorks, November 2007): Explore this comprehensive list of information resources for WebSphere Replication Server for z/OS and WebSphere Replication Server.
- Q Replication tutorial (Information Management Software for z/OS Solutions Information Center, IBM, November 2007): Learn how to use the high-speed Q replication technology to replicate data over message queues. This tutorial explains how to set up and run the WebSphere Replication Server and WebSphere MQ.
- developerWorks resource page for IBM Data Studio: Find articles and tutorials, and connect to other resources to expand your Data Studio skills.
- Data Studio Administration Console: Get a free download of the Data Studio Administration Console, which contains the Q Replication Dashboard.
- developerWorks Information Management zone: Learn more about Information Management. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
- Technology bookstore: Browse for books on these and other technical topics.
Get products and technologies
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in developerWorks blogs and get involved in the developerWorks community.