Introducing IBM InfoSphere CDC replication for IBM DB2 for i solution

This article introduces a journal-based data replication solution for IBM® i using IBM InfoSphere® CDC, which captures only data changes from the source IBM DB2® for i as they happen and delivers them to the target. The InfoSphere CDC solution supports not only data replication between homogeneous databases, but also supports heterogeneous databases, such as DB2 for Linux, UNIX and Windows (DB2 for LUW), Oracle, and so on.

Pi GuangMing (pigm@cn.ibm.com), Software Engineer, IBM China

Pi GuangMing has 6 years of working experience on the IBM i platform, his expertise include WebSphere, Android, Mobile DB, Web Service, OSGi, Apache, Web Integration etc, and now he is working as project lead of IBM Java Toolbox for i.



28 March 2013

Introducing InfoSphere CDC replication for DB2 for i solution

IBM InfoSphere Change Data Capture (CDC) is a log-based replication solution that captures database changes as they happen. It then delivers them to target databases, Java™ Message Service (JMS) message queues, or extract, transform, and load (ETL) solutions such as InfoSphere DataStage, based on table mappings configured in the InfoSphere CDC Management Console GUI application. Figure 1 illustrates the key components of InfoSphere CDC.

Figure 1. InfoSphere CDC architecture
InfoSphere CDC architecture

The source and target data stores represent the InfoSphere CDC installation and the database made available for replication. A data store is an InfoSphere CDC process on a source or target server. The process accepts requests from an instance of the Access Server and communicates with the data store replication engine to initiate and manage the replication activity. These processes are defined in the access manager perspective of the Management Console. An InfoSphere CDC instance can operate as a source capture engine and a target engine simultaneously.

The Management Console allows you to configure and manage replication on various servers, specify replication parameters, and initiate refresh and mirroring operations from a client workstation. Also, the Management Console allows you to monitor replication operations, latency, event messages, and so on, which is useful for time-critical working environments that require continuous analysis of data movement.

When you log in to the Management Console, you are connecting to the Access Server. The Access Server is a client workstation where a set of background processes direct requests received from the InfoSphere CDC management. The Access Server and the Management Console can be closed on the client workstation without affecting active data replication activities between the source and target servers.

The refresh process performs the initial synchronization of the tables from the source database to the target, while the mirror process performs the replication of changes in a manner of real-time or accumulation of source changes at a later time.

The metadata represents the information about the relevant tables, mappings, subscriptions, notifications, events, and other particulars of a data replication instance that you set up.

Specifically, the product InfoSphere CDC for DB2 for i belongs to the InfoSphere CDC family and is intended for DB2 for i. To find more details about the InfoSphere CDC for DB2 for i architecture, check About InfoSphere CDC for DB2 for i and InfoSphere CDC Management Console.


Install InfoSphere CDC for DB2 for i

The latest version of InfoSphere CDC for DB2 for i (at the time of this publication) is Version 6.1, and it supports IBM DB2 for i version 6.1 and IBM DB2 for i version 7.1. For correct product operation, you might require program temporary fixes (PTFs). For details about the system requirement, check System requirements for InfoSphere CDC for DB2 for i.

Generally, you can install InfoSphere CDC for DB2 for i from scratch, or choose to update from the existing one. In this article, we will cover the brand new install. For details to update installation, check Upgrading InfoSphere CDC for DB2 for i.

Before you install InfoSphere CDC for DB2 for i

First, you need to prepare a name that you want to assign to the InfoSphere CDC product library. DMIRROR (previously called Data Mirror prior to IBM acquisition) is the default name.

The InfoSphere CDC automatically creates a user profile (D_MIRROR) during installation to supervise replication operations. The password for this user profile is intentionally set to *NONE, and should not be changed. As a result, the D_MIRROR user profile should be reserved for InfoSphere CDC in advance.

Installation of the InfoSphere CDC is typically done using the QSECOFR user (or any user who has *SECOFR special authorities). At a minimum, the following special authorities are required for the user installing InfoSphere CDC:

  • *SECADM
  • *JOBCTL
  • *ALLOBJ
  • *SAVSYS
  • *AUDIT
  • *IOSYSCFG
  • *SPLCTL

InfoSphere CDC requires that you allocate a set of ports for communicating with client workstations running the Management Console and other servers. The default ports are 11111, 2222, and 10101. Ensure that these ports available and accessible through firewalls.

It is recommended to check Before you install InfoSphere CDC for DB2 for i for more details before you set up.

Installing InfoSphere CDC for DB2 for i

The following steps help you install InfoSphere CDC.

  1. Sign on with authorities.
    Before you install InfoSphere CDC, you need to sign in with a user profile that has the *SECADM,*JOBCTL, *ALLOBJ, *SAVSYS, *AUDIT, *IOSYSCFG, *SPLCTL authorities. The recommended user is QSECOFR.
  2. Restore the installation program.
    Issue the following command to restore from the save file, as indicated in Listing 1.
    Listing 1. Restore InfoSphere CDC for IBM i save file
    > RSTOBJ OBJ(*ALL) SAVLIB(V6R1M0) DEV(<D-ROM device name>)
     OPTFILE('/DTS/V6/AS400/D_INITIAL/V6R1M0INIT') RSTLIB(QTEMP)

    where<library> is the save file library.
    After you have restored the installation program, you should get the following completion message:
    "3 objects restored from V6R1M0 to QTEMP."
  1. Run the installation program
    Issue the following command to run the installation program, as indicated in Listing 2.
    Listing 2. Install InfoSphere CDC for IBM i
    > ?QTEMP/DMCINSTALL

    Press F4 to access the installation prompt screen.
  1. Specify the CD ROM device name or save file information
    In the Device/Save file name screen that is displayed (as shown in Figure 2), enter the necessary information and press Enter to continue.
Figure 2. Device/Save file name screen
Device/Save file name screen
  1. Accept the software license agreement and begin installation.
    Press F2 to accept the terms expressed in the software license agreement and press Enter to start the installation.
  2. Specify the authorization code.
    The authorization code screen appears. You can get the InfoSphere CDC authorization code from IBM Support or obtain it from the IBM website. The authorization code is no longer verified in product versions greater than or equal to InfoSphere CDC version 6.1 Fix Pack 1. Press Enter to continue.
  3. 7. Specify product and tutorial libraries.
    The product and tutorial library screen appears. Usually, you can accept the default values. Press Enter to install InfoSphere CDC on your server.

    After the installation has been completed, you should get the following status messages:
    "Transformation Server product objects are being restored."
    "Transformation Server communication objects are being created."

    If the installation is unsuccessful, error messages are placed in the job log. Use the DSPJOBLOG command to identify the errors and take the necessary corrective actions before running the installation program again.

    It is recommended to check Before you install InfoSphere CDC for DB2 for i for more details when you set up.

After you install InfoSphere CDC for DB2 for i

The most important step after you install is to define a port that CDC will listen to. Here are some steps.

  1. Determine whether the port is free.
    The standard listening port for CDC is 2222. To determine if the port is free, type the NETSTAT *CNN command, and you can view the TCP/IP connection.
    Press F15 (which is Shift+F3) to subset the connections and enter 2222 for the local port. If you see a listener on the specified port, then you need to choose a different one.
  2. Add the service table entry.
    Enter the WRKSRVTBLE command to work with service table entries. Add a new service by selecting "1" as the option and enter the rest of the service, as indicated in Figure 3.
Figure 3. Add Service Table Entries screen
Add Service Table Entries screen

After the InfoSphere CDC for DB2 for i installed, you need to install the latest fix pack and interim fixes (individual fixes or enhancements that have not been gathered in a fix pack yet).

It is recommended to check After you install InfoSphere CDC for DB2 for i for more details after the installation is completed.


Configure InfoSphere CDC for DB2 for i homogeneous replication

To use InfoSphere CDC for DB2 for i to set up replication between two different DB2 for i systems, you need to install the following components:

  • InfoSphere CDC for DB2 for i
  • Source database
  • Target database
  • InfoSphere CDC Access Server
  • InfoSphere CDC Management Console

Set up InfoSphere for homogeneous replication

This section gives an overview of the steps to set up a replication subscription between two different DB2 for i systems. The following sections then provide detailed explanation and screen captures for the steps.

The following steps provide the basic procedure.

  1. Start an InfoSphere CDC instance for the source database server.
  2. Start an InfoSphere CDC instance for the target database server .
  3. Create data stores and assign users to the data stores:
    • Log in to the Management Console and connect to the Access Server.
    • Create data stores for source and target databases and assign the data stores to users.
  4. Set up a subscription for replication:
    • Create tables in the source and target data stores for CDC replication.
    • Create a new subscription.
    • Map tables for replication and select the replication method.
    • Start replication on subscriptions.

Start InfoSphere CDC instance for the source and target databases

To start the InfoSphere CDC, you need to start the DMIRROR subsystem as indicated in Listing 3, which will automatically begin the pre-started job, DMIRROR (see Figure 4), serving the InfoSphere CDC data replication.

Listing 3. View InfoSphere CDC for IBM i job
> STRSBS DMIRROR/DMIRROR
> WRKACTJOB SBS(DMIRROR)
Figure 4. Work with Active Jobs screen
Work with Active Jobs screen

To view the DMIRROR job log, enter option 5 before the TCPLISTEN job, then enter option 10 on the menu, and then press F10 for detailed messages.

After the listener has been started, you can create the data store in the Management Console and start creating subscriptions and mapping tables.

Create data stores and assign data stores to users

This section describes the steps required to create data stores for your source and target databases and assign them to users.

1. Log in to the Management Console and connect to Access Server.

The first step is to log in to the Management Console. By logging into the Management Console, you are also connecting to Access Server. Figure 5 shows the Management Console login dialog box.

Figure 5. Management Console login dialog box
Management Console login dialog box

When logging in, use the administrator user name, password, and port number provided when InfoSphere Access Server was installed. In the Server Name field, enter the host name of the workstation running Access Server.

2. Create data stores for source and target databases.

Open Access Manager perspective in Management Console (as shown in Figure 6), and then perform the following operations to create data stores for your source and target databases.

  • Click Add new datastore to create a data store.
  • Click Add new user to create a user connecting to the data store.
  • Click Add users to datastore to assign the user to the data store.
Figure 6. Management Console - Access Manager perspective
Management Console - Access Manager perspective

Set up a subscription for replication

A subscription is a connection that is required to replicate data between a source data store and a target data store. The subscription defines the details of the data being replicated and how the source data is applied to the target. To set up a new subscription, switch to the Configuration perspective of the InfoSphere CDC Management Console (see Figure 7) and clickNew Subscription, type subscription name, description, and source and target datastores.

Figure 7. Management Console – New Subscription dialog box
Management Console – New Subscription dialog box

Map source and target tables

This example describes a simple table mapping for a source table to a similar target table using the Map Tables wizard. The Monitor Console also provides options to map to a custom table, filter rows and columns, and define advanced table mappings.

To open the Map Tables wizard (shown in Figure 8), right-click a subscription in the Configuration perspective of the Management Console and click Map Tables.

Figure 8. Management Console – Map Tables wizard
Management Console – Map Tables wizard

Click Next and the wizard takes you to the Set Replication Method page (see Figure 9).

You can choose any one of the two replication options:

  • Mirror (Change Data Capture) option immediately replicates changes made to the source table to the target table.
  • Refresh (Snapshot) option replicates a snapshot of the source table to the target table.

For this example, select Mirror (Change Data Capture) and click Next. Verify whether the table mappings summary looks correct and click Finish.

Figure 9. Map Tables wizard – Set Replication Method page
Map Tables wizard – Set Replication Method page

After the mapping is complete, you can see a new subscription, SUB01, created (see Figure 10).

Figure 10. Management Console – displaying the new subscription
Management Console – displaying the new subscription

Start replication on subscriptions

To start replication on a subscription, switch to the Monitoring perspective of the InfoSphere CDC Management Console.

In the Subscriptions view, right-click a subscription. Choose one of the three ways to start replication: Start Refresh, Start Mirror (Continuous), or Start Mirror (Net Change), as shown in Figure 11.

Figure 11. Management Console – Start mirroring
Management Console – Start mirroring

In this example, we select the Continuous mirroring method. Any data inserted into the source database is replicated continuously to the target database. If everything is successful, you can see a green bar, as shown in Figure 12.

Figure 12. Management Console – mirroring
Management Console – mirroring

After replication is enabled, you can verify the target table values.

Verify data replication

The first example (in Figure 13) shows data being inserted into a DB2 for i system that in turn is being replicated to another system, as indicated in Listing 4. Users can also perform other SQL operations, such as update, delete, and so on.

Listing 4. Insert sample data into the source database
> insert into dtutor/customer(custno, name1) values(2, 'tom');
> select custno, name1 from dtutor/customer where custno=2 and name1='tom';
Figure 13. Insert one record into the CUSTOMER table
Insert one record into the CUSTOMER table

The second example (in Figure 14) shows that you can undo row insertion using the RMVJRNCHG command, as indicated in Listing 5. You can use the APYJRNCHG command to redo the row insertion in a similar way.

Listing 5. Undo sample data insertion from the source database
> CL: RMVJRNCHG JRN(DTUTOR/QSQJRN) FILE((DTUTOR/CUSTOMER CUSTOMER));
> select custno, name1 from dtutor/customer where custno=2 and name1='tom';
Figure 14. Undo data insertion using the RMVJRNCHG command
Undo data insertion using the RMVJRNCHG command

In fact, you can also use the command-line interface to manage data replication. Refer to Commands for InfoSphere CDC for DB2 for i for more details.

Troubleshooting

When running on IBM i, InfoSphere CDC might fail on certain operations and the error report is displayed in the event log, as shown in Figure 15.

Figure 15. Management Console - events log
Management Console - events log

Sometimes, the detail in the event log is not sufficient to analyze the problem and therefore, you need to retrieve further information (spooled file) from the IBM i server, as shown in Figure 16.

Figure 16. Display Spooled File screen
Display Spooled File screen

We just finish the demonstration of data replication between two DB2 for i databases.


Reference

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into IBM i on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=IBM i
ArticleID=862670
ArticleTitle=Introducing IBM InfoSphere CDC replication for IBM DB2 for i solution
publish-date=03282013