SQL Replication - Lesson 3.2: Testing integration with DataStage

With everything hooked up between SQL Replication and DataStage, you can do a quick demonstration of how rows travel from the source database to the target database, into DataStage, and finally into the data set file.

Procedure

On the computer where you saved the samples package for the tutorial, navigate to the sqlrepl-datastage-tutorial\scripts folder for your operating system.
Start SQL Replication by following these steps:
1. Run the startSQLCapture.bat (Windows) or startSQLCapture.sh (Linux or UNIX) file to start the Capture program at the SALES database.
2. Run the startSQLApply.bat (Windows) or startSQLApply.sh (Linux or UNIX) file to start the Apply program at the STAGEDB database.
Open the updateSourceTables.sql file and replace <sales-connect-ID> and <sales-password> with the user ID and password for connecting to the SALES database.
Open a DB2 command window, change directory to sqlrepl-datastage-tutorial\scripts, and run issue the following command:
```
db2 -tvf updateSourceTables.sql
```
The SQL script performs an INSERT, UPDATE, and DELETE operation on both the PRODUCT and INVENTORY tables in the SALES database.
On the system where DataStage is running, open the DataStage Director and run the STAGEDB_AQ00_S00_sequence job by selecting the job and clicking Job > Run Now.
Figure 1. Starting sequence job

Within seconds, the following things should occur:
- The Capture program reads the six row changes in the SALES database log and inserts them into the CD tables.
- The Apply program fetches the change rows from the CD tables at SALES and inserts them into the CCD tables at STAGEDB.
- The two DataStage extract jobs pick up the changes from the CCD tables and write them to the productdataset.ds and inventorydataset.ds files.
You can verify that these steps took place by looking at the data sets.
Perform these steps:
1. Start the Designer and open the STAGEDB_ASN_PRODUCT_CCD_extract job.
2. Double-click the insert_into_a_dataset icon, and in the stage editor click View Data.
3. Accept the defaults in the rows to be displayed window and click OK.
The data set contains three new rows, but the easiest way to verify the change is to scroll to the far right of the Data Browser and look at the last three rows in the IBMSNAP_OPERATION and IBMSNAP_LOGMARKER columns:

Figure 2. Changes to PRODUCT table showing in Data Browser window

The I, U, and D values show the INSERT, UPDATE, and DELETE operation that resulted in each new row, and you can see from the timestamp in the IBMSNAP_LOGMARKER column that the rows were added at the same time.
Perform the same steps to check the inventorydataset.ds file. Here is what you should see:
Figure 3. Changes to INVENTORY table showing in Data Browser window

Lesson checkpoint

Congratulations! You successfully combined SQL Replication and InfoSphere DataStage with two DB2 databases to create a configuration that can enable dynamic warehousing. Using SQL Replication can dramatically speed the extract phase of data warehousing while taking advantage of Information Server's robust capability for the transform and load phases. The entire ETL process can be significantly shortened to the point where your organization can achieve real-time business intelligence.

Lesson 3.2: Testing integration between SQL Replication and DataStage