With everything hooked up between SQL Replication and DataStage,
you can do a quick demonstration of how rows travel from the source
database to the target database, into DataStage, and finally into
the data set file.
- On the computer where you saved the samples package for
the tutorial, navigate to the sqlrepl-datastage-tutorial\scripts folder
for your operating system.
- Start SQL Replication by following these steps:
- Run the startSQLCapture.bat (Windows) or startSQLCapture.sh
(Linux or UNIX) file to start the Capture program at the SALES database.
- Run the startSQLApply.bat (Windows) or startSQLApply.sh
(Linux or UNIX) file to start the Apply program at the STAGEDB database.
- Open the updateSourceTables.sql file
and replace <sales-connect-ID> and <sales-password> with
the user ID and password for connecting to the SALES database.
- Open a DB2 command window, change directory to sqlrepl-datastage-tutorial\scripts,
and run issue the following command:
db2 -tvf updateSourceTables.sql
The SQL script performs an INSERT, UPDATE, and DELETE
operation on both the PRODUCT and INVENTORY tables in the SALES database.
- On the system where DataStage is running, open the DataStage
Director and run the STAGEDB_AQ00_S00_sequence job by selecting the
job and clicking .
Figure 1. Starting
sequence job
Within seconds, the following things should occur:
- The Capture program reads the six row changes in the SALES database
log and inserts them into the CD tables.
- The Apply program fetches the change rows from the CD tables at
SALES and inserts them into the CCD tables at STAGEDB.
- The two DataStage extract jobs pick up the changes from the CCD
tables and write them to the productdataset.ds and inventorydataset.ds files.
You can verify that these steps took place by looking at the
data sets.
- Perform these steps:
- Start the Designer and open the STAGEDB_ASN_PRODUCT_CCD_extract
job.
- Double-click the insert_into_a_dataset icon, and in
the stage editor click View Data.
- Accept the defaults in the rows to be displayed window
and click OK.
The data set contains three new rows, but the easiest
way to verify the change is to scroll to the far right of the Data
Browser and look at the last three rows in the IBMSNAP_OPERATION and
IBMSNAP_LOGMARKER columns:
Figure 2. Changes to PRODUCT table
showing in Data Browser window
The I, U, and D values show the INSERT, UPDATE, and DELETE
operation that resulted in each new row, and you can see from the
timestamp in the IBMSNAP_LOGMARKER column that the rows were added
at the same time.
- Perform the same steps to check the inventorydataset.ds file. Here is what you should see:
Figure 3. Changes to INVENTORY
table showing in Data Browser window
Lesson checkpoint
Congratulations! You successfully
combined SQL Replication and InfoSphere DataStage with two DB2 databases
to create a configuration that can enable dynamic warehousing. Using
SQL Replication can dramatically speed the extract phase of data warehousing
while taking advantage of Information Server's robust capability
for the transform and load phases. The entire ETL process can be significantly
shortened to the point where your organization can achieve real-time
business intelligence.