CDC Replication Engine for Db2® Database behavior in a DPF source environment

It is important to be aware of certain details when deploying CDC Replication in a DPF source environment.

Limitations in the database logging of a DPF environment

When a transaction affects rows in more than one partition, the database log maintains the order of operations occurring in each partition but does not maintain the relative order across the partitions.
In addition, updates that move a row from one partition to another (updates to the partitioning key) are logged by the database as separate delete and insert operations.

The following two scenarios and accompanying examples describe the effect on CDC Replication behavior as a result of these limitations.

Effect on LiveAudit table mappings

These limitations will affect the behavior of a LiveAudit table mapping. The target table will contain all of the operations that occurred, but the order of the operations (within a transaction) may not match the original order on the source system. In addition, updates that cause rows to move between partitions will appear in the target table as two individual delete and insert operations. Example 1 illustrates this scenario.

Example 1 – LiveAudit

Note: In this example TABLE1 has one column PK (a key column defined to be the partitioning key). A key value of PXK* indicates a key value that will end up on partition X. For example, P2K1 refers to a key value resolving to partition 2.

In this example, you execute the following SQL code in your source database:

INSERT INTO TABLE1 VALUES ("P1K1");
INSERT INTO TABLE1 VALUES ("P2K2");
UPDATE TABLE1 SET PK= "P2K3" WHERE PK = "P1K1";
COMMIT

CDC Replication LiveAudit may view the sample SQL code as:

INSERT of "P2K2"
INSERT of "P2K3"
INSERT of "P1K1"
DELETE of "P1K1"

Unique constraints on target tables

Limitations in database logging can also cause apply errors when there is a unique constraint on the target table which is not part of the target key used by CDC Replication (the equivalent of the partitioning key on the source table). The workaround for this limitation is to remove the unique constraint from the target system. The target table will remain synchronized with the source database. Example 2 illustrates this scenario.

Example 2 – Unique constraints

Note: A key value of PXK* indicates a key value that will end up on partition X. For example, P2K1 refers to a key value resolving to partition 2.

In this example, TABLE2 contains two columns, PK and VAL:

PK is the partitioning key and VAL is an attribute that also has a unique constraint.
Assume that TABLE2 already contains two rows with the following data:
- Row 1 ("P1K1", "VAL1"), Row 2 ("P2K2", "VAL2")

You then execute the following SQL code in your source database:

UPDATE TABLE2 SET VAL= "VAL3" WHERE PK = "P1K1";
UPDATE TABLE2 SET VAL= "VAL1" WHERE PK = "P2K2";
COMMIT;

On the target CDC Replication may not execute the source SQL in the same order. If the second update is executed before the first in your target database, then a unique constraint error will occur. If the constraint is removed, regardless of the order in which they are executed, the target table will remain synchronized with the source.

DPF database partition configuration

During initial configuration, CDC Replication will determine all the partitions necessary to detect changes from the DPF database. When you need to change the partition configuration of your source DPF database, you will be required to follow some administrative steps as documented later in this section.