Understanding the behavior of CDC for Db2 for z/OS when DDL changes occur

The behavior of CDC for Db2® for z/OS® regarding Data Definition Language (DDL) operations performed on in-scope tables is dependent on several factors:

  • DDL statement type
  • DB2®'s DATA CAPTURE CHANGES setting
  • Version of DB2

DDL statements only have impact on the CDC Replication Engine for Db2 for z/OS if the types of DDL operations affect its ability to read the log. The following are examples of these types of DDL statements:

  • Adding columns
  • Modifying column formats (such as data type, length or precision)
  • Changing column types (such as CHAR to VARCHAR)
  • Dropping and re-creating tables
  • Turning off DATA CAPTURE CHANGES

These operations cause data to be written to the log which will not be received by the CDC Replication Engine for Db2 for z/OS or cause the log record format of the table (the layout of the data in the database logs) to change. For operations where the log record format changes, the log reader must be directed how to proceed when a change is encountered. CDC Replication metadata must be modified to accommodate the new log record format; otherwise the log reader will fail to properly decode log records after the point of the DDL change. Changes that do not materially affect the physical structure of the table in the log or the DATA CAPTURE CHANGES setting for the table will not interrupt replication.

The CDC Replication Engine for Db2 for z/OS can detect most DDL changes on in-scope tables only if DATA CAPTURE CHANGES is enabled on the SYSIBM.SYSTABLES system catalog table in DB2. If it is not enabled, the CDC Replication Engine for Db2 for z/OS will be aware of DDL changes when a log record is encountered which does not match the definition in the metadata, but its actions will depend on the type of DDL operation and the version of DB2.

  • Adding columns to tables is one type of DDL change which may not affect the ability of the CDC Replication Engine for Db2 for z/OS to read the log, however this is dependent on the version of the database and the level of maintenance installed for CDC Replication. The actions of the CDC Replication Engine for Db2 for z/OS on new columns being added to a table is dependent on the version of DB2.
    • Db2 stores row images in two possible row formats—Db2 row images can be stored in Basic Row Format (BRF) or Reordered Row Format (RRF). BRF row images have the column values stored in order of the column numbers that are assigned when the table is created or extended with additional columns. Added columns are located after the preexisting columns in the row. RRF row images have the column values stored with all the fixed-length columns (INT, CHAR, etc.) stored in increasing order of their column numbers, followed by all the variable-length columns (VARCHAR, VARBINARY, etc.) stored in increasing order of their column numbers. A column that is added to the table might extend the content of the fixed or variable sections of the log record, depending on the DDL change that was made to the table. Replication tries to continue, but it might fail at the point where the log record is interpreted.
    • Effect of the ADDCOLUMNISSCHEMACHANGE keyword—If ADDCOLUMNISSCHEMACHANGE=YES is specified in CHCDBMxx, then any added column will cause the CDC Replication Engine for Db2 for z/OS to behave as if it cannot read the log with respect to tables that have had columns added to them, even if the log could continue to be successfully read.
  • Changing column formats—Most column format changes will cause replication to fail at the point where the log reader attempts to interpret the log record.
  • Dropping and re-creating tables—The log reader will not detect that a table has been dropped and re-created. Tables are recognized by their DBID, PSID and OBID in the database, which will most likely change when the table is re-created. This will cause the log reader to treat the table as not in scope for replication. If the table is created with the same DBID, PSID and OBID then scraping will resume when subsequent changes are made to the table. There is a possibility that another table could be created with the original table's DBID, PSID and OBID. Should this occur, invalid data could be replicated although it would likely fail on format validation. Ensure that Update Source Table Definition is run in Management Console to avoid this possibility after dropping and re-creating tables which are in-scope for replication.
  • DATA CAPTURE CHANGES turned off—The log reader will not detect this and no changes for the table will be replicated as they will not be returned by the DB2 IFI when change records are requested.

If DATA CAPTURE has been set to CHANGES on the SYSIBM.SYSTABLES system catalog table in DB2, and if DATA CAPTURE is then changed to NONE, the log reader will detect this, issue a message and end replication for the subscription.

Adding columns to RRF format tables

When replication is running and a column is added in an RRF table, replication will continue unless ADDCOLUMNISSCHEMACHANGE=YES is specified, but the new column will not be included. So long as ADDCOLUMNISSCHEMACHANGE=YES is not specified, CDC Replication will maintain a history of the table versions and automatically update the table definition in its metadata. Note that when viewing a table in Management Console, you will need to refresh the view in order to see any changes to tables that have occurred during your current session.

There are some specific exceptions to the ability of CDC Replication to adapt to added columns:

  • Adding columns with FIELDPROCs that change the length of the data will stop replication.
  • If the CDC Replication Engine for Db2 for z/OS is not running when you are making changes to the structure of the table, your procedure should include going into Management Console and updating the source table definition.
  • If you skip forward in the log (through a REFRESH, issuing SETLOGPOS or changing a table's status from ACTIVE to IDLE and back to ACTIVE), this will cause CDC Replication to miss changes to table structure recorded in the skipped section of log and it may be unable to interpret a record whose version was not detected.
  • If a table is removed from the CDC Replication Engine for Db2 for z/OS catalog, all its versions will be removed from the history table. A REORG would be required to add the table back to the catalog.

For all changes discussed above which materially affect the structure of the log record (except those special cases noted) a REORG of the table will be required. If the subscription has shut down in error, recovery will require a refresh of the table. To avoid the need for a refresh, the best way to handle these types of changes is to ensure that CDC Replication has replicated all data for the table to the point where it is taken off line for the modification. A REORG will be required to the table so that all future log records contain the correct format for modified columns. You will need to run the Update Source Table Definition in Management Console and re-map as necessary before restarting replication.

The tables below show a synopsis of the behavior for the database versions and level of the CDC Replication Engine for Db2 for z/OS installed. As noted above, there are DDL changes which may affect the log read and those that will not. The ‘Column Added' is listed separately because while it may affect the log read the behavior is different for these DDL changes than for others affecting the log read.

Table 1. Replication behavior with DDL changes
DDL change Behavior
Column added Continues (without the new column) unless ADDCOLUMNISSCHEMACHANGE=YES is specified
Not affecting log read Continues
Affecting log read Stops unless ONSCHEMACHANGE is set to IDLE