Starting a refresh on a subscription

The CDC Replication refresh operation is designed to synchronize source and target tables.

About this task

Tables can become out of synchronization for various reasons, including the following issues:

Parked tables
Parking a table from replication for some time to make changes (such as updating the definition of a source table) and the changes that are taking place on the source are no longer being replicated.
This may cause the target table to become inconsistent with the source table.
Configuration changes
A refresh may be necessary when a set of subscriptions are promoted from a test environment to a production environment.
The promotion operation may add new transformations or other table mapping changes which require the source and target tables to be refreshed in order to prepare for mirroring.
Maintenance operations
Large bulk SQL operations performed during maintenance windows on the source table which affect a majority of the rows may be faster to resynchronize using refresh. Refresh may be faster than mirroring to replicate millions of changes due to the ability to bulk load rows into the target database.

When you have a subscription that contains a target table that is not synchronized with the source table, you can flag the source table for a refresh. Tables that are flagged for refresh and have a replication method of Mirror will be refreshed before mirroring begins.

CDC Replication will refresh all of the flagged tables within a single subscription as one sequential operation which will run to completion. Each table is refreshed individually one at a time until all flagged tables have finished refreshing. Refresh is an operation which applies to a single subscription, so while one subscription is refreshing other subscriptions are not affected; they may continue mirroring data for different tables or refreshing tables as required. To perform a parallel refresh, multiple subscriptions can be used.

Rule-based tables are not part of a refresh. Within Management Console, when you start a refresh, you will see a warning to indicate that tables selected by rules will not be refreshed.

CDC Replication offers two types of refresh operations: Standard Refresh and Differential Refresh.

A Standard Refresh results in a complete copy of the data in a source table being sent to the target table. This operation truncates or deletes the contents of the target table and then inserts the new contents as sent by the source system.

A Differential Refresh updates the target table by applying only the differences between it and the source table. Instead of the target table being cleared at the start of the refresh and repopulated with data, as with the Standard Refresh, the Differential Refresh compares each row in the target table with each row in the source table to identify missing, changed or additional rows. The primary advantage of the Differential Refresh is that the target table stays online during the refresh operation.

There are three possible methods for the Differential Refresh:

Refresh Only
Performs a Differential Refresh by changing any target rows that differ from the source rows.
Refresh and Log Differences
Performs a Differential Refresh, and also creates a log table in the target replication engine metadata to track all changes during the refresh.
The log table is identical to the target table, with the addition of a column to indicate the actions taken during the refresh, such as inserting a row, deleting a row, or updating a row. For an update, both the source and target row images are logged. This log table is created in the same database and tablespace as the TS_CONFAUD table (or DMMD_DMCONFAUD table for the CDC Replication Engine for DB2® for z/OS®), with the same owner as the metadata. The name of the log table is created by combining the subscription name, the target table name, and the refresh start date and time.
Only Log Differences
Creates and populates a log table in the target replication engine metadata to identify all differences between the source and target tables.
The target table is not updated. This allows you to evaluate what the differences are between the target and the source. If you then decide to refresh the table, you can go back to the subscription and select Refresh Only to update the target table or update the target table manually based on the contents of the log table.

Performing a Differential Refresh has some requirements and restrictions:

  • Differential refresh is only available for tables that use Standard replication
  • The collation sequence of the source and target tables must be identical
  • Derived columns on the source table are not supported
  • Any target columns which are mapped to derived expressions, constants or journal control fields will be ignored
  • The key columns of the target table must be mapped directly to columns on the source table.
  • The target table must have a unique key, either a primary key or a unique index with at least one non-nullable column.
  • Differential refresh does not support the Use all searchable columns or Specify the key options when you are specifying key columns for table mapping. Only the Use an index option is valid.

Both a Standard Refresh and Differential Refresh can be further refined through the use of a SQL WHERE clause to only include rows within a specified range. This is useful for tables where only the most recent data requires a refresh.

If you want to refine the refresh through the use of a SQL WHERE clause, then the Row Subset feature requires one of the following conditions be met:
  • The table capture point has been set, either explicitly or through the table having been mirrored in Management Console or the CHCCLP program.
  • The scraping point for the subscription has been set (using the SETLOGPOS command or dmsetbookmark command where applicable)

The order in which data is retrieved from the database during a refresh depends on the type of refresh performed. During a Standard Refresh, no ORDER BY sort is used; the database determines the order in which the data is returned. During a Differential Refresh, CDC Replication queries the database by using an ORDER BY sort on the table. Either a primary key or a unique index key is chosen to sort the source and target tables and determine their differences.

When a refresh is performed with multiple tables, the order in which each individual table is refreshed is based on the group order, as set in Refresh Order option.

After a refresh has successfully completed, the subscription can be restarted for mirroring. CDC Replication will then process the backlog of changes. For tables which weren't refreshed CDC Replication will continue processing changes from the position where mirroring ended. For tables which were refreshed, CDC Replication will process all the changes that committed after refresh began for that table.

Notes:
  • When using CDC Replication in a bidirectional configuration, you need to ensure that the following conditions exist when refreshing a table from side A to side B:
    • There is no application workload occurring on side B for that table while the refresh is occurring.
    • The table isn't being mirrored from side B to side A.
    • After the refresh is complete and before you add the workload back on side B, you need to execute the dmmarktablecapturepoint command for that table.
  • Differential refresh is not available for CDC Replication Engine for Netezza® Technology, or CDC Replication Engine for FlexRep datastores.
  • If no Refresh Order is set for a subscription for the CDC Replication Engine for DB2 for z/OS, then tables are refreshed in alphabetical order.
  • Refresh functionality is not available for the CDC Replication Engine for Teradata when the TPUMP utility is being used.

Procedure

  1. Stop mirroring on any active subscriptions that will have tables refreshed.
  2. Ensure that the tables to be refreshed have been flagged for refresh.
  3. Click Monitoring > Subscriptions.
  4. If the subscription is available for editing, right-click one or more subscriptions and select Start Refresh.

    The Start Refresh dialog shows all tables in the selected subscriptions that have been flagged for refresh

  5. If you want to view the refresh details for a specific table, select it and click Refresh Options.

    If multiple table mappings were selected, any enabled row constraints will not be displayed on the Refresh Options dialog.

  6. Click OK.