Parallelize by table

This mode of Fast Apply is similar to the Group by table mode, but instead of applying the reordered operations on a single database connection, the operations are applied concurrently across multiple database connections.

About this task

You should note that CDC Replication does not attempt to balance the number of operations assigned to each connection.

For example, if you are using two database connections and have the same set of source operations:

INSERT TABLE1
INSERT TABLE2
UPDATE TABLE1
INSERT TABLE3
INSERT TABLE2
INSERT TABLE3

CDC Replication may attempt to apply them on the target system as follows. For database connection 1:

INSERT TABLE1
UPDATE TABLE1
INSERT TABLE3
INSERT TABLE3

For database connection 2:

INSERT TABLE2
INSERT TABLE2

CDC Replication will perform this ordering on a group of transactions from the source system which can be referred to as a unit of work. You can specify the unit of work threshold (the maximum size of the unit of work) and the number of database connections by specifying two integer values (separated by a colon) in the Parameter box of the subscription-level user exit dialog box. For example, to use 3 database connections and use a threshold value of 12000, you can specify 3:12000.

The number of image builder threads can also be specified through the User Parameter. You need to provide the number of image builder threads after the threshold with a : separating them. For example, to have 3 apply thread connections, a threshold of 12000, and 4 image builder threads you must set the parameter to be 3:12000:4. If there is more than one image builder thread, the operations are distributed to the threads through a round-robin algorithm.

CDC Replication will stop adding additional transactions into the unit of work once it has reached the threshold value that you set.

Once the threshold is reached, CDC Replication keeps adding operations from the current transaction until the commit of that transaction is seen. If this transaction happens to be a very large transaction, the UOW might grow to a large size. This might use up all of the available memory, and none of the operations in the UOW can be applied until that commit is seen. So, there is a limit on how much bigger a UOW can get above the user-specified threshold. This is controlled by the system property target_optimizer_uow_max_percent_threshold. The default value is 1000. This specifies a percentage of the number of operations in the UOW ready threshold. This percentage times the UOW ready threshold number of operations that signifies the maximum number of operations that can be in a unit of work. For example, for the default value, a UOW can grow to 10 times larger than the user-specified threshold. Once this max threshold exceeds, the fast apply optimization is abandoned and the data is applied via a single database connection in the original order. At the beginning of the next transaction, CDC Replication will begin constructing a new unit of work.

The threshold value works like the other threshold values around grouping transactions. If there is latency then CDC Replication will create units of work based on this value to maximize throughput and reduce latency. However, when there is no latency CDC Replication will use smaller units of work to ensure it is not artificially adding latency.

If the apply of the reordered data across multiple connections fails, CDC Replication will roll back all data in the database and apply the data in the original source order with a single database connection.

In general, you should avoid having database constraints between the tables in the target database when using this mode. If you do have constraints, you must ensure that they will be verified at the time of the database operation rather than being deferred until commit point so that the CDC Replication recovery mechanism can work effectively.

Procedure

  1. Start Management Console and log into Access Server.
  2. Click Configuration > Subscriptions.
  3. Right-click the subscription where you want to enable Fast Apply and select User Exits.
  4. Select Java Class from the User Exit Type list.
  5. Enter the name of the Java™ class user exit for the Fast Apply mode that you want to use in the Class Name box: com.datamirror.ts.target.publication.userexit.fastapply.ParallelizeByTable
  6. Enter the user parameter that you want to make available to the user exit program in the Parameter box. For example, enter the following value for 3 database connections and a threshold of 12000 operations:
    3:12000
    Note: If you do not specify a value for this field or specify an incorrectly formatted string, CDC Replication uses the following values for the various Fast Apply modes:
    • Group by table: 10000
    • Parallelize by table: 4:10000
    • Parallelize by single table: 4:10000
    • Parallelize single table by hash: 4:10000
    • Group by table Net Effect: 10000
    • Group by table Net Effect Convert Updates: 10000
    • Parallelize by table Net Effect : 4:10000
    • Parallelize by table Net Effect Convert Updates : 4:10000
  7. Click OK.

Results

Fast Apply is now enabled for the subscription.