Parallelize by table
This mode of Fast Apply is similar to the Group by table mode, but instead of applying the reordered operations on a single database connection, the operations are applied concurrently across multiple database connections.
About this task
You should note that CDC Replication does not attempt to balance the number of operations assigned to each connection.
For example, if you are using two database connections and have the same set of source operations:
INSERT TABLE1
INSERT TABLE2
UPDATE TABLE1
INSERT TABLE3
INSERT TABLE2
INSERT TABLE3CDC Replication may attempt to apply them on the target system as follows. For database connection 1:
INSERT TABLE1
UPDATE TABLE1
INSERT TABLE3
INSERT TABLE3For database connection 2:
INSERT TABLE2
INSERT TABLE2CDC Replication will
perform this ordering on a group of transactions from the source system
which can be referred to as a unit of work
. You can specify
the unit of work threshold (the maximum size of the unit of work)
and the number of database connections by specifying two integer values
(separated by a colon) in the Parameter box
of the subscription-level user exit dialog box. For example, to use
3 database connections and use a threshold value of 12000, you can
specify 3:12000.
The number of image builder threads can also be specified through the User Parameter. You need to provide the number of image builder threads after the threshold with a : separating them. For example, to have 3 apply thread connections, a threshold of 12000, and 4 image builder threads you must set the parameter to be 3:12000:4. If there is more than one image builder thread, the operations are distributed to the threads through a round-robin algorithm.
CDC Replication will stop adding additional transactions into the unit of work once it has reached the threshold value that you set.
Once the threshold is reached, CDC Replication keeps adding operations from the current transaction until the commit of that transaction is seen. If this transaction happens to be a very large transaction, the UOW might grow to a large size. This might use up all of the available memory, and none of the operations in the UOW can be applied until that commit is seen. So, there is a limit on how much bigger a UOW can get above the user-specified threshold. This is controlled by the system property target_optimizer_uow_max_percent_threshold. The default value is 1000. This specifies a percentage of the number of operations in the UOW ready threshold. This percentage times the UOW ready threshold number of operations that signifies the maximum number of operations that can be in a unit of work. For example, for the default value, a UOW can grow to 10 times larger than the user-specified threshold. Once this max threshold exceeds, the fast apply optimization is abandoned and the data is applied via a single database connection in the original order. At the beginning of the next transaction, CDC Replication will begin constructing a new unit of work.
The threshold value works like the other threshold values around grouping transactions. If there is latency then CDC Replication will create units of work based on this value to maximize throughput and reduce latency. However, when there is no latency CDC Replication will use smaller units of work to ensure it is not artificially adding latency.
If the apply of the reordered data across multiple connections fails, CDC Replication will roll back all data in the database and apply the data in the original source order with a single database connection.
In general, you should avoid having database constraints between the tables in the target database when using this mode. If you do have constraints, you must ensure that they will be verified at the time of the database operation rather than being deferred until commit point so that the CDC Replication recovery mechanism can work effectively.