Sizing considerations for the staging store

The CDC Replication staging store is located on your source server and is a cache of change data read from the database logs. The size of the staging store will increase as the product accumulates change data, and therefore you must plan your source environment accordingly, particularly disk space.

Latent subscriptions

The amount of data within the staging store is related to the latency of your subscriptions. CDC Replication measures latency as the amount of time that passes between when data changes on a source table and when that change is applied on the target table. For example, if an application inserts and commits a row into the source table at 10:00 and CDC Replication applies that row to the target table at 10:15, then the latency for the subscription is 15 minutes.

When all of your subscriptions are mirroring and have very little latency, the volume of data that needs to be kept in the staging store will be relatively small. If all of your subscriptions are mirroring but some are latent, the staging store will contain all the data generated by the logs for the latent subscriptions during the entire time they are mirroring. For example, if the difference in latency between the least latent subscription and the most latent subscription is 3 hours, and your database generates 100 GB of log data per hour, the staging store will require approximately 300 GB of disk storage space.

Inactive subscriptions

An inactive (not currently replicating) subscription that contains tables with a replication method of Mirror will continue to accumulate change data in the staging store from the current point back to the point where mirroring was stopped. For this reason, you should delete subscriptions that are no longer required, or change the replication method of all tables in the subscription to Refresh to prevent the accumulation of change data in the staging store on your source system.

Continuous Capture

Continuous Capture is a product feature that is designed to accommodate those replication environments in which it is necessary to separate the reading of the database logs from the transmission of the logical database operations. This is useful when you want to continue processing log data even if replication and your subscriptions stop due to issues such as network communication failures over a fragile network, target server maintenance, or some other issue. You can enable or disable Continuous Capture without stopping subscriptions.

Continuous Capture results in additional disk utilization on the source machine in order to accumulate change data from the database log file when these are not being replicated to the target machine. This change data is stored in the staging store. The additional disk utilization due to the accumulation of change data in the staging store should be evaluated and understood before deciding to use this feature in your replication environment.