Most sites implement this type of cache invalidation: Database triggers are defined for most tables (OOTB triggers are provided), and when data is updated, triggers fire and insert a particular invalidation ID in the CACHEIVL table. Then the DynaCacheInvalidation command scans the table and fires invalidations into the replication domain and DRS ensures the invalidations make it to every server.
When a large amount of data is updated, e.g. with a feed or a staging propagation, thousands of rows can be inserted into the CACHEIVL table in a short period of time. Then, the next time the DynacacheInvalidation job runs, it picks up these thousands of invalidations and sends them into the replication domain. As the job has extended logic to handle the Data Cache, the number of invalidations is actually higher than the number of rows in the table.
This sudden burst of invalidations can overload DRS. When invalidations are received faster than can be processed, they are queued up in memory. Too many invalidations can cause an out of memory condition or performance degradation.
Next I will describe some techniques. The reduceInvalidationIds setting reduces the Data Cache invalidations that are used, and some new DynacacheInvalidationCmd parameters can be used to limit both the number of invalidations issued and the rate at which they are issued.
The reduceInvalidationIds setting
Since Fix Pack 3, the reduceInvalidationIds setting can reduce the amount of Data Cache Invalidations by more than 50% (very significant). This is achieved by issuing slightly coarser invalidations. When reduceInvalidationIds is enabled, only "WCT" invalidations are issued. "T" stands for "Any type". "D" (delete) and "N" (non-delete operation) are no longer used. The WSTE has more info: Webcast replay: WebSphere Commerce Data Cache Overview and Configuration
The reduceInvalidationIds configuration can be enabled in wc-server.xml. See: Additional WebSphere Commerce data cache configuration
maxTimeToLive="172800" reduceInvalidationIds="true" reduceMemory="false">
DynacacheInvalidationCmd Job options
Since Fix Pack 7, the invalidation job offers several parameters that can be used to restrict the number of invalidations issued and the rate at which they are issued. They work in different ways, but the idea is that when a certain condition or threshold is met, the invalidation job will stop processing invalidations and completely clear that cache instead, or slow down the processing so that only so many invalidations are issued per second.
See the following: DynaCacheInvalidation URL
Fix Pack 7
Fix Pack 8
The easiest one to implement is maxInvalidationDataIds. maxInvalidationDataIdsPerCache is similar but it offers granularity at the cache level. Fine tuning the values could be important depending on the size of the propagations and when they are done, but for sites that clear caches after a propagation, or propagations that are done overnight, fine tuning is typically not needed. For example, I have used maxInvalidationDataIds=10000 before, which helped avoid OOM issues.
If a review of the CACHEIVL table shows most entries belong to the Data Cache (entries that start with WC), the maxIndividualInvalidationsPerTable setting can be useful as well, as once the parameter is reached, only entries for that table will be removed from the cache. This can help preserve other caches such as the base cache.
In the past, to avoid processing the thousands of invalidations, I've seen customers disable the DynaCacheInvalidation job while Stageprop runs, or drop triggers, and then do a full cache clear. Today, with settings like the ones above that can protect the JVMs from an overload of invalidations, it's just easier to allow the job to run.
Special thanks to Robert Dunn for his help with the content.