Balance the workload across watches and scale for large data
With cooperative file listening, a single watch runs on multiple Launchers. On each Launcher where the watch runs, the file listener monitors the same trigger directory for the same files. Use the Pending Instance Thresholds and the Max Concurrent Map Instances Launcher settings in the Integration Flow Designer to help to balance the workload among Launchers.
The Pending Instance Thresholds pause and resume the file listener of a watch. The file listener pauses when the watch's backlog of unprocessed source events reaches the high threshold, and resumes when the backlog falls below the low threshold. While the watch is paused on one Launcher, the listeners on other Launchers can process the arriving source events, which balances the workload.
Configure the high and low Pending Instance Thresholds to ensure that maps trigger uniformly across multiple Launchers. Adjust these settings so that each Launcher processes a similar number of source-event files, proportionate to the processing speed of the Launcher's server.
For example, if a watch that typically processes 10000 files runs on three Launchers, set the high Pending Instance Threshold value to 3300 to ensure that a Launcher listener pauses when it acquires a third of the source-event files. While the watch on one Launcher is paused, the listeners on the other Launchers continue to process the incoming source-event files. Set the low Pending Instance Threshold value to 1000 to resume the paused listener when the Launcher processes most of the source-event files.
- If the high and low Pending Instance Thresholds values are too low, stopping and starting the listeners frequently can impact performance.
- If the high and low Pending Instance Thresholds values are too high, one watch can accumulate the inputs while the others remain idle.
With very large data and CPU-intensive transformation, low values for the Pending Instance Thresholds can achieve scalability and load balancing. Low thresholds limit the number of source-event files that each Launcher can acquire and resume the listener only when a Launcher has CPU power available to process them.
You can accommodate CPU-intensive maps and large data by tuning the Max Concurrent Map Instances Launcher setting in the Integration Flow Designer. The Max Concurrent Map Instances setting limits the number of map instances that can run simultaneously within a Launcher process.
- Average input volume per hour
- Average time per transformation
- Number of cooperative Launchers
- Number of processors per Launcher per computer
- CPU utilization during peak load
For example, consider a system that processes very large data. Each map takes approximately 60 seconds to run and utilizes nearly 100% of one CPU. The system has three Launchers (L1, L2, and L3) that are deployed on three different host computers (H1, H2, and H3). The Launchers trigger on the same directory.
Host | Number of CPUs |
---|---|
H1 | 4 |
H2 | 2 |
H3 | 1 |
Launcher on host computer | Max Concurrent Map Instances | High Pending Instance Threshold | Low Pending Instance Threshold |
---|---|---|---|
L1 on H1 | 4 | 8 | 4 |
L2 on H2 | 2 | 4 | 2 |
L3 on H3 | 1 | 2 | 1 |