Troubleshooting
Problem
In SDC 5.9.0 and up, pipelines which use the Azure Data Lake Storage Gen2 (Legacy) origin take longer to stop after reading all data than in previous versions.
In previous SDC versions, the Batch Wait Time setting was not being applied properly. This was causing the pipeline to stop sooner than intended.
Resolving The Problem
In SDC 5.9.0 and up, the Batch Wait Time setting is applied as intended.
The Batch Wait Time setting defaults to 60 seconds. It is intended to allow the pipeline to wait in case any additional files appear in the source directory on Azure.
If the stage is configured to use multiple threads, the Batch Wait Time setting will be applied consecutively for each thread, which could cause the pipeline to wait much longer to stop than in previous versions. This is the intended behavior, and StreamSets recommends making no changes and allowing the pipeline to run as intended in production. In development or non-production environments, the Batch Wait TIme setting can be lowered for convenience.
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
15 March 2025
UID
ibm17186243