Documentum Advanced Configuration and Performance Tuning
This section describes advanced configuration and performance tuning techniques and methods that can help optimize the connector for your environment.
- Increasing Crawler Aggressiveness
- Another way to improve the performance of a connector is to enable multithreading and then
reduce the delay between requests. This will put more load on the server where the resource that you
are crawling is stored, but will allow the connector to crawl that resource more quickly. Note: Enabling multithreading is not supported in all connectors, and will increase Watson™ Explorer Engine and resource server memory consumption. Consider increasing the size of the Java heap to prevent Java
OutOfMemory
errors.To optimize for speed set the value of the Delay setting to 0 in the tab for the associated search collection. Setting this value to 0 will eliminate any delay between successive calls to the resource server, and will also cause the connector to create as many threads as it can in order to submit and service those requests.
Note: Setting the Delay option to 0 can cause additional errors to be introduced because the resource server or Watson Explorer Engine may not be able to keep up with incoming requests. However, it is still useful to try this setting when tuning a connector for performance, because this setting will provide the theoretical maximum performance for the crawl.To tune for more balanced speed, adjust the value of the Delay setting to a value greater than 0 and less than the default value of
100
(This value is expressed in milliseconds). You may also want to adjust the value of the Concurrent requests to the same host setting in the Crawling aggressiveness section to a greater value than the default value of1
. This setting controls how many threads the connector creates when starting.Note: Some of these settings are replicated in the configuration settings for certain connectors, both to highlight their relevance and to enable setting connector-specific Delay values. Settings that are replicated in the seed for a connector take precedence over the crawler settings, but only apply to URLs that are destined for that connector. This enables the use of different settings in multiple connectors that contribute to a single search collection.