Db2 Text Search indexing threads
Multiple indexing threads work in parallel to parse and index documents. This usually reduces the total elapsed time for text search index updates.
Indexer threads pick documents from the queue and manage the indexing process. They make use of index preprocessing threads to prepare the document content for indexing and write the result to the text index collection.
Index preprocessing threads extract text, identify the language, tokenize and analyze the document.
Usually the number of indexer threads and index preprocessing threads is configured to be the same. However, in some scenarios, for example, when large documents are processed, increasing the number of preprocessing threads might provide a performance benefit.
Indexing thread usage
If multiple indexer threads work on the same collection, the effect is reduced by the coordination required to synchronize the processing among the threads. Also, indexing threads that are single threaded perform better while parsing, but there can be a performance hit while merging or writing to disk. For example, four indexing threads working on four different text indexes show better throughput than four indexing threads working on a single text index.
Number of indexing threads
You should have at least two indexing threads and ensure that the number of indexing threads does not exceed the number of available CPUs. The maximum number of parallel index updates should not exceed the number of indexing threads to avoid thread sharing. With too many indexing threads or too many parallel index updates, the overall system performance suffers due to memory usage for process context switches.
For example, if 40 text indexes are frequently updated, and the system contains 8 CPUs, do not use more than eight indexing threads. Also, use a staggered update schedule for the text indexes to minimize contention for index threads.
The default setting for the number of indexer threads is 4, the same default applies to index preprocessing threads.
configTool configureParams -configPath <full-path-to-configuration-folder>
-numberOfIndexerThreads <value>
where
<value> is the number of threads and
<full-path-to-configuration-folder> is the full path to the
config.xml file for the Db2 Text Search
server.configTool configureParams -configPath <full-path-to-configuration-folder>
-numberOfPreprocessingThreads <value>
where
<value> is the number of threads and
<full-path-to-configuration-folder> is the full path to the
config.xml file for the Db2 Text Search
server.