Db2 Text Search index planning and optimization
Data source characteristics have major impact on performance.
- the number of documents to be indexed
- the document size
- the index type
- index update parallelism
- text search server configuration
The processing time for each document is the sum of an approximate fixed time and a variable time. The fixed time is influenced by the document type, such as plain text, XML or INSO. The fixed time is approximate because there can be minor variations in time for memory usage or reuse. The variable time is determined mainly by the document size and linguistic processing variations.
For indexes of INSO documents, handling different MIME types can also affect the processing time.
The number of documents that can be processed in a given timeframe increases for smaller document sizes. However, the total throughput is less for smaller documents than for larger documents due to the fixed cost per document.