Document truncation

The OmniFind Text Search Server for DB2® for i limits the number of characters that can be indexed for each text document. Sometimes this character limit results in the truncation of large text documents in the text search index.

Documents that contain more than 10 million Unicode characters might be truncated by the text search server. For a rich text document, this limit is applied after the document is transformed to plain text.

If a text document is truncated during the parsing stage, you receive a warning that some documents were not processed completely. The warning appears in the job log. The document is partially indexed. Text that is in the document after the limit is reached is not indexed and is not considered during searches.

You might want to remove the document that has been truncated from the text search index to avoid unexpected behavior during search processing. You can remove the document by removing the corresponding record from the DB2 table, or by changing the value for the document to empty or null.