Pipeline phase

The following components are involved in the pipeline phase: the XML filtering utility and the IBM® Content Search Services index server. Configure the operation of a Content Search Services index server to optimize indexing performance or control system resources.

The following table describes the steps in the pipeline phase.

Step	Description	Related information
1. Input queue	When received by an index server, the text document is placed in an input queue for the index server.
2. Document preprocessing	XML filtering: The XML filtering utility optionally removes surplus XML elements from XML content.	For information about defining surplus XML elements, see Setting XML elements as non-searchable.
	Language identification: The index server identifies the text language for the text document.	For information about selecting text languages, see Selecting text languages or text analyzers for an object store.
	Tokenization: The server creates tokens for the text document based upon a language-aware analysis of the text. Word stems and other language constructs are identified.	For information about word stems, see Token searches: Language-aware versus exact-match.
3. Output queue	After preprocessing, the text document is sent to the output queue for the index server.
4. Token indexing	The index entry for the object in the target index is updated with the tokens.