Indexing is the process whereby data prepared by the Converters Provided with Watson Explorer Engine is made available to be searched. Indexing and searching are both controlled by the indexer-service. As the Crawling, Seeds, and Connectors crawls, it will find new or modified data which it processes and then forwards to the indexer. The indexer prepares this data for searching. When the data has been processed and committed to an index, the new and/or updated information will appear in search results. An index is a read-only disk based data structure that is used to respond to searches.
In Watson Explorer Engine versions 10.0.0.2 and greater, you can add a lexical analysis language stream to a collection. There are pre-built configurations for Arabic, Chinese, Czech, Danish, Dutch, French, German, Greek, Hebrew, Italian, Japanese, Korean, Polish, Portuguese, Russian, Spanish, and Turkish, plus you can create your own. See Lexical Analysis Streams for more information.
If you choose to add a Custom Stream, see Stream Definitions for the list of available fields.
To learn how to add an index stream to a search collection, see Adding Index Streams.
The next two sections discuss how you can index the same content in multiple ways, and explain the internals of the various processes involved in processing any data that is to be indexed.