Once you have defined the resources that you want to crawl and index, you must next define any special ways in which you want to process those resources. In the case of this tutorial and based on our analysis of the input files, we want to extract certain information from the files that we are crawling and create specific metadata elements from that data.
The process of processing the content from the resources that you are crawling is known as converting. Once converted, this content is delivered to the search engine for subsequent use and eventual indexing. Watson Explorer Engine includes a number of built-in converters for web pages, PDF documents, Microsoft Office documents, and many more. However, since the goal of this tutorial is to identify, extract, and use metadata that is specific to a given set of documents, we will have to add a custom converter to extract this information for our use.
To proceed to the next section of this tutorial, click Creating a Fast Index for Metadata.