Add a ranker
You can create a ranker with the Create Ranker wizard.
The wizard steps are described below.
Specify the name and description of the ranker. Select the ranker type from the drop-down list.
Add a dataset to your collection
You can select an existing dataset that has already been defined from the drop-down list. Alternatively, you can create a new dataset by uploading a CSV file or by crawling the file system.
- Upload CSV
- For instructions on uploading a CSV file, see Importers.
- File System
- Before crawling the file system, you must provide IBM Watson® Explorer oneWEX access. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX. You can select multiple directories to crawl. Subdirectories will also be crawled.
After you create a dataset, the dataset is crawled for data. When the crawl has completed, you can proceed to the next step.
Similar Document Ranking Setting
Specify fields for machine learning training and prediction. The following fields are required.
- Answer Field
- Specifies the field to use as the answer field.
- Answer Field Type
- Specifies the answer field type.
- Attribute Type
- The answer field contains a list of types.
- ID Type
- The answer field contains a list of similar document IDs.
- Collection Template
- Specifies the name and description of the collection template that the ranker generates. You must specify this template to create a collection that uses this ranking.
Configure collection fields
Select the title, body, and timestamp fields, which are typically used by applications, and metadata fields to initially configure this collection. For advanced usages, you can further configure the fields after creating a collection.
You can configure the following fields.
- Body field
- Specifies unstructured text content data to be analyzed. For an analytics collection, the enrichment process enriches this field in order to analyze documents in later stages. For a search collection, the field is tokenized for a better search precision.
- Title field
- Specifies the document title. Document titles are used in various ways in IBM Watson Explorer Content Miner. For example, the Documents view has a Title column. In both analytics and search collections, this field is tokenized for better search precision.
- Date field
- Specifies the document date. The document date is used in the Documents view as the DATE column, and is also used in time series bases analytics view such as Time Series, Topic, and Trends view.
- Metadata Facets
- Select fields you want to use as facets for your analysis. You cannot select body field or title
field. Fields selected here are treated as facet values and will be displayed in the
Facet tree. You can use these facet values in Watson™ Explorer Content Miner analysis views. This is a very important step because
Watson Explorer Content Miner requires facets for text analytics
processes.Note: Whether or not you select these fields, IBM Watson Explorer oneWEX will use all metadata facet fields.
Enrich your collection
Enrichment is a process to generate annotations from unstructured text content. Only existing annotations are listed here, but you can create and apply more later. Enrichments selected here are applied to analyzable text fields (body and title fields in typical collections).
- Select annotators to be enabled for this collection. Selected annotators enrich the body text content. The Part of Speech annotator is selected by default. For more information, see c_ee_adm_annotators.html#c_ee_adm_annotators.
- Select classifier modules to be enabled for this collection. Selected classifiers are used to classify results into categories. For more information, see Classifiers.
- Language identification
- Specify how a language used in the enrichment process applied to text content is determined.
Choose automatic detection or a specific language. The following languages are supported.
- Arabic, Czech, Danish, German, English, Spanish, French, Hebrew, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Russian, Slovak, Turkish, Chinese
Specify the facets for analysis
A facet is an unit of analysis. You analyze the unstructured content with facets and various statistics. Specifying meaningful labels for each facet is very important for your successful analysis.
You can check and confirm the available facets that were produced by selected annotators, classifiers and metadata fields in previous steps. You can modify these facets.
Confirm the configuration. If you want to change these settings, go back to modify the wizard steps.