Add a collection
You can add a collection using the IBM Watson® Explorer Admin Console or the Watson™ Explorer Content Miner.
Click Add Collection on the Collections page of Watson Explorer Admin Console or Watson Explorer Content Miner, and then choose a template on the Collection Template page.
The following default collection types are available.
- Content Mining
- This is a collection for general content mining. (Version 12.0.1 or later.)
- Sentiment Analysis
- This is a collection for sentiment analysis.(Version 12.0.1 or later.)
- Analytics
- This is a text analytics collection. (Version 12.0.0 only.)
- Search
-
- For IBM Watson Explorer Admin Console, this is an enterprise search collection type that you can choose for use in IBM Watson Explorer oneWEX Application Builder. (Version 12.0.2 and later.)
- For Watson Explorer Content Miner, this template does not exist. (Version 12.0.2 and later)
There may be other collection templates available if you have created new collection templates by deploying rankers.
If IBM Watson Explorer
oneWEX is installed on IBM® Cloud Private with multiple index partitions, the following advanced options are
available.
- Number of index partitions
- Specifies number of index partitions. Set the same number as the number of pods for the Discovery service to maximize parallelism of a single query request. This option can be set only when creating a collection.
- Enable index replication
- Check if you want to enable a backup index replica. Enabling this option improves service availability. Even if a pod goes down, another pod works as a backup. Note that required system resource will be doubled. This option can be set only when creating a collection.
After you select a collection type and click Next, provide a name and a description for your collection. The collection creation wizard guides you through the rest of the collection creation process. These steps are described below.
Add a dataset to your collection
You can select an existing dataset that has already been defined from the drop-down list. Alternatively, you can create a new dataset by uploading a CSV file or by crawling the file system.
- Upload CSV
- For instructions on uploading a CSV file, see Importers.
- File System
- Before crawling the file system, you must provide IBM Watson Explorer oneWEX access. For more information, see Providing access to the local filesystem from Watson Explorer oneWEX. You can select multiple directories to crawl. Subdirectories will also be crawled.
After you create a dataset, the dataset is crawled for data. When the crawl has completed, you can proceed to the next step.
Configure collection fields
Select the title, body, and timestamp fields, which are typically used by applications, and metadata fields to initially configure this collection. For advanced usages, you can further configure the fields after creating a collection.
You can configure the following fields.
- Body field
- Specifies unstructured text content data to be analyzed. For an analytics collection, the enrichment process enriches this field in order to analyze documents in later stages. For a search collection, the field is tokenized for better search precision.
- Title field
- Specifies the document title. Document titles are used in various ways in IBM Watson Explorer Content Miner. For example, the Documents view has a Title column. In both analytics and search collections, this field is tokenized for better search precision.
- Date field
- Specifies the document date. The document date is used in the Documents view as the DATE column, and is also used in time-series based analytics view such as Time Series, Topic, and Trends view.
- Metadata Facets
- Select fields you want to use as facets for your analysis. You cannot select the body field or the title field. Fields selected here are treated as facet values and will be displayed in the Facet tree. You can use these facet values in Watson Explorer Content Miner analysis views. This is a very important step because Watson Explorer Content Miner requires facets for text analytics processes.
Enrich your collection
This step does not apply to search collections.
Enrichment is a process to generate annotations from unstructured text content. Only existing annotations are listed here, but you can create and apply more later. Enrichments selected here are applied to analyzable text fields (body and title fields in typical collections).
- Annotators
- Select annotators to be enabled for this collection. Selected annotators enrich the body text content. The Part of Speech annotator is selected by default. For more information, see c_ee_adm_annotators.html#c_ee_adm_annotators.
- Classifiers
- Select classifier modules to be enabled for this collection. Selected classifiers are used to classify results into categories. For more information, see Classifiers.
- Language identification
- Specify how a language used in the enrichment process applied to text content is determined.
Choose automatic detection or a specific language. The following languages are supported.
- Arabic, Czech, Danish, German, English, Spanish, French, Hebrew, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Romanian, Russian, Slovak, Turkish, Chinese
Specify the facets for analysis
A facet is an unit of analysis. You analyze the unstructured content with facets and various statistics. Specifying meaningful labels for each facet is very important for your successful analysis.
You can check and confirm the available facets that were produced by selected annotators, classifiers and metadata fields in previous steps. You can modify these facets.
You can specify the default visualization for each facet.

You can specify a range rule for facets that are of type Long, Double, and Date. Click Edit next to the facet and modify the range rule. For further details, see Interval Faceting.
Save your collection
You can select Enable Domain
Adaptation Curator to facilitate natural language processing in Watson Explorer Content Miner. For more information, see Domain Adaptation Curator.
You can choose what occurs after you save your collection. There are three choices if you created your collection in Watson Explorer Content Miner but only one choice if you created it in Watson Explorer Admin Console.
- Run indexing now
- The indexing process starts soon after the collection is created. Available in Watson Explorer Content Miner and Watson Explorer Admin Console.
- Open the collection to configure advanced options
- Watson Explorer Content Miner opens the edit page for the collection in
order to review or change the collection configuration. Available in Watson Explorer Content Miner only.
To run indexing, select Start Index in the collection card of Watson Explorer Content Miner or the collections page of Watson Explorer Admin Console.
- Do nothing
- The indexing process does not start after the collection is created. Available in Watson Explorer Content Miner only.
To run indexing, select Start Index in the collection card of Watson Explorer Content Miner or the collections page of Watson Explorer Admin Console.