The Categories tab

In the Text Analytics Workbench, you can use the Categories tab to create and explore categories as well as tweak the extraction results.

In the Text Analytics Workbench, you can explore the pattern results and use them as category descriptors. In fact, using the Text Mining node to extract text link analysis (TLA) results is a great way to explore and fine-tune templates to your data for later use directly in the TLA node.

After extracting the concepts and types from your text data, you can begin building categories. You can build categories automatically by using the product's robust set of automated techniques, such as semantic networks and concept inclusion. Or you can create categories manually using additional insight you might have regarding the data. You can also use a combination of both, and you can also load a set of prebuilt categories from a text analysis package.

Note: Not all automatic techniques are available for all languages.

Extraction results can be refined by modifying the linguistic resources, which you can do directly from the Categories tab.

Categories
Categories are a group of closely related ideas and patterns to which documents and records are assigned through a scoring process. They organize related concepts and patterns into larger groupings, which are easier to work with. Categories are a combination of concepts, types, patterns, rules, and text links.
Descriptors
Descriptors are used to identify whether or not a record or document belongs in a given category. Every category is made up of a set of descriptors, such as concepts, types, and rules. When some or all of the text in a document or record matches a descriptor, the document or record is matched to the category.

Figure 1. Categories tab
Categories tab

Categories pane

You can manage any categories that you build in the Categories pane. Manual creation of categories or refining categories can only be done through the interactive workbench. You can select a row in the pane to display information about corresponding documents/records or descriptors.

With a category selected, you can change its settings by selecting Build > Change settings from the toolbar. For more information, see Setting options.

Preview pane

When you select a row, the Preview pane shows the text from the documents or records that has the concept you select. The text is highlighted to help you easily identify them in the text.

Descriptors pane

The Descriptors pane shows a list of concepts, types, type patterns, and concept patterns. You can also see if any of these descriptors are currently part of a category.

Searching the Categories tab

To locate information quickly in a particular section:

  1. Click the Find icon on the Categories tab to display the search field.
  2. Type the word string you want to search for. You can use the up and down arrow buttons to control the direction of your search. If a match is found, the text is highlighted.
  3. To look for the next match, click the arrow button again.

Custom category sets

You can download a category set as an .xslx file. You can customize the category set and then reuse it by uploading the .xslx file while on the Categories tab.