Strategies for Creating Categories

The following list of strategies is by no means exhaustive but it can provide you with some ideas on how to approach the building of your categories.

  • When you define the Text Mining node, select a category set from a text analysis package (TAP) so that you begin your analysis with some prebuilt categories. These categories may sufficiently categorize your text right from the start. However, if you want to add more categories, you can edit the Build Categories settings (Categories > Build Settings). Open the Advanced Settings: Linguistics dialog and choose the Category input option Unused extraction results and build the additional categories.
  • When you define the node, select a category set from a TAPin the Categories and Concepts view in the Interactive Workbench. Next, drag and drop unused concepts or patterns into the categories as you deem appropriate. Then, extend the existing categories you've just edited (Categories > Extend Categories) to obtain more descriptors that are related to the existing category descriptors.
  • Build categories automatically using the advanced linguistic settings (Categories > Build Categories). Then, refine the categories manually by deleting descriptors, deleting categories, or merging similar categories until you are satisfied with the resulting categories. Additionally, if you originally built categories without using the Generalize with wildcards where possible option, you can also try to simplify the categories automatically using the Extend Categories using the Generalize option.
  • Import a predefined category file with very descriptive category names and/or annotations. Additionally, if you originally imported without choosing the option to import or generate descriptors from category names, you can later use the Extend Categories dialog and choose the Extend empty categories with descriptors generated from the category name. option. Then, extend those categories a second time but use the grouping techniques this time.
  • Manually create a first set of categories by sorting concepts or concept patterns by frequency and then dragging and dropping the most interesting ones to the Categories pane. Once you have that initial set of categories, use the Extend feature (Categories > Extend Categories) to expand and refine all of the selected categories so they'll include other related descriptors and thereby match more records.

After applying these techniques, we recommend that you review the resulting categories and use manual techniques to make minor adjustments, remove any misclassifications, or add records or words that may have been missed. Additionally, since using different techniques may produce redundant categories, you could also merge or delete categories as needed. See the topic Editing and Refining Categories for more information.