Adding terms

The library tree pane displays libraries and can be expanded to show the type dictionaries that they contain. In the center pane, a term list displays the terms in the selected library or type dictionary, depending on the selection in the tree.

In the Resource Editor, you can add terms to a type dictionary directly in the term pane or through the Add New Terms dialog box. The terms that you add can be single words or compound words. You will always find a blank row at the top of the list to allow you to add a new term.

Note: These instructions show you how to make changes within the Resource Editor view or the Template Editor . Keep in mind that you can also do this kind of fine-tuning directly from the Extraction Results pane , Data pane, Categories pane, or Cluster Definitions dialog box in the other views. See the topic Refining extraction results for more information.

Term column

In this column, enter single or compound words into the cell. The color in which the term appears depends on the color for the type in which the term is stored or forced. You can change type colors in the Type Properties dialog box. See the topic Creating types for more information.

Force column

In this column, by putting a pushpin icon into this cell, the extraction engine knows to ignore any other occurrences of this same term in other libraries. See the topic Forcing terms for more information.

Match column

In this column, select a match option to instruct the extraction engine how to match this term to text data. See the table for examples. You can change the default value by editing the type properties. See the topic Creating types for more information. From the menus, choose Edit > Change Match. The following are the basic match options since combinations of these are also possible:

  • Start. If the term in the dictionary matches the first word in a concept extracted from the text, this type is assigned. For example, if you enter apple, apple tart will be matched.
  • End. If the term in the dictionary matches the last word in a concept extracted from the text, this type is assigned. For example, if you enter apple, cider apple will be matched.
  • Any. If the term in the dictionary matches any word of a concept extracted from the text, this type is assigned. For example, if you enter apple, the Any option will type apple tart, cider apple, and cider apple tart the same way.
  • Entire Term. If the entire concept extracted from the text matches the exact term in the dictionary, this type is assigned. Adding a term as Entire term, Entire and Start, Entire and End, Entire and Any, or Entire (no compounds) will force the extraction of a term.

    Furthermore, since the <Person> type extracts only two part names, such as edith piaf or mohandas gandhi, you may want to explicitly add the first names to this type dictionary if you are trying to extract a first name when no last name is mentioned. For example, if you want to catch all instances of edith as a name, you should add edith to the <Person> type using Entire term or Entire and Start.

  • Entire (no compounds). If the entire concept extracted from the text matches the exact term in the dictionary, this type is assigned and the extraction is stopped to prohibit the extraction from matching the term to a longer compound. For example, if you enter apple, the Entire (no compound) option will type apple and not extract the compound apple sauce unless it is forced in somewhere else.

In the following table, assume that the term apple is in a type dictionary. Depending on the match option, this table shows which concepts would be extracted and typed if they were found in the text.

Table 1. Matching Examples
Match options for the term:
apple
Extracted concepts      
  apple apple tart ripe apple homemade apple tart
Entire Term
     
Start  
   
End    
 
Start or End  
 
Entire and Start
   
Entire and End
 
 
Entire and (Start or End)
 
Any  
Entire and Any
Entire (no compounds)
never extracted never extracted never extracted

Inflect column

In this column, select whether the extraction engine should generate inflected forms of this term during extraction so that they are all grouped together. The default value for this column is defined in the Type Properties but you can change this option on a case-by-case basis directly in the column. From the menus, choose Edit > Change Inflection.

Note: This technique does not work with text data that is written in Japanese. Written Japanese relies on context for grammatical functions like number and gender, so words often have the same form despite different uses. As a result, this technique does not work effectively.

Type column

In this column, select a type dictionary from the drop-down list. The list of types is filtered according to your selection in the library tree pane. The first type in the list is always the default type selected in the library tree pane. From the menus, choose Edit > Change Type.

Library column

In this column, the library in which your term is stored appears. You can drag and drop a term into another type in the library tree pane to change its library.

To add a single term to a type dictionary

  1. In the library tree pane, select the type dictionary to which you want to add the term.
  2. In the term list in the center pane, type your term in the first available empty cell and set any options you want for this term.

To add multiple terms to a type dictionary

  1. In the library tree pane, select the type dictionary to which you want to add terms.
  2. From the menus, choose Tools > New Terms. The Add New Terms dialog box opens.
  3. Enter the terms you want to add to the selected type dictionary by typing the terms or copying and pasting a set of terms. If you enter multiple terms, you must separate them using the delimiter that is defined in the Options dialog, or add each term on a new line. See the topic Setting Options for more information.
  4. Click OK to add the terms to the dictionary. The match option is automatically set to the default option for this type library. The dialog box closes and the new terms appear in the dictionary.