Supported Elements for Rules and Macros

The following arguments are accepted for the value parameters in text link analysis rules and macros:

Macros

You can use a macro directly in a text link analysis rule or within another macro. If you are entering the macro name by hand or from within the source view (as opposed to selecting the macro name from a context menu), make sure to prefix the name with a dollar sign character ($), such as $mTopic. The macro name is case sensitive. You can choose from any macro defined in the current Text Link Rules tab when selecting macros through the context menus.

Types

You can use a type directly in a text link analysis rule or macro. If you are entering the type name by hand or in the source view (as opposed to selecting the type from a context menu), make sure to prefix the type name with a dollar sign character ($), such as $Person. The type name is case sensitive. If you use the context menus, you can choose from any type from the current set of resources being used.

If you reference an unrecognized type, you will receive a warning message, and the rule will have a warning icon in the Rules and Macro Tree until you correct it.

Literal Strings

To include information that was never extracted, you can define a literal string for which the extraction engine will search. All extracted words or phrases have been assigned to a type and for this reason, they cannot be used in literal strings. If you use a word that was extracted, it will be ignored, even if its type is <Unknown>.

A literal string can be one or more words. The following rules apply when defining a list of literal strings:

  • Enclose the list of strings in parentheses such as (his). If there is a choice of literal strings then each string must be separated by the OR operator, such as (a|an|the) or (his|hers|its).
  • Use single or compound words.
  • Separate each word in the list by the | character, which is like a Boolean OR.
  • Enter both singular and plural forms if you want to match both. Inflection is not automatically generated.
  • Use lower case only.
  • To reuse literal strings, define them as a macro and then use that macro in your other macros and text link analysis rules.
  • If a string contains periods (full stops) or hyphens, you must include them. For example, to match a.k.a in the text, enter the periods along with the letters a.k.a as the literal string.

Exclusion Operator

Use ! as an exclusion operator to stop any expression of the negation from occupying a particular slot. You can only add an exclusion operator by hand through inline cell editing (double-click the cell in the Rule Value table or Macro Value table) or in the source view. For example, if you add $mTopic @{0,2} !($Positive) $Budget to your text link analysis rule, you are looking for text that contains (1) a term assigned to any of the types in the mTopic macro, (2) a word gap of zero to two words long, (3) no instances of a term assigned to the <Positive> type, and (4) a term assigned to the <Budget> type. This might capture "cars have an inflated price tag" but would ignore "store offers amazing discounts".

To use this operator, you must enter the exclamation point and parenthesis manually into the element cell by double-clicking the cell.

Word Gaps (<Any Token>)

A word gap, also referred to as <Any Token>, defines a numeric range of tokens that may be present between two elements. Word gaps are very useful when matching very similar phrases that may differ only slightly due to the presence of additional determiners, prepositional phrases, adjectives, or other such words.

Table 1. Example of the elements in a Rule Value table without a word gap
# Element
1
Unknown
2
mBeHave
3
Positive

Note: In the source view this value is defined as: $Unknown $mBeHave $Positive

This value will match sentences like "the hotel staff was nice”, where hotel staff belongs to type <Unknown>, was is under the macro mBeHave and nice is <Positive>. But it will not match “the hotel staff was very nice”.

Table 2. Example of the elements in a Rule Value table with a <Any Token> word gap
# Element
1
Unknown
2
mBeHave
3
4
Positive

Note: In the source view this value is defined as: $Unknown $mBeHave @{0,1} $Positive

If you add a word gap to your rule value, it will match both “the hotel staff was nice” and “the hotel staff was very nice”.

In the source view or with inline editing, the syntax for a word gap is @{#,#}, where @ signifies a word gap and the {#,#} defines the minimum and maximum of words accepted between the preceding element and following element. For example, @{1,3} means that a match can be made between the two defined elements if there is at least one word present but no more than three words appearing between those two elements. @{0,3} means that a match can be made between the two defined elements if there is 0, 1, 2 or 3 words present but no more than three words.