Semantic Networks

In this release, the semantic networks technique is only available for English language text.

This technique builds categories using a built-in network of word relationships. For this reason, this technique can produce very good results when the terms are concrete and are not too ambiguous. However, you should not expect the technique to find many links between highly technical/specialized concepts. When dealing with such concepts, you may find the concept inclusion and concept root derivation techniques to be more useful.

How Semantic Network Works

The idea behind the semantic network technique is to leverage known word relationships to create categories of synonyms or hyponyms. A hyponym is when one concept is a sort of second concept such that there is a hierarchical relationship, also known as an ISA relationship. For example, if animal is a concept, then cat and kangaroo are hyponyms of animal since they are sorts of animals.

In addition to synonym and hyponym relationships, the semantic network technique also examines part and whole links between any concepts from the <Location> type. For example, the technique will group the concepts normandy, provence, and france into one category because Normandy and Provence are parts of France.

Semantic networks begin by identifying the possible senses of each concept in the semantic network. When concepts are identified as synonyms or hyponyms, they are grouped into a single category. For example, the technique would create a single category containing these three concepts: eating apple, dessert apple, and granny smith since the semantic network contains the information that: 1) dessert apple is a synonym of an eating apple, and 2) granny smith is a sort of eating apple (meaning it is a hyponym of eating apple).

Taken individually, many concepts, especially uniterms, are ambiguous. For example, the concept buffet can denote a sort of meal or a piece of furniture. If the set of concepts includes meal, furniture and buffet, then the algorithm is forced to choose between grouping buffet with meal or with furniture. Be aware that in some cases the choices made by the algorithm may not be appropriate in the context of a particular set of records or documents.

The semantic network technique can outperform concept inclusion with certain types of data. While both the semantic network and concept inclusion recognize that apple pie is a sort of pie, only the semantic network recognizes that tart is also a sort of pie.

Semantic networks will work in conjunction with the other techniques. For example, suppose that you have selected both the semantic network and inclusion techniques and that the semantic network has grouped the concept teacher with the concept tutor (because a tutor is a kind of teacher). The inclusion algorithm can group the concept graduate tutor with tutor and, as a result, the two algorithms collaborate to produce an output category containing all three concepts: tutor, graduate tutor, and teacher.

Options for Semantic Network

There are a number of additional settings that might be of interest with this technique.

Change the Maximum search distance. Select how far you want the techniques to search before producing categories. The lower the value, the fewer results produced—however, these results will be less noisy and are more likely to be significantly linked or associated with each other. The higher the value, the more results you will get—however, these results may be less reliable or relevant.

For example, depending on the distance, the algorithm searches from Danish pastry up to coffee roll (its parent), then bun (grand parent) and on upwards to bread.

By reducing the search distance, this technique produces smaller categories that might be easier to work with if you feel that the categories being produced are too large or group too many things together.

Important! Additionally, we recommend that you do not apply the option Accommodate spelling errors for a minimum root character limit of (defined on the Expert tab of the node or in the Extract dialog box) for fuzzy grouping when using this technique since some false groupings can have a largely negative impact on the results.