XML schema concepts

Glossary assets can be defined in the XML schema.

The XML schema defines glossary assets in one of two ways: with a repository identifier or RID, or with an identity.

A repository identifier (RID) is a generated string that uniquely identifies a category, term, information governance policy, and information governance rule. When you export glossary assets to an XML file, the file includes a RID for each category, term, information governance policy, and information governance rule.

For categories and terms, the identity consists of the name and full context, or path, from the top-level category, of the category or term. The context of each category and term is defined in the XML file.

For information governance policies, the identity consists of the name and full context, or path, from the top-level policy, if the policy is a subpolicy. For information governance rules, the identity consists of the rule name.

When you import an XML file into an existing catalog, a process of reconciliation occurs between what is defined in the file that is imported and what exists in the catalog.

Reconciliation

Reconciliation refers to the process of determining the content differences between a file being imported and an existing catalog. In trying to merge the new content with the existing content, the import utility first searches for a RID in the existing catalog that matches the RID in the imported file. If it does not find a matching RID, it then searches for a matching identity.

Reconciliation is used for terms, categories, information governance policies, information governance rules, and relationships. For example, consider this snippet from an XML file:
<category name="Category1" rid="reww">
<referencedTerms>
<termRef identity="Category2::Term2" rid="asdf"/>
</referencedTerms>
</category>
In this case, the reconciliation process is used for two tasks: to look for the category Category1 in the existing catalog and to look for the referenced term Term2, with a parent category Category2, in the existing catalog.

Merging

The reconciliation process determines if there are content differences between the file being imported and the existing catalog. After reconciliation, the content in the file and the content in the existing catalog are combined, or merged. Merging refers to the process of selecting which assets and which asset properties must be used in the updated catalog if there are differences between what is defined in the file and what is defined in the existing catalog, and then actually combining them. In the example introduced earlier, if the reconciliation process does not find the category Category1 in the existing catalog, then Category1 is added to the existing catalog during the import process.

You can choose from several merge options when you import a file. The merge options determine how conflicts between information in the import file and the existing catalog are resolved for any glossary assets that exist in both places at the time of the import.

Relationships

You can define certain relationships glossary assets. In addition, the assignment of a steward to a term or category constitutes a relationship, and terms that are synonyms of one another also have a relationship. For example, one term can be a related term to another term, or an information governance policy can reference a rule.

If the file that is being imported contains an RID or identifier that is not found in the existing catalog, then the relationship is dropped. That is, even if such a relationship existed in the current catalog, after the import occurs, the relationship will no longer exist.

Using the example introduced earlier, if the reconciliation process does not find the referenced term Term2, with a parent category Category2, in the existing catalog, then the referenced term relationship between Category1 and Term2 is dropped.