Unstructured data sources

Unstructured data sources are information assets that are governed by IBM® StoredIQ®. Asset types include instances, infosets, volumes, and filters.

Unstructured data sources deal with data such as email messages, word-processing documents, audio or video files, collaboration software, or instant messages. Together with structured data, they give a full picture of data in the enterprise.

An instance server contains infosets, volumes, and filters. Data in infosets is already classified and comes from volumes. Filters can be applied to infosets and volumes to abstract specific data.

To display unstructured data source assets in Information Governance Catalog, a StoredIQ administrator must set up the initial synchronization. Additionally, event notification must be enabled in Information Governance Catalog Administration pane. When these steps are completed, the assets are synchronized automatically between the products. For more information, see the Synchronizing data topic in the StoredIQ documentation.

Asset types

The following table lists and defines the types of unstructured data sources that can be stored in the metadata repository. Each asset has a unique identity that is determined by the identity components.
Table 1. Unstructured data sources
Asset type Definition Components of the identity of the asset Contained asset types
Instance icon Instance A server instance that governs unstructured data sources. StoredIQ Identifier Infosets, Volumes, Filters
InfoSet icon InfoSet An infoset, which is a collection of specific data. StoredIQ Identifier  
Volume icon Volume A volume, which is a data source or destination that is available on the network. StoredIQ Identifier  
Filter icon Filter A filter, which is a set of one or more attributes that can be applied to an infoset to create a refined subset of data. StoredIQ Identifier