Governing virtual data (Data Virtualization)

Data Virtualization can connect to Watson™ Knowledge Catalog to govern the virtual data that you publish to governed catalogs. Data governance entails applying business context, data policies, and data protection rules to your virtual data.

Watson Knowledge Catalog is a secure enterprise data catalog management platform. With Watson Knowledge Catalog, you use catalogs to easily find and share your data and other assets. A catalog is a way to organize, label, and search for data assets. An asset in a catalog consists of metadata about a data asset. For more information about catalogs and data assets, see Catalogs.

You can create virtual tables in Data Virtualization from existing Watson Knowledge Catalog data assets that have term assignments. Data Virtualization can use terms assigned to tables in the catalog to rename table and column names while these tables are being virtualized. See Virtualizing data with business terms for details.

If you have the Data Virtualization Admin or Engineer roles, when you virtualize data by using the Data Virtualization console, including creating views, your virtual data is published to a governed catalog automatically. Data Virtualization Admins, Engineers, and Stewards can publish virtual data to the catalog manually. For more information, see Publishing virtual data.

Additionally, you can add virtual tables and views from the Data Virtualization connection type to the default catalog. See Create Data Virtualization connection in Watson Knowledge Catalog and Adding a data asset from a connection.

The process of virtualizing and publishing data to the catalog entails multiple steps with different users and roles that are involved in each step.
Figure 1. Virtualize and publish data to the catalog
Process to virtualize and publish data.
A catalog data asset contains a set of properties that includes business terms and tags. After your virtual data is in a catalog, you can:
  • Assign business terms, data classes, and tags that are authored in Watson Knowledge Catalog to tables and columns and thus, form a logical structure of your virtual data.
  • Use data protection rules to deny access to your virtual data. These data protection rules can be based on the assigned tags and business terms. Data protection rules are enforced only on data that is published or added to a catalog.

    Optionally, you can use data protection rules to mask your virtual data. Tech preview This is a technology preview and is not supported for use in production environments.

Virtual object owner vs. data asset owner

When a virtual object is published to a catalog, this object becomes a data asset in Watson Knowledge Catalog. There is a difference between virtual object owners and data asset owners:
Virtual object owner
The user that created the virtual object in Data Virtualization.
Data asset owner
The user that owns the asset for a virtual object in a catalog. Typically, the user who created the virtual object will also be the asset owner when the virtual object is published to the default catalog automatically. However, this might not always be the case. Asset owners are exempt from Watson Knowledge Catalog data protection rules and policies.