Watson Knowledge Catalog on Cloud Pak for Data

Version: 4.0.9    Included   IBM


Watson Knowledge Catalog provides a secure enterprise catalog management platform that is supported by a data governance framework. A catalog connects people to the data and knowledge that they need. The data governance framework ensures that data access and data quality are compliant with your business rules and standards. Watson Knowledge Catalog provides fine-grain control of which users can perform which tasks through a combination of user roles and permissions and collaborator roles that control what actions users can perform.

The data governance framework is composed of governance artifacts that enrich data assets and protect sensitive data from unauthorized access. Governance artifacts are organized in categories and subject to workflow. Data Stewards and Data Quality Analysts who are collaborators in categories and have the required roles can create governance artifacts, import artifacts from files, or import artifacts' from Knowledge Accelerators.

A catalog is how you share assets across your enterprise:

  • Collaborators in a catalog have access to data assets without needing separate credentials or being able to see the credentials.
  • An asset in a catalog consists of metadata about data, including how to access the data, the data format, the classification of the asset, which collaborators can access the data and other types of metadata that describe the data. Data assets can include both relational data and unstructured data, such as PDF or Microsoft Office documents.

The default catalog is created automatically after you install the Watson Knowledge Catalog service. It differs from other catalogs in these ways:

  • Data Stewards and Data Quality Analysts can create data quality projects and use automated discovery to import metadata about data sets, automatically assign governance artifacts, and analyze the quality of the data sets. Then, they can publish the data sets as data assets to the default catalog.
  • The information assets view shows additional properties of the assets in the default catalog.

Data Stewards can also use data quality projects to run quick scan to get a fast initial analysis of large numbers of tables and files from data sources. Then they can run a deeper analysis or publish the data assets to any catalog.

Data Scientists and Business Analysts can copy catalog assets into analytics projects to analyze data and build models. They can also publish data and analytical assets to any catalog. Watson Knowledge Catalog includes these tools in analytics projects:

  • The Data Refinery tool for preparing and visualizing data.
  • The Metadata import tool to import asset metadata into a project or a catalog.

The following illustration shows the architecture of Watson Knowledge Catalog.

An architectural diagram depicting the relationships just described among the various types of collaborators, the catalogs, and their assets.

Quick links

Integrated services

Table 1. Related services
Service Capability
Watson™ Studio Prepare, analyze, and model data in a collaborative environment with tools for data scientists, developers, and domain experts.
Data Virtualization Integrate data sources across multiple types and locations into one logical data view.
Cognos® Dashboards Identify patterns in your data with sophisticated visualizations. No coding needed.

Compatible data sources

See Supported data sources for a list of data source services that are compatible.