Watson Knowledge Catalog on Cloud Pak for Data

Version: 4.7.3    Included   IBM

Description

Watson™ Knowledge Catalog provides a secure enterprise catalog management platform that is supported by a data governance framework. A catalog connects people to the data and knowledge that they need. The data governance framework ensures that data access and data quality are compliant with your business rules and standards. Watson Knowledge Catalog provides fine-grain control of which users can perform which tasks through a combination of user roles and permissions and collaborator roles that control what actions users can perform.

The data governance framework is composed of governance artifacts that enrich data assets and data protection rules that define how to protect sensitive data from unauthorized access. Governance artifacts are organized in categories and subject to workflow. Data Stewards and Data Quality Analysts who are collaborators in categories and have the required roles can create governance artifacts, import artifacts from files, or import artifacts from IBM Knowledge Accelerators.

If you have the Data Governance Express® offering, the features of the Watson Knowledge Catalog service that are included in the following optional installation components are not available:

A catalog is how you share assets across your enterprise:

  • Collaborators in a catalog have access to data assets without needing separate credentials or being able to see the credentials.
  • An asset in a catalog consists of metadata about data, including how to access the data, the data format, the classification of the asset, which collaborators can access the data and other types of metadata that describe the data. Data assets can include both relational data and unstructured data, such as PDF or Microsoft Office documents.

The default catalog is created automatically after you install the Watson Knowledge Catalog service.

Data Stewards and Data Quality Analysts can import asset metadata to a project to enrich the assets through profiling, quality analysis, and assigning business terms. Data Scientists and Business Analysts can copy catalog assets into projects to analyze data and build models. They can also publish data and analytical assets to any catalog. Watson Knowledge Catalog includes these tools in projects:

  • The Data Refinery tool for preparing and visualizing data.
  • The Metadata import tool to import asset metadata into a project or a catalog.
  • The Metadata enrichment tool to profile data, analyze the data quality, automatically assign governance artifacts to the data assets, and then publish the data assets with the enrichment results to a catalog of choice.
  • Data quality rules to evaluate the quality of data.

Quick links

Integrated services

Table 1. Supplemental services. You can extend the functionality of this service with the following supplemental services, which require this service.
Service Capability
Data Privacy De-identify sensitive data to preserve privacy while maintaining utility.
AI Factsheets Use AI Factsheets to organize and track lineage events, facts, and details for each of your machine learning models' lifecycle, and increase transparency for model governance needs.
MANTA Automated Data Lineage for IBM Cloud Pak® for Data Use MANTA Automated Data Lineage for advanced metadata import.
Table 2. Related services. The following related services are often used with this service and provide complementary features, but they are not required.
Service Capability
Watson Studio Prepare, analyze, and model data in a collaborative environment with tools for data scientists, developers, and domain experts.
Watson Query Integrate data sources across multiple types and locations into one logical data view.
Cognos® Dashboards Identify patterns in your data so you can make timely and effective decisions with visualizations.
Data Replication Integrate and synchronize your data using near-real-time data delivery with low impact to sources.
DataStage® Use built-in search, automatic metadata propagation, and simultaneous highlighting of compilation errors to create, edit, load, and run jobs that transform and tailor information for your enterprise.

Compatible data sources

See Supported data sources for a list of data source services that are compatible.