IBM Knowledge Catalog

Version: 5.1.3

Experience: Cloud Pak for Data

Thumbnail depiction of the interface of this service

Description

IBM® Knowledge Catalog connects people to the data and knowledge that they need. The platform is supported by a data governance framework to ensure that data access and data quality are compliant with your business rules and standards. IBM Knowledge Catalog delivers automated enrichment of data assets with business metadata to align company policies and vocabularies to data in support of AI, analytics, and compliance use cases.

IBM Knowledge Catalog is available in three editions:

IBM Knowledge Catalog: IBM Knowledge Catalog is included in IBM Cloud Pak for Data Enterprise Edition. It provides the methods that your enterprise needs to automate data governance so you can ensure data accessibility, trust, protection, security, and compliance.
IBM Knowledge Catalog Standard Cartridge: IBM Knowledge Catalog Standard Cartridge is a pricing option separate from IBM Cloud Pak for Data Enterprise Edition. It provides basic governance capabilities and AI-augmented data enrichment.
IBM Knowledge Catalog Premium Cartridge: IBM Knowledge Catalog Premium Cartridge is a pricing option separate from IBM Cloud Pak for Data Enterprise Edition. It offers the full governance framework with data privacy, data quality, cataloging, and AI-augmented data enrichment across the data lifecycle.

All editions provide features to manage governance artifacts, such as business terms, data classes, or classifications, that can be applied to data assets. Governance artifacts are organized in categories and subject to approval workflows. Collaborators in categories can create governance artifacts, import artifacts from files, or import artifacts from IBM Knowledge Accelerators. Additionally, governance artifacts can be extended with custom properties and relationships.

Data quality features that are available in IBM Knowledge Catalog and IBM Knowledge Catalog Premium support the proactive management of enterprise data quality through data quality definitions and data quality rules. Automated monitoring of data quality through data quality SLA rules can trigger notifications and remediation workflows for critical data elements at scale based on relationships to business terms.

Data protection rules that define how to protect sensitive data from unauthorized access are also available in IBM Knowledge Catalog and IBM Knowledge Catalog Premium. Data protection rules are enforced automatically in a uniform manner in governed catalogs. You can configure data protection rules to mask sensitive data based on the content, format, or meaning of the data, or the identity of the users who access the data. When you mask data, you unlock the data for users who are not authorized to view sensitive data and avoid the need to maintain multiple copies of the data.

IBM Knowledge Catalog Standard and IBM Knowledge Catalog Premium provide AI-augmented data enrichment of on top of the basic governance capabilities:

Names can be expanded based on context and the existing business vocabulary to provide meaningful names for data assets and columns instead of the often cryptic source names.
Pretrained foundation models, a fine-tuned granite-8b model and a fine-tuned slate model, provide gen AI based descriptions for assets and columns that users can easily understand, and generate gen AI based and thus more accurate term assignments by matching terms even when no exact matches are found.

You provide a self-service way to find and share assets across your enterprise with catalogs:

Collaborators in a catalog have access to data assets without needing separate credentials or being able to see the credentials. Collaborators have roles that control what activities they can perform in the catalog.
Data assets contain information about how to access the data, the data format, data classifications, assigned business terms and other governance artifacts, relationships with other assets, which collaborators can access the data and other types of metadata that describe the data. Data assets can be relational data or unstructured data, such as PDF or Microsoft Office documents.
Other types of assets in catalogs include operational assets, which data scientists create with tools to work with data, such as, models, notebooks, and dashboards.
Search based on data asset metadata and properties and AI-powered recommendations help users find the data that they need.

A default catalog is created automatically after you install an IBM Knowledge Catalog service.

Data scientists find assets in catalogs and then copy the assets into projects where they analyze data and build models with Watson Studio and Watson Machine Learning tools.

Licensing information

Each edition of this service requires a different license:

IBM Knowledge Catalog is included in the IBM Cloud Pak® for Data Enterprise Edition and IBM Cloud Pak for Data Standard Edition licenses.
IBM Knowledge Catalog Standard is included in the IBM Knowledge Catalog Standard Cartridge license.
IBM Knowledge Catalog Premium is included in the IBM Knowledge Catalog Premium Cartridge license.

For more information, see Licenses and entitlements.

Integrated services

Unless otherwise noted, the supplemental and related services can be used with all editions of IBM Knowledge Catalog.

Table 1. Supplemental services. You can extend the functionality of this service with the following supplemental services, which require this service.
Service	Capability
Data Privacy	BasePremium De-identify sensitive data to preserve privacy while maintaining utility.
AI Factsheets	Use AI Factsheets to organize and track lineage events, facts, and details for each of your machine learning models' lifecycle, and increase transparency for model governance needs.
MANTA Automated Data Lineage for IBM Cloud Pak for Data	Use MANTA Automated Data Lineage for advanced metadata import. You cannot install this service together with IBM Manta Data Lineage.
IBM Manta Data Lineage	Use IBM Manta Data Lineage for advanced metadata import. You cannot install this service together with MANTA Automated Data Lineage.

Table 2. Related services. The following related services are often used with this service and provide complementary features, but they are not required.
Service	Capability
Watson Studio	Prepare, analyze, and model data in a collaborative environment with tools for data scientists, developers, and domain experts.
Data Virtualization	Integrate data sources across multiple types and locations into one logical data view.
Cognos Dashboards	Identify patterns in your data so you can make timely and effective decisions with visualizations.
Data Replication	Integrate and synchronize your data using near-real-time data delivery with low impact to sources.
DataStage	Use built-in search, automatic metadata propagation, and simultaneous highlighting of compilation errors to create, edit, load, and run jobs that transform and tailor information for your enterprise.

IBM Knowledge Catalog

Description

Licensing information

Quick links

Integrated services