Asset types created through metadata import (IBM Knowledge Catalog)

When you run a metadata import, you create different asset types in projects and catalogs.

Data assets

You can add data assets from connections to a project or a catalog. Data assets that you import to a project are not visible in any catalog until you publish them. After you share them to a catalog, other catalog users can work with these assets. If you want to run metadata enrichment on the imported assets, you import them to a project.

For a data asset such as a table or a view imported from a database, technical metadata includes the following information:

  • Table name
  • Table view description
  • Column information such as name, type, length, and description
  • Data source information (connection information) such as server hostname or IP, and the parent database and schema of the table

This list is not exhaustive. Also, when you import metadata from an unstructured data source such as a Box folder, a different set of metadata is imported. It includes, for example, file name, file type, size, access permissions, owner, creation date, last access date, parent folder, and other information.

For more information, see Data assets and their properties.

Cobol copybooks

Cobol copybooks describe the data structure of a COBOL program. You can import COBOL copybook maps, virtual tables, and views into projects and catalogs. To add such assets from mainframes, you must use a Data Virtualization Manager for z/OS connection. The imported assets cannot be profiled, enriched through metadata enrichment, or used in Data Refinery.

Business intelligence assets

You can add business intelligence assets to a catalog for inspecting the components of business intelligence reports and how they are related. In this case, the advanced metadata import feature must be enabled. A MANTA Automated Data Lineage for IBM Cloud Pak for Data license key is not required to import such assets. However, to be able to visualize the data flows that transform and populate the source data for the reports, use the Get BI report lineage metadata import option. To use this option, a MANTA Automated Data Lineage for IBM Cloud Pak for Data license key is required.

In business intelligence (BI) reporting, BI tools are used to gather, analyze, and present data. Business intelligence assets are used to organize reports that provide a business view of that data.

You can add reports and the report queries and report query items that they include to a selected catalog, where you can examine the individual components and how they are related.

  • Report: Represents the definition of a report, for example, a monthly sales report based on the information in a reporting database.
  • Report query: Is a child asset of a report. Queries fetch data from views or tables within a reporting database to render the report.
  • Report query items: Is a child asset of a report query and is defined within the report for intermediate processing of data.

Business intelligence assets cannot be imported to a project, downloaded, profiled, enriched through metadata enrichment, or used in Data Refinery or Data Virtualization.

Transformation scripts

Transformation scripts describe data transformations that change the format, structure, or values of data and that usually are part of ETL (extract, transform, and load) processes. Transformation scripts are used in data operations, such as manipulating, converting, and cleansing.

The advanced metadata import feature must be installed for importing such assets to a catalog. A MANTA Automated Data Lineage for IBM Cloud Pak for Data license key is not required.

You can import transformation scripts of the types Function, Procedure, Script, and Trigger. The following data sources are supported:

If there is a sequence of many transformation scripts, only the last one is imported.

In the catalog, the Transformation expression property and the preview of a transformation script show the transformation logic. You can also access this information from the asset details in the side panel of the asset's Lineage tab.

Transformation scripts cannot be imported to a project, downloaded, profiled, or enriched through metadata enrichment.

Data model assets

A data model visualizes data elements, called entities, and their relationships and describes the attributes that are associated with each entity. You can add data model assets to a catalog to have a single collection point for all the business knowledge relating to your data management landscape.

The advanced metadata import feature must be installed for importing such assets to a catalog. A MANTA Automated Data Lineage for IBM Cloud Pak for Data license key is not required.

Imported data models are read-only copies of the originals created and maintained in database modeling tools. You can import data models created in these data modeling tools:

  • ER/Studio
  • erwin Data Modeler
  • SAP PowerDesigner

See Preparing data model files for metadata import.

A logical data model visualizes data elements, called entities, and their relationships and describes the attributes that are associated with each entity. For a logical data model, the following asset types are created:

  • Logical model: A logical representation of data objects that are related to a business domain. The model consists of a set of logical entities and their attributes and relationships that can be organized in groups. A logical model can be implemented by a physical data model or a database schema.
  • Logical model attribute: A logical model attribute defines the meaning and purpose of a unit of data.
  • Logical model entity: Logical model entities are assets that represent the data structure in the logical data model
  • Logical model relationship: A logical model relationship represents a relationship between two logical model entities that can become a foreign key constraint when the logical model is transformed to a physical model.

A physical data model defines the physical structures and relationships of data within a subject domain or application. For a physical data model, the following asset types are created:

  • Physical model: The physical model defines the physical structures and relationships of data within a subject domain or application.
  • Physical model schema: A design schema for data assets that defines the physical structures and relationships of data within a subject domain or application. Each physical model can contain one or more physical model schemas.
  • Physical model table: An asset that represents a table structure in the physical model.
  • Physical model view: An asset that represents a virtual table based on the result-set of an SQL statement.
  • Physical model column: An asset that defines the relevant properties or characteristic of a column in a table in the physical model.
  • Physical model constraint: An asset that defines an SQL constraint that is used to specify rules for data in a table, such as a primary key, foreign key, unique, or check constraint.

Depending on the size of the imported data model, a large number of assets might be created in the catalog. To find the root of the model as a starting point, filter the catalog assets on the Logical model or Physical model asset type.

Data model assets cannot be imported to a project, downloaded, profiled, enriched through metadata enrichment, or used in Data Refinery or Data Virtualization.

Data integration assets

With the Import ETL job metadata import option, you can add data integration assets to a catalog for inspecting the components of ETL jobs and how they are related. In this case, the advanced metadata import feature must be enabled. A MANTA Automated Data Lineage for IBM Cloud Pak for Data license key is not required to import such assets.

To be able to visualize the data flows and transformations in such ETL jobs, use the Get ETL job lineage metadata import option. To use this option, a MANTA Automated Data Lineage for IBM Cloud Pak for Data license key is required.

You can add assets for ETL jobs that are created and maintained in these data integration tools:

  • DataStage on Cloud Pak for Data
  • Informatica PowerCenter
  • InfoSphere DataStage
  • Microsoft SQL Server Integration Services
  • OpenLineage
  • Oracle Data Integrator
  • Talend

For each ETL job or DataStage flow, the following asset types are created:

  • Data integration job: Represents the ETL job.
  • Data integration component: Is a child asset of a data integration job and represents a single component of the ETL job, for example, a stage in a DataStage flow.
  • Data integration column: Is a child asset of a data integration component. Data integration columns describe the input and output columns of a data integration component.

Data integration assets cannot be imported into a project, downloaded, profiled, enriched through metadata enrichment, or used in Data Refinery or Data Virtualization.

Learn more

Parent topic: Importing metadata with MANTA Automated Data Lineage