Registration of new information asset types

You can extend the model of IBM® InfoSphere® Information Governance Catalog and add asset types and metadata from other products to give a holistic representation of your assets in the catalog. In addition, new asset types can enrich your lineage reports.

Required skills

The following skills are needed to register new information asset types to the model:
  • Familiarity with modeling: classes, references, attributes
  • Ability to write code that can do the following actions:
    • Build XMLs with ID references
    • Uses RESTful APIs

Overview

You can use IBM InfoSphere Metadata Asset Manager to import metadata from design tools, business intelligence tools, databases, and files into the metadata repository of IBM InfoSphere Information Server. You can also use bridges and connectors to import metadata from third-party tools and from databases and files. In addition, asset metadata can be created in the catalog by importing extended data sources from a CSV file or from an ISX file by using the istool command-line utility. Both methods bring a limited number of known asset types into the catalog.

Alternatively, when you need to bring metadata from products into the catalog, you can dynamically register the additional asset types in the catalog. The group of asset types from a single product or tool is called an asset type bundle. The format of an asset type bundle is a compressed file with a descriptor XML file for the asset types, icons, and label property files. The bundle defines the asset types and how to display them in the catalog.

Registration is the process of calling IBM InfoSphere Information Governance Catalog REST API to add the asset types that are defined in the asset type bundle.

When you register the asset type bundle by using IBM InfoSphere Information Governance Catalog REST API, the contents of the bundle are brought into the catalog. Therefore, the new asset types and assets are specific to the instance of the catalog and are not part of the metadata repository for access by other applications in InfoSphere Information Server. The registered asset types with their icons, labels, and their imported assets remain in the catalog even during in-place upgrades of InfoSphere Information Server and InfoSphere Information Governance Catalog. If you move your catalog to a different instance of InfoSphere Information Governance Catalog, you must register the asset bundles in the new instance.

Asset import is the process of calling InfoSphere Information Governance Catalog REST API with generated XMLs to add asset metadata that conform to the previously registered asset types.

Flow import is the process of calling InfoSphere Information Governance Catalog REST API with generated XMLs to add data flow metadata that connects previously imported assets. The assets in such data flows can be of built-in asset types or of bundle-defined types alike. For the optimal user experience, the generation and import of asset XMLs and flow XMLs happen automatically, or with as little user intervention as possible.

The new assets behave mostly like other information assets in the catalog. You can assign terms, stewards, information governance rules, and information governance policies to the new information assets. You can browse, query, run lineage reports, and run filtered lineage reports on them. You can place them in collections.

Registration of asset types offers the following advantages over extension mapping documents and extended data sources:
  • Metadata of more asset types can be included in the catalog, with their specific icons and labels
  • Job-internal data flows are also included in lineage reports rather than only user-defined flows
  • You do not need to wait for a new release of InfoSphere Information Governance Catalog or InfoSphere Metadata Asset Manager that can work with new asset types

Scenario

Bank ABCD must meet Basel 3 compliance requirements. Data lineage reports must reflect the data flows of all products that are used by the bank. XYZ is a BI product. Bank management wants to publish XYZ metadata to the catalog to comply with regulations.

Bob is an Integration Developer with knowledge of XYZ. As the Integration Developer, Bob is responsible to extract relevant metadata from XYZ and import it into the catalog.

Erin is an information architect at Bank ABCD. The information architect is responsible for the review and reporting of the Data Quality and Governance Compliance report for the different systems at Bank ABCD.

Bob and Erin take the following steps to get the metadata of their XYZ assets into the catalog:
  1. Bob extends the catalog to support XYZ data sources by doing these steps:
    1. Authors an asset type descriptor XML that defines the XYZ-specific asset types for the catalog.
    2. Creates a bundle, which is a compressed file that contains the asset type descriptor, the asset type icons, and label property files for various locales (languages or countries).
    3. Registers the asset types that are in the compressed file by using InfoSphere Information Governance Catalog REST API.
  2. Bob populates the catalog with XYZ assets and data flows by doing these steps:
    1. Develops an application that monitors XYZ, extracts the metadata of the relevant assets, and generates XMLs with that metadata, in the format that is expected by InfoSphere Information Governance Catalog REST API.
    2. The application uploads the XYZ metadata to the catalog by using InfoSphere Information Governance Catalog REST API
  3. Erin inspects the XYZ asset types and their assets in the catalog by doing these steps:
    1. Browses the new XYZ asset types in the Browse All window.
    2. Inspects the assets of the new types in their Details page.
    3. Obtains data lineage reports with data flows from, and through, the XYZ assets.