Publishing Data Gate table metadata

If the IBM Knowledge Catalog integration feature has been configured, Data Gate table metadata can be published to IBM Knowledge Catalog by a one-click-action.

Checking the status

A tile that is labeled IBM Knowledge Catalog integration on the dashboard of your Data Gate instance allows you to check the current integration status. Possible status values are:

Not configured
The IBM Knowledge Catalog service is not configured. Make sure that you completed all steps in Configuring Data Gate table metadata publishing to IBM Knowledge Catalog successfully.
Available
The IBM Knowledge Catalog service was integrated successfully. The service can be reached from the Data Gate instance.
Error
The IBM Knowledge Catalog service is installed, but cannot be reached by Data Gate. If the status is Error, run the following checks:
  1. Check whether the IBM Knowledge Catalog service is up and running.
  2. Check whether all Data Gate pods are up and running.

Below the status, you find a timestamp or the information N/A. A timestamp indicates that the last time metadata was published to IBM Knowledge Catalog. If the information N/A is displayed, it means that metadata has not been published until now.

Publishing

If the status is Available, you can click the Publish to catalog button on the dashboard to publish Data Gate table metadata to the configured IBM Knowledge Catalog.

A publishing action has the following effects:

  • Connection assets are created in the catalog for the database connections that are used by the Data Gate instance. These are connection assets for the Db2 for z/OS source database and the Db2 or Db2 Warehouse target database. The name of the created connection assets follows this naming pattern:
    subsystem_name (source|target) - data_gate_pairing_name
  • Data assets are created for all source database tables and target database tables that are managed by the Data Gate instance. However, the creation of these assets depends on the state of the tables. Assets are created only if the tables are in the Loaded or Active state.
  • Tags with the following labels are added to all connection assets and data assets that are created:
    • source or target
    • The data gate pairing name
  • Relationships between the source database tables and the corresponding target database tables are created.

If you click the Publish to catalog button again, and if tables in the Data Gate instance have changed since the last publishing action, the changes are reflected in the associated IBM Knowledge Catalog instance:

  • If new tables have been added, assets for these tables will be added to IBM Knowledge Catalog, if the state of the tables permits an asset creation.
  • Table state changes from a state that did not permit asset creation to a state that permits asset creation result in the creation of data assets.
  • Table state changes from a state that permitted asset creation to a state that does not permit an asset creation are ignored. That is, related data assets are not removed.
  • If tables have been removed from the Data Gate instance, the information in IBM Knowledge Catalog stays the same. Existing data assets are not removed.
  • Existing assets are re-created or overwritten according to the settings of the configured IBM Knowledge Catalog. For more information, see Detection and handling of duplicates.

Detection and handling of duplicates

IBM Knowledge Catalog offers different strategies for the detection of duplicates and various options for the handling of duplicate assets. See: The strategies and options to be used are determined when you publish metadata assets. The duplicate detection strategy is determined by an evaluation of the asset name, its resource key, or the combination of both. The duplicate asset handling strategy defines the actions to be taken by IBM Knowledge Catalog in accordance with the detection strategy when a duplicate asset is found. IBM Knowledge Catalog can react in the following ways:
  • Update a previous asset with the information of a newly published duplicate
  • Allow the creation of a duplicate asset
  • Preserve the original asset and reject the duplicate

When metadata assets are published to the configured catalog, Data Gate relies on the settings and capabilities of IBM Knowledge Catalog regarding the detection of duplicates. Whether an asset is identified as a duplicate depends on the strategy that is configured for the target metadata catalog. Bear this in mind when you use the IBM Knowledge Catalog integration feature because Data Gate will always publish the metadata information of all the tables that are managed by the Data Gate in accordance with the conditions that are outlined in the Publishing section.

Attention: Data Gate publishes metadata assets according to pre-defined naming patterns. If the target metadata catalog uses the duplicate detection by name strategy at the time of pressing the Publish to catalog button, and previously published metadata assets have been renamed, the asset will not be identified as a duplicate. As a result, the asset is created again with the original name that is generated by Data Gate. If a connection asset is affected, all related data assets will also be re-created. To avoid the creation of duplicate assets by Data Gate, do not rename assets, or consider using resource keys as the duplicate detection strategy for the target metadata catalog.

The method that is chosen for the handling of detected duplicates at the catalog level, on the other hand, Data Gate overwrites while publishing the assets. Data Gate enforces the following methods based on the asset types:

  • The strategy that is implemented for the handling of Data asset duplicates is to always preserve original assets and reject duplicates. If a data asset, that is, an asset representing a table, exists, it will not be modified and no additional asset will be created. This behavior prevents a repetitive recreation of data assets by Data Gate and the loss of user-provided changes to the asset, except for the asset name. See the previous note.
  • The strategy that is implemented for the handling of connection asset duplicates is always update original asset

    That is, the existing source and target database connection assets are updated with each metadata publication from Data Gate. Duplicates will not be created. This behavior asserts that connection information that is related to the source and target databases (for example, username and password) are always up to date. If connection credentials are updated, the following actions need to be performed for propagating these changes to the configured IBM Knowledge Catalog.

  • If you want to update the credentials for the Db2 for z/OS connection, first update these credentials in the Data Gate source definition. See Updating a source definition for Data Gate. Then publish the metadata again to IBM Knowledge Catalog through the Data Gate UI.
  • If you want to update the credentials for the target database, first update these in the IBM Knowledge Catalog configuration wizard. See Configuring Data Gate table metadata publishing to IBM Knowledge Catalog. Then publish the metadata again to IBM Knowledge Catalog through the Data Gate UI.
Note: