Managing existing metadata imports in a project (IBM Knowledge Catalog)
After you create a metadata import asset, you can view, edit, or delete the metadata import asset, and you can rerun an import.
Manage existing metadata imports:
- View and edit the metadata import
- View the real-time progress and metrics of the metadata import job
- Pause and resume the metadata import
- Rerun the import
- Duplicate a metadata import
- Delete a metadata import asset
You can also perform these tasks with APIs instead of the user interface. The links to these APIs are listed in the Learn more section.
- Required permissions
- To manage and rerun a metadata import, you must have these roles and permissions:
- The Manage asset discovery user permission.
- The Admin or the Editor role in the project.
- The Admin or the Editor role in the catalog to which you want to import or publish the assets.
- Access to the connections to the data sources of the data assets to be imported and the SELECT or a similar permission on the corresponding databases.
Viewing the metadata import
Metadata import assets are listed in the Metadata imports section of the Assets page. To view an asset, click its name or select View from the asset's action menu.
When you view the metadata import asset, you can see the list of assets imported with a run of the associated import job. You can work with these assets, edit the metadata import, or rerun the import.
For each imported asset, you can see the following information:
- The asset name, which provides a link to the asset in the project or catalog.
- The asset type, such as
Data
orReport
. For data assets, also the format, such asRelational table
, is shown. For other asset types, the format column shows a dash (—). - The asset context, such as the parent or file path.
- The date and time that the asset was last imported.
- The import status, which can be
Imported
for successfully imported data,In progress
, orRemoved
if the asset couldn't be reimported. See Rerunning the import.
You can view additional information for an asset, publish it to a catalog (not in projects marked as sensitive), or delete the asset. When you delete an asset from the list of imported assets, it is deleted from the project or catalog to which it was imported but not from the metadata import scope. You can publish or delete a set of assets by selecting the assets individually or by using the Select all or Select all on page option.
You can work with imported data assets in exactly the same way as with connected data assets. Imported assets have a tag automatically assigned that reflects the asset's parent if applicable.
The About this metadata import side panel provides a summary of the import configuration, job details, schedule, advanced options, a list of related assets, and tags. Click View metrics to open a dedicated page for the latest job run to monitor the job status or view logs. To hide the details of the About this metadata import panel, click the information icon.
To edit the metadata import asset, click Edit metadata import. You can change these configuration settings:
-
Metadata import asset details such as the asset name, the description, or tags. Note that changing the asset name does not change the name of the associated import job. You cannot change the connection or the import target.
-
The data scope for specific types of import:
- For metadata imports with the goal Discover, the entire data scope
- For metadata imports with the goal Get ETL job lineage or Get BI report lineage, the scope of data assets to be included in such metadata import if the import additionally includes connections for source
or target data assets
You can't change the scope for importing data models, business intelligence reports, or ETL jobs, or for lineage imports except the scope of source and target data assets if they are part of the import:
- For data model imports, you can't change the data model file or the data modeling tool that you selected when you set the initial import scope.
- For ETL job or ETL job lineage imports, you can't change the ETL job file, the data integration tool, or the connection to the data integration tool that you selected when you set the initial import scope.
- For business intelligence reports or lineage imports for reports, you can't change the report input file, the reporting tool, or the connection to the reporting tool that you selected when you set the initial import scope.
- For lineage imports, you can't change the selected connections or assets.
However, you can duplicate a metadata import asset and update the scope in the new metadata import. See Duplicating a metadata import asset.
-
The schedule.
-
The advanced import options.
View the real-time progress and metrics of the metadata import job
Monitor the progress and metrics of your asset and lineage metadata import job runs, in real time in the Run metrics tab in Projects > PROJECT_NAME > MDI_RUN_JOB > Job run details.
As your job runs, the tab displays detailed visual views with charts, progress bars, and counts.
If an import fails, you can see and download logs with details by clicking Download failed assets log.
Pause and resume the metadata import
Pause and resume your metadata import jobs by using the Pause job run and Resume job run buttons in Projects > PROJECT_NAME > MDI_RUN_JOB > Job run details.
To get access to the buttons, follow the steps.
-
Log in as Admin to the oc cluster.
-
Open the metadata-discovery resource:
oc edit deployment metadata-discovery -n <CPD NAMESPACE>
-
In the environment variable section, add the following entry:
- name: enable_discovery_with_pause_resume_support value: "true.
-
Open the wkc-cr resource:
oc edit wkc wkc-cr
-
In the spec section, set
ignoreForMaintenance
totrue
.
Rerunning the import
If you did not configure a schedule, you can manually rerun the metadata import at any time in several ways:
- Open the metadata import asset and select Reimport assets.
- Open the metadata import asset and click the job name in the About this metadata import side panel, which takes you to the job page. Click the run icon on this page.
- Go to the project's Jobs page and run the import job from there.
Reimporting refreshes the asset information. Existing assets are updated, which means, any content changes are merged. New assets in the data source might be added, depending on the defined scope. If you removed an asset from the metadata import asset, project, or catalog, the asset in question is imported again unless you removed it from the scope.
Assets that were removed from the data scope or deleted from the data source after the last import can't be reimported. By default, the status of such assets is set to Removed
, but no assets are deleted from the target project or
catalog. To clean up the target project or catalog, you can choose to delete assets that are no longer available in the data source or assets that were removed from the import scope on reimport.
When you import ETL jobs or lineage metadata, only assets that aren't available in the data source can be deleted. When you import data models, none of the deletion options apply.
When you reimport assets, the asset status might incorrectly be set to Removed with the reason The asset was deleted from the data source
. This can be due to connectivity issues during the reimport. Depending on
the metadata import settings for reimport, such assets might be deleted from the target project or catalog. Rerun the import at a later time, either manually or through a scheduled reimport. When the connection is restored, the assets are
imported again.
By default, all asset properties are updated when assets are reimported. However, you can configure a metadata import to skip updating any of these properties: asset name, asset description, column description
If the metadata import is configured accordingly, only new or modified data assets are imported.
You can rerun a data model import to update the model information in catalog after you upload an updated version of the .zip file that serves as the data scope.
Depending on the outcome of the metadata import job run, a completion message or an error notification is displayed.
A completion message is displayed when the job run completed successfully, completed with warnings, or completed with errors. An error notification is displayed if the entire job run failed. Either type of notification contains a link to the job run log that provides details about the specific job run.
Email notifications You can set up an email notification to receive a brief summary of the reimported data (job status, number of modified, added and removed assets). To configure email notifications, create notification events. For more information, see Forwarding notifications to email in the IBM Software Hub documentation. To configure notification events, you must be an administrator or a user with the Platform Administration > Manage configurations permission.
Duplicating a metadata import asset
After you create a metadata import asset, you can change the scope only for metadata imports with the goal Discover. As an alternative for metadata imports with other goal, you can duplicate the metadata import asset and adjust the scope and other settings.
To duplicate a metadata import asset, open the asset and select Duplicate from the overflow menu next to the asset name.
The default name of a duplicated metadata import asset is Copy of <name of original metadata import>_timestamp
, for example, Copy of BI report_2023-08-14_20:38:21
. The name of the associated job follows the same
pattern. You can change the name of the metadata import asset and of the job.
The import target, scope, and advanced option settings are prefilled but you can change this configuration as required. Any schedule that is defined in the originating metadata import is not carried over. Configure a new schedule as required.
Deleting a metadata import asset
You can delete a metadata import asset from a project in one of these ways:
- Select the Delete option from the overflow menu for the asset on the project Assets page.
- Open the asset and select Delete from the overflow menu next to the asset name.
The metadata import configuration and its associated metadata import job are deleted. Assets in the project or a catalog that were imported with this metadata import asset are not affected.
Learn more
- Managing assets in projects
- Finding and viewing assets in a catalog
- Marking a project as sensitive
- IBM Software Hub roles and permissions
- IBM Knowledge Catalog API: Edit a metadata import asset
- IBM Knowledge Catalog API: Start a metadata importjob
- IBM Knowledge Catalog API: Delete a metadata import asset
Parent topic: Importing metadata with MANTA Automated Data Lineage