Managing reporting for IBM Knowledge Catalog
Take control of IBM Knowledge Catalog data synchronization into the reporting data mart.
Prerequisites
Before you start the reporting synchronization, make sure that you use a clean schema with no reporting tables present at it.
If you are using an existing schema, you don't need to delete that schema. Delete all the tables that are present under your schema:
DROP TABLE <SCHEMA_NAME>.<TABLE_NAME>;
Reporting on assets metadata
Use the control switch to decide whether to allow for reporting on assets and artifacts metadata to be sent to an external reporting database. Whenever you create a new project, a catalog, or a category, you can use the control switch Allow reporting on asset metadata to define whether the metadata can be sent to an external reporting database. By default, this setting is switched off.
When you enable reporting, the Reporting Administrator can set up reports for your workspace, and asset or artifact metadata can be sent to an external reporting database.
The reporting administrator can review these settings in the Reporting setup page:
-
Go to Administration > Governance and catalogs > Reporting setup.
-
Select Categories, Projects, or Calatogs from the left side panel and review the following column:
- For Catalogs and Projects: Asset metadata
- For Categories: Artifact metadata
If the column states Unavailable, the owner decided to disable metadata reporting. For more information, see Configuring reporting settings for IBM Knowledge Catalog in the IBM Software Hub documentation.
Reporting synchronization
- When you click Start reporting, the data is sent to the selected database, and you can start generating reports with SQL queries. Refer to the data model diagram to get started with the queries.
The data is automatically synchronized between IBM Knowledge Catalog and the database. Any change in the catalog, project, category, or data protection rule that is enabled for reporting is reflected on the database.
- When you stop reporting, the data is no longer synchronized and it is deleted from the database. The existing reporting settings are retained.
- When interruptions occur instead of stopping the reporting completely, you can pause the synchronization of IBM Knowledge Catalog data into the reporting data mart. Any updates that are made to assets or artifacts while paused are processed when you resume synchronization.
Automatic sychronization and update of data in the data mart
Data is initially synchronized with the data mart when you enable and start the reporting.
If a failure occurs, automatic synchronization is retried up to four times.
After reporting is established, the data mart is synchronized only when changes occur in the data for which reporting is configured. For example, changes in the assets or governance artifacts, or their attributes, relationships, or assignments.
You can't configure a synchronization interval.
Reporting synchronization after you restore a backup of an external database
Backing up the external database ensures data consistency between Cloud Pak for Data and the reporting database if a Cloud Pak for Data backup is restored.
You need to back up the reporting database when the Cloud Pak for Data backup is done. You need to restore the external reporting database to the backup taken at same time. For more information, see Creating and scheduling online backups of Cloud Pak for Data with IBM Storage Fusion in the IBM Software Hub documentation.
You can restore a IBM Cloud Pak for Data backup on a different cluster for testing purposes. You can use this cluster as a standby (passive) cluster in case the source (active) cluster is lost. However, reporting database backups cannot be automatically restored on the passive recovery side. Do not perform any write operations on the passive cluster. Pause the synchronization of IBM Knowledge Catalog data on the passive cluster to the reporting data mart. For more information, see Offline backup and restore to a different cluster with the Cloud Pak for Data OADP backup and restore utility or Cloud Pak for Data online backup and restore to a different cluster with IBM Storage Fusion in the IBM Software Hub documentation..
If you want the passive cluster to become the new active cluster, delete the Cloud Pak for Data deployment and then restore the latest backup, or any other backup that you prefer to restore. Then, restore the database backup that was taken at the same time as the Cloud Pak for Data backup. For more information, see Restoring a Cloud Pak for Data online backup to a different cluster with IBM Storage Fusion in the IBM Software Hub documentation..
When the restore operation is completed, the reporting synchronization resumes because it is enabled when you take a IBM Cloud Pak for Data backup on the active cluster.
Handling synchronization failures and manual restart of the synchronization
If the initial synchronization for a particular item fails, the metadata that is related to that item is not synchronized to the target tables in the data mart. Instead, this data is skipped until you resolve the underlying problem. After you resolve the issue that triggers the error, the details of that missing asset or artifact are automatically updated in the data mart.
Furthermore, upon any update in the reporting setting, all assets that were skipped are also queued for an update.
If the synchronization fails after you modify the reporting settings, the synchronization is still established for the previous settings.
In case you believe that the cluster or database is out of sync, you can restart the synchronization manually in the user interface.
- For those items that failed. This option restarts only the containers or features that failed.
- For items that failed and items in the queue. This option restarts all the items that had not yet started (if there are any) and the failed containers or features.
- For all configured items. This option restarts all the items that had not yet started (if there are any) and the failed or passed containers or features.
Depending on the option you choose, the process might take a while. Learn more about Setting up reporting for IBM Knowledge Catalog.
If you want to remove the reporting data from the database and start with a new configuration, complete these steps:
- Click Stop reporting. The data is no longer synchronized and it is deleted from the database. The existing reporting settings are retained.
- Click Reset settings. The settings are restored to the default state. You can then define a new connection and configure the reporting in a different way.
Parent topic: Setting up reporting for IBM Knowledge Catalog