Known issues and limitations for IBM Knowledge Catalog
The following known issues and limitations apply to IBM Knowledge Catalog and to watsonx.data intelligence.
Known issues
General
- Assets imported with the user
admininstead ofcpadmin - Search bar returning incorrect results
- Assessment link in email notifications doesn't work
Installing, upgrading, and uninstalling
- When upgrading with the Identity Managements service enabled, the upgrade might fail
- During uninstalling IBM Knowledge Catalog, the
glossary crdependency is not deleted - IBM Knowledge Catalog 5.3.0 installation might not complete
- IBM Knowledge Catalog portal-catalog pod out of sync after upgrade
- After installing IBM Knowledge Catalog, some pods might go into the
errorstate and related jobs will fail - In IBM Knowledge Catalog Standard 5.3.0, semantic automation pods might be missing some labels
- Upgrading IBM Knowledge Catalog from 5.1 or 5.2 to 5.3.x might fail if you customized resources
- Installation of IBM Knowledge Catalog 5.3.1 on Power clusters with ODF storage might fail
- Installation of IBM Knowledge Catalog 5.3.1 fails on FIPS-enabled clusters
Migration and removal of legacy functions
For known issues with migration from InfoSphere Information Server, see Known issues for migration from InfoSphere Information Server.
Catalogs and Projects
- Unauthorized users might have access to profiling results
- Cannot run import operations on a container package exported from another Cloud Pak for Data cluster
- Data protection rules don't apply to column names that contain spaces
- Preview of data from file-based connections other than IBM Cloud Object Storage is not fully supported
- Scroll bar is not visible when adding assets to a project on MacOS
- Unexpected assets filtering results in catalogs
- Can't use Git projects with identical data assets
- Mismatched column metadata information
- For SQL assets, the SQL validation dropdown might not open automatically when you retest
- Inconsistent behavior when publishing assets after metadata enrichment
- Publishing asset from project to catalog fails despite success message
- Data quality score not available for columns in the catalog lineage view
- Attempt to download protected data assets is allowed
- The Allow duplicates option isn't available when publishing assets from projects to catalogs
- Description isn't available after importing data virtualization assets to a catalog
- Profiling and previewing assets fail after metadata import with the Overwrite original assets option
- Profiling page doesn't show changes after reimporting an asset
- Page crashes when working with data protection rules
- Catalog page fails to load with 502 Bad Gateway error
- Can't run export and import jobs on non-FIPS clusters with
cpd-cli
Governance artifacts
- Error
Couldn't fetch reference data valuesshows up on screen after publishing reference data - Publishing large reference data sets fails with Db2 transaction log full
- Imported data assets with assigned out-of-the-box data classes or terms have incorrect identifiers resulting in no enforcement of data protection rules
- Can't cascade delete a Knowledge Accelerator category
wkc-glossary-servicepod restarting when adding or updating multiple business terms- Column data is not redacted in accordance with the rule precedence
- Bulk edit does not enforce mandatory properties
- ZIP import with
specifiedoremptymerge method issues for mandatory custom properties and relationships - Applying default values for the Predefined values mandatory property type doesn't work correctly in ZIP import
- Changing artifact's primary category to a category for which mandatory custom properties apply is blocked
- Providing missing mandatory custom relationship in a category, which already has other mandatory custom relationships specified, fails
- Editing the Make this source a required field setting in a custom relationship definition takes a long time or times out
- Editing artifacts fails after upgrading from 5.1
Workflows
- Workflow types not available
- Connection of wkc-workflow-service to its Postgres database does not use SSL/TLS encryption
Metadata import
- Business data lineage is incomplete for the metadata imports with Get ETL job lineage or Get BI report lineage goals
- Assets are not imported from the IBM Cognos Analytics source when the content language is set to Japanese
- Dummy assets get created for any file assets that come from Amazon S3 to show the complete business data lineage if Get ETL job lineage is performed
- SocketTimeoutException during metadata import
- Business lineage shows
Name not availablewhen you're using Get BI report lineage option to import Tableau data assets - Reference assets are not marked as
Removed from scopeorRemoved from source - Assets removed from the metadata import scope are marked incorrectly
- Description field is empty after importing data virtualization assets to catalog
Metadata enrichment
- Running primary key or relations analysis doesn't update the enrichment and review statuses
- Issues with the Microsoft Excel add-in
- Republishing doesn't update primary key information in catalog
- Masked data might be profiled when the data source is IBM watsonx.data
- In small system configurations, large metadata enrichment jobs can fail
- Data quality output configuration in metadata enrichment can't be removed
- Primary category for generated terms is added to the category scope for term assignment
Data quality
- Rules bound to columns of the data type NUMERIC in data assets from Oracle data sources might not work
- Runs of migrated data quality rules complete with warnings
- Data quality rules with duplicated variable names from multiple data quality definitions might not work
- Can't drill-down into data quality output on DataStax Enterprise
- In deployments with IBM Knowledge Catalog Premium, access to SLA features might be blocked
- Data quality scores might not be available for data quality rules created in earlier versions
- Can't create rules or SQL assets with plain-text queries for data quality output tables
Generative AI capabilities
- Socket timeout errors can occur for Text2SQL API calls
- Can't work with a remote watsonx.ai instance if the system is configured to run models on CPU
- Project collaborators with the Viewer role can use the toggle to enable or disable the Data intelligence settings
- The project settings page for data intelligence shows the main IBM Cloud Pak for Data toolbar
- Known issues with the Text-to-SQL capability
MANTA Automated Data Lineage for IBM Cloud Pak for Data
- Metadata import jobs for getting lineage might take very long to complete
- Chrome security warning for Cloud Pak for Data deployments where MANTA Automated Data Lineage is enabled
- Can't 'Get lineage' with a DB2 connection in FIPS environments
- Data Virtualization connection fails when SSL is required
- Data export is incomplete due to table deduction in Query Service
- Validation shows complete for Data Virtualization metadata import jobs
- Upgrading MANTA Automated Data Lineage from 5.2.2 to 5.3.1 patch 1 and 2 results in an error
Business lineage
- Business data lineage is incomplete for the metadata imports with Get ETL job lineage or Get BI report lineage goals
- Hops for components and columns of components work only inside the data integration flow area of an expanded job node
- Viewing large business lineage graph results in a service error
Relationship explorer
- 404 Error occurs when opening relationship explorer
- Primary and foreign keys are not visible for Google BigQuery assets
- Some relationships are not displayed in relationship explorer after the upgrade
- Error 404 when trying to view data asset from the relationship explorer
- Original asset names instead of current asset names show in relationship explorer when opening it from the project
Also see:
Limitations
Installing, upgrading, and uninstalling
Catalogs and Projects
- Default catalog is missing
- Special or double-byte characters in the data asset name are truncated on download
- Catalog UI does not update when changes are made to the asset metadata
- A blank page might be rendered when you search for terms while manually assigning terms to a catalog asset
- Profiling in catalogs, projects, and metadata enrichment might fail for Teradata connections
- Catalog asset search doesn't support special characters
- Can't add individual group members as asset Members
- Long names of the asset owners get truncated when hovering over their avatars
- Duplicate actions fail if dynamic IP address are used
- Project assets that are added while you create segmented data assets might not be available for selection
- An extra path to manage catalogs in the navigation menu
- Row filter data protection rules are not available for previews of SQL query asset type
- Pagination selection dropdown no longer available on the Catalogs page
- Can't add empty or zero-byte files to catalogs
- Migrating data source definitions from the Platform assets catalog will fail
- Associated connection isn't added when you add a vector index from catalog to project
- You can't manage more than 20 assets at the same time
- Issues with publishing data quality rules
Governance artifacts
Metadata import
- Metadata import jobs might be stuck due to issues related to RabbitMQ
- Data assets might not be imported when running an ETL job lineage import for DataStage flows
- When a job for importing lineage metadata hangs, it cannot be stopped
- Only files with .sql extension can be provided as manual input for metadata import from the Oracle and PostgreSQL sources
- Assets and lineage might not be imported if many connections use the same data source
- When you import a project from a .zip file, the metadata import asset is not imported
- Lineage metadata cannot be imported from the Informatica PowerCenter connection
Metadata enrichment
- Profiling in catalogs, projects, and metadata enrichment might fail for Teradata connections
- For assets from SAP OData sources, the metadata enrichment results do not show the table type
Data quality
MANTA Automated Data Lineage for IBM Cloud Pak for Data
- Not all stages are displayed in technical data lineage graph for the imported DataStage ETL flow
- Lineage is not re-imported after the connection is reimported in Manta Admin UI
- Columns are displayed as numbers for a DataStage job lineage in the catalog
Business lineage
General issues
You might encounter these known issues and restrictions when you work with the IBM Knowledge Catalog service.
Assets imported with the user admin instead of cpadmin
For Cloud Pak for Data clusters with Identity Management Service enabled, the default administrator is cpadmin. However, for import, the default administrative user admin is used. Therefore, the assets are imported
with the admin user instead of cpadmin.
Applies to: 5.3.0 and later
Workaround:
Before running the import, apply the following workaround:
-
Edit the config map by executing
oc edit cm catalog-api-exim-cm -
Manually update the environment variable
admin_usernameinimport-job.spec.template.spec.envfrom:- name: admin_username value: ${admin_username}to:
- name: admin_username value: cpadmin
Search bar returning incorrect results
-
Searching for assets when using the search bar returns unexpected results if only one or two characters are used.
Applies to: 5.3.0 and later
Workaround: Type at least three characters in the search bar.
Assessment link in email notifications doesn't work
Applies to: 5.3.0
In email notifications for data quality SLA assessment, the link to the assessment does not work due to an extra dot in the URL.
Workaround: To access the assessment, copy the link to a new browser window, remove the extra dot between ibm and com, and press Enter.
Installing, upgrading and uninstalling
You might encounter these known issues while installing, upgrading or uninstalling IBM Knowledge Catalog.
When upgrading with the Identity Managements service enabled, the upgrade might fail
Applies to: 5.3.0
Fixed in: 5.3.1
If you have the Identity Managements service enabled and you have IBM Knowledge Catalog Premium or IBM Knowledge Catalog Standard installed, or a service that is dependent on either of these two versions of IBM Knowledge Catalog, upgrading might fail.
Workaround Run the following steps to fix this issue:
- Backup your data from the PersistentVolumeClaims (PVC). This is not mandatory but it is recommended in the event of any issues that might occur while fixing the upgrade issue. Run the following to backup your data:
oc get po | grep semantic-automation oc cp semantic-automation-xx-yy:/exchange/configuration ./configuration/ - Remove the PVC:
oc delete pvc volumes-ikcfilestorage-pvc - Restart the pods:
semantic-automation semantic-embedding semantic-text-generation pods oc get po| grep semant oc delete po <x y z> - Debug any failing job pods:
oc get po| grep ikc-lib-volume oc debug ikc-lib-volume-instance-xxxxx - Recreate the PVC volume:
/bin/sh /wkc/genkeys.sh exit - Check that the semantic pods are running again:
oc get po| grep semant
During uninstalling IBM Knowledge Catalog, the glossary cr dependency is not deleted
Applies to: 5.3.0
When uninstalling IBM Knowledge Catalog, the glosssary cr dependency is not deleted. This means that if you try to re-install IBM Knowledge Catalog, it will not properly install. With the IKC operator gone during the initial uninstall,
there is no way to clean up the glossary cr that remains.
Workaround Run the following command in the cluster to patch the glossary cr:
oc patch Glossary glossary-cr -p '{"metadata":{"finalizers": null}}' --type=merge
IBM Knowledge Catalog 5.3.0 installation might not complete
Applies to: 5.3.0
Fixed in: 5.3.1
The installation of IBM Knowledge Catalog 5.3.0 might not complete because the Spark cluster instance could not be provisioned.
Workaround: Force reconciliation of the wkc custom resource:
-
Delete the
wdp-profiling-iae-initjob:oc delete job wdp-profiling-iae-init -n ${PROJECT_CPD_INST_OPERANDS} -
Make sure that the job and the corresponding pod are deleted:
oc get job -n ${PROJECT_CPD_INST_OPERANDS} | grep wdp-profiling-iae-initoc get pod -n ${PROJECT_CPD_INST_OPERANDS} | grep wdp-profiling-iae-initThese commands should not return any output.
-
Initiate reconciliation of the
wkcoperator by deleting thewkc-operatorpod:oc -n ${PROJECT_CPD_INST_OPERATORS} delete pod `oc get pods -n ${PROJECT_CPD_INST_OPERATORS} | grep ibm-cpd-wkc-operator | cut -d' ' -f1`Check whether the
wkc-crstatus changed toInProgressand monitor the newly createdwkc-operatorpod log.
IBM Knowledge Catalog portal-catalog pod out of sync after upgrade
Applies to: 5.3.0
When upgrading IBM Knowledge Catalog and changing editions, the portal-catalog pod might become out of sync, leading to missing functionality that should be enabled from the upgrade.
Workaround: To enable the missing functionality, restart the portal-catalog pod after upgrading IBM Knowledge Catalog.
After installing IBM Knowledge Catalog, some pods might go into the error state and related jobs will fail
Applies to: 5.3.0
After installing IBM Knowledge Catalog, during the post-install steps when the apply-cr commands are running, pods related to kg-resync-glossary and jobs related to these pods might fail.
Workaround: To fix this issue, run the following steps:
- Check for pods that are in the
failedstatus:oc get pod -n ${PROJECT_CPD_INST_OPERANDS} | grep kg-resync-glossary- - Check the corresponding job status for those pods:
oc get job kg-resync-glossary -n ${PROJECT_CPD_INST_OPERANDS} - Delete the
kg-resync-glossaryjob:oc delete job kg-resync-glossary -n ${PROJECT_CPD_INST_OPERANDS} - Reconcile the custom resource (CR) by restarting the
wkc-operatorpod:oc delete pod ibm-cpd-wkc-operator-xxxx-xxxx -n ${PROJECT_CPD_INST_OPERATORS} - Wait for the CR reconciliation to complete and check the pods. Then, the
kg-resync-glossary-xxxxpod should be completed.
In IBM Knowledge Catalog Standard 5.3.0, semantic automation pods might be missing some labels
Applies to: 5.3.0
Fixed in: 5.3.1
If custom models are enabled in IBM Knowledge Catalog Standard 5.3.0, some semantic automation pods might be missing the icpdsupport/app: and icpdsupport/module labels. As a result, these pods can't be tracked on
the IBM Software Hub Monitoring page or might be missed by serviceability scripts that identify pods based on the labels.
Custom models are no longer available starting with version 5.3.1.
Upgrading IBM Knowledge Catalog from 5.1 or 5.2 to 5.3.x might fail if you customized resources
Applies to: 5.3.0 and later
If you customized the resources for any services in IBM Knowledge Catalog from 5.1 or 5.2, the upgrade to version 5.3.x can fail because the configuration doesn't include a requests: section. For example, you might have added
the following entry in the wkc-cr:
wkc_term_assignment_resources:
limits:
cpu: "2"
memory: 4Gi
This specification can cause an error similar to the following one:
wkc_term_assignment_resources.requests.cpu : 'dict object' has no attribute 'requests'
Starting with version 5.3.0, the requests: section is required.
Workaround: Edit the wkc-cr and update resources by adding requests: sections as required. For example:
wkc_term_assignment_resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "1"
memory: 1Gi
You can use the oc patch command to update the configuration. See the following example:
oc patch wkc wkc-cr -n ${PROJECT_CPD_INSTANCE} --type=merge -p '{"spec":{"wkc_term_assignment_resources":"limits":{"cpu": "2", "memory": "4Gi"}, {"requests":{"cpu": "1", "memory": "1Gi"}}}'
Installation of IBM Knowledge Catalog 5.3.1 on Power clusters with ODF storage might fail
Applies to: 5.3.1
Fixed in: 5.3.1 Patch 1
On Power clusters with ODF storage, you might see intermittent out-of-memory errors in the dp-transform-iae-thirdparty-lib-vol job during installation or restore of IBM Knowledge Catalog 5.3.1. An OOMKilledstatus
is shown for the job when you run this command:
oc get po -l job-name=dp-transform-iae-thirdparty-lib-vol -owide
Workaround: You can apply Patch 1 to fix the issue.
Alternatively, you can update the job template:
-
Get the name of the operator pod:
oc -n ${PROJECT_CPD_INST_OPERATORS} get pods -l app.kubernetes.io/name=ibm-cpd-wkc-operator -
Copy the job template file to your local file system:
oc -n ${PROJECT_CPD_INST_OPERATORS} rsync <wkc_op_pod_name>:/opt/ansible/5.3.1/roles/wkc-core/roles/0075-wkc-lite/dp_transform/templates/iae-thirdparty-lib-copyjars-job.yaml.j2 . -
Edit the local copy and search for
limits. Increase the memory limit to512M:limits: cpu: 500m memory: 512M -
Copy the file back:
oc -n ${PROJECT_CPD_INST_OPERATORS} cp iae-thirdparty-lib-copyjars-job.yaml.j2 <wkc_op_pod_name>:/opt/ansible/5.3.1/roles/wkc-core/roles/0075-wkc-lite/dp_transform/templates/
Installation of IBM Knowledge Catalog 5.3.1 fails on FIPS-enabled clusters
Applies to: 5.3.1
Fixed in: 5.3.1 Patch 1
Installing IBM Knowledge Catalog 5.3.1 fails on clusters where FIPS mode is enabled because the wkc-bi-data-service pod crashes with readiness and liveness probe failures. This occurs in environments where the default settings
for the probe initial delays are insufficient.
Workaround: You can apply Patch 1 to fix the issue.
Alternatively, you can increase the delay period. Patch the WKC-CR custom resource with updated initial delay values:
oc patch wkc wkc-cr --type json -p '[{"op":"add","path":"/spec/wkc_bi_data_service_readiness_probe_initial_delay_seconds","value":300},{"op":"add","path":"/spec/wkc_bi_data_service_liveness_probe_initial_delay_seconds","value":300}]'
Catalog and project issues
You might encounter these known issues and restrictions when you use catalogs.
Unauthorized users might have access to profiling results
Applies to: 5.3.0 and later
Users who are collaborators with any role in a project or a catalog can view an asset profile even if they don't have access to that asset at the data source level or in Data Virtualization.
Workaround: Before you add users as collaborators to a project or a catalog, make sure they are authorized to access the assets in the container and thus to view the asset profiles.
Cannot run import operations on a container package exported from another Cloud Pak for Data cluster
Applies to: 5.3.0 and later
When you're importing a container package exported from another Cloud Pak for Data cluster, permissions on the archive must be configured so that export operations are available on the target cluster and the files within the archive can be accessed.
Workaround: To extract the export archive and modify permissions, complete the following steps:
-
Create a temporary directory:
mkdir temp_directory -
Extract the archive:
tar -xvf cpd-exports-<export_name>-<timestamp>-data.tar --directory temp_directory -
Clients must run the following command on the target cluster:
oc get ns $CLUSTER_CPD_NAMESPACE -o=jsonpath='{@.metadata.annotations.openshift\.io/sa\.scc\.supplemental-groups}'Example output:
1000700000/10000. -
Apply the first part of the output of the previous step (ex.
1000700000) as the new ownership on all files within the archive. Example:cd temp_directory/ chown -R 1000700000:1000700000 <export_name> -
Archive the fixed files with the directory. Use the same export name and timestamp as the original exported tar:
tar -cvf cpd-exports-<export_name>-<timestamp>-data.tar <export_name>/ -
Upload the archive.
Data protection rules don't apply to column names that contain spaces
Applies to: 5.3.0 and later
If a column name contains trailing or leading spaces during import, the column cannot be masked using data protection rules.
Workaround: When you're importing columns, ensure that column names don't contain trailing or leading spaces.
Preview of data from file-based connections other than IBM Cloud Object Storage is not fully supported
Applies to: 5.3.0 and later
Connected assets from file-based connections other than IBM Cloud Object Storage do not preview correctly. Data might appear in a table with missing and/or incorrect data. There is no workaround at this time.
Scroll bar is not visible when adding assets to a project on MacOS
When adding assets to a project, the scroll bar might not be available in the Selected assets table, showing a maximum of 5 assets.
Applies to: 5.3.0 and later
Workaround: Change the MacOS settings:
- Click the Apple symbol in the top-left corner of your Mac's menu bar, then click System Settings.
- Scroll down and select Appearance.
- Under the Show scroll bars option, click the radio button next to Always.
Unexpected assets filtering results in catalogs
Applies to: 5.3.0 and later
In catalogs, when you are searching for an asset by using Find assets field, the search might return assets whose names don't match the name string that you typed in the search field and assets that contain a keyword in a property or a related item associated with the typed name string.
Can't use Git projects with identical data assets
Applies to: 5.3.0 and later
Identical data assets don't work with Git projects.
Workaround: To publish assets from catalog to a Git project, check out a branch in the Git project first.
Mismatched column metadata information
Applies to: 5.3.0 and later
If you add columns to an existing asset, you can see the new columns in the Assets tab, but not in the Overview or Profile tabs.
Workaround: Reprofile the asset to view the changes by running a metadata import with the Discover option.
For SQL assets, the SQL validation dropdown might not open automatically when you retest
Applies to: 5.3.0
Fixed in: 5.3.1
If you create a query-based assed and you close the SQL validation dropdown after an initial test, the drop-down does not automatically open when you click the Test query button again.
Workaround: Open the dropdown by clicking the chevron next to the SQL validation label.
Inconsistent behavior when publishing assets after metadata enrichment
Applies to: 5.3.0
Fixed in: 5.3.1
If you run a metadata enrichment on a connected data asset in a project and publish the asset back to a governed catalog from the metadata enrichment screen, the status of the asset doesn't change from Draft to Referenced. As a result, any changes to shared properties made on a catalog-level aren't reflected on a project-level for such an asset.
Workaround: Publish the asset from the asset browser instead.
Publishing asset from project to catalog fails despite success message
Applies to: 5.3.0
When publishing an asset from a project to a catalog, the system displays a success notification, but the asset does not appear in the catalog as expected.
Workaround:
Check the network console for any errors.
Data quality score not available for columns in the catalog lineage view
Applies to: 5.3.0 Fixed in: 5.3.1
You can't view data quality score for columns in the catalog lineage view for connected assets. The score is available in the catalog where the asset is published, but it doesn't show up in the lineage view of a different catalog that contains the same connected asset.
Attempt to download protected data assets is allowed
Applies to: 5.3.0 and later
After you apply a data protection rule to an asset, for example partial redacting of columns, and attempt to download it by clicking the download icon, you get a Prepare asset for download message, which confirms the download will be available through notifications. No notification ever appears, as the asset is protected and not available for download.
The Allow duplicates option isn't available when publishing assets from projects to catalog
Applies to: 5.3.1
When you're publishing assets from a project to catalog, the Allow duplicates option isn't listed as a possible duplicate action for the selected assets.
Workaround: If you want to add assets to a catalog and allow the creation of duplicates for these assets, you must add the asset directly from the catalog as a connected asset.
Description isn't available after importing data virtualization assets to a catalog
Aplies to: 5.3.1
After a data virtualization asset is reimported to a catalog, the tags are retained, but the description is not.
Workaround: Add the description for the data virtualization asset after the reimport. The update is reflected in the project and across catalogs.
Profiling and previewing assets fail after metadata import with the Overwrite original assets option
Applies to: 5.3.1
If you select Overwrite original assets as the merging setting when you import asset metadata in a catalog, the Profile and Asset tabs in the catalog aren't available after the metadata import
is completed.
Workaround:
- If you want to update existing asset metadata or add new metadata to columns and on an asset-level, select
Update original assetsinstead of theOverwrite original assetoption. - If you remove existing asset metadata or relationship details, you can't use metadata import. You must remove the metadata or relationship information for each asset individually.
Profiling page doesn't show changes after reimporting an asset
Applies to: 5.3.1
If an asset was changed in the data source, for example, columns were renamed or added and this asset is reimported, the asset profile is not automatically updated. The changes are not reflected on the asset's Profile page.
Workaround: Update the asset profile manually. Go to the asset's Profile page and click Update profile.
Page crashes when working with data protection rules
Applies to: 5.3.1
When you are opening, creating, or editing a data protection rule, or switching browsers when creating a data protection rule, the page might crash intermittently.
Workaround: Reload the page or create the same rule again.
Catalog page fails to load with 502 Bad Gateway error
Applies to: 5.3.1 Patch 4 Fixed in: Hotfix for Patch 4, Patch 5
When you open a catalog from Catalogs > All catalogs, the page fails to load and shows a 502 Bad Gateway error. The portal-catalog pod might crash repeatedly, preventing access to all catalog UI
functionality.
This situation might occur in the following scenarios:
- A user group added as a catalog or asset collaborator is later deleted.
- Data Virtualization is installed and configured for auto‑publishing virtual tables to a default catalog, where assets are created with a special service ID (icp4d-dev) that is incorrectly handled as a group ID in Patch 4.
The Catalog UI does not correctly handle cases where a user or group ID referenced as a catalog or asset collaborator no longer exists in IBM Software Hub user management.
Workaround:
For scenario 1 and scenario 2, apply a hotfix for the portal-catalog image on top of patch 4 as described in Catalog UI page fails with 502 Bad Gateway error when a missing user groups is used in a catalog
Can't run export and import jobs on non-FIPS clusters with cpd-cli
Applies to: 5.3.1 Fixed in: Patch 5
When you run project, space or catalog import or export jobs on non-FIPS cluster with cpd-cli, the jobs fail with the SSL/TLS Initialization Failure error.
Workaround:
-
As instance administrator, edit the config map:
oc edit configmap catalog-api-aux-cm -n <namespace> -
For
exportjob.shandimportjob.sh, locate the following entries in the else block:-Dserver.ssl.key-store=${SCRATCH_SPACE}/security/cacerts -Djavax.net.ssl.trustStore=${SCRATCH_SPACE}/security/cacerts -Djavax.net.ssl.keyStore=${SCRATCH_SPACE}/security/cacerts -
Add the
.p12to the entries:-Dserver.ssl.key-store=${SCRATCH_SPACE}/security/cacerts.p12 -Djavax.net.ssl.trustStore=${SCRATCH_SPACE}/security/cacerts.p12 -Djavax.net.ssl.keyStore=${SCRATCH_SPACE}/security/cacerts.p12 -
Save the config map and run your job again.
Governance artifacts issues
You might encounter these known issues and restrictions when you use governance artifacts.
Error Couldn't fetch reference data values shows up on screen after publishing reference data
Applies to: 5.3.0 and later
When new values are added to a reference data set, and the reference data set is published, the following error is displayed when you try to click on the values:
Couldn't fetch reference data values. WKCBG3064E: The reference_data_value for the reference_data which has parentVersionId: <ID> and code: <code> does not exist in the glossary. WKCBG0001I: Need more help?
When the reference data set is published, the currently displayed view changes to Draft-history as marked by the green label on the top. The Draft-history view does not allow to view the reference data values.
Workaround: To view the values, click Reload artifact so that you can view the published version.
Publishing large reference data sets fails with Db2 transaction log full
Applies to: 5.3.0 and later
Publishing large reference data sets might fail with a Db2 error such as:
The transaction log for the database is full. SQLSTATE=57011
Workaround: Publish the set in smaller chunks, or increase Db2 transaction log size as described in the following steps.
-
Modify the transaction log settings with the following commands:
db2 update db cfg for bgdb using LOGPRIMARY 5 --> default value, should not be changed db2 update db cfg for bgdb using LOGSECOND 251 db2 update db cfg for bgdb using LOGFILSIZ 20480 -
Restart Db2.
You can calculate the required transaction log size as follows:
(LOGPRIMARY + LOGSECOND) * LOGFILSIZ
For publishing large sets, the following Db2 transaction log sizes are recommended:
- 5GB for 1M reference data values and 300K relationships
- 20GB for 1M reference data values and 1M relationships
- 80GB for 1M reference data values and 4M relationships
where the relationship count is the sum of the parent, term and value mapping relationships for reference data values in the set.
Imported data assets with assigned out-of-the-box data classes or terms have incorrect identifiers resulting in no enforcement of data protection rules
When you migrate data assets across Cloud Pak for Data instances and these assets have out-of-the-box data classes or terms assigned, the imported data assets indicate correct data class or term assignments but the assigned artifact ID is incorrect. As a result, any operations that reference the data class or term, such as data protection rules, can't be applied to the imported data assets.
Relationships between catalog assets and out-of-the-box governance artifacts cannot be migrated correctly.
Applies to: All versions of Cloud Pak for Data beginning with 4.0 and later.
Workaround: none
Can't cascade delete a Knowledge Accelerator category
Applies to: 5.3.0 and later
If you run cascade delete of a Knowledge Accelerator category, the operation might fail due to deadlocks.
Workaround: In case of deadlocks, retry cascade delete of the same root category until it's deleted.
wkc-glossary-service pod restarting when updating or creating multiple business terms
When updating or creating large numbers of business terms, the wkc-glossary-service pod is restarting due to reaching the CPU limit.
Applies to: 5.3.0 and later
Workaround: Increase the CPU limit for wkc-glossary-service as described in Manually scaling resources for services.
Column data is not redacted in accordance with the rule precedence
Applies to: 5.3.0 and later
Some advanced partial masking options get ignored by the rule precedence mechanism when the rules resolution is executed, which results in incorrect data redaction.
Workaround: If there are multiple rules with partial masking specified, ensure that all of the partial masking columns are the same in all of these rules for the same data class.
Publishing asset from project to catalog fails despite success message
Applies to: 5.3.0
When publishing an asset from a project to a catalog, the system displays a success notification, but the asset does not appear in the catalog as expected.
Workaround:
Check the network console for any errors.
Bulk edit does not enforce mandatory properties
Applies to: 5.3.1
-
Bulk edit ignores validation of the missing mandatory values for custom properties, when these properties are restricted to a specific category. Validation recognizes and filters out only the global mandatory custom properties.As a result of the bulk edit, updates are applied to artifacts with missing values. Such edits would be blocked when editing a single artifact.
-
Bulk edit validation does not recognize and filter out attributes with missing mandatory custom relationships.
Workaround: None
ZIP import with specified or empty merge method issues for mandatory custom properties and relationships
Applies to: 5.3.1
Validation of mandatory custom properties and relationships doesn't work in ZIP import, if the merge method is set to specified or empty. Artifacts with missing values for mandatory custom properties or relationships
are imported without any warning.
Also, applying default values in ZIP import with those merge methods doesn't work.
Workaround: Set ZIP import merge option to all.
Applying default values for the Predefined values mandatory property type doesn't work correctly in ZIP import
Applies to: 5.3.1
Applying default values in ZIP import with all merge method doesn't work for the Predefined values mandatory custom property type. The following error is returned:
Custom property {name} contains invalid value.
As a result, the affected artifact is not imported due to the mandatory value missing.
Workaround: None
Changing artifact's primary category to a category for which mandatory custom properties apply is blocked
Applies to: 5.3.1
Moving artifact to a category for which a mandatory custom property is defined is blocked due to missing mandatory values.
Workaround: The administrator can temporarily remove the mandatory checkbox from the custom property definition. After the artifact is moved to the category, the mandatory checkbox can be re-enabled.
Providing missing mandatory custom relationship in a category, which already has other mandatory custom relationships specified, fails
Applies to: 5.3.1
When a new mandatory custom relationship definition for the Category artifact type is created, then the user is asked to fill in the missing mandatory custom relationship value before editing the category in any way. If that category already has another mandatory custom relationships assigned, providing a value for the missing one fails.
Once there is at least one mandatory custom relationship value assigned to a category, creating another mandatory custom relationship definition applicable to this category renders the category uneditable.
Workaround: All the mandatory custom relationships need to be assigned to the categories in one edit action. Alternatively, the administrator can remove the mandatory checkbox from the custom relationship definition.
Editing the Make this source a required field setting in a custom relationship definition takes a long time or times out
Applies to: 5.3.1
If a custom relationship definition is restricted by a category, editing the Make this source a required field setting might take a few minutes or time out.
Workaround: None
Editing artifacts fails after upgrading from 5.1.x
Applies to: 5.3.0, 5.3.1
After upgrading from 5.1.x to 5.3.x, it might not be possible to edit governance artifacts that were created prior to the upgrade. When trying to modify an artifact, the following message is displayed:
Unable to execute action
The following exception can be found in the logs:
"exception":"\njava.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because the return value of "com.ibm.glossary.model.coreentity.AbstractBusinessEntity.getVersionSequence()" is null\n
Workaround: Download and run the corrective scripts as described in this support document.
Workflows issues
You might encounter these known issues and restrictions when you use workflows.
Connection of wkc-workflow-service to its Postgres database does not use SSL/TLS encryption
Applies to: 5.3.0
Fixed in: IBM Knowledge Catalog 5.3.0 and IBM watsonx.data intelligence 2.3.0 patch
Starting in version 5.2.0, the workflow service is using a Postgres Database as its internal database, which resides on the same Openshift cluster. When connecting to the database, the wkc-workflow-service is not using SSL/TLS encryption. Therefore, this intra-cluster connection is unencrypted.
Metadata import issues
You might encounter these known issues when you work with metadata import.
Assets are not imported from the IBM Cognos Analytics source when the content language is set to Japanese
Applies to: 5.3.0 and later
If you want to import metadata from the Cognos Analytics connection, where the user's content language is set to Japanese, no assets are imported. The issue occurs when you create a metadata import with the Get BI report lineage goal.
Workaround: In Cognos Analytics, change the user's content language from Japanese to English. Find the user for which you want to change the language, and change this setting in the Personal tab. Run the metadata import again.
Dummy assets get created for any file assets that come from Amazon S3 to show the complete business data lineage if Get ETL job lineage is performed
Applies to: 5.3.0 and later
If you perform Get ETL job lineage import involving Amazon S3 connection, dummy assets get created for any file assets that come from Amazon S3 connection to show the complete business data lineage. If you perform metadata import for the same Amazon S3 connection, a duplicate asset will get created for the dummy asset created from Get ETL job lineage import and a valid asset discovered during the metadata import.
SocketTimeoutException during metadata import
Applies to: 5.3.0 and later
During metadata import, when records from a CSV file that contains more than 30,000 rows are read, SocketTimeoutException is returned. This indicates a network issue where the connection between the client and server was unexpectedly
closed.
Workaround:
-
Log in to the OpenShift console.
-
Go to Workloads > Pods > metadata-discovery-pod.
-
Go to the Environment section.
-
Search for the
manta_wf_export_downloadenvironment variable and set it to true.Example:
manta_wf_export_download=trueBy setting the variable, you're bypassing the socket timeout issue and downloading the CSV file to the local system. As a result, the CSV file can be read locally rather than over the network. After the CSV file is read, the locally downloaded file is deleted from the local system.
Business lineage shows Name not available when you're using Get BI report lineage option to import Tableau data assets
Applies to: 5.3.0 and later
When you're using Get BI report lineage option to import Tableau data assets, business lineage for some of these assets shows Name not available instead of the actual data asset names due to extra object types
that are generated by MANTA Automated Data Lineage service.
Workaround: Use technical data lineage to show the names of all data assets.
Reference assets are not marked as Removed from scope or Removed from source
Applies to: 5.3.0
If you remove an asset that was initially imported into the catalog with the associated primary and foreign keys from the metadata import scope or remove the asset from the data source, the asset itself is marked accordingly as Removed from scope or Removed from source in the catalog on reimport, but the associated PK and FK constraint assets aren't. However, when you delete the data asset from the catalog, the associated constraint assets are also deleted.
Assets removed from the metadata import scope are marked incorrectly
Applies to: 5.3.0
Fixed in: IBM Knowledge Catalog 5.3.0 and IBM watsonx.data intelligence 2.3.0 patch
If you remove an asset from the metadata import scope and rerun the import, the data asset in the catalog has the tag Removed from source instead of Removed from scope.
Metadata enrichment issues
You might encounter these known issues when you work with metadata enrichment.
Running primary key or relations analysis doesn't update the enrichment and review statuses
Applies to: 5.3.0 and later
The enrichment status is set or updated when you run a metadata enrichment with the configured enrichment options (Profile data, Analyze quality, Assign terms). However, the enrichment status is not updated when you run a primary key analysis or a relationship analysis. In addition, the review status does not change from Reviewed to Reanalyzed after review if new keys or relationships were identified.
Issues with the Microsoft Excel add-in
Applies to: 5.3.0 and later
The following issues are known for the Review metadata add-in for Microsoft Excel:
-
When you open the drop-down list to assign a business term or a data class, the entry Distinctive name is displayed as the first entry. If you select this entry, it shows up in the column but does not have any effect.
-
Updating or overwriting existing data in a spreadsheet is currently not supported. You must use an empty template file whenever you retrieve data.
-
If another user works on the metadata enrichment results while you are editing the spreadsheet, the other user's changes can get lost when you upload the changes that you made in the spreadsheet.
-
Only assigned data classes and business terms are copied from the spreadsheet columns Assigned / suggested data classes and Assigned / suggested business terms to the corresponding entry columns. If multiple business terms are assigned, each one is copied to a separate column.
Republishing doesn't update primary key information in catalog
Applies to: 5.3.0 and later
If you remove primary key information from a data asset that initially was published with the primary key information to a catalog with the duplicate-asset handling method Overwrite original assets in the metadata enrichment results and then republish the asset to that catalog, the primary key information on the catalog asset remains intact.
Workaround: Delete the existing catalog asset before you republish the data asset from the metadata enrichment results.
Masked data might be profiled when the data source is IBM watsonx.data
Applies to: 5.3.0 and later
If a user who is not the owner of a protected data asset in IBM watsonx.data adds such asset to a project and runs metadata enrichment on it, the masked data is sent for profiling. As a result, even the asset owner will see the profile with masked data.
Workaround: None.
In small system configurations, large metadata enrichment jobs can fail
Applies to: 5.3.0 and later
In a system that is configured for smaller workloads, metadata enrichments that contain a lot of assets can fail with out-of-memory errors.
Workaround: To resolve the issue, you have these options:
- Increase the CPU and memory values of the profiling pod.
- Add one more replica.
Update these parameters in the wkc-cr custom resource:
- For the number of replicas: the
wdp_profiling_min_replicasandwdp_profiling_max_replicasvalues - For the CPU and memory values: the
requestsandlimitsentries for thewdp_profiling_resourcesparameter
Use the oc patch command to update the values. See the following example:
oc patch wkc wkc-cr -n ${PROJECT_CPD_INSTANCE} --type=merge -p '{"spec":{"wdp_profiling_min_replicas":"4","wdp_profiling_max_replicas":"4","wdp_profiling_resources":{"requests":{"cpu": "300m", "memory": "600Mi"}, "limits":{"cpu": "4000m", "memory": "8192Mi"}}}}'
Data quality output configuration in metadata enrichment can't be removed
Applies to: 5.3.0
Fixed in: 5.3.1
After you saved a metadata enrichment that you configured with the Run data quality analysis objective and a custom data quality output table, you can no longer remove the output table configuration from the metadata enrichment. When you edit the metadata enrichment, you can click Customize in the Run data quality analysis tile and then switch off the output table setting, but that change doesn't persist.
Workaround: To remove the configured data quality output database from the metadata enrichment, complete these steps:
-
Generate a token as described in Generating a bearer token.
-
Set the
HOSTvariable:export HOST=<host_name>where <host_name> is the hostname of your deployment.
-
Set the
PROJECT_IDvariable:export PROJECT_ID=<project_id>where <project_id> is the ID of the project that holds the metadata enrichment. You can find this ID in the browser URL between the
project_id=and&contextstrings. -
Set the
MDE_IDvariable:export MDE_ID=<mde_id>where <mde_id> is the ID of the metadata enrichment. You can find this ID in the browser URL between the
/display/and?project_id=strings. -
Set the
TOKENvariable:export TOKEN=<access_token_value>where <access_token_value> is the token that you created in the first step.
-
Run the following cURL command:
curl -X PATCH -H "Content-Type: application/merge-patch+json" -H "Authorization: Bearer $TOKEN" https://$HOST/metadata_enrichment/v3/metadata_enrichment_assets/$MDE_ID?project_id=$PROJECT_ID" -d '{"objective":{"data_quality":{"structured":{"dq_exceptions_database":null,"dq_exception_records_database":null}}}}'You might need to run the command with the
-koption if the certificate verification fails.
Primary category for generated terms is added to the category scope for term assignment
Applies to: 5.3.0 and later
Fixed in: 5.3.1 Patch 1
In deployments where automatic glossary generation is available (generative AI is enabled), a primary category for the generated terms is required. If you don't select a category, the [uncategorized] category is set by default.
The selected or set category is automatically added to the category scope for enrichment. Thus, additional data classes, business terms, and classifications might be considered for enrichment, for example, for term assignment.
Workaround: You can apply Patch 1 to fix the issue.
Alternatively, you can select one of the categories that you included in the enrichment category scope as the primary category for generated terms.
Data quality issues
You might encounter these known issues when you work with data quality assets.
Rules bound to columns of the data type NUMERIC in data assets from Oracle data sources might not work
Applies to: 5.3.0 and later
Testing or running a data quality rule that is bound to a NUMERIC column in a data asset from an Oracle data source fails if the data source is connected through a Generic JDBC connection.
Workaround: Use the native connector.
Runs of migrated data quality rules complete with warnings
Applies to: 5.3.0 and later
When you run a data quality rule that was migrated from from InfoSphere Information Server, you might see a message like the following one:
7/2/2025 11:22:30 WARNING IIS-DSEE-TFIP-00072 <Modify_1> When checking operator: When binding output schema variable "outRec": When binding output interface field "col1" to field "col1": Implicit conversion from source type "int32" to result type "string[variable_max=10]": Converting number to string.
Such warnings are displayed when you run the DataStage job for a data quality rule that was created in InfoSphere Information Server as a Rule Stage with an output link of the type Violation details and then migrated to Cloud Pak for Data.
Workaround: You can ignore such warnings or set up a DataStage message handler to suppress such messages or to redurce their severity.
Data quality rules with duplicated variable names from multiple data quality definitions might not work
Applies to: 5.3.0 and later
When you create a data quality rule from multiple data quality definition that reference the same variable name, the variables are internally renamed for differentiation by adding a suffix. If the renaming results in a name collision with other variables from the definitions, the data quality rule will not work.
Can't drill-down into data quality output on DataStax Enterprise
Applies to: 5.3.0 and later
When you try to access details of data quality issues that are stored in a table on DataStax Enterprise, an error similar to the following one is displayed:
Loading failed values failed.
There was a problem loading the values that
failed profiling.
Internal Server Error Failed to get Fight info: CDICO2005E: Table could not be found:
SQL syntax error: [IBM][Cassandra JDBC Driver)[Cassandra syntax error or access rule
violation: base table or view not found: dgexceptions. If the table exists, then the user
may not be authorized to see it.
This error occurs because the name of the data quality output table is created in a way that makes it case-sensitive in the underlying Cassandra database so that an uppercase name is generated. However, the SELECT statement for the lookup is constructed in a way that Cassandra search looks for a lowercase table name.
Data quality scores might not be available for data quality rules created in earlier versions
Applies to: 5.3.0
Fixed in: 5.3.1
In some cases, no data quality scores are available for a rule that was created in an earlier product version. This issue is caused by missing asset relationships. Such relationships are usually set up when rules from an earlier release are migrated during the upgrade to the new product version.
Workaround: Complete the following steps:
- Open the affected data quality rule.
- In the Related items, select Add related items > Add related columns.
- Select the Validates data quality of relationship and click Next.
- Select a data asset and then the columns for which you want the rule to report a score report and click Add.
Can't create rules or SQL assets with plain-text queries for data quality output tables created before 5.3.1
Applies to: 5.3.0 and later
For data quality rule output tables that were created in version 5.3.0 or earlier, you cannot create a new SQL-based data quality rule by using plain text queries. Also, you cannot create SQL assets (assets of the type query) from these tables.
Generative AI capabilities
You might encounter these known issues when you work with the generative AI capabilities in metadata enrichment or use the Text2SQL feature.
Socket timeout errors can occur for Text2SQL API calls
Applies to: 5.3.0 and later
Socket timeout errors can occur for Text2SQL API calls if the load balancer timeout setting for watsonx.ai is too low.
Workaround: Check the load balancer configuration and make sure that the timeout is set to 600s (10m). For more information, see Changing load balancer timeout settings.
Can't work with a remote watsonx.ai instance if the system is configured to run models on CPU
Applies to: 5.3.0 and later
If you installed IBM Knowledge Catalog or watsonx.data intelligence with the option enableModelsOn:cpu, the models used for the generative AI capabilities run locally on CPU. You cannot work with foundation models on a remote
watsonx.ai instance.
For more information about setting up a connection with a remote watsonx.ai instance, see Configuring the setup for enabled gen AI capabilities in the Cloud Pak for Data documentation or Configuring the setup for enabled gen AI capabilities in the watsonx.data intelligence documentation.
Workaround: To be able to work with remote models, change your system configuration:
-
Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task.
oc login <your_openshift_cluster_url> -
Set the context to the project where IBM Knowledge Catalog or watsonx.data intelligence is deployed:
oc project ${PROJECT_CPD_INST_OPERANDS} -
Patch the
semanticautomation-crcustom resource to set the installation parameterenableModelsOntoremote.oc patch sal semanticautomation-cr --type=merge -p '{"spec":{"enableModelsOn": "remote"}}'
Project collaborators with the Viewer role can use the toggle to enable or disable the Data intelligence settings
Applies to: 5.3.0
Fixed in: 5.3.1
Project collaborators with the Viewer role can switch the toggle setting for enabling or disabling the Data intelligence settings even though they have only read access to the settings. As a result, changing the toggle setting returns an error similar to the following one:
Disabling failed
Disabling generative AI capabilities failed. Try again.
The project settings page for data intelligence shows the main IBM Cloud Pak for Data toolbar
Applies to: 5.3.0
Fixed in: 5.3.1
In the IBM Cloud Pak for Data and Data Fabric experiences, the project settings page for data intelligence shows an additional IBM Cloud Pak for Data toolbar. This additional toolbar does not impact the functionality of the settings UI. You can still change the settings as required.
Known issues with the Text-to-SQL capability
Applies to: 5.3.0
Fixed in: IBM Knowledge Catalog 5.3.0 and IBM watsonx.data intelligence 2.3.0 patch and 5.3.1
You might encounter the following known issues for the Text-to-SQL feature:
-
Even if the Text-to-SQL capability is not enabled for IBM Knowledge Catalog or watsonx.data intelligence in IBM Software Hub, the UI for SQL assets and SQL-based data quality rules shows a notification asking to enable natural language queries.
-
If a project contains similar data assets from various connections, incorrect SQL statements are created when you use plain text queries to create SQL assets or SQL-based data quality rules.
-
Incorrect SQL statements might be created when you use plain text queries to create SQL assets from PostgreSQL connection.
-
SQL assets (assets of the type query) might be processed when you use plain text queries to create new SQL assets or SQL-based data quality rules.
-
Even if the Text-to-SQL capability is enabled for IBM Knowledge Catalog or watsonx.data intelligence in IBM Software Hub, the create wizards for SQL assets and SQL-based data quality rules do not show the tag.
-
If IBM Knowledge Catalog or watsonx.data intelligence are installed with the installation option
enableModels=CPU, the Text-to-SQL capability should not be enabled.
MANTA Automated Data Lineage
You might encounter these known issues and restrictions when MANTA Automated Data Lineage is used for capturing lineage.
Metadata import jobs for getting lineage might take very long to complete
Applies to: 5.3.0
If multiple lineage scans are requested at the same time, the corresponding metadata import jobs for getting lineage might take very long to complete. This is due to the fact that MANTA Automated Data Lineage workflows can't run in parallel but are executed sequentially.
Chrome security warning for Cloud Pak for Data deployments where MANTA Automated Data Lineage for IBM Cloud Pak for Data is enabled
Applies to: 5.3.0
When you try to access a Cloud Pak for Data cluster that has MANTA Automated Data Lineage for IBM Cloud Pak for Data enabled from the Chrome web browser, the message Your connection is not private is displayed and you can't proceed.
This is due to MANTA Automated Data Lineage for IBM Cloud Pak for Data requiring an SSL certificate to be applied and occurs only if a self-signed certificate is used.
Workaround: To bypass the warning for the remainder of the browser session, type thisisunsafe anywhere on the window. Note that this code changes every now and then. The mentioned code is valid as of the date
of general availability of Cloud Pak for Data 4.6.0. You can search the web for the updated code if necessary.
Can't Get lineage with a DB2 connection in FIPS environments
Applies to: 5.3.0 and later
If you try to import metadata for data assets with the Get lineage scenario with a DB2 connection in a FIPS environment, the metadata import fails and the following error message is displayed.
Error creating the metadata import Metadata import creation failed due to connection validation errors. Not all connections in the metadata import passed validation. Check the log for the complete validation results.
Data Virtualization connection fails when SSL is required
Applies to: 5.3.0 and later
When you're creating a connection to Data Virtualization and SSL is required, the metadata import lineage job will fail with an exception despite the connection validation passed.
Workaround:
-
In the project where metadata lineage import job will be executed, create a new independent Data Virtualization connection.
-
Download Data Virtualization SSL certificate from CP4D Menu > Data > Data virtualization > Menu > Configure connection > Download SSL Certificate.
-
In the project where you run MDI Lineage job, create a new non-Platform Data Virtualization connection with exactly the same details as the Data Virtualization Platform connection in CP4D Menu > Data > Connections and enter the SSL certificate contents in the SSL certificate field.
-
Use that connection to run metadata lineage import job.
Data export is incomplete due to table deduction in Query Service
Applies to: 5.3.0 and later
When you reference a table from another system in the Query Service using the synonym of that table, then the resulting object is deduced and exported data is incomplete.
Validation shows complete for Data Virtualization metadata import jobs
Applies to: 5.3.1
When you're creating metadata import for Data Virtualization, the created job shows as if validated, regardless of user's insufficient permissions, which might result in missing metadata and errors in Manta extraction log.
Business lineage issues
You might encounter these known issues and restrictions with lineage.
Business data lineage is incomplete for the metadata imports with Get ETL job lineage or Get BI report lineage goals
Applies to: 5.3.0 and later
In some cases, when you display business lineage between databases and ETL jobs or BI reports, some assets are missing, for example, a starting database. The data was imported by using the Get ETL job lineage or Get BI report lineage import option. Technical data lineage correctly shows all assets.
Workaround: Sometimes MANTA Automated Data Lineage cannot map the connection information from an ETL job or a BI report to the existing connections in IBM Knowledge Catalog. Follow these steps to solve the issue:
-
Open the MANTA Automated Data Lineage Admin UI:
https://<CPD-HOSTNAME>/manta-admin-gui/ -
Go to Log Viewer and from the Source filter select Workflow Execution.
-
From the Workflow Execution filter, select the name of the lineage workflow that is associated with the incomplete business lineage.
-
Look for the
dictionary_manta_mapping_errorsissue category and expand it. -
In each entry, expand the error and click View Log Details.
-
In each error details, look for the value of
connectionString. For example, in the following error message, the value of theconnectionStringparameter isDQ DB2 PX.2023/11/14 18:40:12.186 PM [CLI] WARN - <provider-name> [Context: [DS Job 2_PARAMETER_SET] flow in project [ede1ab09-4cc9-4a3f-87fa-8ba1ea2dc0d8_lineage]] DICTIONARY_MANTA_MAPPING_ERRORS - NO_MAPPING_FOR_CONNECTION User message: Connection in use could not be automatically mapped to one of the database connections configured in MANTA. Technical message: There is no mapping for the connection Connection [type=DB2, connectionString=DQ DB2 PX, serverName=dataquack.ddns.net, databaseName=cpd, schemaName=null, userName=db2inst1]. Solution: Identify the particular database technology DB2 leading to "DQ DB2 PX" and configure it as a new connection or configure the manual mapping for that database technology in MANTA Admin UI. Lineage impact: SINGLE_INPUT -
Depending on the connection that you used for the metadata import, go to Configuration > CLI > connection server > connection server Alias Mapping, for example DB2 > DB2 Alias Mapping.
-
Select the connection used in workflow and click Full override.
-
In the Connection ID field, add the value of the
connectionStringparameter that you found in the error details, for exampleDQ DB2 PX. -
Rerun the metadata import job in IBM Knowledge Catalog.
Hops for components and columns of components work only inside the data integration flow area of an expanded job node
Applies to: 5.3.0 and later
When working with the lineage graph, hops for components and columns of data integration components work only inside the data integration flow area of an expanded job node and don't connect columns of nodes outside of the flow area.
Viewing large business lineage graph results in a service error
Applies to: 5.3.0 and later
When you click the Lineage tab for a large business lineage graph, after around two minutes the following error message is displayed:
Lineage service error. An error occurred in the lineage service. Try to reload the lineage graph, and if the error persists, contact your administrator.
When you click the Lineage tab again, the lineage generation process starts, but after a while the following error is displayed:
Error 404 – Not Found
Workaround: To fix the issue, modify some of the configuration settings:
- Increase the timeout value in
nginxpod:- Open the configuration YAML file by running the command
oc edit zenextension wkc-base-routes -n ${PROJECT_CPD_INSTANCE}. - In the section
location /data/catalogs, change the value of theproxy_read_timeoutvariable to10m, and save the changes. - Restart the
nginxpod by running the commandoc rollout restart deploy/ibm-nginx -n ${PROJECT_CPD_INST_OPERANDS}
- Open the configuration YAML file by running the command
- Allocate more memory to the service
wkc-data-lineage-service, 8 GB or more, and enable horizontal scaling for the service:- Update the lineage deployment with the environment variable
oc edit deploy –n ${PROJECT_CPD_INSTANCE} wkc-data-lineage-service. - In the
containerssection, change the default graph memory threshold by adding the variablelineage_memory_threshold_limit:- name: lineage_memory_threshold_limit value: "0.5" - In the
resources/limitssection, increase the default memory allocation to 8192Mi:resources: limits: cpu: "2" ephemeral-storage: 1Gi memory: 8192Mi requests: cpu: 250m ephemeral-storage: 50Mi memory: 512Mi - Change these settings to horizontally scale the lineage service to 2–3 instances:
spec: progressDeadlineSeconds: 600 replicas: 3 HistoryLimit: 10
- Update the lineage deployment with the environment variable
After you modify the configuration, try to open the Lineage tab again. If the error still occurs, change the value of the lineage_memory_threshold_limit parameter to 0.4.
Relationship explorer issues
You might encounter these known issues and restrictions with relationship explorer.
404 Error occurs when opening relationship explorer
Applies to: 5.3.0 and later
Clicking the Explore relationships button from a catalog or project asset results in a 404 error.
Workaround: Check if your Cloud Pak for Data experience includes the IBM Knowledge Catalog any edition service. The relationship explorer is an IBM Knowledge Catalog any edition feature and can’t be accessed without it.
Primary and foreign keys are not visible for Google BigQuery assets
Applies to: 5.3.0 and later
When selecting a Google BigQuery asset on the relationship explorer canvas, the Item metadata panel is not showing primary and foreign keys.
Some relationships are not displayed in relationship explorer after the upgrade
Applies to: 5.3.0 and later
After upgrading from 5.1.x or 5.2.x to 5.3.0, relationships created before the upgrade are not displayed in relationship explorer. Relationships created after the upgrade display correctly.
Workaround: After the upgrade, in the cluster where Knowledge Graph is deployed, delete the sync job. Then, run the reconciliation of the Knowledge Graph custom resource.
Error 404 when trying to view data asset from the relationship explorer
Applies to: 5.3.1
The Go to data asset link in the relationship explorer metadata view for an asset might not work if you also change the workspace. This issue only occurs when you select a different workspace than the one originally selected.
Workaround: Navigate to the project and view the asset from there.
Original asset names instead of current asset names show in relationship explorer when opening it from the project
Applies to: 5.3.0 and later
After you update an asset name in the catalog, add it to the project and open relationship explorer from there, the original asset name shows instead of the updated asset name, even though the display name settings are set to show the current asset name.
Limitations
Installing, upgrading, and uninstalling
When uninstalling Manta Data Lineage, re-installing IBM Knowledge Catalog runs into issues
Applies to: 5.3.0 and later
You can install Manta Data Lineage with IBM Knowledge Catalog. If you uninstall Manta Data Lineage, and then try to re-install the wkc-cr for IBM Knowledge Catalog, you might run into issues. The wkc-post-install-init pod might fail to restart.
Workaround: To fix this issue, restart the ibm-nginx pods, then restart the wkc-operator pod. This will put the wkc-operator in the completed state.
Catalogs and projects
Duplicate actions fail if dynamic IP address are used
Applies to: 5.3.0 and later
Duplicate actions work only for connections with static IP addresses. If the connection is using a hostname with a dynamic IP address, duplicate actions might fail during connection creation.
Long names of the asset owners get truncated when hovering over their avatars
Applies to: 5.3.0 and later
When you are hovering over the avatar to show the long name of the asset owner in the side panel, the name gets truncated if it is longer than 40 characters or contains a space or a special character. If the name is longer than 40 characters, it will display correctly as long as it contains a space or '-' within the first 40 characters.
Can't add individual group members as asset members
Applies to: 5.3.0 and later
You can't add individual group members as asset members. You can add individual group members as catalog collaborators and then as asset members.
Catalog asset search doesn't support special characters
Applies to: 5.3.0 and later
If search keywords contain any of the following special characters, the search filter doesn't return the most accurate results.
Search keywords:
. + - && || ! ( ) { } [ ] ^ " ~ * ? : \
Workaround: To obtain the most accurate results, search only for the keyword after the special character. For example, instead of AUTO_DV1.SF_CUSTOMER, search for SF_CUSTOMER.
Missing default catalog and predefined data classes
Applies to: 5.3.0 and later
The automatic creation of the default catalog after installation of the IBM Knowledge Catalog service can fail. If it does, the predefined data classes are not automatically loaded and published as governance artifacts.
Workaround: Ask someone with the Administrator role to follow the instructions for creating the default catalog manually.
Special or double-byte characters in the data asset name are truncated on download
Applies to: 5.3.0 and later
When you download a data asset with a name that contains special or double-byte characters from a catalog, these characters might be truncated from the name. For example, a data asset named special chars!&@$()テニス.csv will
be downloaded as specialchars!().csv.
The following character sets are supported:
- Alphanumeric characters:
0-9,a-z,A-Z - Special characters:
! - _ . * ' ( )
Catalog UI does not update when changes are made to the asset metadata
Applies to: 5.3.0 and later
If the Catalog UI is open in a browser while an update is made to the asset metadata, the Catalog UI page will not automatically update to reflect this change. Outdated information will continue to be displayed, causing external processes to produce incorrect information.
Workaround: After the asset metadata is updated, refresh the Catalog UI page at the browser level.
A blank page might be rendered when you search for terms while manually assigning terms to a catalog asset
Applies to: 5.3.0 and later
When you search for a term to assign to a catalog asset and change that term while the search is running, it can happen that a blank page is shown instead of any search results.
Workaround: Rerun the search.
Project assets that are added while you create segmented data assets might not be available for selection
Applies to: 5.3.0 and later
If assets are added to the project while you are viewing the list of data assets to pick the column for segmentation, these new assets are listed, but you cannot select them.
Workaround: Cancel the creation process and start anew.
An extra path to manage catalogs in the navigation menu
Applies to: 5.3.0 and later
If you have the Manage catalogs user permission, an extra Administration > Catalogs path to manage catalogs shows up in the navigation menu.
Row filter data protection rules are not available for previews of SQL query asset type
Applies to: 5.3.0 and later
Row filter data protection rules do not work for previews of SQL query asset type. If any of the row filter rule applies to the SQL query asset in a governed catalog, the asset viewer will see an error.
Workaround: Choose one of the following methods:
- As a Data Steward, disable the row filter rule, which is applied on the SQL query asset.
- Edit the SQL query with where predicates are required at the time of creation of the asset in the project and then publish it to the catalog.
Pagination selection dropdown no longer available on the Catalogs page
Applies to: 5.3.0 and later
Pagination selection dropdown that let you navigate to a specfic page is no longer available on the Catalogs page. You must use the left (previous page) or right (next page) arrows to get to the required page.
Can't add empty or zero-byte files to catalogs
Applies to: 5.3.0
You can't add empty or zero-byte files when you're adding local files to a catalog (on the catalog asset page, when you click Add to catalog and select Local files).
Workaround: Use the API to add empty or zero-byte files to catalogs as described in the Data and AI Common Core API documentation.
Migrating data source definitions from the Platform assets catalog will fail
Applies to: 5.3.0 and later
Data source definitions can't be migrated and attempts to migrate data source definitions will cause the migration to fail.
Workaround: There is currently no workaround for this issue.
You can migrate all other content from the Platform assets catalog without issues.
Associated connection isn't added when you add a vector index from catalog to project
Applies to: 5.3.0 and later
When you add a vector index (In memory, Elasticsearch, watsonx.data Milvus) to a project from a catalog, the associated connection isn't automatically added with it.
You can't manage more than 20 assets at the same time
Applies to: 5.3.0 and later
You can edit, delete only up to 20 assets at the same time.
Issues with publishing data quality rules
Applies to: 5.3.1
When a rule has a connection for input data asset and a connection with the same name already exists in the catalog, publishing the rule does not overwrite or update the connection in the catalog, regardless of the duplicate asset handling settings in the catalog.
Workaround: Publish the connection before you publish the data quality rule.
Governance artifacts
Cannot use CSV to move data class between Cloud Pak for Data instances
Applies to: 5.3.0 and later
If you try to export data classes with matching method Match to reference data to CSV, and then import it into another Cloud Pak for Data instance, the import fails.
Workaround: For moving governance artifact data from one instance to another, especially data classes of this matching method, use the ZIP format export and import. For more information about the import methods, see Import methods for governance artifacts in the Cloud Pak for Data documentation.
Metadata import
Metadata import jobs might be stuck due to issues related to RabbitMQ
Applies to: 5.3.0 and later
If the metadata-discovery pod starts before the rabbitmq pods are up after a cluster reboot, metadata import jobs can get stuck while attempting to get the job run logs.
Workaround: To fix the issue, complete the following steps:
- Log in to the OpenShift console by using admin credentials.
- Go to Workloads > Pods.
- Search for rabbitmq.
- Delete the
rabbitmq-0,rabbitmq-1, andrabbitmq-2pods. Wait for the pods to be back up and running. - Search for discovery.
- Delete the
metadata-discoverypod. Wait for the pod to be back up and running. - Rerun the metadata import job.
Data assets might not be imported when running an ETL job lineage import for DataStage flows
Applies to: 5.3.0 and later
When you create and run a metadata import with the goal Get ETL job lineage where the scope is determined by the Select all DataStage flows and their dependencies in the project option, data assets from the connections associated with the DataStage flows are not imported.
Workaround: Explicitly select all DataStage flows and connections when you set the scope instead of using the Select all DataStage flows and their dependencies in the project option.
When a job for importing lineage metadata hangs, it cannot be stopped
Applies to: 5.3.0 and later
When you run a lineage metadata import and the job stops responding, the job can't be stopped.
Only files with .sql extension can be provided as manual input for metadata import from the Oracle and PostgreSQL sources
Applies to: 5.3.0 and later
When you import metadata from the Oracle and PostgreSQL sources, only .sql files can be used as manual input. Other formats like files with .pck extension can't be used. This limitation is applicable when you install
IBM Manta Data Lineage.
Assets and lineage might not be imported if many connections use the same data source
Applies to: 5.3.0 and later
If more than one connection points to the same data source, for example to the same Db2 database, importing lineage might not be successful. Assets and lineage metadata might not be imported in such case. When you create a connection to use with metadata import, make sure that only one connection points to a selected data source.
When you import a project from a .zip file, the metadata import asset is not imported
Applies to: 5.3.0 and later
When you import a project from a file, metadata import assets might not be imported. The issue occurs when a metadata import asset was imported to a catalog, not to a project, in the source system from which the project was exported. This catalog does not exist on the target system and the metadata import asset can't be accessed.
Workaround: After you import the project from a file, duplicate metadata import assets and add them to a catalog that exists on the target system. For details, see Duplicating a metadata import asset.
Lineage metadata cannot be imported from the Informatica PowerCenter connection
Applies to: 5.3.0 and later
When you import lineage metadata from the Informatica PowerCenter connection, the metadata job run fails with the following message:
400 [Failed to create discovery asset. path=/GLOBAL_DESEN/DM_PES_PESSOA/WKF_BCB_PES_PESSOA_JURIDICA_DIARIA_2020/s_M_PEJ_TOTAL_03_CARREGA_ST3_2020/SQ_FF_ACFJ671_CNAE_SECUND�RIA details=ASTSV3030E: The field 'name' should contain valid unicode characters.]",
"more_info" : null
Workaround: Ensure that the encoding value is the same in the workflow file in Informatica PowerCenter and in the connection that was created in Automatic Data Lineage. If the values are different, use the one from the Informatica
PowerCenter workflow file.
To solve the issue, complete these steps:
-
Open Automatic Data Lineage:
https://<CPD-HOSTNAME>/manta-admin-gui/ -
Go to Connections > Data Integration Tools > IFPC and select the connection for which the metadata import failed.
-
In the Inputs section, change the value of the Workflow encoding parameter to match the value from the Informatica PowerCenter workflow file.
-
Save the connection.
-
In IBM Knowledge Catalog, reimport assets for the metadata import that failed.
Metadata enrichment
Profiling in catalogs, projects, and metadata enrichment might fail for Teradata connections
Applies to: 5.3.0 and later
If a Generic JDBC connection for Teradata exists with a driver version before 17.20.00.15, profiling in catalogs and projects, and metadata enrichment of data assets from a Teradata connection fails with an error message similar to the following one:
2023-02-15T22:51:02.744Z - cfc74cfa-db47-48e1-89f5-e64865a88304 [P] ("CUSTOMERS") - com.ibm.connect.api.SCAPIException: CDICO0100E: Connection failed: SQL error: [Teradata JDBC Driver] [TeraJDBC 16.20.00.06] [Error 1536] [SQLState HY000] Invalid connection parameter name SSLMODE (error code: DATA_IO_ERROR)
Workaround: For this workaround, users must be enabled to upload or remove JDBC drivers. For more information, see Enable users to upload, delete, or view JDBC drivers.
Complete these steps:
- Go to Data > Connectivity > JDBC drivers and delete the existing JAR file for Teradata (
terajdbc4.jar). - Edit the Generic JDBC connection, remove the selected JAR files, and add
SSLMODE=ALLOWto the JDBC URL.
For assets from SAP OData sources, the metadata enrichment results do not show the table type
Applies to: 5.3.0 and later
In general, metadata enrichment results show for each enriched data asset whether the asset is a table or a view. This information cannot be retrieved for data assets from SAP OData data sources and is thus not shown in the enrichment results.
Data quality
Rules run on columns of type timestamp with timezone fail
Applies to: 5.3.0 and later
The data type timestamp with timezone is not supported. You can't apply data quality rules to columns with that data type.
MANTA Automated Data Lineage for IBM Cloud Pak for Data
Not all stages are displayed in technical data lineage graph for the imported DataStage ETL flow
Applies to: 5.3.0 and later
When you import a DataStage ETL flow and view it in the technical data lineage graph, only three stages are displayed, even when four stages were imported.
Workaround: By default, three connected elements are displayed in the graph. To display more elements, click the expand icon on the last or the first displayed element on the graph.
Lineage is not reimported after the connection is reimported in Manta Admin UI
Applies to: 5.3.0 and later
After you delete and restore connections in Manta Admin UI, reimporting the lineage metadata for these connections in Cloud Pak for Data fails. The issue occurs when the following steps are taken:
- You create projects, connections to data sources and catalogs in Cloud Pak for Data.
- You run metadata imports with the goal Get lineage. The imports are successful and the connections are listed in Manta Admin UI.
- You export all connections from Manta Admin UI by using the Export all connections option.
- You remove all connections from Manta Admin UI.
- You import the same connections in Manta Admin UI again by using the Import connection option.
- You rerun the metadata imports with the goal Get lineage again for the same connections but the imports are not successful and errors are displayed.
Workaround: The issue is caused by the presence of the truststore properties in the exported .json connection files. For example, for the Db2 data source, the properties are db2.extractor.truststore.path and db2.extractor.truststore.password.
You can apply one of the following workarounds:
- Before you import the connection back in Manta Admin UI, manually remove these properties from the connection .json file.
- Export connections one by one, not by using the Export all connections option. When you export the connections individually, the truststore properties are not saved in the .json file.
Columns are displayed as numbers for a DataStage job lineage in the catalog
Applies to: 5.3.0 and later
The columns for a lineage that was imported from a DataStage job are not displayed correctly in the catalog. Instead of column names, column numbers are displayed. The issue occurs when the source or target of a lineage is a CSV file.
Business lineage
An unnecessary edge appears when expanding data integration assets
Applies to: 5.3.0 and later
After expanding a data integration asset and clicking Show next or Show all, the transformer nodes will have an unnecessary edge that points to themselves.