Known issues and limitations for IBM Knowledge Catalog

The following known issues and limitations apply to IBM Knowledge Catalog and to watsonx.data intelligence.

Known issues

General

Installing, upgrading, and uninstalling

Migration and removal of legacy functions
For known issues with migration from InfoSphere Information Server, see Known issues for migration from InfoSphere Information Server.

Catalogs and Projects

Governance artifacts

Custom workflows

-Workflow types not available

Metadata import

Metadata enrichment

Data quality

Generative AI capabilities

MANTA Automated Data Lineage for IBM Cloud Pak for Data

Business lineage

Relationship explorer

Also see:

Limitations

Catalogs and Projects

Governance artifacts

Metadata import

Metadata enrichment

Data quality

Business lineage

General issues

You might encounter these known issues and restrictions when you work with the IBM Knowledge Catalog service.

Assets imported with the user admin instead of cpadmin

For Cloud Pak for Data clusters with Identity Management Service enabled, the default administrator is cpadmin. However, for import, the default administrative user admin is used. Therefore, the assets are imported with the admin user instead of cpadmin.

Applies to: 5.2.0

Workaround:

Before running the import, apply the following workaround:

  1. Edit the config map by executing oc edit cm catalog-api-exim-cm

  2. Manually update the environment variable admin_username in import-job.spec.template.spec.env from:

    - name: admin_username
    value: ${admin_username}
    

    to:

    - name: admin_username
    value: cpadmin
    

Search bar returning incorrect results

  • Searching for assets when using the search bar returns unexpected results if only one or two characters are used.

    Applies to: 5.2.0

    Workaround: Type at least three characters in the search bar.

Installing, upgrading and uninstalling

You might encounter these known issues while installing, upgrading or uninstalling IBM Knowledge Catalog.

When uninstalling Manta Data Lineage, re-installing IBM Knowledge Catalog runs into issues

Applies to: 5.2.0

You can install Manta Data Lineage with IBM Knowledge Catalog. If you uninstall Manta Data Lineage, and then try to re-install the wkc-cr for IBM Knowledge Catalog, you may run into issues. The wkc-post-install-init pod may fail to restart.

Workaround: To fix this issue, restart the ibm-nginx pods, then restart the wkc-operator pod. This will put the wkc-operator in the completed state.

After the upgrade to 5.2, predefined roles are missing permissions

Applies to: 5.2.x

After the upgrade from IBM Knowledge Catalog 4.7.x or 4.8.x to IBM Knowledge Catalog 5.2.x, some permissions are missing from Data Engineer, Data Quality Analyst, and Data Steward roles. Users with these roles might not be able to run metadata imports or access any governance artifacts.

Workaround: To add any missing permissions to the Data Engineer, Data Quality Analyst, and Data Steward roles, restart the zen-watcher pod by running the following command:

oc delete pod $(oc get pod -n ${PROJECT_CPD_INST_OPERANDS} -o custom-columns="Name:metadata.name" -l app.kubernetes.io/component=zen-watcher --no-headers) -n ${PROJECT_CPD_INST_OPERANDS}

After upgrading to 5.2.x, the ingestion service pod service crashes

Applies to: 5.2.0

After upgrading IBM Knowledge Catalog to Version 5.2.x, or later, the Knowledge Graph ingestion service pod wdp-kg-ingestion-service-xxx may crash.

Workaround: Run the following steps after upgrading:

  1. Find the number of pods running ingeston service:

    oc get deployment wdp-kg-ingestion-service -n ${PROJECT_CPD_INSTANCE}
    
  2. Scale the ingestion service to 0:

    oc scale deployment wdp-kg-ingestion-service --replicas=0 -n ${PROJECT_CPD_INSTANCE}
    
  3. Wait for the ingestion service pods to end. Run the following command to check:

    oc get pod -n ${PROJECT_CPD_INSTANCE} | grep ingestion
    
  4. Get the credentials to login go to the RabbitMQ web console:

    oc get secret rabbitmq-ha -o json -n ${PROJECT_CPD_INSTANCE}
    

    Keep note of the values for the following:

    • rabbitmq-username
    • rabbitmq-password

    Decode the username and password and use the decoded value when logging in to the RabbitMQ console:

    echo <rabbitmq-password> | base64 -d
    
  5. On your local machine run the following:

    oc port-forward rabbitmq-ha-0 15671:15671
    
  6. Open the following URL in your browser:

    https://localhost:15671/#/queues
    
  7. Using the RabbitMQ web console, find and delete all the queues starting with kg*. The complete list includes:

    kg-queue
    kg-queue-cams-bulk
    kg-queue-glossary
    kg-queue-policy
    
  8. Scale the ingestion service back to the original size, which will depend on the installation of the chosen cluster. For example, you can run:

    oc scale deployment wdp-kg-ingestion-service --replicas=1 -n ${PROJECT_CPD_INSTANCE}
    
  9. Wait for all pods to become ready. Run the following to check the progress:

    oc get pods -n ${PROJECT_CPD_INSTANCE} | grep ingestion
    
  10. Using the RabbitMQ web console verify that the kg* queues are re-created.

  11. If you see assets that do not display the lineage graph correctly, run the re-sync operation:

    oc create job -n ${PROJECT_CPD_INSTANCE} --from=cronjob/wkc-search-lineage-cronjob lineage-job
    

    This operation is time-consuming and should only be run if necessary. For more information about re-sync, see Resync of lineage metadata.

IBM Knowledge Catalog portal-catalog pod out of sync after upgrade

Applies to: 5.2.0

When upgrading IBM Knowledge Catalog and changing editions, the portal-catalog pod may become out of sync, leading to missing functionality that should be enabled from the upgrade.

Workaround: To enable the missing functionality, restart the portal-catalog pod after upgrading IBM Knowledge Catalog.

After installing IBM Knowledge Catalog, some pods may go into the error state and related jobs will fail

Applies to: 5.2.0
Fixed in: 5.2.1

After installing IBM Knowledge Catalog, during the post-install steps when the apply-cr commands are running, pods related to kg-resync-glossary and jobs related to these pods may fail.

Workaround: To fix this issue, run the following steps:

  1. Check for pods that are in the failed status:
    oc get pod -n ${PROJECT_CPD_INST_OPERANDS} | grep kg-resync-glossary-
    
  2. Check the corresponding job status for those pods:
    oc get job kg-resync-glossary -n ${PROJECT_CPD_INST_OPERANDS}
    
  3. Delete the kg-resync-glossary job:
    oc delete job kg-resync-glossary -n ${PROJECT_CPD_INST_OPERANDS}
    
  4. Reconcile the custom resource (CR) by restarting the wkc-operator pod:
    oc delete pod ibm-cpd-wkc-operator-xxxx-xxxx -n ${PROJECT_CPD_INST_OPERATORS}
    
  5. Wait for the CR reconciliation to comeplete and check the pods. Then, the kg-resync-glossary-xxxx pod should be completed.

Installing IBM Knowledge Catalog on Power architecture fails to deploy Data Quality or Knowledge Graph services

Applies to: 5.2.0
Fixed in: 5.2.1

When you install IBM Knowledge Catalog on Power architecture (ppc64le), the installation will fail if one of the Data Quality or Knowledge Graph features is enabled. These components are not supported on Power architecture (ppc64le) and should not be enabled.

Workaround: Run the following commands to fix this issue:

  1. In the IBM Knowledge Catalog custom resource (wkc-cr) spec, set enableDataQuality and enableKnowledgeGraph to false:

    oc patch wkc wkc-cr --patch '{"spec": {"enableDataQuality": false}}' --type='merge' -n ${PROJECT_CPD_INST_OPERANDS}
    oc patch wkc wkc-cr --patch '{"spec": {"enableKnowledgeGraph": false}}' --type='merge' -n ${PROJECT_CPD_INST_OPERANDS}
    
  2. Delete knowledgegraph CRs if they exist:

    oc delete knowledgegraph knowledge-graph-cr -n ${PROJECT_CPD_INST_OPERANDS}
    
  3. Reconcile the IBM Knowledge Catalog custom resource (wkc-cr) by restarting the wkc-operator pod:

    oc  delete pod ibm-cpd-wkc-operator-xxxx-xxxx -n ${PROJECT_CPD_INST_OPERATORS}
    

Scaling IBM Knowledge Catalog resources on Power architecture results in an out-of-memory failure

Applies to: 5.2.0
Fixed in: 5.2.1

When you try to scale IBM Knowledge Catalog resources on Power architecture, the apply-scale-config command times out. The wkc operator goes into OOMkilled status.

Workaround: In the wkc operator, increase the memory limits to 2Gi and the memory requests to 1.5Gi.

oc -n ${PROJECT_CPD_INST_OPERATORS}$ patch csv ibm-cpd-wkc.v2.2.0 \
  --type='json' \
  -p='[
    {
      "op": "replace",
      "path": "/spec/install/spec/deployments/0/spec/template/spec/containers/0/resources",
      "value": {
        "limits": {
          "cpu": "750m",
          "ephemeral-storage": "2Gi",
          "memory": "2Gi"
        },
        "requests": {
          "cpu": "100m",
          "ephemeral-storage": "2Gi",
          "memory": "1536Mi"
        }
      }
    }
  ]'

After the upgrade to 5.2.x, custom resource settings for the wkc-term-assignment pod are lost

Applies to: 5.2.0 and later

When you upgrade IBM Knowledge Catalog to version 5.2.x on a Red Hat OpenShift on IBM Cloud (ROKS) cluster, some custom resource settings for the wkc-term-assignment pod aren't applied, and the default values are used instead for the cpu and memory limits.

Workaround: When you upgrade IBM Knowledge Catalog to version 5.2.x, explicitly set these wkc-cr properties:

spec:
  wkc_term_assignment_resources:
    limits:
      cpu: "2"
      memory: 4Gi

Catalog and project issues

You might encounter these known issues and restrictions when you use catalogs.

Details for masked columns display incorrectly

Applies to: 5.2.0
Fixed in: 5.2.1

In the asset preview page, which is known to happen for virtualized join views and watsonx.data connected data, the value for the Masked columns displays an incorrect count. In addition, the masked indicator icon image A masked columns indicator icon is either missing or incorrectly displayed from the header of columns with masked data.

When a deep enforcement solution is configured to protect a data source, protection is subject to that configured deep enforcement solution to apply column masking. Each protection solution has its own semantics for applying data masking and thus masking indicators that are displayed in the user interface might not align with the actual columns masked.

For details on how masking rules apply to virtualized views, see Authorization model for views in the Cloud Pak for Data documentation.

Workaround: None.

Unauthorized users might have access to profiling results

Applies to: 5.2.0 and later

Users who are collaborators with any role in a project or a catalog can view an asset profile even if they don't have access to that asset at the data source level or in Data Virtualization.

Workaround: Before you add users as collaborators to a project or a catalog, make sure they are authorized to access the assets in the container and thus to view the asset profiles.

Cannot run import operations on a container package exported from another Cloud Pak for Data cluster

Applies to: 5.2.0 and later

When you're importing a container package exported from another Cloud Pak for Data cluster, permissions on the archive must be configured so that export operations are available on the target cluster and the files within the archive can be accessed.

Workaround: To extract the export archive and modify permissions, complete the following steps:

  1. Create a temporary directory:

    mkdir temp_directory
    
  2. Extract the archive:

    tar -xvf cpd-exports-<export_name>-<timestamp>-data.tar --directory temp_directory
    
  3. Clients must run the following command on the target cluster:

    oc get ns $CLUSTER_CPD_NAMESPACE -o=jsonpath='{@.metadata.annotations.openshift\.io/sa\.scc\.supplemental-groups}'
    

    Example output: 1000700000/10000.

  4. Apply the first part of the output of the previous step (ex. 1000700000) as the new ownership on all files within the archive. Example:

    cd temp_directory/
    chown -R 1000700000:1000700000 <export_name>
    
  5. Archive the fixed files with the directory. Use the same export name and timestamp as the original exported tar:

    tar -cvf cpd-exports-<export_name>-<timestamp>-data.tar <export_name>/
    
  6. Upload the archive.

Data protection rules don't apply to column names that contain spaces

Applies to: 5.2.0 and later

If a column name contains trailing or leading spaces during import, the column cannot be masked using data protection rules.

Workaround: When you're importing columns, ensure that column names don't contain trailing or leading spaces.

Preview of data from file-based connections other than IBM Cloud Object Storage is not fully supported

Applies to: 5.2.0

Connected assets from file-based connections other than IBM Cloud Object Storage do not preview correctly. Data might appear in a table with missing and/or incorrect data. There is no workaround at this time.

Scroll bar is not visible when adding assets to a project on MacOS

When adding assets to a project, the scroll bar might not be available in the Selected assets table, showing a maximum of 5 assets.

Applies to: 5.2.0

Workaround: Change the MacOS settings:

  1. Click the Apple symbol in the top-left corner of your Mac's menu bar, then click System Settings.
  2. Scroll down and select Appearance.
  3. Under the Show scroll bars option, click the radio button next to Always.

Unexpected assets filtering results in catalogs

Applies to: 5.2.0

In catalogs, when you are searching for an asset by using Find assets field, the search might return assets whose names don't match the name string that you typed in the search field and assets that contain a keyword in a property or a related item associated with the typed name string.

Can't create a connection if you're including a reference connection

Applies to: 5.2.0 and 5.2.1
Fixed in: 5.2.2

When you're adding connections that contain references to catalogs, you might see the following error: Unable to create connection</br>An unexpected error occurred of type Null pointer error. No further error information is available.

Workaround: Reference connections are not supported. Ensure that the platform connection doesn't contain any reference connections.

Migrating data source definitions from the Platform assets catalog will fail

Applies to: 5.2.0

Data source definitions can't be migrated and attempts to migrate data source definitions will cause the migration to fail.

Workaround: There is currently no workaround for this issue.
You can migrate all other content from the Platform assets catalog without issues.

Can't use Git projects with identical data assets

Applies to: 5.2.0 and later

Identical data assets don't work with Git projects.

Workaround: To publish assets from catalog to a Git project, check out a branch in the Git project first.

Lineage not established for document libraries with no document sets

Applies to: 5.2.0
Fixed in: 5.2.1

If you create a document library in a project with no document sets, lineage isn't properly created as the structure is established when you set a relationship from a document set to document library. The View lineage button is available, but the feature doesn't work as expected.

Workaround: To use lineage, add at least one document set to your document library.

Masking flow job page crashes if Target schema is selected

Applies to: 5.2.0

If you use Mask option as the asset context option to create masking flow job in Select targets page, masking flow job page will crash.

Workaround:

  1. In the project, from the Assets tab, click New Asset.
  2. Select Copy and mask data, select Bulk copy or Copy related records across tables.
  3. Select connection and schema in Select targets page and proceed to run the job.

Connected data assets with shared properties in the Draft state

Applies to: 5.2.0 and later Fixed in: 5.2.1

When you add a connected data asset from a governed catalog to a project, the asset changes its state to Draft even if none of the asset properties were updated.

When you publish a connected data asset from a project to a governed catalog, the asset remains in the Draft state.

Mismatched column metadata information

Applies to: 5.2.0 and later

If you add columns to an existing asset, you can see the new columns in the Assets tab, but not in the Overview or Profile tabs.

Workaround: Reprofile the asset to view the changes by running a metadata import with the Discover option.

Asset membership is changed or removed when identical data assets are published from projects to catalogs

Applies to: 5.2.1
Fixed in: 5.2.2

When an identical data asset is added to project and then published from that project to a governed catalog in which identical data assets are recognized, existing asset membership information is changed for all published identical data assets across catalogs. Asset members from the asset in the project are applied to all identical data assets with the same identification key in published catalogs. In most cases, the asset owner becomes the user who created the asset in the project. As a result, asset members might lose access to assets.

For example, an identical data asset asset1 is published in catalog1, catalog2, and catalog3 and has user1 as the asset owner and user2 as the asset editor.

The same identical data asset asset1 is added to project1 by user1. No other asset members are specified.

When asset1 from project1 is published to a catalog, the asset membership value for all already published instances of asset1 (in catalog1, catalog2, and catalog3) is changed from user1 (asset owner) and user2 (asset editor) to user1 (asset owner).

Workaround: After you edit and publish identical data assets from a project to catalog, update the asset members manually.

Creating a project activity report generates an incomplete file

Applies to: 5.2.0 and later

When creating an activity report for a project with logging enabled, the generated file omits expected events.

Workaround:

A cluster administrator must restart the event logger API pods. After the restart, create the report again. The new report will include new events, but previously missed events cannot be recovered.

Can't add user groups when you're using a non-English language in your browser

Applies to: 5.2.2

If you're using a non-English language in your browser and want to add a user group on the Asset access control page, after you click Add user group, the Add user tearsheet with a list of single users opens instead.

Workaround: Change the language of your browser to English.

Governance artifacts issues

You might encounter these known issues and restrictions when you use governance artifacts.

Error Couldn't fetch reference data values shows up on screen after publishing reference data

Applies to: 5.2.0

When new values are added to a reference data set, and the reference data set is published, the following error is displayed when you try to click on the values:

Couldn't fetch reference data values. WKCBG3064E: The reference_data_value for the reference_data which has parentVersionId: <ID> and code: <code> does not exist in the glossary. WKCBG0001I: Need more help?

When the reference data set is published, the currently displayed view changes to Draft-history as marked by the green label on the top. The Draft-history view does not allow to view the reference data values.

Workaround: To view the values, click Reload artifact so that you can view the published version.

Publishing large reference data sets fails with Db2 transaction log full

Applies to: 5.2.0

Publishing large reference data sets might fail with a Db2 error such as:

The transaction log for the database is full. SQLSTATE=57011

Workaround: Publish the set in smaller chunks, or increase Db2 transaction log size as described in the following steps.

  1. Modify the transaction log settings with the following commands:

    db2 update db cfg for bgdb using LOGPRIMARY 5 --> default value, should not be changed
    db2 update db cfg for bgdb using LOGSECOND 251
    db2 update db cfg for bgdb using LOGFILSIZ 20480
    
  2. Restart Db2.

You can calculate the required transaction log size as follows:

(LOGPRIMARY + LOGSECOND) * LOGFILSIZ

For publishing large sets, the following Db2 transaction log sizes are recommended:

  • 5GB for 1M reference data values and 300K relationships
  • 20GB for 1M reference data values and 1M relationships
  • 80GB for 1M reference data values and 4M relationships

where the relationship count is the sum of the parent, term and value mapping relationships for reference data values in the set.

Imported data assets with assigned out-of-the-box data classes or terms have incorrect identifiers resulting in no enforcement of data protection rules

When you migrate data assets across Cloud Pak for Data instances and these assets have out-of-the-box data classes or terms assigned, the imported data assets indicate correct data class or term assignments but the assigned artifact ID is incorrect. As a result, any operations that reference the data class or term, such as data protection rules, can't be applied to the imported data assets.

Relationships between catalog assets and out-of-the-box governance artifacts cannot be migrated correctly.

Applies to: All versions of Cloud Pak for Data beginning with 4.0 and later.

Workaround: none

Business terms remain after the semantic automation layer integration is deleted from IBM watsonx.data

Applies to: 5.2.0
Fixed in: 5.2.1

Business terms that were imported to IBM Knowledge Catalog for a semantic automation layer (SAL) integration in watsonx.data are not removed when the integration is deleted. This can result in duplicate business terms if a new SAL integration is subsequently enabled and the same or similar business terms are uploaded again.

Workaround: To avoid duplicate business terms, the cluster administrator or the user who originally created the SAL registration must manually delete all business terms that were imported for the SAL integration.

Can't cascade delete a Knowledge Accelerator category

Applies to: 5.2.0 and later

If you run cascade delete of a Knowledge Accelerator category, the operation might fail due to deadlocks.

Workaround: In case of deadlocks, retry cascade delete of the same root category until it's deleted.

wkc-glossary-service pod restarting when updating or creating multiple business terms

When updating or creating large numbers of business terms, the wkc-glossary-service pod is restarting due to reaching the CPU limit.

Applies to: 5.2.0

Workaround: Increase the CPU limit for wkc-glossary-service as described in Manually scaling resources for services.

Generating business terms into a view-only category

If you have the Viewer role on a category, you can select the category as the primary category for a metadata enrichment job and successfully generate draft business terms. However, you can't access the business terms after they're generated and publish them. Only the admin user can access such terms.

Applies to: 5.2.0
Fixed in: 5.2.1

Workaround: Before you select a primary category, ensure that you have the Admin, Editor, or Owner role.

403 Forbidden errors when generating business terms

Applies to: 5.2.0
Fixed in: 5.2.1

When you're generating business terms, you might see 403 Forbidden errors. The errors are caused by the username, displayName, and permissions not being set correctly when a Cloud Pak for Data token is generated.

Workaround: Manually setting the TA_CP4D_SERVICE_TOKEN_API_VERSION=2 environment variable as described in https://www.ibm.com/support/pages/node/7236111.

Column data is not redacted in accordance with the rule precedence

Applies to: 5.2.0

Some advanced partial masking options get ignored by the rule precedence mechanism when the rules resolution is executed, which results in incorrect data redaction.

Workaround: If there are multiple rules with partial masking specified, ensure that all of the partial masking columns are the same in all of these rules for the same data class.

Column data is not redacted in accordance with the selected redact rule

Applies to: 5.2.0
Fixed in: 5.2.1

If you created an advanced masking rule to partially redact the User name and the Domain name, the column data doesn't get redacted as expected.

Custom workflows issues

You might encounter these known issues and restrictions when you use custom workflows.

Workflow types not available

When custom request workflow are used and a user group is added to the list of start users (in Workflow Type > Overview > Access), it is not possible to retrieve the workflow types anymore. This leads to missing workflow types in workflow management UI and task inbox. The Start new request button is not available either.

Applies to: 5.2.2

Workaround: To solve this issue manually you can access the workflow type that contains the user group using the following URL:

https://<HOST>/gov/workflow/types/<WORKFLOW_TYPE_ID>

Go to Overview > Access and remove the user group from the list of users that can start a request.

If you don't know the workflow type ID, follow these steps to get all IDs:

  1. Open a terminal and run the following command to get an authorization bearer token:

    curl -k -X POST https://<HOST>/icp4d-api/v1/authorize     -H 'cache-control: no-cache'     -H 'content-type: application/json'     -d '{"username":"<USERNAME>","password":"<PASSWORD>"}'
    
  2. Run the following command to retrieve all workflow types:

    curl -X GET "https://<HOST>/v3/workflow_types" --insecure -H "accept: application/json" -H "Authorization: Bearer <TOKEN>"
    

Metadata import issues

You might encounter these known issues when you work with metadata import.

Assets are not imported from the IBM Cognos Analytics source when the content language is set to Japanese

Applies to: 5.2.0 and later

If you want to import metadata from the Cognos Analytics connection, where the user's content language is set to Japanese, no assets are imported. The issue occurs when you create a metadata import with the Get BI report lineage goal.

Workaround: In Cognos Analytics, change the user's content language from Japanese to English. Find the user for which you want to change the language, and change this setting in the Personal tab. Run the metadata import again.

When you import a project from a .zip file, the metadata import asset is not imported

Applies to: 5.2.0 and later

When you import a project from a file, metadata import assets might not be imported. The issue occurs when a metadata import asset was imported to a catalog, not to a project, in the source system from which the project was exported. This catalog does not exist on the target system and the metadata import asset can't be accessed.

Workaround: After you import the project from a file, duplicate metadata import assets and add them to a catalog that exists on the target system. For details, see Duplicating a metadata import asset.

Lineage metadata cannot be imported from the Informatica PowerCenter connection

Applies to: 5.2.0 and later

When you import lineage metadata from the Informatica PowerCenter connection, the metadata job run fails with the following message:

400 [Failed to create discovery asset. path=/GLOBAL_DESEN/DM_PES_PESSOA/WKF_BCB_PES_PESSOA_JURIDICA_DIARIA_2020/s_M_PEJ_TOTAL_03_CARREGA_ST3_2020/SQ_FF_ACFJ671_CNAE_SECUND�RIA details=ASTSV3030E: The field 'name' should contain valid unicode characters.]",
"more_info" : null

Workaround: Ensure that the encoding value is the same in the workflow file in Informatica PowerCenter and in the connection that was created in Automatic Data Lineage. If the values are different, use the one from the Informatica PowerCenter workflow file.
To solve the issue, complete these steps:

  1. Open Automatic Data Lineage:

    https://<CPD-HOSTNAME>/manta-admin-gui/
    
  2. Go to Connections > Data Integration Tools > IFPC and select the connection for which the metadata import failed.

  3. In the Inputs section, change the value of the Workflow encoding parameter to match the value from the Informatica PowerCenter workflow file.

  4. Save the connection.

  5. In IBM Knowledge Catalog, reimport assets for the metadata import that failed.

Dummy assets get created for any file assets that come from Amazon S3 to show the complete business data lineage if Get ETL job lineage is performed

Applies to: 5.2.0

If you perform Get ETL job lineage import involving Amazon S3 connection, dummy assets get created for any file assets that come from Amazon S3 connection to show the complete business data lineage. If you perform metadata import for the same Amazon S3 connection, a duplicate asset will get created for the dummy asset created from Get ETL job lineage import and a valid asset discovered during the metadata import.

SocketTimeoutException during metadata import

Applies to: 5.2.0 and later

During metadata import, when records from a CSV file that contains more than 30,000 rows are read, SocketTimeoutException is returned. This indicates a network issue where the connection between the client and server was unexpectedly closed.

Workaround:

  1. Log in to the OpenShift console.

  2. Go to Workloads > Pods > metadata-discovery-pod.

  3. Go to the Environment section.

  4. Search for the manta_wf_export_download environment variable and set it to true.

    Example:

    manta_wf_export_download=true
    

    By setting the variable, you're bypassing the socket timeout issue and downloading the CSV file to the local system. As a result, the CSV file can be read locally rather than over the network. After the CSV file is read, the locally downloaded file is deleted from the local system.

Business lineage shows Name not available when you're using Get BI report lineage option to import Tableau data assets

Applies to: 5.2.0

When you're using Get BI report lineage option to import Tableau data assets, business lineage for some of these assets shows Name not available instead of the actual data asset names due to extra object types that are generated by MANTA Automated Data Lineage service.

Workaround: Use technical data lineage to show the names of all data assets.

Asset matadata import fails with a timeout error when using connections created by using CLI

Applies to: 5.2.1
Fixed in: 5.2.2

When you're using connections that you created by using CLI to import asset metadata, the metadata import job fails with a timeout error after running for 60 minutes. To avoid the issue, create the connection from the Platform connections or the project UI.

Workaround:

  1. In the Admin > Platform Connections, create a connection with the connection type: personal parameter.
  2. Add the connection to a project.
  3. Create metadata import with the connection.
  4. Target the metadata import assets to a project.
  5. Ensure that the assets get imported to the project successfully.

Metadata enrichment issues

You might encounter these known issues when you work with metadata enrichment.

Running primary key or relations analysis doesn't update the enrichment and review statuses

Applies to: 5.2.0 and later

The enrichment status is set or updated when you run a metadata enrichment with the configured enrichment options (Profile data, Analyze quality, Assign terms). However, the enrichment status is not updated when you run a primary key analysis or a relationship analysis. In addition, the review status does not change from Reviewed to Reanalyzed after review if new keys or relationships were identified.

Issues with the Microsoft Excel add-in

Applies to: 5.2.0 and later

The following issues are known for the Review metadata add-in for Microsoft Excel:

  • When you open the drop-down list to assign a business term or a data class, the entry Distinctive name is displayed as the first entry. If you select this entry, it shows up in the column but does not have any effect.

  • Updating or overwriting existing data in a spreadsheet is currently not supported. You must use an empty template file whenever you retrieve data.

  • If another user works on the metadata enrichment results while you are editing the spreadsheet, the other user's changes can get lost when you upload the changes that you made in the spreadsheet.

  • Only assigned data classes and business terms are copied from the spreadsheet columns Assigned / suggested data classes and Assigned / suggested business terms to the corresponding entry columns. If multiple business terms are assigned, each one is copied to a separate column.

Republishing doesn't update primary key information in catalog

Applies to: 5.2.0 and later

If you remove primary key information from a data asset that initially was published with the primary key information to a catalog with the duplicate-asset handling method Overwrite original assets in the metadata enrichment results and then republish the asset to that catalog, the primary key information on the catalog asset remains intact.

Workaround: Delete the existing catalog asset before you republish the data asset from the metadata enrichment results.

Masked data might be profiled when the data source is IBM watsonx.data

Applies to: 5.2.0 and later

If a user who is not the owner of a protected data asset in IBM watsonx.data adds such asset to a project and runs metadata enrichment on it, the masked data is sent for profiling. As a result, even the asset owner will see the profile with masked data.

Workaround: None.

Shallow relationship analysis might not return results

Applies to: 5.2.0
Fixed in: 5.2.1

If you run a shallow key relationship analysis on data assets for which no profiles exists, the relationship analysis job runs and completes without error, but not key relationships are computed. Neither the job status nor the job run log show error messages or other information explaining the missing relationships.

Workaround: Run metadata enrichment with the Profile data objective or run advanced profiling on the data assets for which you want to identify relationships before you run a key relationship analysis.

Review key relationships button might remain disabled after analysis run

Applies to: 5.2.0
Fixed in: 5.2.1

After running a key relationship analysis, the Review key relationships button might remain disabled even if relationships were assigned. In that case, you can't access and review the results.

Workaround: To enable the button to be able to view the relationship information, refresh the browser page.

Users with Editor role in the project and the catalog can't publish enriched assets

Applies to: 5.2.0
Fixed in: 5.2.0 day 0 patch

Publishing enriched data assets to a catalog fails for users with the Editor role in the project and are asset editors with the Editor role in the target catalog.

Workaround: Change the user's catalog collaborator role to Admin or have the asset owner publish the enriched assets.

No display names are generated if the retry count is exceeded

Applies to: 5.2.0 and 5.2.1
Fixed in: 5.2.2

When you run metadata enrichment with the Expand metadata objective to generate display names, no names might be suggested or assigned if the internal retry count is exceeded.

Workaround: Rerun the metadata enrichment.

In small system configurations, large metadata enrichment jobs can fail

Applies to: 5.2.2

In a system that is configured for smaller workloads, metadata enrichments that contain a lot of assets can fail with out-of-memory errors.

Workaround: To resolve the issue, you have these options:

  • Increase the CPU and memory values of the profiling pod.
  • Add one more replica.

Update these parameters in the wkc-cr custom resource:

  • For the number of replicas: the wdp_profiling_min_replicas and wdp_profiling_max_replicas values
  • For the CPU and memory values: the requests and limits entries for the wdp_profiling_resources parameter

Use the oc patch command to update the values. See the following example:

oc patch wkc wkc-cr -n ${PROJECT_CPD_INSTANCE} --type=merge -p '{"spec":{"wdp_profiling_min_replicas":"4","wdp_profiling_max_replicas":"4","wdp_profiling_resources":{"requests":{"cpu": "300m", "memory": "600Mi"}, "limits":{"cpu": "4000m", "memory": "8192Mi"}}}}'

Data quality issues

You might encounter these known issues when you work with data quality assets.

Rules bound to columns of the data type NUMERIC in data assets from Oracle data sources might not work

Applies to: 5.2.0 and later

Testing or running a data quality rule that is bound to a NUMERIC column in a data asset from an Oracle data source fails if the data source is connected through a Generic JDBC connection.

Workaround: Use the native connector.

Runs of migrated data quality rules complete with warnings

Applies to: 5.2.0 and later

When you run a data quality rule that was migrated from from InfoSphere Information Server, you might see a message like the following one:

7/2/2025 11:22:30 WARNING IIS-DSEE-TFIP-00072 <Modify_1> When checking operator: When binding output schema variable "outRec": When binding output interface field "col1" to field "col1": Implicit conversion from source type "int32" to result type "string[variable_max=10]": Converting number to string.

Such warnings are displayed when you run the DataStage job for a data quality rule that was created in InfoSphere Information Server as a Rule Stage with an output link of the type Violation details and then migrated to Cloud Pak for Data.

Workaround: You can ignore such warnings or set up a DataStage message handler to suppress such messages or to redurce their severity.

Accessing data quality information might show an error message

Applies to: 5.2.0
Fixed in: 5.2.1

When you access data quality information for an asset or a column from the metadata enrichment results or an asset's Data quality page, you might see an error message. This can happen when the connection session times out.

Workaround: Retry the action, for example, open the profiling results again.

Changes to toggle status on the Data quality page might not be reflected in the catalog after publishing an asset

Applies to: 5.2.0
Fixed in: 5.2.1

If you switch off the Contributes to overall score option for a column on an asset's Data quality page in the project and publish the asset, the toggle status on the Data quality page in the catalog still is On. The actual data quality score is properly reflected.

Historical stability check might not immediately reflect changes in row count

Applies to: 5.2.0
Fixed in: 5.2.1

If the row count for an asset was constant, a sudden change is not immediately reflected in the score provided by the historical stability check. However, when you drill down into the results, you can see the changes in row count.

Workaround: Rerun the historical stability check to have the score recalculated.

Data quality rules with duplicated variable names from multiple data quality definitions might not work

Applies to: 5.2.0 and later

When you create a data quality rule from multiple data quality definition that reference the same variable name, the variables are internally renamed for differentiation by adding a suffix. If the renaming results in a name collision with other variables from the definitions, the data quality rule will not work.

Possible values check doesn't work for numeric columns

Applies to: 5.2.0 and later

If you create a possible values check for a numeric column in metadata enrichment, data quality analysis returns an incorrect result. All values in the column are flagged as incorrect because the input values are treated as string values instead of numeric values.

Table names for rule output on DataStax Enterprise can't contain hyphens

Applies to: 5.2.2

If you configure a rule output table on DataStax Enterprise with a name that contains a hyphen, running the data quality rule fails with an output error. This applies to user-defined table names and to names that dynamically created through use of parameters.

Workaround: Specify a table name that does not contain any hyphen. Do not use parameters to dynamically create the table name.

Can't drill-down into data quality output on DataStax Enterprise

Applies to: 5.2.2

When you try to access details of data quality issues that are stored in a table on DataStax Enterprise, an error similar to the following one is displayed:

 Loading failed values failed.
 There was a problem loading the values that
 failed profiling.
 Internal Server Error Failed to get Fight info: CDICO2005E: Table could not be found:
 SQL syntax error: [IBM][Cassandra JDBC Driver)[Cassandra syntax error or access rule
 violation: base table or view not found: dgexceptions. If the table exists, then the user
 may not be authorized to see it.

This error occurs because the name of the data quality output table is created in a way that makes it case-sensitive in the underlying Cassandra database so that an uppercase name is generated. However, the SELECT statement for the lookup is constructed in a way that Cassandra search looks for a lowercase table name.

Generative AI capabilities

You might encounter these known issues when you work with the generative AI capabilities in metadata enrichment or use the Text2SQL feature.

Socket timeout errors can occur for Text2SQL API calls

Applies to: 5.2

Socket timeout errors can occur for Text2SQL API calls if the load balancer timeout setting for watsonx.ai is too low.

Workaround: Check the load balancer configuration and make sure that the timeout is set to 600s (10m). For more information, see Changing load balancer timeout settings.

Can't work with a remote watsonx.ai instance if the system is configured to run models on CPU

Applies to: 5.2.0 and later

If you installed IBM Knowledge Catalog or watsonx.data intelligence with the option enableModelsOn:cpu, the models used for the generative AI capabilities run locally on CPU. You cannot work with foundation models on a remote watsonx.ai instance.

For more information about setting up a connection with a remote watsonx.ai instance, see Configuring the setup for enabled gen AI capabilities in the Cloud Pak for Data documentation or Configuring the setup for enabled gen AI capabilities in the watsonx.data intelligence documentation.

Workaround: To be able to work with remote models, change your system configuration:

  1. Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task.

    oc login <your_openshift_cluster_url>
    
  2. Set the context to the project where IBM Knowledge Catalog or watsonx.data intelligence is deployed:

    oc project ${PROJECT_CPD_INST_OPERANDS}
    
  3. Patch the semanticautomation-cr custom resource to set the installation parameter enableModelsOn to remote.

     oc patch sal semanticautomation-cr --type=merge -p '{"spec":{"enableModelsOn": "remote"}}'
    

MANTA Automated Data Lineage

You might encounter these known issues and restrictions when MANTA Automated Data Lineage is used for capturing lineage.

Metadata import jobs for getting lineage might take very long to complete

Applies to: 5.2.0 and later

If multiple lineage scans are requested at the same time, the corresponding metadata import jobs for getting lineage might take very long to complete. This is due to the fact that MANTA Automated Data Lineage workflows can't run in parallel but are executed sequentially.

Chrome security warning for Cloud Pak for Data deployments where MANTA Automated Data Lineage for IBM Cloud Pak for Data is enabled

Applies to: 4.8.0 and later

When you try to access a Cloud Pak for Data cluster that has MANTA Automated Data Lineage for IBM Cloud Pak for Data enabled from the Chrome web browser, the message Your connection is not private is displayed and you can't proceed. This is due to MANTA Automated Data Lineage for IBM Cloud Pak for Data requiring an SSL certificate to be applied and occurs only if a self-signed certificate is used.

Workaround: To bypass the warning for the remainder of the browser session, type thisisunsafe anywhere on the window. Note that this code changes every now and then. The mentioned code is valid as of the date of general availability of Cloud Pak for Data 4.6.0. You can search the web for the updated code if necessary.

Columns are displayed as numbers for a DataStage job lineage in the catalog

Applies to: 5.2.0 and later

The columns for a lineage that was imported from a DataStage job are not displayed correctly in the catalog. Instead of column names, column numbers are displayed. The issue occurs when the source or target of a lineage is a CSV file.

MANTA Automated Data Lineage will not function properly on IBM Knowledge Catalog Standard

Applies to: 5.2.0 and later

If you install MANTA Automated Data Lineage when you have IBM Knowledge Catalog Standard installed as the prerequisite, MANTA will not function properly.

If you want to install MANTA, you will need to have IBM Knowledge Catalog Premium installed.

Not all stages are displayed in technical data lineage graph for the imported DataStage ETL flow

Applies to: 5.2.0 and later

When you import a DataStage ETL flow and view it in the technical data lineage graph, only three stages are displayed, even when four stages were imported.

Workaround: By default, three connected elements are displayed in the graph. To display more elements, click the expand icon on the last or the first displayed element on the graph.

Can't Get lineage with a DB2 connection in FIPS environments

Applies to: 5.2.0 and later

If you try to import metadata for data assets with the Get lineage scenario with a DB2 connection in a FIPS environment, the metadata import fails and the following error message is displayed.

Error creating the metadata import Metadata import creation failed due to connection validation errors. Not all connections in the metadata import passed validation. Check the log for the complete validation results.

Can’t create metadata hierarchy because REST API template isn’t available in Open Manta Designer

Applies to: 5.2.0
Fixed in: 5.2.1

You can’t create any metadata hierarchy for the REST API resource type by Open Manta Designer, because the manta-open-manta-designer-common-42.5.0.jar\resource_templates\REST API.yml file isn't available. In addition, if you try to import any existing metadata with REST API resource from Manta Flow Server to Open Manta Designer, the import fails.

Workaround: Contact IBM Support to obtain manta-open-manta-designer-common-42.5.0.jar\resource_templates\REST API.yml file.

Data Virtualization connection fails when SSL is required

Applies to: 5.2.0

When you're creating a connection to Data Virtualization and SSL is required, the metadata import lineage job will fail with an exception despite the connection validation passed.

Workaround:

  1. In the project where metadata lineage import job will be executed, create a new independent Data Virtualization connection.

  2. Download Data Virtualization SSL certificate from CP4D Menu > Data > Data virtualization > Menu > Configure connection > Download SSL Certificate.

  3. In the project where you run MDI Lineage job, create a new non-Platform Data Virtualization connection with exactly the same details as the Data Virtualization Platform connection in CP4D Menu > Data > Connections and enter the SSL certificate contents in the SSL certificate field.

  4. Use that connection to run metadata lineage import job.

Data export is incomplete due to table deduction in Query Service

Applies to: 5.2.0 and later

When you reference a table from another system in the Query Service using the synonym of that table, then the resulting object is deduced and exported data is incomplete.

Importing lineage for Microsoft Azure SQL Database fails

Applies to: 5.2.1
Fixed in: 5.2.2

When you create a lineage metadata import with the Get lineage goal from the Microsoft Azure SQL Database connection, the error occurs and import is not created. This issue occurs when you use MANTA Automated Data Lineage.

Business lineage issues

You might encounter these known issues and restrictions with lineage.

Business data lineage is incomplete for the metadata imports with Get ETL job lineage or Get BI report lineage goals

Applies to: 5.2.0 and later

In some cases, when you display business lineage between databases and ETL jobs or BI reports, some assets are missing, for example, a starting database. The data was imported by using the Get ETL job lineage or Get BI report lineage import option. Technical data lineage correctly shows all assets.

Workaround: Sometimes MANTA Automated Data Lineage cannot map the connection information from an ETL job or a BI report to the existing connections in IBM Knowledge Catalog. Follow these steps to solve the issue:

  1. Open the MANTA Automated Data Lineage Admin UI:

    https://<CPD-HOSTNAME>/manta-admin-gui/
    
  2. Go to Log Viewer and from the Source filter select Workflow Execution.

  3. From the Workflow Execution filter, select the name of the lineage workflow that is associated with the incomplete business lineage.

  4. Look for the dictionary_manta_mapping_errors issue category and expand it.

  5. In each entry, expand the error and click View Log Details.

  6. In each error details, look for the value of connectionString. For example, in the following error message, the value of the connectionString parameter is DQ DB2 PX.

    2023/11/14 18:40:12.186 PM [CLI] WARN - <provider-name> [Context: [DS Job 2_PARAMETER_SET] flow in project [ede1ab09-4cc9-4a3f-87fa-8ba1ea2dc0d8_lineage]]
    DICTIONARY_MANTA_MAPPING_ERRORS - NO_MAPPING_FOR_CONNECTION
    User message: Connection in use could not be automatically mapped to one of the database connections configured in MANTA.
    Technical message: There is no mapping for the connection Connection [type=DB2, connectionString=DQ DB2 PX, serverName=dataquack.ddns.net, databaseName=cpd, schemaName=null, userName=db2inst1].
    Solution: Identify the particular database technology DB2 leading to "DQ DB2 PX" and configure it as a new connection or configure the manual mapping for that database technology in MANTA Admin UI.
    Lineage impact: SINGLE_INPUT
    
  7. Depending on the connection that you used for the metadata import, go to Configuration > CLI > connection server > connection server Alias Mapping, for example DB2 > DB2 Alias Mapping.

  8. Select the connection used in workflow and click Full override.

  9. In the Connection ID field, add the value of the connectionString parameter that you found in the error details, for example DQ DB2 PX.

  10. Rerun the metadata import job in IBM Knowledge Catalog.

Hops for components and columns of components work only inside the data integration flow area of an expanded job node

Applies to: 5.2.0

When working with the lineage graph, hops for components and columns of data integration components work only inside the data integration flow area of an expanded job node and don't connect columns of nodes outside of the flow area.

Filtering business lineage might not filter assets

Applies to: 5.2.1
Fixed in: 5.2.2

When you apply a filter to a business lineage, the graph on the UI might contain assets that do not match the filter.

Viewing large business lineage graph results in a service error

Applies to: 5.2.0 and later

When you click the Lineage tab for a large business lineage graph, after around two minutes the following error message is displayed:

Lineage service error. An error occurred in the lineage service. Try to reload the lineage graph, and if the error persists, contact your administrator.

When you click the Lineage tab again, the lineage generation process starts, but after a while the following error is displayed:

Error 404 – Not Found

Workaround: To fix the issue, modify some of the configuration settings:

  1. Increase the timeout value in nginx pod:
    1. Open the configuration YAML file by running the command oc edit zenextension wkc-base-routes -n ${PROJECT_CPD_INSTANCE}.
    2. In the section location /data/catalogs, change the value of the proxy_read_timeout variable to 10m, and save the changes.
    3. Restart the nginx pod by running the command oc rollout restart deploy/ibm-nginx -n ${PROJECT_CPD_INST_OPERANDS}
  2. Allocate more memory to the service wkc-data-lineage-service, 8 GB or more, and enable horizontal scaling for the service:
    1. Update the lineage deployment with the environment variable oc edit deploy –n ${PROJECT_CPD_INSTANCE} wkc-data-lineage-service.
    2. In the containers section, change the default graph memory threshold by adding the variable lineage_memory_threshold_limit:
      - name: lineage_memory_threshold_limit
      value: "0.5"
      
    3. In the resources/limits section, increase the default memory allocation to 8192Mi:
      resources:
      limits:
         cpu: "2"
         ephemeral-storage: 1Gi
         memory: 8192Mi
      requests:
         cpu: 250m
         ephemeral-storage: 50Mi
         memory: 512Mi
      
    4. Change these settings to horizontally scale the lineage service to 2–3 instances:
      spec:
      progressDeadlineSeconds: 600
      replicas: 3
      HistoryLimit: 10
      

After you modify the configuration, try to open the Lineage tab again. If the error still occurs, change the value of the lineage_memory_threshold_limit parameter to 0.4.

Relationship explorer issues

You might encounter these known issues and restrictions with relationship explorer.

Governance artifacts and assets can’t be viewed on the canvas

Applies to: 5.2.0
Fixed in: 5.2.1

After upgrading to Cloud Pak for Data 5.2.0, newly created governance artifacts and assets won’t show on the relationship explorer canvas.

Workaround: Complete these steps:

  1. Run the following command:

    oc get pod -n <CPD_NAMESPACE> | grep ingestion
    

    Example of an output:

    wdp-kg-ingestion-service-dbb96df99-bjx4g       1/1     Running                  41 (2d18h ago)   8d
    wdp-kg-ingestion-service-dbb96df99-jtzjt       1/1     Running                  37 (2d18h ago)   8d
    wdp-kg-ingestion-service-dbb96df99-xh24z       1/1     Running                  27 (2d19h ago)   8d
    
  2. Delete every pod returned by previous command.

    oc delete pod wdp-kg-ingestion-service-dbb96df99-bjx4g
    oc delete pod wdp-kg-ingestion-service-dbb96df99-jtzjt
    oc delete pod wdp-kg-ingestion-service-dbb96df99-xh24z
    
  3. Wait for old pods to terminate. New pods will be started automatically. Wait for new pods to be in 1/1 Running state by running command again:

    oc get pod -n <CPD_NAMESPACE> | grep ingestion
    
  4. To view missing assets resync your lineage metadata. See, Resync of lineage metadata.

  5. To view missing governance artifacts run the following command:

    curl -k -X POST "https://<hostname>/v3/glossary_terms/admin/resync?artifact_type=all" --header "Content-Type: application/json" --header "Accept: application/json" --header "Authorization: Bearer ${TOKEN}" -d '{}'
    

    Replace <hostname> with your Cloud Pak for Data installation hostname.

  6. Alternativally, you can reimport your data for both assets and governance artifacts to CAMS.

404 Error occurs when opening relationship explorer

Applies to: 5.2.0 and later

Clicking the Explore relationships button from a catalog or project asset results in a 404 error.

Workaround: Check if your Cloud Pak for Data experience includes the IBM Knowledge Catalog any edition service. The relationship explorer is an IBM Knowledge Catalog any edition feature and can’t be accessed without it.

Primary and foreign keys are not visible for Google BigQuery assets

Applies to: 5.2.0 and later

When selecting a Google BigQuery asset on the relationship explorer canvas, the Item metadata panel is not showing primary and foreign keys.

Removed categories still appear in the relationship explorer

Applies to: 5.2.0
Fixed in: 5.2.1

When a category is removed from a relationship, the relationship explorer still display the deleted category. Even after updating or replacing a category relationship, the old category may continue to appear in the relationship explorer, despite no longer being associated with the asset.

Not all available governance artifacts are displayed on relationship explorer

Applies to: 5.2.1 and later

When you visualize data on relationship explorer, not all available governance artifacts are listed as related items. For example, when you select a primary category for the basis of the visualization, only part of its related governance artifacts can be added to the canvas.

Limitations

Catalogs and projects

Duplicate actions fail if dynamic IP address are used

Applies to: 5.2.0 and later

Duplicate actions work only for connections with static IP addresses. If the connection is using a hostname with a dynamic IP address, duplicate actions might fail during connection creation.

Long names of the asset owners get truncated when hovering over their avatars

Applies to: 5.2.0

When you are hovering over the avatar to show the long name of the asset owner in the side panel, the name gets truncated if it is longer than 40 characters or contains a space or a special character. If the name is longer than 40 characters, it will display correctly as long as it contains a space or '-' within the first 40 characters.

Can't add individual group members as asset members

Applies to: 5.2.0 and later

You can't add individual group members as asset members. You can add individual group members as catalog collaborators and then as asset members.

Catalog asset search doesn't support special characters

Applies to: 5.2.0 and later

If search keywords contain any of the following special characters, the search filter doesn't return the most accurate results.

Search keywords:

. + - && || ! ( ) { } [ ] ^ " ~ * ? : \

Workaround: To obtain the most accurate results, search only for the keyword after the special character. For example, instead of AUTO_DV1.SF_CUSTOMER, search for SF_CUSTOMER.

Missing default catalog and predefined data classes

Applies to: 5.2.0

The automatic creation of the default catalog after installation of the IBM Knowledge Catalog service can fail. If it does, the predefined data classes are not automatically loaded and published as governance artifacts.

Workaround: Ask someone with the Administrator role to follow the instructions for creating the default catalog manually.

Special or double-byte characters in the data asset name are truncated on download

Applies to: 5.2

When you download a data asset with a name that contains special or double-byte characters from a catalog, these characters might be truncated from the name. For example, a data asset named special chars!&@$()テニス.csv will be downloaded as specialchars!().csv.

The following character sets are supported:

  • Alphanumeric characters: 0-9, a-z, A-Z
  • Special characters: ! - _ . * ' ( )

Catalog UI does not update when changes are made to the asset metadata

Applies to: 5.2

If the Catalog UI is open in a browser while an update is made to the asset metadata, the Catalog UI page will not automatically update to reflect this change. Outdated information will continue to be displayed, causing external processes to produce incorrect information.

Workaround: After the asset metadata is updated, refresh the Catalog UI page at the browser level.

A blank page might be rendered when you search for terms while manually assigning terms to a catalog asset

Applies to: 5.2

When you search for a term to assign to a catalog asset and change that term while the search is running, it can happen that a blank page is shown instead of any search results.

Workaround: Rerun the search.

Project assets that are added while you create segmented data assets might not be available for selection

Applies to: 5.2

If assets are added to the project while you are viewing the list of data assets to pick the column for segmentation, these new assets are listed, but you cannot select them.

Workaround: Cancel the creation process and start anew.

An extra path to manage catalogs in the navigation menu

Applies to: 5.2.0

If you have the Manage catalogs user permission, an extra Administration > Catalogs path to manage catalogs shows up in the navigation menu.

Migrated connections aren't listed during importing from the asset browser

Applies to: 5.2.0

After you migrate connections and you try to import assets from any of the migrated connections, the asset browser is empty. No connections are listed even though the connection is available in the project.

Row filter data protection rules are not available for previews of SQL query asset type

Applies to: 5.2.0

Row filter data protection rules do not work for previews of SQL query asset type. If any of the row filter rule applies to the SQL query asset in a governed catalog, the asset viewer will see an error.

Workaround: Choose one of the following methods:

  • As a Data Steward, disable the row filter rule, which is applied on the SQL query asset.
  • Edit the SQL query with where predicates are required at the time of creation of the asset in the project and then publish it to the catalog.

Pagination selection dropdown no longer available on the Catalogs page

Applies to: 5.2.1 and later

Pagination selection dropdown that let you navigate to a specfic page is no longer available on the Catalogs page. You must use the left (previous page) or right (next page) arrows to get to the required page.

Governance artifacts

Cannot use CSV to move data class between Cloud Pak for Data instances

Applies to: 5.2.0

If you try to export data classes with matching method Match to reference data to CSV, and then import it into another Cloud Pak for Data instance, the import fails.

Workaround: For moving governance artifact data from one instance to another, especially data classes of this matching method, use the ZIP format export and import. For more information about the import methods, see Import methods for governance artifacts in the Cloud Pak for Data documentation.

Unable to use masked data in visualizations from data assets imported from version 4.8 or earlier

Applies to: 5.2.0 and later

If you import data assets with masked data from version 4.8 or earlier into your project, you cannot use these assets to create visualizations. If you attempt to generate a chart in the Visualization tab of a data asset from an imported asset that has masked data, the following error message is received: Bad Request: Failed to retrieve data from server. Masked data is not supported.

Workaround: To properly mask data with imported data assets in visualization, you must configure your platform with Data Virtualization as a protection solution. For more information, see the Data Virtualization as a protection solution section of the Protection solutions for data source definitions topic.

Metadata import

Metadata import jobs might be stuck due to issues related to RabbitMQ

Applies to: 5.2

If the metadata-discovery pod starts before the rabbitmq pods are up after a cluster reboot, metadata import jobs can get stuck while attempting to get the job run logs.

Workaround: To fix the issue, complete the following steps:

  1. Log in to the OpenShift console by using admin credentials.
  2. Go to Workloads > Pods.
  3. Search for rabbitmq.
  4. Delete the rabbitmq-0, rabbitmq-1, and rabbitmq-2 pods. Wait for the pods to be back up and running.
  5. Search for discovery.
  6. Delete the metadata-discovery pod. Wait for the pod to be back up and running.
  7. Rerun the metadata import job.

Data assets might not be imported when running an ETL job lineage import for DataStage flows

Applies to: 5.1

When you create and run a metadata import with the goal Get ETL job lineage where the scope is determined by the Select all DataStage flows and their dependencies in the project option, data assets from the connections associated with the DataStage flows are not imported.

Workaround: Explicitly select all DataStage flows and connections when you set the scope instead of using the Select all DataStage flows and their dependencies in the project option.

When a job for importing lineage metadata hangs, it cannot be stopped

Applies to: 5.2.0 and later

When you run a lineage metadata import and the job stops responding, the job can't be stopped.

Only files with .sql extension can be provided as manual input for metadata import from the Oracle and PostgreSQL sources

Applies to: 5.2.0 and later

When you import metadata from the Oracle and PostgreSQL sources, only .sql files can be used as manual input. Other formats like files with .pck extension can't be used. This limitation is applicable when you install IBM Manta Data Lineage.

Assets and lineage might not be imported if many connections use the same data source

Applies to: 5.2.0

If more than one connection points to the same data source, for example to the same Db2 database, importing lineage might not be successful. Assets and lineage metadata might not be imported in such case. When you create a connection to use with metadata import, make sure that only one connection points to a selected data source.

Metadata enrichment

Profiling in catalogs, projects, and metadata enrichment might fail for Teradata connections

Applies to: 5.2

If a Generic JDBC connection for Teradata exists with a driver version before 17.20.00.15, profiling in catalogs and projects, and metadata enrichment of data assets from a Teradata connection fails with an error message similar to the following one:

2023-02-15T22:51:02.744Z - cfc74cfa-db47-48e1-89f5-e64865a88304 [P] ("CUSTOMERS") - com.ibm.connect.api.SCAPIException: CDICO0100E: Connection failed: SQL error: [Teradata JDBC Driver] [TeraJDBC 16.20.00.06] [Error 1536] [SQLState HY000] Invalid connection parameter name SSLMODE (error code: DATA_IO_ERROR)

Workaround: For this workaround, users must be enabled to upload or remove JDBC drivers. For more information, see Enable users to upload, delete, or view JDBC drivers.

Complete these steps:

  1. Go to Data > Connectivity > JDBC drivers and delete the existing JAR file for Teradata (terajdbc4.jar).
  2. Edit the Generic JDBC connection, remove the selected JAR files, and add SSLMODE=ALLOW to the JDBC URL.

For assets from SAP OData sources, the metadata enrichment results do not show the table type

Applies to: 5.2

In general, metadata enrichment results show for each enriched data asset whether the asset is a table or a view. This information cannot be retrieved for data assets from SAP OData data sources and is thus not shown in the enrichment results.

Data quality

Rules run on columns of type timestamp with timezone fail

Applies to: 5.2

The data type timestamp with timezone is not supported. You can't apply data quality rules to columns with that data type.

Business lineage

An unnecessary edge appears when expanding data integration assets

Applies to: 5.2.0 and later

After expanding a data integration asset and clicking Show next or Show all, the transformer nodes will have an unnecessary edge that points to themselves.