Known issues and limitations for IBM Knowledge Catalog

The following known issues and limitations apply to IBM Knowledge Catalog.

Known issues

General

Installing, upgrading, and uninstalling

Migration and removal of legacy functions For known issues with migration and removal of legacy functions, see Known issues for migration and Known issues for migration from InfoSphere Information Server.

Catalogs and Projects

Governance artifacts

Governance artifact workflows

Custom workflows

Metadata import

Metadata enrichment

Data quality

MANTA Automated Data Lineage for IBM Cloud Pak for Data

Lineage

Also see:

Limitations

Catalogs and Projects

Governance artifacts

Metadata enrichment

Data quality

General issues

You might encounter these known issues and restrictions when you work with the IBM Knowledge Catalog service.

Assets imported with the user admin instead of cpadmin

For Cloud Pak for Data clusters with Identity Management Service enabled, the default administrator is cpadmin. However, for import, the default administrative user admin is used. Therefore, the assets are imported with the admin user instead of cpadmin.

Applies to: 4.8.0 and later

Workaround:

Before running the import, apply the following workaround:

  1. Edit the config map by executing oc edit cm catalog-api-exim-cm

  2. Manually update the environment variable admin_username in import-job.spec.template.spec.env from:

    - name: admin_username
    value: ${admin_username}
    

    to:

    - name: admin_username
    value: cpadmin
    

The services catalog page on the user interface displays two IBM Knowledge Catalog tiles

Applies to: 4.8.0 and later

When navigating the user interface (UI) on the services catalog page, you might encounter two tiles displaying IBM Knowledge Catalog when selecting a Data governance service.

Workaround: To fix this, you must restart the zen-watcher pod by runninig:

oc delete pod zen-watcher-nnnnnn-nnnnn -n ${PROJECT_CPD_INST_OPERANDS}

After successfully restarting the pod, the services catalog page shows only one IBM Knowledge Catalog tile.

Heavy I/O load can cause out-of-memory failures of the wkc-db2u instance

Applies to: 4.8.2 and later

After a metadata enrichment job fails, you see that the pods for the glossary service, data quality rules, and wkc-db2 were restarted. When you check the status of the wkc-db2 pod, you see the following error:

Error:
  terminated:
          exitCode: 143
          reason: OOMKilled

This error indicates that resource limits must be increased.

Workaround: Scale up the Db2 instance for the IBM Knowledge Catalog service on Cloud Pak for Data to enhance high availability and increase processing capacity for the IBM Knowledge Catalog service. Allocate additional memory and CPU resources to the existing Db2 deployment by completing these steps:

  1. Specify the CPU and memory limit. In this example, CPU is set to 8 vCPU and memory is set to 15 Gi. Modify the values according to your needs.

    oc patch db2ucluster db2oltp-wkc --type=merge --patch '{"spec": {
    "podConfig": {
        "db2u": {
            "resource": {
                "db2u": {
                    "limits": {
                        "cpu": "8",
                        "memory": "15Gi"
                    }
                }
            }
        }
    }
    }}'
    
  2. Wait for the c-db2oltp-wkc-db2u-0 pod to restart.

For more information, see Scaling up Db2 for IBM Knowledge Catalog. If needed, also complete steps 3 to 6 of the described procedure.

Installing, upgrading and uninstalling

You might encounter these known issues while installing, upgrading or uninstalling IBM Knowledge Catalog.

When you upgrade to version 4.8.x, new permissions aren't added to the Administrator and Data Quality Analyst roles

Applies to: Upgrades to 4.8.x

When you upgrade to version 4.8.x from version 4.7 or earlier, the new permissions Manage data quality SLA rules (manage_data_quality_sla_rules) and Drill down to issue details (data_quality_drill_down) are not added to the default permission sets of the Administrator (zen_administrator_role) and Data Quality Analyst (wkc_data_quality_analyst_role) roles.

Workaround: Reapply the content of the wkc-usr-role-extensions ConfigMap:

  1. Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task.

    ${OC_LOGIN}
    
  2. Copy the wkc-user-role-extensions ConfigMap to a YAML file:

    oc get cm wkc-user-role-extensions -n ${PROJECT_CPD_INST_OPERANDS} -o yaml > wkc-user-role-extensions-reapply.yaml
    
  3. Edit the wkc-user-role-extensions-reapply.yaml file and apply these changes:

    • Remove the managedFields and creationTimestamp entries.
    • Change the name value.
  4. Apply the new ConfigMap manually:

    oc create -f wkc-user-role-extensions-reapply.yaml
    

After the ConfigMap is reapplied successfully, the Administrator and Data Quality Analyst roles should include the new permissions.

After upgrading to 4.8.0 or 4.8.1, no assets might be imported when you run a lineage import

Applies to: Upgrades to 4.8.0 or 4.8.1
Fixed in: 4.8.2

After upgrading to version 4.8.0 or 4.8.1, lineage imports might fail because the trust store in MANTA Automated Data Lineage is corrupted.

Workaround: To fix the issue, complete these steps:

  1. Open the administration GUI for MANTA Automated Data Lineage. Enter the following URL in a web browser replacing hostname with your Cloud Pak for Data cluster URL:

    https://<hostname>/manta-admin-gui/
    
  2. From the menu bar, select Configuration.

  3. From the menu, select Common > Common Config and click Edit.

  4. In the SSL section, click Edit next to MANTA Flow CLI System Connectors Settings.

  5. In the Edit truststore settings window, click Recreate.

  6. For the Use generated password option, select false.

  7. In the Password and Confirm Password fields, enter the default password mantaConnectorsTruststore and click Confirm.

  8. Save the new configuration.

When uninstalling, terminating PVCs might get stuck

Applies to: 4.8

During the uninstall of IBM Knowledge Catalog, the PVCs c-db2oltp-wkc-meta and c-db2oltp-iis-meta might get stuck in the Terminating state.

Note:

This might be seen in environments that have been upgraded, and will prevent new installations of IBM Knowledge Catalog unless the PVC is removed altogether.

When inspecting the PVCs, it might show which pods are still using it, which prevents the PVC from being deleted. For example:

Used By:       c-db2oltp-wkc-11.5.8.0-cn1-to-11.5.8.0-cn5-f6d4j
               c-db2oltp-wkc-11.5.8.0-cn1-to-11.5.8.0-cn5-qhwzm
               db2u-ssl-rotate-db2oltp-wkc-k78kg

In the example, the c-db2oltp-wkc-meta PVC is still being used.

Workaround: To ensure the PVC is properly deleted, the completed pods and jobs that are still mounting the PVC must be manually deleted.

Follow these steps to manually delete the PVCs:

  1. Delete completed jobs db2u-ssl-rotate-db2oltp-iis and db2u-ssl-rotate-db2oltp-wkc if they exist:

    oc delete job db2u-ssl-rotate-db2oltp-iis -n ${PROJECT_CPD_INST_OPERANDS} --ignore-not-found
    oc delete job db2u-ssl-rotate-db2oltp-wkc -n ${PROJECT_CPD_INST_OPERANDS} --ignore-not-found
    
  2. Delete the completed upgrade pods, if they exist:

    oc delete po c-db2oltp-wkc-11.5.8.0-cn1-to-11.5.8.0-cn5-f6d4j -n ${PROJECT_CPD_INST_OPERANDS}
    oc delete po c-db2oltp-wkc-11.5.8.0-cn1-to-11.5.8.0-cn5-qhwzm -n ${PROJECT_CPD_INST_OPERANDS}
    

When installing or upgrading IBM Knowledge Catalog, the wdp-profiling-iae-thirdparty-lib-volume-instance job might fail

Applies to: 4.8.0 and 4.8.1

During the deployment of IBM Knowledge Catalog the wdp-profiling-iae-thirdparty-lib-volume-instance job might fail and the following message will appear in the IBM Knowledge Catalog custom resource (CR):

Failed at task: Deploy job resource and wait for it to complete - Item: iae-thirdparty-lib-volume-instance
      The error was: Failed to patch object: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Job.batch \\"wdp-profiling-iae-thirdparty-lib-volume-instance\\" is invalid: spec.template: Invalid value: core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:\\"\\", GenerateName:\\"\\", Namespace:\\"\\", SelfLink:\\"\\", UID:\\"\\", ResourceVersion:\\"\\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:\\u003cnil\\u003e, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{\\"app\\":\\"wdp-profiling\\", \\"app.kubernetes.io/instance\\":\\"0075-wkc-lite\\", \\"app.kubernetes.io/managed-by\\":\\"Tiller\\", \\"app.kubernetes.io/name\\":\\"wdp-profiling-chart\\", \\"chart\\":\\"wdp-profiling-chart\\", \\"controller-uid\\":\\"67636a6b-1b5d-45c3-a960-d4b6948b3886\\", \\"helm.sh/chart\\":\\"wdp-profiling-chart\\", \\"heritage\\":\\"Tiller\\", \\"icpdsupport/addOnId\\":\\"wkc\\", \\"icpdsupport/app\\":\\"api\\", \\"icpdsupport/module\\":\\"wdp-profiling\\", \\"job-name\\":\\"wdp-profiling-iae-thirdparty-lib-volume-instance\\", \\"release\\":\\"0075-wkc-lite\\"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference{v1.OwnerReference{APIVersion:\\"wkc.cpd.ibm.com/v1beta1\\", Kind:\\"WKC\\", Name:\\"wkc-cr\\", UID:\\"e59deb0f-2f4a-4e13-bfcf-274c6c267fa1\\", Controller:(*bool)(nil), BlockOwnerDeletion:(*bool)(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:core.PodSpec{Volumes:[]core.Volume{core.Volume{Name:\\"wdp-profiling-iae-thirdparty-lib-volume-config\\", VolumeSource:core.VolumeSource{HostPath:(*core.HostPathVolumeSource)(nil), EmptyDir:(*core.EmptyDirVolumeSource)(nil), GCEPersistentDisk:(*core.GCEPersistentDiskVolumeSource)(nil), AWSElasticBlockStore:(*core.AWSElasticBlockStoreVolumeSource)(nil), GitRepo:(*core.GitRepoVolumeSource)(nil), Secret:(*core.SecretVolumeSource)(nil), NFS:(*core.NFSVolumeSource)(nil), ISCSI:(*core.ISCSIVolumeSource)(nil), Glusterfs:(*core.GlusterfsVolumeSource)(nil), PersistentVolumeClaim:(*core.PersistentVolumeClaimVolumeSource)(nil), RBD:(*core.RBDVolumeSource)(nil), Quobyte:(*core.QuobyteVolumeSource)(nil), FlexVolume:(*core.FlexVolumeSource)(nil), Cinder:(*core.CinderVolumeSource)(nil), CephFS:(*core.CephFSVolumeSource)(nil), Flocker:(*core.FlockerVolumeSource)(nil), DownwardAPI:(*core.DownwardAPIVolumeSource)(nil), FC:(*core.FCVolumeSource)(nil), AzureFile:(*core.AzureFileVolumeSource)(nil), ConfigMap:(*core.ConfigMapVolumeSource)(0xc0e8edf600), VsphereVolume:(*core.VsphereVirtualDiskVolumeSource)(nil), AzureDisk:(*core.AzureDiskVolumeSource)(nil), PhotonPersistentDisk:(*core.PhotonPersistentDiskVolumeSource)(nil), Projected:(*core.ProjectedVolumeSource)(nil), PortworxVolume:(*core.PortworxVolumeSource)(nil), ScaleIO:(*core.ScaleIOVolumeSource)(nil), StorageOS:(*core.StorageOSVolumeSource)(nil), CSI:(*core.CSIVolumeSource)(nil), Ephemeral:(*core.EphemeralVolumeSource)(nil), core.Volume{Name:\\"secrets-mount\\", VolumeSource:core.VolumeSource{HostPath:(*core.HostPathVolumeSource)(nil), EmptyDir:(*core.EmptyDirVolumeSource)(nil), GCEPersistentDisk:(*core.GCEPersistentDiskVolumeSource)(nil), AWSElasticBlockStore:(*core.AWSElasticBlockStoreVolumeSource)(nil), GitRepo:(*core.GitRepoVolumeSource)(nil), Secret:(*core.SecretVolumeSource)(nil), NFS:(*core.NFSVolumeSource)(nil), ISCSI:(*core.ISCSIVolumeSource)(nil), Glusterfs:(*core.GlusterfsVolumeSource)(nil), PersistentVolumeClaim:(*core.PersistentVolumeClaimVolumeSource)(nil), RBD:(*core.RBDVolumeSource)(nil), Quobyte:(*core.QuobyteVolumeSource)(nil), FlexVolume:(*core.FlexVolumeSource)(nil), Cinder:(*core.CinderVolumeSource)(nil), CephFS:(*core.CephFSVolumeSource)(nil), Flocker:(*core.FlockerVolumeSource)(nil), DownwardAPI:(*core.DownwardAPIVolumeSource)(nil), FC:(*core.FCVolumeSource)(nil), AzureFile:(*core.AzureFileVolumeSource)(nil), ConfigMap:(*core.ConfigMapVolumeSource)(nil), VsphereVolume:(*core.VsphereVirtualDiskVolumeSource)(nil), AzureDisk:(*core.AzureDiskVolumeSource)(nil), PhotonPersistentDisk:(*core.PhotonPersistentDiskVolumeSource)(nil), Projected:(*core.ProjectedVolumeSource)(0xc06a6d6640), PortworxVolume:(*core.PortworxVolumeSource)(nil), ScaleIO:(*core.ScaleIOVolumeSource)(nil), StorageOS:(*core.StorageOSVolumeSource)(nil), CSI:(*core.CSIVolumeSource)(nil), Ephemeral:(*core.EphemeralVolumeSource)(nil)}, InitContainers:[]core.Container(nil), Containers:[]core.Container{core.Container{Name:\\"wdp-profiling-iae-thirdparty-lib-volume-instance\\", Image:\\"cp.icr.io/cp/cpd/wkc-init-container-wkc@sha256:d49ce63df1c06546a22b0c483b1e3f2a2159c3e30d81208b9e30105dbc2d7a0e\\", Command:[]string{\\"/bin/sh\\", \\"/wkc/genkeys.sh\\"}, Args:[]string(nil), WorkingDir:\\"\\", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar{core.EnvVar{Name:\\"GATEWAY_HOST\\", Value:\\"\\", ValueFrom:(*core.EnvVarSource)(0xc06a6d6560), Resources:core.ResourceRequirements{Limits:core.ResourceList{\\"cpu\\":resource.Quantity{i:resource.int64Amount{value:500, scale:-3}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\\"500m\\", Format:\\"DecimalSI\\"}, \\"memory\\":resource.Quantity{i:resource.int64Amount{value:256, scale:6}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\\"256M\\", Format:\\"DecimalSI\\", Requests:core.ResourceList{\\"cpu\\":resource.Quantity{i:resource.int64Amount{value:100, scale:-3}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\\"100m\\", Format:\\"DecimalSI\\"}, \\"memory\\":resource.Quantity{i:resource.int64Amount{value:128, scale:6}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\\"128M\\", Format:\\"DecimalSI\\"}, VolumeMounts:[]core.VolumeMount{core.VolumeMount{Name:\\"wdp-profiling-iae-thirdparty-lib-volume-config\\", ReadOnly:false, MountPath:\\"/wkc\\", SubPath:\\"\\", MountPropagation:(*core.MountPropagationMode)(nil), SubPathExpr:\\"\\"}, core.VolumeMount{Name:\\"secrets-mount\\", ReadOnly:true, MountPath:\\"/etc/.secrets\\", SubPath:\\"\\", MountPropagation:(*core.MountPropagationMode)(nil), SubPathExpr:\\"\\", VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), StartupProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:\\"/dev/termination-log\\", TerminationMessagePolicy:\\"File\\", ImagePullPolicy:\\"IfNotPresent\\", SecurityContext:(*core.SecurityContext)(0xc0c3a7a060), Stdin:false, StdinOnce:false, TTY:false, EphemeralContainers:[]core.EphemeralContainer(nil), RestartPolicy:\\"Never\\", TerminationGracePeriodSeconds:(*int64)(0xc0e2059810), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:\\"ClusterFirst\\", NodeSelector:map[string]string(nil), ServiceAccountName:\\"zen-norbac-sa\\", AutomountServiceAccountToken:(*bool)(0xc0e20596d5), NodeName:\\"\\", SecurityContext:(*core.PodSecurityContext)(0xc0551e50e0), ImagePullSecrets:[]core.LocalObjectReference(nil), Hostname:\\"\\", Subdomain:\\"\\", SetHostnameAsFQDN:(*bool)(nil), Affinity:(*core.Affinity)(0xc072d56870), SchedulerName:\\"default-scheduler\\", Tolerations:[]core.Toleration(nil), HostAliases:[]core.HostAlias(nil), PriorityClassName:\\"\\", Priority:(*int32)(nil), PreemptionPolicy:(*core.PreemptionPolicy)(nil), DNSConfig:(*core.PodDNSConfig)(nil), ReadinessGates:[]core.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), Overhead:core.ResourceList(nil), EnableServiceLinks:(*bool)(nil), TopologySpreadConstraints:[]core.TopologySpreadConstraint(nil), OS:(*core.PodOS)(nil): field is immutable","reason":"Invalid","details":{"name":"wdp-profiling-iae-thirdparty-lib-volume-instance","group":"batch","kind":"Job","causes":[{"reason":"FieldValueInvalid","message":"Invalid value: core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:\\"\\", GenerateName:\\"\\", Namespace:\\"\\", SelfLink:\\"\\", UID:\\"\\", ResourceVersion:\\"\\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:\\u003cnil\\u003e, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{\\"app\\":\\"wdp-profiling\\", \\"app.kubernetes.io/instance\\":\\"0075-wkc-lite\\", \\"app.kubernetes.io/managed-by\\":\\"Tiller\\", \\"app.kubernetes.io/name\\":\\"wdp-profiling-chart\\", \\"chart\\":\\"wdp-profiling-chart\\", \\"controller-uid\\":\\"67636a6b-1b5d-45c3-a960-d4b6948b3886\\", \\"helm.sh/chart\\":\\"wdp-profiling-chart\\", \\"heritage\\":\\"Tiller\\", \\"icpdsupport/addOnId\\":\\"wkc\\", \\"icpdsupport/app\\":\\"api\\", \\"icpdsupport/module\\":\\"wdp-profiling\\", \\"job-name\\":\\"wdp-profiling-iae-thirdparty-lib-volume-instance\\", \\"release\\":\\"0075-wkc-lite\\"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference{v1.OwnerReference{APIVersion:\\"wkc.cpd.ibm.com/v1beta1\\", Kind:\\"WKC\\", Name:\\"wkc-cr\\", UID:\\"e59deb0f-2f4a-4e13-bfcf-274c6c267fa1\\", Controller:(*bool)(nil), BlockOwnerDeletion:(*bool)(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:core.PodSpec{Volumes:[]core.Volume{core.Volume{Name:\\"wdp-profiling-iae-thirdparty-lib-volume-config\\", VolumeSource:core.VolumeSource{HostPath:(*core.HostPathVolumeSource)(nil), EmptyDir:(*core.EmptyDirVolumeSource)(nil), GCEPersistentDisk:(*core.GCEPersistentDiskVolumeSource)(nil), AWSElasticBlockStore:(*core.AWSElasticBlockStoreVolumeSource)(nil), GitRepo:(*core.GitRepoVolumeSource)(nil), Secret:(*core.SecretVolumeSource)(nil), NFS:(*core.NFSVolumeSource)(nil), ISCSI:(*core.ISCSIVolumeSource)(nil), Glusterfs:(*core.GlusterfsVolumeSource)(nil), PersistentVolumeClaim:(*core.PersistentVolumeClaimVolumeSource)(nil), RBD:(*core.RBDVolumeSource)(nil), Quobyte:(*core.QuobyteVolumeSource)(nil), FlexVolume:(*core.FlexVolumeSource)(nil), Cinder:(*core.CinderVolumeSource)(nil), CephFS:(*core.CephFSVolumeSource)(nil), Flocker:(*core.FlockerVolumeSource)(nil), DownwardAPI:(*core.DownwardAPIVolumeSource)(nil), FC:(*core.FCVolumeSource)(nil), AzureFile:(*core.AzureFileVolumeSource)(nil), ConfigMap:(*core.ConfigMapVolumeSource)(0xc0e8edf600), VsphereVolume:(*core.VsphereVirtualDiskVolumeSource)(nil), AzureDisk:(*core.AzureDiskVolumeSource)(nil), PhotonPersistentDisk:(*core.PhotonPersistentDiskVolumeSource)(nil), Projected:(*core.ProjectedVolumeSource)(nil), PortworxVolume:(*core.PortworxVolumeSource)(nil), ScaleIO:(*core.ScaleIOVolumeSource)(nil), StorageOS:(*core.StorageOSVolumeSource)(nil), CSI:(*core.CSIVolumeSource)(nil), Ephemeral:(*core.EphemeralVolumeSource)(nil), core.Volume{Name:\\"secrets-mount\\", VolumeSource:core.VolumeSource{HostPath:(*core.HostPathVolumeSource)(nil), EmptyDir:(*core.EmptyDirVolumeSource)(nil), GCEPersistentDisk:(*core.GCEPersistentDiskVolumeSource)(nil), AWSElasticBlockStore:(*core.AWSElasticBlockStoreVolumeSource)(nil), GitRepo:(*core.GitRepoVolumeSource)(nil), Secret:(*core.SecretVolumeSource)(nil), NFS:(*core.NFSVolumeSource)(nil), ISCSI:(*core.ISCSIVolumeSource)(nil), Glusterfs:(*core.GlusterfsVolumeSource)(nil), PersistentVolumeClaim:(*core.PersistentVolumeClaimVolumeSource)(nil), RBD:(*core.RBDVolumeSource)(nil), Quobyte:(*core.QuobyteVolumeSource)(nil), FlexVolume:(*core.FlexVolumeSource)(nil), Cinder:(*core.CinderVolumeSource)(nil), CephFS:(*core.CephFSVolumeSource)(nil), Flocker:(*core.FlockerVolumeSource)(nil), DownwardAPI:(*core.DownwardAPIVolumeSource)(nil), FC:(*core.FCVolumeSource)(nil), AzureFile:(*core.AzureFileVolumeSource)(nil), ConfigMap:(*core.ConfigMapVolumeSource)(nil), VsphereVolume:(*core.VsphereVirtualDiskVolumeSource)(nil), AzureDisk:(*core.AzureDiskVolumeSource)(nil), PhotonPersistentDisk:(*core.PhotonPersistentDiskVolumeSource)(nil), Projected:(*core.ProjectedVolumeSource)(0xc06a6d6640), PortworxVolume:(*core.PortworxVolumeSource)(nil), ScaleIO:(*core.ScaleIOVolumeSource)(nil), StorageOS:(*core.StorageOSVolumeSource)(nil), CSI:(*core.CSIVolumeSource)(nil), Ephemeral:(*core.EphemeralVolumeSource)(nil)}, InitContainers:[]core.Container(nil), Containers:[]core.Container{core.Container{Name:\\"wdp-profiling-iae-thirdparty-lib-volume-instance\\", Image:\\"cp.icr.io/cp/cpd/wkc-init-container-wkc@sha256:d49ce63df1c06546a22b0c483b1e3f2a2159c3e30d81208b9e30105dbc2d7a0e\\", Command:[]string{\\"/bin/sh\\", \\"/wkc/genkeys.sh\\"}, Args:[]string(nil), WorkingDir:\\"\\", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar{core.EnvVar{Name:\\"GATEWAY_HOST\\", Value:\\"\\", ValueFrom:(*core.EnvVarSource)(0xc06a6d6560), Resources:core.ResourceRequirements{Limits:core.ResourceList{\\"cpu\\":resource.Quantity{i:resource.int64Amount{value:500, scale:-3}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\\"500m\\", Format:\\"DecimalSI\\"}, \\"memory\\":resource.Quantity{i:resource.int64Amount{value:256, scale:6}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\\"256M\\", Format:\\"DecimalSI\\", Requests:core.ResourceList{\\"cpu\\":resource.Quantity{i:resource.int64Amount{value:100, scale:-3}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\\"100m\\", Format:\\"DecimalSI\\"}, \\"memory\\":resource.Quantity{i:resource.int64Amount{value:128, scale:6}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\\"128M\\", Format:\\"DecimalSI\\"}, VolumeMounts:[]core.VolumeMount{core.VolumeMount{Name:\\"wdp-profiling-iae-thirdparty-lib-volume-config\\", ReadOnly:false, MountPath:\\"/wkc\\", SubPath:\\"\\", MountPropagation:(*core.MountPropagationMode)(nil), SubPathExpr:\\"\\"}, core.VolumeMount{Name:\\"secrets-mount\\", ReadOnly:true, MountPath:\\"/etc/.secrets\\", SubPath:\\"\\", MountPropagation:(*core.MountPropagationMode)(nil), SubPathExpr:\\"\\", VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), StartupProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:\\"/dev/termination-log\\", TerminationMessagePolicy:\\"File\\", ImagePullPolicy:\\"IfNotPresent\\", SecurityContext:(*core.SecurityContext)(0xc0c3a7a060), Stdin:false, StdinOnce:false, TTY:false, EphemeralContainers:[]core.EphemeralContainer(nil), RestartPolicy:\\"Never\\", TerminationGracePeriodSeconds:(*int64)(0xc0e2059810), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:\\"ClusterFirst\\", NodeSelector:map[string]string(nil), ServiceAccountName:\\"zen-norbac-sa\\", AutomountServiceAccountToken:(*bool)(0xc0e20596d5), NodeName:\\"\\", SecurityContext:(*core.PodSecurityContext)(0xc0551e50e0), ImagePullSecrets:[]core.LocalObjectReference(nil), Hostname:\\"\\", Subdomain:\\"\\", SetHostnameAsFQDN:(*bool)(nil), Affinity:(*core.Affinity)(0xc072d56870), SchedulerName:\\"default-scheduler\\", Tolerations:[]core.Toleration(nil), HostAliases:[]core.HostAlias(nil), PriorityClassName:\\"\\", Priority:(*int32)(nil), PreemptionPolicy:(*core.PreemptionPolicy)(nil), DNSConfig:(*core.PodDNSConfig)(nil), ReadinessGates:[]core.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), Overhead:core.ResourceList(nil), EnableServiceLinks:(*bool)(nil), TopologySpreadConstraints:[]core.TopologySpreadConstraint(nil), OS:(*core.PodOS)(nil): field is immutable","field":"spec.template"}]},"code":422}\n'
    reason: Failed

Symptoms: On some environments, during the wkc-cr reconciliation, the Openshift cluster (OC) tries to patch the wdp-profiling-iae-thirdparty-lib-volume-instance job and might fail.

Workaround: Delete the wdp-profiling-iae-thirdparty-lib-volume-instance job and continue with the wkc-cr reconciliation process by running:

oc delete job <cpd_instance> wdp-profiling-iae-thirdparty-lib-volume-instance

Legacy migration prompt message displays incorrect version when upgrading to Version 4.8

Applies to: 4.8.0 and 4.8.1

When upgrading to Version 4.8, and conducting the legacy migration, the prompt message for the migration displays the wrong target release. Instead of Version 4.8, it shows Version 4.8.7.

An example of the prompt message:

Your deployment includes Watson Knowledge Catalog. Watson Knowledge Catalog requires a data migration when you upgrade to Cloud Pak for Data 4.7. Depending on what legacy metadata you use in Watson Knowledge Catalog, the data migration tool might not cover your metadata yet. If you start the upgrade of any Cloud Pak for Data component without ensuring that the Watson Knowledge Catalog migration tool can migrate your metadata to 4.7, you might experience loss of data and your Watson Knowledge Catalog instance could end up in a non-working state after the upgrade. The only way to recover would be to restore your 4.6.5 deployment from backup. Contact IBM Support for more information. A tool is available to check if your metadata can be migrated to 4.7.
Have you run the tool and validated that your metadata can be migrated?
If you want to continue with the migration, please type: 'I have validated that I can migrate my metadata and I want to continue'

Workaround: None. There is no impact to the upgrade process. Continue with the upgrade to Version 4.8.

Upgrading from Version 4.6.x to Version 4.8 shows incorrect status

Applies to: 4.8.0 and 4.8.1

You might encounter this issue when upgrading from Version 4.6.x to Version 4.8. The status of the Unified Governance (UG) and the InfoSphere Information Server (IIS) services shows as inprogress while IBM Knowledge Catalog shows as completed.

Workaround: None. The status will continue to show inprogress but has no impact on the upgrade. Continue with the upgrade from Version 4.6.x to Version 4.8.

Categories have no category collaborators after installing Version 4.8.1

Applies to: 4.8.1 and later

Categories might not be visible after installing IBM Knowledge Catalog Version 4.8.1.

Workaround:

  1. Log in to the cluster and run the following commands in the c-db2oltp-wkc-db2u-0 container:

    oc exec -it c-db2oltp-wkc-db2u-0 bash
    db2 connect to BGDB
    db2 "set session authorization \"999\""
    db2 "UPDATE BG.CATEGORY SET MIGRATION_STATUS='NOT_MIGRATED' WHERE ARTIFACT_ID='e39ada11-8338-3704-90e3-681a71e7c839'"
    
  2. After running the commands, exit the Db2 console.

  3. Run the following command from the command line interface (CLI):

    curl -X 'POST' \
        'https://$HOST/v3/categories/collaborators/bootstrap' \
        -H 'accept: application/json' \
        -H "Authorization: Bearer $TOKEN" \
     -d ''
    

    Set the HOST and TOKEN variables correctly.

    Note:

    To learn how to generate a token, see Generating a bearer token.

  4. Verify that the command completed successfully:

    curl -X 'GET' \
         'https://$HOST/v3/categories/collaborators/bootstrap/status' \
        -H 'accept: application/json' \
         -H "Authorization: Bearer $TOKEN"
    

    The command must return a SUCCESS message.

  5. Wait around for another minute for the caches to be rebuild, then refresh the categories page. The [Uncategorized] category is now visible.

Upgrading to 4.8.1 or 4.8.2 may not work if additional service components are installed

Applies to: 4.8.1 and 4.8.2
Fixed in: 4.8.3

This issue can be observed when IBM Knowledge Catalog is installed with other service components like AI Factsheets, Watson Studio, and other services.

When you upgrade IBM Knowledge Catalog to version 4.8.1 or 4.8.2, it removes AI Factsheets partially. This causes upgrade failures for other services like Watson Studio when zenExtension pods get restarted during the upgrade process.

Symptoms:

  1. Check to see if any of the zenExtension pods are in Failed status:
    oc get zenExtension -n $PROJECT_CPD_INST_OPERANDS
    
  2. If any are in Failed state, check the reason for the failure:
    oc -n $PROJECT_CPD_INST_OPERANDS get zenExtension <failed zenExtension>  -o yaml
    
  3. Check the message output. If it has a similar message to the example below, then apply the workaround:
    host not found in upstream "wkc-factsheet-service:443" in /nginx_data/extensions/upstreams/wkc_fs-routes_ie_223.conf:5
    nginx: [emerg] host not found in upstream "wkc-factsheet-service:443" in /nginx_data/extensions/upstreams/wkc_fs-routes_ie_223.conf:5
    nginx: configuration file /usr/local/openresty/nginx/conf/nginx.conf test failed
    

Workaround:

  1. Run the following command on the zenExtension pods:
    oc get zenExtension -n $PROJECT_CPD_INST_OPERANDS | grep fs-routes
    oc delete fs-routes -n $PROJECT_CPD_INST_OPERANDS ZenExtension
    
  2. Check whether the failing component CR status shows InProgress, and then Completed.

Upgrading to 4.8.2 requires a cleanup script after upgrade

Applies to: 4.8.2

When upgrading to version 4.8.2, users might have to cleanup unused permissions after the upgrade. This can also occur if users uninstall IBM Knowledge Catalog on an upgraded environment, or perform legacy migration cleanup in an upgraded environment.

To cleanup the unused permissions follow the workaround.

Workaround:

  1. Enter the CCS job pod wkc-base-roles-init:
    oc debug job/wkc-base-roles-init --as-user=122323
    
  2. Copy the following content into a python script:
    import json
    import os
    import requests
    import sys
    import argparse
    import urllib3
    urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
    #include extension_name(s) of all the roles that have been modified in your charts for the current release
    #below is an example where I want to refresh the Data Steward and the User roles. This is a list so you can include multiple role extension names.
    service_broker_secret = None
    cpd_namespace = os.environ.get('CPD_NAMESPACE')
    verify_cert = '/etc/wdp_certs/internal-nginx.cert.pem'
    zen_core_api_svc = f'https://zen-core-api-svc.{cpd_namespace}.svc.cluster.local:4444'
    usermgmt_svc = f'https://usermgmt-svc.{cpd_namespace}.svc.cluster.local:3443'
    
    #update the below environment variable to match how you mount the service broker secret in your chart .yaml file
    if os.environ.get('ZEN_SERVICE_BROKER_SECRET') is not None:
        service_broker_secret = os.environ.get('ZEN_SERVICE_BROKER_SECRET')
    elif os.path.isfile('/etc/.secrets/ZEN_SERVICE_BROKER_SECRET'):
        with open('/etc/.secrets/ZEN_SERVICE_BROKER_SECRET') as file:
            service_broker_secret = file.read().strip()
    else:
        sys.exit('could not find zen-service-broker-secret environment variable')
    def get_internal_service_token():
        print('\n>> Requesting internal-service-token from zen-core-api-svc...')
        url = f'{zen_core_api_svc}/internal/v1/service_token'
        headers = {
            'secret': service_broker_secret
        }
        r = requests.get(url, headers=headers, verify=False)
        if r.status_code != 200:
            print('>> Error requesting internal_service_token - status_code - ' + str(r.status_code))
            try:
                print(json.dumps(r.json(), indent=2))
            except Exception:
                print(r.text)
            sys.exit('request to retrieve internal_service_token failed')
        else:
            print('>> Successfully requested internal_service_token - status_code - 200')
            try:
                resp = r.json()
                if resp['token']:
                    return 'Bearer ' + resp['token']
                else:
                    sys.exit('could not parse internal_service_token from the response')
            except Exception:
                sys.exit('could not parse internal_service_token from the response')
    def get_platform_roles(token):
        print('\n>> Requesting platform-roles from usermgmt-svc...')
        url = f'{usermgmt_svc}/v1/roles'
        headers = {
            'Authorization': token
        }
        r = requests.get(url, headers=headers, verify=False)
        if r.status_code != 200:
            print('>> Error retrieving all platform-roles - status_code - ' + str(r.status_code))
            try:
                print(json.dumps(r.json(), indent=2))
            except Exception:
                print(r.text)
            sys.exit('request to retrieve roles failed')
        else:
            print('>> Successfully retrieved all platform-roles - status_code - 200')
            try:
                resp = r.json()
                if resp['rows']:
                    print(json.dumps(resp['rows'], indent=2))
                    return resp['rows']
                else:
                    sys.exit('could not parse roles from the response')
            except Exception:
                sys.exit('could not parse roles from the response')
    def get_permissions(token):
        print('\n>> Getting permissions...')
        url = f'{zen_core_api_svc}/openapi/v1/permissions'
        headers = {
            'Authorization': token
        }
        r = requests.get(url, headers=headers, verify=False)
        if r.status_code != 200:
            print('>> Error retrieving permissions - status_code - ' + str(r.status_code))
            try:
                print(json.dumps(r.json(), indent=2))
            except Exception:
                print(r.text)
            sys.exit('request to retrieve permissions failed')
        else:
            print('>> Successfully retrieved permissions - status_code - 200')
            try:
                resp = r.json()
                if resp['Permissions']:
                    #print(json.dumps(resp['Permissions'], indent=2))
                    return resp['Permissions']
                else:
                    sys.exit('could not parse permissions from the response')
            except Exception:
                sys.exit('could not parse permissions from the response')
    def refresh_user_permissions(token, roles_list, perm_list):
        print('\n>> Verifying permissions of roles from usermgmt-svc...')
        flagger = False
        for role_obj in roles_list:
            if role_obj and type(role_obj) is dict and \
                role_obj['id'] is not None and \
                role_obj['doc'] and type(role_obj) is dict and \
                role_obj['doc']['role_name'] is not None and \
                role_obj['doc']['permissions'] and type(role_obj['doc']['permissions']) is list and len(role_obj['doc']['permissions']) > 0:
                print('>> Verifying permissions for role - ' + role_obj['id'])
                url = f'{usermgmt_svc}/v1/role/' + role_obj['id']
                headers = {
                    'Authorization': token,
                    'Content-Type': 'application/json'
                }
                print('>> Current permissions for role - ' + role_obj['id'])
                print(json.dumps(role_obj['doc']['permissions'], indent=2))
                role_permissions = role_obj['doc']['permissions']
                valid_permissions = list(filter(lambda p: (p in perm_list),role_permissions))
                invalid_permissions = list(filter(lambda p: not(p in perm_list), role_permissions))
                payload_data = {
                    'role_name': role_obj['doc']['role_name'],
                    'description': role_obj['doc']['description'] or '',
                    'permissions': valid_permissions
                }
                r = requests.put(url, headers=headers, data=json.dumps(payload_data), verify=False)
                if r.status_code != 200:
                    print('>> Error refreshing role with extension_name - ' + role_obj['id'] + ' - status_code - ' + str(r.status_code))
                    try:
                        print(json.dumps(r.json(), indent=2))
                    except Exception:
                        print(r.text)
                    flagger = True
                else:
                    if len(invalid_permissions) == 0:
                        print('>> Nothing to purge for role with extension_name - ' + role_obj['id'])
                    else:
                        print('>> Successfully purged invalid permissions from role with extension_name - ' + role_obj['id'] + ' - status_code - 200')
                        print('>> Purged permissions:')
                        print(json.dumps(invalid_permissions, indent=2))
            else:
                continue
        if flagger == True:
            sys.exit('some roles were not refreshed - exit and retry')
        return
    def main():
        bearer_token = get_internal_service_token()
        role_extensions = get_platform_roles(bearer_token)
        permissions_list = get_permissions(bearer_token)
        refresh_user_permissions(bearer_token, role_extensions, permissions_list)
    if __name__ == '__main__':
        print("========== Refresh user-profiles with roles that contain modified permissions ==========")
        main()
    
    Run the following command to create the file:
    cat << EOF > /tmp/cleanup.py
    
  3. Run the script:
    python3 /tmp/cleanup.py
    

Installing IBM Knowledge Catalog may fail because of common core services failure

Applies to: 4.8.2, 4.8.3, and 4.8.4
Fixed in: 4.8.5

When installing IBM Knowledge Catalog, the installation may fail because the installation of the common core services fails.

Symptoms: The IBM Knowledge Catalog zenExtension sometimes fails with the following error:

host not found in upstream "catalog-api-upstream"

Even if the catalog-api-upstream is up and running.

You can see the following error by running:

oc describe zenextensions wkc-base-routes -n <$PROJECT_CPD_INST_OPERANDS>

Workaround:

  1. Delete the wkc-base-routes zenExtension by running:
    oc delete zenextension wkc-base-routes -n <$PROJECT_CPD_INST_OPERANDS>
    
  2. Find the common core services operator pod by running:
    oc get pods -n <cpd-operator-namespace> | grep ccs-operator
    
  3. Delete the common core services operator pod to reconcile.

Catalog and project issues

You might encounter these known issues and restrictions when you use catalogs.

Missing previews

Applies to: 4.8.0 and later

You might not see previews of assets in these circumstances:

  • In a catalog or project, you might not see previews or profiles of connected data assets that are associated with connections that require personal credentials. You are prompted to enter your personal credentials to start the preview or profiling of the connection asset.
  • In a catalog, you might not see previews of JSON, text, or image files that were published from a project.
  • In a catalog, the previews of JSON and text files that are accessed through a connection might not be formatted correctly.
  • In a project, you cannot view the preview of image files that are accessed through a connection.

Publishing a Cognos Dashboard with more than one image creates multiple attachments with the same image data

Applies to: 4.8.0, 4.8.1, and 4.8.2
Fixed in: 4.8.3

When you publish a Cognos dashboard to a catalog and choose to add more than one preview of the dashboard, all attachments in the catalog show the image added last. This issue occurs when you select multiple images at a time and drag them into the publish page and when you add files one at a time.

In addition, when you browse for files in the publishing step, you can select only one file. To add further images, you must drag them to the publish page.

Whenever you publish the same dashboard again, it will have the images from the previously published assets as well as the newly added images. For example, if you publish dashboard A with images 1, 2 and 3, it will have 3 screen captures of image 3. If you publish dashboard A again with images 4, 5, 6, it will have 5 screen captures, 3 with image 3 and 2 with image 6.

Cannot delete a custom relationship definition for catalog assets

Applies to: 4.8.0 and later

After adding a custom relationship of a type to a catalog, you will not be able to delete it on Asset and artifacts definitions page.

Workaround: To delete a custom relationship definition, you need to delete all other existing relationships of that type first.

Canceling import job does not prevent processing

Applies to: 4.8.0 and later

Canceling an import job using cpd-cli will not stop the job from being processed.

Workaround: Restart the portal-job-manager pod using the command: kubectl rollout restart deployment portal-job-manager.

Unauthorized users might have access to profiling results

Applies to: 4.8.0 and later

Users who are collaborators with any role in a project or a catalog can view an asset profile even if they don't have access to that asset at the data source level or in Watson Query.

Workaround: Before you add users as collaborators to a project or a catalog, make sure they are authorized to access the assets in the container and thus to view the asset profiles.

Duplicate columns in files will not be displayed in the columns table

Applies to: 4.8.0 and later

If a CSV or other structured file type contains duplicate columns with the same name, only the first instance of each column will be displayed in the columns table on the asset Overview page.

Duplicate action fails when IP address changes

Applies to: 4.8.0 and later

Duplicate actions can fail during connection creation if the connection is using a hostname with a dynamic IP address.

Cannot run import operations on a container package exported from another Cloud Pak for Data cluster

Applies to: 4.8.0 and later

When importing a container package exported from another Cloud Pak for Data cluster, permissions must be configured on the archive to allow export operations on the target cluster to access the files within the archive.

Workaround: To extract the export archive and modify permissions, complete the following steps:

  1. Create a temporary directory by running:
    mkdir temp_directory
    
  2. Extract the archive by running:
    tar -xvf cpd-exports-<export_name>-<timestamp>-data.tar --directory temp_directory
    
  3. Clients will need to run the following command on the target cluster:
    oc get ns $CLUSTER_CPD_NAMESPACE -o=jsonpath='{@.metadata.annotations.openshift\.io/sa\.scc\.supplemental-groups}'
    
    Example output: 1000700000/10000.
  4. The first part of the output of the previous step (ex. 1000700000) will need be applied as the new ownership on all files within the archive. Example:
    cd temp_directory/
    chown -R 1000700000:1000700000 <export_name>
    
  5. Archive the fixed files with the directory, using the same export name and timestamp as the original exported tar:
    tar -cvf cpd-exports-<export_name>-<timestamp>-data.tar <export_name>/
    
  6. Upload the archive.

Relationships are not removed when assets are deleted from a catalog

Applies to: 4.8.0 and later

When deleting an asset that is part of a column-to-column or asset-to-column relationship, the relationship is not removed.

Workaround: Manually remove the relationship from the remaining asset or column. A failure message might appear even if the relationship is successfully removed.

Data protection rules do not apply to column names that contain spaces

Applies to: 4.8.0 and later

If a column name contains trailing or leading spaces during import, the column cannot be masked using data protection rules.

Workaround: When importing columns, ensure column names do not contain trailing or leading spaces.

Viewers see edit options for metadata import assets

Applies to: 4.8.0 and later

Catalog Viewers see the property edit options and the Reimport button for metadata import assets but will receive a permission error when attempting to edit.

Log files are not available when exporting assets with cpd-cli export-import command

Applies to: 4.8.0 and 4.8.1
Fixed in: 4.8.2

The cpd-cli export-import export logs command does not return any logs for the export job in the catalog-api directory.

Workaround: To access the logs, run oc get pods | grep catalog-api-export-job, and find the corresponding export job.

Preview of data from file-based connections other than IBM Cloud Object Storage is not fully supported

Applies to: 4.8.0 and later

Connected assets from file-based connections other than IBM Cloud Object Storage do not preview correctly. Data might appear in a table with missing and/or incorrect data. There is no workaround at this time.

Scroll bar is not visible when adding assets to a project on MacOS

When adding assets to a project, the scroll bar might not be available in the Selected assets table, showing a maximum of 5 assets.

Applies to: 4.8.0 and later

Workaround: Change the MacOS settings:

  1. Click the Apple symbol in the top-left corner of your Mac's menu bar, then click System Settings.
  2. Scroll down and select Appearance.
  3. Under the Show scroll bars option, click the radio button next to Always.

Data quality SLA rule compliance and remediation section is shown in catalogs

Applies to: 4.8.0, 4.8.1, 4.8.2, and 4.8.3
Fixed in: 4.8.4

On a catalog asset's Data quality page, the Data quality SLA rule compliance and remediation section is shown although that feature is not available in catalogs. When you click Enable SLA rules in that section, the error message Error 400 Bad request is shown. The Back button on the error window and the browser back button do not work.

Workaround: Click the menu icon and navigate back from there.

Limited number of rows displayed in the asset preview

Applies to: 4.8.4
Fixed in: 4.8.5

The number of rows for SQL data assets in the preview is limited to 100.

Editing NoClassDetected column data class returns a blank page

Applies to: 4.8.4

When you try to edit NoClassDetected column data class by clicking Edit, you get a blank page.

Workaround: Go to the Profile tab to edit such column data class and refresh the page to reflect the update in the asset details page.

Clicking NoClassDetected column data class returns the Error 404 message

Applies to: 4.8.4 and later

When you click NoClassDetected column data class, you get the Error 404 page not found message, instead of getting redirected to the Data classes in governance UI page.

Workaround: Go to the governance UI page and search for NoClassDetected data class.

Actions in the overflow menu of a query-based data asset are enabled although not supported

Applies to: 4.8.4 and later

When you open the overflow menu (overflow menu) of a query-based data asset in the All assets list for the project, the following options are enabled although they are currently not supported:

  • Publish to catalog
  • Promote to space
  • Prepare data

Preview fails when you edit the SQL query for an asset with term assignments created through metadata enrichment

Applies to: 4.8.4 and later

When you edit the SQL query for a query-based data asset after you assigned business terms by running metadata enrichment, preview for the updated asset fails. You can then no longer edit the SQL query for the asset.

Catalog migration duplicate asset handling is by default set to allow duplicates

Applies to: 4.8.4 and later

When you migrate catalog assets, the default setting is incorrectly set to allow duplicates.

Workaround: Explicitly set the duplicate_action parameter in your import.yaml. For more information on duplicate_action parameter, see Selecting the scope of import.

Governance artifacts

You might encounter these known issues and restrictions when you use governance artifacts.

Cannot use masked assets in Data Refinery

Applies to: 4.8.4 and later

For any masked assets, the Data Refinery jobs fail. If you have access to the initial data assets before masking, the workaround is to use Data Refinery with unmasked assets.

Cannot use CSV to move data class between Cloud Pak for Data instances

Applies to: 4.8.0 and later

If you try to export data classes with matching method Match to reference data to CSV, and then import it into another Cloud Pak for Data instance, the import fails.

Workaround: For moving governance artifact data from one instance to another, especially data classes of this matching method, use the ZIP format export and import. For more information about the import methods, see Import methods for governance artifacts.

Error Couldn't fetch reference data values shows up on screen after publishing reference data

Applies to: 4.8.0 and later

When new values are added to a reference data set, and the reference data set is published, the following error is displayed when you try to click on the values:

Couldn't fetch reference data values. WKCBG3064E: The reference_data_value for the reference_data which has parentVersionId: <ID> and code: <code> does not exist in the glossary. WKCBG0001I: Need more help?

When the reference data set is published, the currently displayed view changes to Draft-history as marked by the green label on the top. The Draft-history view does not allow to view the reference data values.

Workaround: To view the values, click Reload artifact so that you can view the published version.

Issue when creating or editing data quality SLA rules on Firefox

Applies to: 4.8.0, 4.8.1, 4.8.2, 4.8.3, and 4.8.4
Fixed in: 4.8.5

If you use Mozilla Firefox as your web browser, you might encounter this issue. When you create or edit a data quality SLA rule based on terms, you must type every character twice when you enter a term name to have the name registered properly.

Workaround: Use a different web browser, such as Google Chrome or Microsoft Edge.

Publishing large reference data sets fails with Db2 transaction log full

Applies to: 4.8.0 and later

Publishing large reference data sets might fail with a Db2 error such as:

The transaction log for the database is full. SQLSTATE=57011

Workaround: Publish the set in smaller chunks, or increase Db2 transaction log size as described in the following steps.

  1. Modify the transaction log settings with the following commands:

    db2 update db cfg for bgdb using LOGPRIMARY 5 --> default value, should not be changed
    db2 update db cfg for bgdb using LOGSECOND 251
    db2 update db cfg for bgdb using LOGFILSIZ 20480
    
  2. Restart Db2.

You can calculate the required transaction log size as follows:

(LOGPRIMARY + LOGSECOND) * LOGFILSIZ

For publishing large sets, the following Db2 transaction log sizes are recommended:

  • 5GB for 1M reference data values and 300K relationships
  • 20GB for 1M reference data values and 1M relationships
  • 80GB for 1M reference data values and 4M relationships

where the relationship count is the sum of the parent, term and value mapping relationships for reference data values in the set.

Governance artifact workflows

You might encounter these known issues and restrictions when you use governance workflows.

Workflow notifications are not sent to the notification bell

Applies to: 4.8.0 and later

Regardless of notification settings specified in a workflow configuration, notifications from workflows are not sent to the notification bell. There is currently no workaround for this issue.

Custom workflows

You might encounter these known issues and restrictions when you use custom workflows.

HTTP method PATCH might not be supported in custom workflows

Applies to: 4.8.0, 4.8.1, 4.8.2, and 4.8.3
Fixed in: 4.8.4

Custom workflow templates might call a REST API by using the HTTP task activity offered by the Flowable workflow engine. The HTTP task activity in version 6.5.0 of the embedded Flowable workflow engine that is used in IBM Knowledge Catalog does not support the HTTP method PATCH. Trying to call a REST API using that method results in a "requestMethod is invalid" error. GET, POST, PUT, and DELETE methods work fine.

Workaround: Modify your REST API call to use the POST method instead, and add this special header to your request:

X-HTTP-Method-Override: PATCH

For this workaround to actually work, the called service must understand and correctly interpret this header field. Calls to REST APIs provided by the wkc-glossary-service service have worked properly.

Metadata import

You might encounter these known issues when you work with metadata import.

Column information might not be available for data assets imported through lineage import

Applies to: 4.8.0, 4.8.1, and 4.8.2
Fixed in: 4.8.3

When a metadata import is configured to get lineage from multiple connections and databases with the same name exist in these data sources, the tables from these databases are imported but no column information.

Workaround: Configure a separate metadata import for each connection pointing to same-named databases.

Running concurrent metadata import jobs on multiple metadata-discovery pods might fail

Applies to: 4.8.0 and later

When you run several metadata import jobs in parallel on multiple metadata-discovery pods, an error might occur, and an error message similar to the following one is written to the job run log:

Error 429: CDICW9926E: Too many concurrent user requests: 50

Workaround: You can resolve the issue in one of these ways:

  • Increase the maximum number of concurrent requests allowed per user. In the wdp-connect-connection pod, change the value of the MAX_CONCURRENT_REQUESTS_PER_USER environment variable, for example:

    MAX_CONCURRENT_REQUESTS_PER_USER: 100
    
  • If you don't have enough resources to increase the number of concurrent requests per user, reduce the number of threads connecting to the source. By default, 20 worker threads in a metadata-discovery pod access the wdp-connect-connection pod concurrently. If you define 4 pods for metadata import, 80 worker threads will access the data source at the same time. In a metadata-discovery pod, change the value of the discovery_create_asset_thread_count environment variable. For example:

    discovery_create_asset_thread_count: 10
    

Metadata import jobs might be stuck due to issues related to RabbitMQ

Applies to: 4.8.0 and later

If the metadata-discovery pod starts before the rabbitmq pods are up after a cluster reboot, metadata import jobs can get stuck while attempting to get the job run logs.

Workaround: To fix the issue, complete the following steps:

  1. Log in to the OpenShift console by using admin credentials.
  2. Go to Workloads > Pods.
  3. Search for rabbitmq.
  4. Delete the rabbitmq-0, rabbitmq-1, and rabbitmq-2 pods. Wait for the pods to be back up and running.
  5. Search for discovery.
  6. Delete the metadata-discovery pod. Wait for the pod to be back up and running.
  7. Rerun the metadata import job.

Deleting all assets from a metadata import at once might not work

Applies to: 4.8.0 and later

To delete all imported assets from a metadata import at once, you can use the Select all option. However, if the number of assets that you want to delete is very large, deletion might fail.

Workaround: Delete the assets in batches, for example, by using the Select all on page option.

Data assets might not be imported when running an ETL job lineage import for DataStage flows

Applies to: 4.8.0 and later

When you create and run a metadata import with the goal Get ETL job lineage where the scope is determined by the Select all DataStage flows and their dependencies in the project option, data assets from the connections associated with the DataStage flows are not imported.

Workaround: Explicitly select all DataStage flows and connections when you set the scope instead of using the Select all DataStage flows and their dependencies in the project option.

Can't import or reimport metadata from the Apache Hive connection

Applies to: 4.8.0
Fixed in: 4.8.1

Issues occur when you use the Apache Hive connection for the metadata import. See the details, depending on your Cloud Pak for Data installation.

  • Fresh installation: When you create a metadata import with the Get lineage import goal, and use the Apache Hive connection as a scope, an error occurs, and the metadata import is not saved.
  • Upgrade from a previous version: When you reimport metadata from the Apache Hive connection, the assets are not imported.

Workaround: To resolve both issues, reset the Apache Hive connection in MANTA Automated Data Lineage:

  1. Open the MANTA Automated Data Lineage for IBM Cloud Pak for Data Admin UI:

    https://<CPD-HOSTNAME>/manta-admin-gui/
    
  2. Go to Connections > Databases > Hive and select the connection that you want to reset.

  3. In the Setup the connection section, change the value of the Hive distribution option to Cloudera, and save the changes.

  4. Change the value of the Hive distribution option back to Apache and save the changes.

After the connection is reset, create a new metadata import or reimport metadata again.

Testing connection fails when creating a metadata import with the Apache Hive connection configured with SSL

Applies to: 4.8.0, 4.8.1, 4.8.2, and 4.8.3
Fixed in: 4.8.4

When you configure a new metadata import, and create a connection to Apache Hive with SSL enabled, testing the connection fails. When you click the Test connection button, an error is displayed. However, the connection is created and the metadata is imported successfully.

Assets are not imported from the IBM Cognos Analytics source when the content language is set to Japanese

Applies to: 4.8.0 and later

If you want to import metadata from the Cognos Analytics connection, where the user's content language is set to Japanese, no assets are imported. The issue occurs when you create a metadata import with the Get BI report lineage goal.

Workaround: In Cognos Analytics, change the user's content language from Japanese to English. Find the user for which you want to change the language, and change this setting in the Personal tab. Run the metadata import again.

When you import a project from a .zip file, the metadata import asset is not imported

Applies to: 4.8.0 and later

When you import a project from a file, metadata import assets might not be imported. The issue occurs when a metadata import asset was imported to a catalog, not to a project, in the source system from which the project was exported. This catalog does not exist on the target system and the metadata import asset can't be accessed.

Workaround: After you import the project from a file, duplicate metadata import assets and add them to a catalog that exists on the target system. For details, see Duplicating a metadata import asset.

Testing connection to Power BI, Qlik Sense, or SAP BusinessObjects fails

Applies to: 4.8.0 and later

When you create a connection to Power BI, Qlik Sense, or SAP BusinessObjects, testing the connection fails with errors.

Workaround: Create a connection without testing it.

After you upgrade to 4.8.1 or later, creating metadata import with the Qlik Sense connection fails

Applies to: 4.8.1 and later

When you upgrade IBM Knowledge Catalog to version 4.8.1 or later, you cannot import metadata from the Qlik Sense connection. When you create the metadata import, a validation error might occur, and the import is not created.

Workaround: To solve the issue, complete these steps:

  1. Modify the metadata-discovery-service-config ConfigMap settings in the OpenShift console:
    1. Log in to the OpenShift console by using admin credentials.
    2. Go to Workloads > ConfigMaps.
    3. Search for metadata-discovery-service-config.
    4. In the YAML tab, set the manta_scanner_validation_enabled setting to false.
    5. Save the changes and restart the metadata discovery pod.
  2. Edit the connection in MANTA Automated Data Lineage:
    1. Open the MANTA Automated Data Lineage Admin UI:

      https://<CPD-HOSTNAME>/manta-admin-gui/
      
    2. Go to Reporting & BI and select the Qlik Sense connection for which you couldn't import metadata.

    3. Click Edit and edit keystore settings.

    4. Click Recreate to re-create a store file.

    5. Click Add entry to add the client certificate.

    6. Validate the connection and save the changes.

After you edit the ConfigMap and connection, create the metadata import in IBM Knowledge Catalog again.

A connection to IBM watsonx.data can't be created

Applies to: 4.8.1
Fixed in: 4.8.2

The IBM watsonx.data connection is not available in IBM Cloud Pak for Data version 4.8.1.

The Google Cloud Storage connection can't be found when you create a metadata import

Applies to: 4.8.2
Fixed in: 4.8.3

When you create a new connection to use with metadata import with the Discover goal, the Google Cloud Storage connection is not listed in the available connections.

Workaround: Clear the filter Metadata import (discovery).

Lineage metadata can't be imported from the Apache Hive connection

Applies to: 4.8.2 and later

When you import lineage metadata from the Apache Hive connection, no assets are added. The workflow in MANTA Automated Data Lineage fails.

Workaround: Rerun the import. After running the import again, assets are successfully added in IBM Knowledge Catalog.

No lineage view for assets imported with metadata import

Applies to: 4.8.2 and later

No business lineage is shown for data assets that were imported by running metadata import with the Get lineage option. Existing lineage information is not updated.

Workaround: Edit the ccs-cr to update the default setting for a property:

  1. Run the following command:

    oc edit ccs ccs-cr
    
  2. Update settings to skip sending messeges to Global Search:

    catalog_api_jvm_args_extras: -Dfeature.skip_sending_messages_to_gs_only=true
    

If the issue still exist after updating the settings, try to resynchronize the catalog metadata as described in Resync of lineage metadata.

Not all tables and views are imported from the IBM Db2 for z/OS connection

Applies to: 4.8.4
Fixed in: 4.8.5

When you import lineage metadata from the IBM Db2 for z/OS connection, only some of the tables and views are imported.

Workaround: To resolve the issue, change advanced configuration settings for this connection in MANTA Automated Data Lineage:

  1. Open the MANTA Automated Data Lineage for IBM Cloud Pak for Data Admin UI:

    https://<CPD-HOSTNAME>/manta-admin-gui/
    
  2. Go to Connections > Databases > DB2 and select the connection that you want to configure.

  3. In the Advanced configuration section, change the value of the Extract extended attributes setting to false, and save the changes.

After you modify the connection settings, rerun the metadata import job.

Lineage metadata can't be imported on FIPS-enabled clusters

Applies to: 4.8.4
Fixed in: 4.8.5

Importing lineage metadata is not supported on FIPS-enabled clusters. This limitation is applicable when you install IBM Cloud Pak for Data 4.8.4, or when you upgrade to this version.

An error occurs when you create a metadata import from DataStage on Cloud Pak for Data with the Include DataStage job runs option

Applies to: 4.8.4
Fixed in: 4.8.5

When you create a metadata import with the Get ETL lineage goal, an error with status code 400 is displayed and the job can't be created. The issue occurs for metadata imports from DataStage on Cloud Pak for Data when the Include DataStage job runs advanced option is selected.

Metadata enrichment

You might encounter these known issues when you work with metadata enrichment.

Running primary key or relations analysis doesn't update the enrichment and review statuses

Applies to: 4.8.0 and later

The enrichment status is set or updated when you run a metadata enrichment with the configured enrichment options (Profile data, Analyze quality, Assign terms). However, the enrichment status is not updated when you run a primary key analysis or a relationship analysis. In addition, the review status does not change from Reviewed to Reanalyzed after review if new keys or relationships were identified.

In environments upgraded from a version before 4.7.1, you can't filter relationships in the enrichment results by assigned primary keys

Applies to: 4.8.0 and later

Starting in Cloud Pak for Data 4.7.1, you can use the Primary key filter in the key relationships view of the enrichment results to see only key relationships with an assigned primary key. This information is not available in upgrade environments if you upgraded from a version before 4.7.1. Therefore, the filter doesn't work as expected.

Workaround: To generate the required information, you can rerun primary key analysis or update primary key assignments manually.

Writing metadata enrichment output to an earlier version of Apache Hive than 3.0.0

Applies to: 4.8.0 and later

If you want to write data quality output generated by metadata enrichment to an Apache Hive database at an earlier software version than 3.0.0, set the following configuration parameters in your Apache Hive Server:

set hive.support.concurrency=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.enforce.bucketing=true;   # not required for version 2

set hive.compactor.initiator.on=true;
set hive.compactor.cleaner.on=true;   # might not be available depending on the version
set hive.compactor.worker.threads=1;

For more information, see Hive Transactions.

Metadata enrichment or profiling can't be run on data from connections that use Cloud Pak for Data credentials

Applies to: 4.8.0, 4.8.1, and 4.8.2
Fixed in: 4.8.3

You can't profile or run metadata enrichment on data assets from connections that are configured to use the user's platform login credentials to authenticate to the data source

Workaround: Update the connection to not use the platform login credentials or select a different authentication method, for example, API key.

Metadata enrichment runs on connections with personal credentials might fail

Applies to: 4.8.0 and 4.8.1
Fixed in: 4.8.2

When you set up a metadata enrichment for a connection with personal credentials that was created by another user, the metadata enrichment job fails unless you unlocked the connection with your credentials before. The metadata enrichment job log then contains one of the following error messages:

Data asset cannot be profiled as the user do not have stored credentials of the connection.

or

Data asset cannot be profiled as the data asset is associated to a connection configured to use personal credentials and the user has not yet provided credentials for that connection.

Workaround: If you are authorized to access the connection, unlock the connection with your credentials. In the project, open one of the assets in your metadata enrichment scope. Enter your credentials on the Preview tab or on the Profile tab. Then, rerun the metadata enrichment.

Data class First Name might not be assigned as expected

Applies to: 4.8.0, 4.8.1, 4.8.2, 4.8.3, and 4.8.4
Fixed in: 4.8.5

After metadata enrichment is complete, a column that contains first names might not have the expected predefined data class First Name assigned. This can happen if the column has too few values or contains many different first names so that the calculated confidence score is lower than the defined data class threshold (or minimum confidence score).

Workaround: Check the threshold that is set for the data class First Name for data class assignments in the project in general. You can adjust the threshold as required where the threshold that is set on an individual data class overwrites the project setting. For more information, see Adding data matching to data classes: Thresholds and Metadata enrichment default settings: Data class assignment.

Term assignment with the built-in model might fail when corrupt terms exist

Applies to: 4.8.0 and 4.8.1
Fixed in: 4.8.2

It can happen that corrupt business terms (business terms without a category ID) exist in the system. In this case, the built-in ML model for term assignment can't be created, which causes term assignment to fail. The job run details show an error message like this one:

- TAS0017E: An unexpected error occurred while running the term assignment profile of data asset
- ''47d78aeb-b110-4ed4-a329-8c0609047e9e'' in catalog ''null'' / project ''46d05fde-f47b-4e23-b9ac-804844e9ad71''.
- Message: ''["TAS0005E: Term assignment algorithm ''ML based term assignment'' failed with error message
 ''TAS0044E: An error occurred while calling a REST API ''POST /v1/term_assignment_models/predict'':
''HTTP 500 Internal Server Error''. Details: ''{\\"errors\\":[{\\"code\\":\\"finley_internal_error\\",\\"message\\":
\\"Model binary could not be retrieved.\\"}],\\"status_code\\":500,\\"trace\\":
\\"78e830a1-39cd-4863-b213-9ea8018bed9f\\"}\\n''.  ''."]''

Workaround: Delete the corrupt business terms by completing these steps:

  1. Get the service ID credentials:

    oc get secret wdp-service-id -o json -n zen
    

    The returned information looks like this:

    {
        "apiVersion": "v1",
        "data": {
            "service-id": "<serviceID>",
            "service-id-credentials": "<serviceID_credentials>"
        },
        "kind": "Secret",
        "metadata": {
            ...
    }
    
  2. Decode the credentials from Base64 once. Run the following command with the credentials returned in the previous step.

    echo -n '<serviceID_credentials>' | base64 --decode
    

    The result should look similar to this example:

    **aWNwNGQtAHW2Ong3eTdBMEV0Nnp1cEQ=**
    
  3. Assign this value to the variable TOKEN:

    export TOKEN="<decoded value>"
    
  4. Retrieve the artifact IDs of any corrupt terms.

    1. Create a param.json file with this content:
      {
        "_source": ["artifact_id"],
        "query": {
          "bool": {
            "must": [
              {
                "term": {
                  "metadata.artifact_type": "glossary_term"
                }
              }
            ],
            "must_not": [
              {
                "exists": {
                  "field": "categories.primary_category_id"
                }
              }
              ]
          }
        }
      }
      
    2. Run the following cURL command to retrieve the artifact ID. Replace <CPD_HOST> with the hostname of your deployment.
      curl -k -s -X POST https://<CPD_HOST>/v3/search -H "Authorization: Basic $TOKEN"  -H 'accept: application/json' -H 'Content-Type: application/json' -d @param.json
      
      The returned result contains the artifact IDs of all business terms without a category ID.
  5. Delete the terms in question by running the following command for each artifact ID. Replace <CPD_HOST> with the hostname of your deployment.

    curl -k -s -X DELETE https://<CPD_HOST>/v3/search_index/providers/glossary/artifacts/<artifact_id> -H "Authorization: Basic $TOKEN"  -H 'accept: application/json' -H 'Content-Type: application/json'
    

After you successfully delete any corrupt terms, rerun the metadata enrichment.

Metadata enrichment job is stuck in running state

Applies to: 4.8.0 and 4.8.1
Fixed in: 4.8.2

The metadata enrichment job is stuck in running state, where the job log shows profiling to be complete, but term assignment is constantly shown as being in progress for the same set of assets as in this example:

====================================================================

Metadata enrichment job run (eb47d880-f017-49f0-88e1-9c195838c608) is in state 'Running'.

Enrichment asset summary:
Total assets: 3
 - Assets with status 'Created': 0
 - Assets with status 'In progress': 3
 - Assets with status 'Not found': 0
 - Assets with status 'Completed': 0
   - Completed successfully: 0
   - Completed with errors due to failed profiling operation: 0
   - Completed with errors due to failed term assignment operation: 0

Profiling status:
HB tasks with status 'Completed': 1
Active/Failed HB tasks: 0
25b7bd3a-5714-40b6-987e-86a4a69f7749 - 2023-11-22T15:28:09.890754325Z
 - 2023-11-22T15:28:09.947215300Z [+0s] - SUBMITTED
 - 2023-11-22T15:28:29.014515579Z [+19s] - RUNNING
 - 2023-11-22T15:29:39.560035406Z [+89s] - COMPLETED

Term assignment status:
Assets in progress: 3
2023-11-22T16:03:18.753Z - af211632-e243-4411-9a76-598baacb1d8b ("MORTGAGE_PROPERTY")
2023-11-22T16:03:18.754Z - b9a70594-f51d-46cf-80f4-6c8d9f5062d7 ("MORTGAGE_CUSTOMER")
2023-11-22T16:03:18.755Z - c1b30f6a-706f-4950-9c15-f7bf98f808aa ("MORTGAGE_DEFAULT")

Workaround: Restart the term assignment pods and the metadata enrichment service manager pod. To restart the pods, you can scale the pods to zero and then back to the previous value:

oc scale deploy wkc-mde-service-manager --replicas=0 
oc scale deploy wkc-term-assignment --replicas=0 
oc scale deploy wkc-mde-service-manager --replicas=N 
oc scale deploy wkc-term-assignment --replicas=N  

When the pods are back up and running, rerun the metadata enrichment.

Metadata enrichment fails because of spark-hb-control-plane pod error

Applies to: 4.8.3

During install or upgrade, when the cluster is validating the workload, Metadata enrichment job can fail. This is because of an error that can happen in the spark-hb-control-plane pod.

Symptoms: Check the logs of the following pods to determine if an error has occured.

The log for the wdp-profiling pod:

[ERROR   ] {"class_name":"com.ibm.wdp.profiling.impl.messaging.consumer.DataProfileConsumer","method_name":"handleDataProfileFailure","class":"com.ibm.wdp.profiling.impl.messaging.consumer.DataProfileConsumer","method":"handleDataProfileFailure","appname":"wdp-profiling","user":"NONE","thread_ID":"7b","trace_ID":"y2ocu5avprf8ehdm0suu2d7y","transaction_ID":"NONE","timestamp":"2024-02-15T00:03:05.501Z","tenant":"NONE","session_ID":"NONE","perf":"false","auditLog":"false","loglevel":"SEVERE","message":"THROW. The WDPException is: Internal Server Error Failed to start Humming-Bird job..","msg_ID":"CDIWC2006E","exception":"com.ibm.wdp.service.common.exceptions.WDPException: CDIWC2006E: Internal Server Error Failed to start Humming-Bird job..\n\tat
...
CDIWC2006E: Internal Server Error Failed to start Humming-Bird job..

The log for the spark-hb-control-plane-xx pod:

...
[2/15/24, 0:19:03:731 UTC] 00000323 id=         com.ibm.iae.ctm.api.ClusterTemplateManagerImpl               E getClusterTemplate spark-3.3-wkc-profiling-cp4d-template No cluster template not found for give template.
[2/15/24, 0:19:03:731 UTC] 00000323 id=         application.api.spark.base.AbstractApplicationConfigPreparer E getClusterTemplate  bd657c5b-69c7-4fe1-b987-77d577a54d0d Cluster template could not be found:
com.ibm.iae.ctm.exceptions.ClusterTemplateNotFoundExecption: Cluster template with id: spark-3.3-wkc-profiling-cp4d-template node found
        at com.ibm.iae.ctm.api.ClusterTemplateManagerImpl.getClusterTemplate(ClusterTemplateManagerImpl.java:67)
...

Workaround: To resolve this issue follow these steps:

  1. Find the spark-hb-control-plane pod:
    oc get pod  -n  ${PROJECT_CPD_INST_OPERANDS}  | grep spark-hb-control-plane
    
  2. Delete the spark-hb-control-plane pod:
    oc delete pod <podname> -n  ${PROJECT_CPD_INST_OPERANDS}
    
  3. Run the Metadata enrichment job again after the spark-hb-control-plane pod is up and running.

Issues with the Microsoft Excel add-in

Applies to: 4.8.4 and later

The following issues are known for the Review metadata add-in for Microsoft Excel:

  • When you open the drop-down list to assign a business term or a data class, the entry Distinctive name is displayed as the first entry. If you select this entry, it shows up in the column but does not have any effect.

  • Updating or overwriting existing data in a spreadsheet is currently not supported. You must use an empty template file whenever you retrieve data.

  • If another user works on the metadata enrichment results while you are editing the spreadsheet, the other user's changes can get lost when you upload the changes that you made in the spreadsheet.

  • Only assigned data classes and business terms are copied from the spreadsheet columns Assigned / suggested data classes and Assigned / suggested business terms to the corresponding entry columns. If multiple business terms are assigned, each one is copied to a separate column.

Profiling or running data quality rules fails for query-based assets from certain data sources

Applies to: 4.8.4 and later

If you create a data asset by using an SQL query, the following issues can occur:

  • For data assets from a Watson Query data source, profiling fails.
  • For data assets from a watsonx.data data source, profiling and running data quality rules fail.

Suggested primary key information isn't shown in the asset details

Applies to: 4.8.4
Fixed in: 4.8.5

Suggested primary keys are not shown on the Keys tab in the side panel that provides the asset details after the enrichment completes.

Workaround: Refresh your browser to see suggested primary keys in the side panel.

Term assignment results based on name matching might be different between versions

Applies to: 4.8.2, 4.8.3, and 4.8.4
Fixed in: 4.8.5

For term assignment based on name matching, the results that are returned in Cloud Pak for Data 4.8.0 or 4.8.1 might be different to the results in Cloud Pak for Data 4.8.2 or later due to a different algorithm being used.

Workaround: To use the same name matching algorithm in 4.8.2 and later as before, complete these steps:

  1. Log in to the cluster. Run the following command as a user with sufficient permissions to complete this task:

    oc login <OpenShift_URL>:<port>
    
  2. Edit the IBM Knowledge Catalog custom resource by running the following command:

    oc edit WKC wkc-cr
    
  3. Add the following entry after the top-level spec element in the yaml:

    wkc_term_assignment_name_matching_algorithm: "FUZZY_MATCHING"
    

    Make sure to indent the entry by two spaces.

The change is picked up the next time the operator is reconciled, which can take 5 - 10 minutes. You can check in these ways whether the change is applied:

  • Check whether the wkc-term-assignment pod was restarted.
  • Run the command oc get WKC wkc-cr -o yaml. The status information shows if and when the reconciliation was run.

Data quality

You might encounter these known issues when you work with data quality assets.

Rules with multiple joins might return incorrect results for data assets from Apache Cassandra, Apache Hive, MongoDB, or Oracle data sources

Applies to: 4.8.0 and later

A data quality rule that is created from one or more data quality definitions and contains multiple joins might return incorrect results when is is run on data assets from Apache Cassandra, Apache Hive, MongoDB, or Oracle data sources that are connected through a Generic JDBC connection.

Workaround: Use the respective native connector.

Rules bound to columns of the data type NUMERIC in data assets from Oracle data sources might not work

Applies to: 4.8.0 and later

Testing or running a data quality rule that is bound to a NUMERIC column in a data asset from an Oracle data source fails if the data source is connected through a Generic JDBC connection.

Workaround: Use the native connector.

Viewing data quality assets requires the Manage data quality assets permission

Applies to: 4.8.0 and 4.8.1
Fixed in: 4.8.2

To see data quality assets in a project and to view individual data quality definitions and rules, users must have the Manage data quality assets user permission in addition to a collaborator role in the project.

Workaround: Assign the Manage data quality assets permissions to any user who needs to view data quality assets.

The list of dimensions for filtering data quality checks might be incomplete

Applies to: 4.8.0, 4.8.1, and 4.8.2
Fixed in: 4.8.3

On the Data quality tab, the list of dimensions for filtering the data quality checks might not include all dimensions for which a score is available.

Workaround: To see the issues for a specific dimension, sort the list of data quality checks by dimension.

Rule testing in the review step fails if the bound data comes from an IBM watsonx.data data source

Applies to: 4.8.2 and later

In the review step of creating a rule, when you test a rule that is bound to a data asset from an IBM watsonx.data data source, the test fails with an error message similar to this one:

 An unknown error occurred. Exception SCAPIException was caught during processing of the request: Connection failed. Please check connection properties: SCAPI error:
 Connection failed. Please check connection properties: IBM watsonx.data on Cloud Pak for Data authentication error: {"errors":null, "exception":"Given API endpoint
 /v1/catalogs was not found in supported APIs list, please refer IBM watsonx.data documentation for supported APIs at “,”message”:”Not Found”:”message code”:”404
 Not Found","status_code":404};

However, the rule is still properly saved when you click Create.

Workaround: Ignore the error or skip the test step for such rule.

Runs of migrated data quality rules complete with warnings

Applies to: 4.8.0 and later

When you run a data quality rule that was migrated from the legacy data quality feature or from InfoSphere Information Server, you might see the message Run successful with warnings.

Workaround: None. You can ignore such warnings.

MANTA Automated Data Lineage

You might encounter these known issues and restrictions when MANTA Automated Data Lineage is used for capturing lineage.

Metadata import jobs for getting lineage might take very long to complete

Applies to: 4.8.0 and later

If multiple lineage scans are requested at the same time, the corresponding metadata import jobs for getting lineage might take very long to complete. This is due to the fact that MANTA Automated Data Lineage workflows can't run in parallel but are executed sequentially.

Chrome security warning for Cloud Pak for Data deployments where MANTA Automated Data Lineage for IBM Cloud Pak for Data is enabled

Applies to: 4.8.0 and later

When you try to access a Cloud Pak for Data cluster that has MANTA Automated Data Lineage for IBM Cloud Pak for Data enabled from the Chrome web browser, the message Your connection is not private is displayed and you can't proceed. This is due to MANTA Automated Data Lineage for IBM Cloud Pak for Data requiring an SSL certificate to be applied and occurs only if a self-signed certificate is used.

Workaround: To bypass the warning for the remainder of the browser session, type thisisunsafe anywhere on the window. Note that this code changes every now and then. The mentioned code is valid as of the date of general availability of Cloud Pak for Data 4.6.0. You can search the web for the updated code if necessary.

Columns are displayed as numbers for a DataStage job lineage in the catalog

Applies to: 4.8.0 and later

The columns for a lineage that was imported from a DataStage job are not displayed correctly in the catalog. Instead of column names, column numbers are displayed. The issue occurs when the source or target of a lineage is a CSV file.

Lineage

You might encounter these known issues and restrictions with lineage.

Lineage graph exported to PDF file might be incomplete

Applies to: 4.8.0 and later
Fixed in: 4.8.5

If your lineage graph resolution is more than 1451  × 725, you won’t be able to export the entire generated graph to PDF file.

Workaround: In the mini map, select Fit graph to screen, and download your graph.

Lineage metadata don’t show on Knowledge Graph after upgrading

Applies to: 4.8.0/br>

After upgrading to 4.7.2, an unknown error appears on the lineage tab.

Workaround: To start seeing the Knowledge Graph, you need to resync catalogs' metadata, see Resync of lineage metadata.

Business data lineage is incomplete for the metadata imports with Get ETL job lineage or Get BI report lineage goals

Applies to: 4.8.0 and later

In some cases, when you display business lineage between databases and ETL jobs or BI reports, some assets are missing, for example, a starting database. The data was imported by using the Get ETL job lineage or Get BI report lineage import option. Technical data lineage correctly shows all assets.

Workaround: Sometimes MANTA Automated Data Lineage cannot map the connection information from an ETL job or a BI report to the existing connections in IBM Knowledge Catalog. Follow these steps to solve the issue:

  1. Open the MANTA Automated Data Lineage Admin UI:

    https://<CPD-HOSTNAME>/manta-admin-gui/
    
  2. Go to Log Viewer and from the Source filter select Workflow Execution.

  3. From the Workflow Execution filter, select the name of the lineage workflow that is associated with the incomplete business lineage.

  4. Look for the dictionary_manta_mapping_errors issue category and expand it.

  5. In each entry, expand the error and click View Log Details.

  6. In each error details, look for the value of connectionString. For example, in the following error message, the value of the connectionString parameter is DQ DB2 PX.

    2023/11/14 18:40:12.186 PM [CLI] WARN - <provider-name> [Context: [DS Job 2_PARAMETER_SET] flow in project [ede1ab09-4cc9-4a3f-87fa-8ba1ea2dc0d8_lineage]]
    DICTIONARY_MANTA_MAPPING_ERRORS - NO_MAPPING_FOR_CONNECTION
    User message: Connection in use could not be automatically mapped to one of the database connections configured in MANTA.
    Technical message: There is no mapping for the connection Connection [type=DB2, connectionString=DQ DB2 PX, serverName=dataquack.ddns.net, databaseName=cpd, schemaName=null, userName=db2inst1].
    Solution: Identify the particular database technology DB2 leading to "DQ DB2 PX" and configure it as a new connection or configure the manual mapping for that database technology in MANTA Admin UI.
    Lineage impact: SINGLE_INPUT
    
  7. Depending on the connection that you used for the metadata import, go to Configuration > CLI > connection server > connection server Alias Mapping, for example DB2 > DB2 Alias Mapping.

  8. Select the connection used in workflow and click Full override.

  9. In the Connection ID field, add the value of the connectionString parameter that you found in the error details, for example DQ DB2 PX.

  10. Rerun the metadata import job in IBM Knowledge Catalog.

Business lineage between data assets and IBM Cognos Analytics assets is incomplete

Applies to: 4.8.0 and later

After you import metadata from IBM Cognos Analytics with its source data assets, business data lineage does not include these source data assets.

Data integration assets show columns

Applies to: 4.8.2

When selecting Show Columns on a data asset connected with a data integration asset, columns might connect with the data integration job asset’s columns that should not be shown.

Workaround: Before selecting Show Columns on a data asset connected with a data integration asset, expand the data integration asset first.

An unnecessary edge appears when expanding data integration assets

Applies to: 4.8.3

After expanding a data integration asset and clicking Show next or Show all, the transformer nodes will have an unnecessary edge that points to themselves.

Path of the ETL flow is not highlighted

Applies to: 4.8.2
Fixed in: 4.8.4

Nodes of ETL flow won’t be highlighted, if they are on the path of any currently selected node outside of the ETL flow. If you select any ETL flow component node, the path highlighting will not extend outside of the ETL flow.

Asset’s name is not displayed on a node

Applies to: 4.8.4
Fixed in: 4.8.5

Assets with names too long to fit into the node’s space might not be displayed.

Workaround: Click the asset to view its name on the side panel.

Find asset panel shows No business terms assigned

Applies to: 4.8.4

Business terms assigned to the asset might not show on the Find asset panel.

Workaround: Click the asset to view Details panel to check what business terms are assigned.

Limitations

Catalogs and projects

Missing default catalog and predefined data classes

Applies to: 4.8.0 and later

The automatic creation of the default catalog after installation of the IBM Knowledge Catalog service can fail. If it does, the predefined data classes are not automatically loaded and published as governance artifacts.

Workaround: Ask someone with the Administrator role to follow the instructions for creating the default catalog manually.

Special or double-byte characters in the data asset name are truncated on download

Applies to: 4.8.0 and later

When you download a data asset with a name that contains special or double-byte characters from a catalog, these characters might be truncated from the name. For example, a data asset named special chars!&@$()テニス.csv will be downloaded as specialchars!().csv.

The following character sets are supported:

  • Alphanumeric characters: 0-9, a-z, A-Z
  • Special characters: ! - _ . * ' ( )

Catalog UI does not update when changes are made to the asset metadata

Applies to: 4.8.0 and later

If the Catalog UI is open in a browser while an update is made to the asset metadata, the Catalog UI page will not automatically update to reflect this change. Outdated information will continue to be displayed, causing external processes to produce incorrect information.

Workaround: After the asset metadata is updated, refresh the Catalog UI page at the browser level.

A blank page might be rendered when you search for terms while manually assigning terms to a catalog asset

Applies to: 4.8.0 and later

When you search for a term to assign to a catalog asset and change that term while the search is running, it can happen that a blank page is shown instead of any search results.

Workaround: Rerun the search.

Governance artifacts

Masked data is not supported in data visualizations

Applies to: 4.8.0 and later

Masked data is not supported in data visualizations. If you attempt to work with masked data while generating a chart in the Visualizations tab of a data asset in a project the following error message is received: Bad Request: Failed to retrieve data from server. Masked data is not supported.

Metadata enrichment

In some cases, you might not see the full log of a metadata enrichment job run in the UI

Applies to: 4.8.0 and later

If the list of errors in a metadata enrichment run is exceptionally long, only part of the job log might be displayed in the UI.

Workaround: Download the entire log and analyze it in an external editor.

Schema information might be missing when you filter enrichment results

Applies to: 4.8.0 and later

When you filter assets or columns in the enrichment results on source information, schema information might not be available.

Workaround: Rerun the enrichment job and apply the Source filter again.

Profiling in catalogs, projects, and metadata enrichment might fail for Teradata connections

Applies to: 4.8.0 and later

If a generic JDBC connection for Teradata exists with a driver version before 17.20.00.15, profiling in catalogs and projects, and metadata enrichment of data assets from a Teradata connection fails with an error message similar to the following one:

2023-02-15T22:51:02.744Z - cfc74cfa-db47-48e1-89f5-e64865a88304 [P] ("CUSTOMERS") - com.ibm.connect.api.SCAPIException: CDICO0100E: Connection failed: SQL error: [Teradata JDBC Driver] [TeraJDBC 16.20.00.06] [Error 1536] [SQLState HY000] Invalid connection parameter name SSLMODE (error code: DATA_IO_ERROR)

Workaround: Complete these steps:

  1. Go to Data > Platform connections > JDBC drivers and delete the existing JAR file for Teradata (terajdbc4.jar).
  2. Edit the generic JDBC connection, remove the selected JAR files, and add SSLMODE=ALLOW to the JDBC URL.

For assets from SAP OData sources, the metadata enrichment results do not show the table type

Applies to: 4.8.0 and later

In general, metadata enrichment results show for each enriched data asset whether the asset is a table or a view. This information cannot be retrieved for data assets from SAP OData data sources and is thus not shown in the enrichment results.

Data quality

Rules run on columns of type timestamp with timezone fail

Applies to: 4.8.0 and later

The data type timestamp with timezone is not supported. You can't apply data quality rules to columns with that data type.

Rules fail because the job's warning limit is exceeded

Applies to: 4.8.0 and later

For some rules, the associated DataStage job fails because the warning limit is reached. The following error message is written to the job log:

Warning limit 100 for the job has been reached, failing the job.

The default limit for jobs associated with data quality rules is 100.

Workaround: Edit the configuration of the DataStage job and set the warning limit to 1,000. Then, rerun the job.

Parent topic: Known issues and limitations in Cloud Pak for Data