Limitations and known issues for Watson Knowledge Catalog
These known issues apply to Watson Knowledge Catalog.
-
General
- User groups not supported in certain areas
- Migration of legacy HDFS connections default to SSL
- No failover for some services when a node becomes unavailable
- Searching platform connections might not return results
- Search strings containing special characters might not render results
- Automated discovery fails after changing from an ia.apt file to a dynamic apt file
- The Watson Knowledge Catalog roles initialization job fails when the common core service (ccs) operator reconciles on 4.6.3
- Data quality permissions are missing from the Data Quality Analyst role
- The Administrator role's permission set does not include the Access data quality asset types permission
- Images for AI Factsheets fail to mirror
-
Installing and upgrading
- The advanced metadata import feature can't be enabled in FIPS-enabled environments
- Installing the advanced metadata import feature may cause operator pods to crash
- During installation, the wkc-db2u-init job can get stuck
- The Metadata import (lineage) feature is not active when upgrading from 4.5.x to 4.6.4
- Installing or upgrading to version 4.6.4 requires a patch for offline backup and restore, and metadata enrichment
- Upgrading to version 4.6.5 may fail because of the Db2u container
- When installing 4.6.5 on an OCS cluster, an issue occurs in the is-en-conductor-0 pod
- After upgrading to 4.6.x, the workflow pod might not start
-
Connections that use credentials from a vault
-
Catalogs
- Missing previews
- Default catalog is missing
- Event details incomplete
- Publishing a Cognos Dashboard with more than one image creates multiple attachments with the same image data
- Catalog UI does not update when changes are made to the asset metadata
- Can't add business terms to a catalog asset
- A blank page might be rendered when you search for terms while manually assigning terms to a catalog asset
- No automatic reprofiling of assets where profiling failed when the assets were added to the catalog
- Cannot delete a custom relationship definition for catalog assets
- Server outage occurs while previewing masked assets for the first time
- Assigning business terms to columns may cause an error
- Special or double-byte characters in the data asset name are truncated on download
- Profiling in catalogs, projects, and metadata enrichment might fail for Teradata connections
- Can’t edit custom relationships
- The default catalog is not created automatically in version 4.6.4
- Duplicate columns in files will not be displayed in the columns table
- Cannot import or export catalog assets
-
Governance artifacts
- Synchronize the data policy service (DPS) category caches
- Row filtering not applied to asset profile
- Masked data is not supported in data visualizations
- Artifacts are not synced to Information Assets view
- Custom category roles created without the view permission by using Watson Data API
- Cannot use CSV to move data class between Cloud Pak for Data instances
- Error
Couldn't fetch reference data values
shows up on screen after publishing reference data - Troubleshooting problems with [uncategorized] category missing after deployment
- Errors show up when viewing related content for data classes and governance rules
- ZIP import of a reference data set with composite key fails
- Using Reload artifact from reference data set details page does not refresh the page
- Reference data set can't be modified even if it is no longer set as a validator
-
Governance artifact workflows
-
Custom workflows
-
Legacy data discovery and data quality
- Column analysis/auto discovery analysis generates "data out of range" error
- Connections that use Cloud Pak for Data credentials for authentication can't be used in discovery jobs
- Incorrect connections are associated with connected data assets after automated discovery
- Changes to platform-level connections aren't propagated for discovery
- Column analysis fails if system resources or the Java heap size are not sufficient
- Connection that was made by using HDFS through the Execution Engine cannot be used with auto discovery
- Known issues with Hive and HDFS connections for data discovery
- Platform connections with encoded certificates cannot be used for discovery
- Certain connections with spaces in the name can't be used in discovery
- In some cases, automated discovery jobs show the status running although they actually failed
- The discovery operation intermittently fails
- Previous connection is deleted from IMAM when running discovery with a new connection pointing to the same source
- Discovery fails for MongoDB schemas or tables where the name contains special characters
- Inconsistent hostnames in the XMETA database for JDBC connections used in automated discovery
- Connection might not be usable in discovery after updating the password in IMAM
- Automated discovery might not work for generic JDBC platform connections with values in the JDBC properties field
- Db2 SSL/TLS connections can't be used in discovery
- Some connection names aren't valid for automated discovery
- Can't view the results of a data rule run
- Automated discovery might fail due to the length of the import area
- Create external bundle location option not supported when scheduling analysis jobs
- Can't preview assets added to the default catalog by running automated discovery on a connection that uses credentials from a vault
- Connections that use credentials from a vault are not synced to the default catalog
- Can't refine data assets added to the default catalog by running automated discovery on a connection that uses credentials from a vault
- Overriding rule or rule set runtime settings at the columns level causes an error
- When a generic JDBC connection for Snowflake is synced to the default catalog, the connection type changes to native Snowflake
- Can't update the password of connections used in metadata import areas from the command line
-
Setting up reporting
-
Quick scan migration
-
Metadata import in projects
- Can't publish more than 700 assets at once
- Column information might not be available for data assets imported through lineage import
- Running concurrent metadata import jobs on multiple metadata-discovery pods might fail
- Metadata import jobs might be stuck in running state
- Metadata import jobs might be stuck due to issues related to RabbitMQ
- After upgrading to 4.6.0, the metadata import option
Get lineage
is disabled - Installing the advanced metadata import feature may cause operator pods to crash
- Can’t create lineage imports after upgrading from 4.5.3 to 4.6.3
- Can’t create metadata import assets for getting lineage from Db2 SSL or Microsoft SQL Server connections
-
Metadata enrichment
- In some cases, you might not see the full log of a metadata enrichment job run in the UI
- Schema information might be missing when you filter enrichment results
- Metadata enrichment runs on connections with personal credentials might fail
- Issues with search on the Assets tab of a metadata enrichment asset
- Information about removed business terms isn't stored if those are removed in bulk
- Term assignments can't be removed after a term is deleted
- Profiling in catalogs, projects, and metadata enrichment might fail for Teradata connections
- Turning off the schedule for an enrichment might not be possible
- Issues with updating an existing metadata enrichment
- Running primary key or relations analysis doesn't update the enrichment and review statuses
- Can't run primary key analysis on data assets from an SAP OData data source
- Can't run primary key analysis on data assets from an Apache Cassandra data source
- Running relationship analysis on more than 200 data assets at once renders no results
- Metadata enrichment job is stuck in running state
- When running enrichment on a large number of assets, profiling might fail for a subset
- If you do a bulk removal of data classes, the Governance tab in the column details is missing data class information
- Restarts of the profiling pod might keep the metadata enrichment job in running state
- When running enrichment on a large number of assets, profiling might fail for some assets due to gateway errors
- Business term or data class information panels in metadata enrichment do not show the correct information
-
Data quality in projects
- Can't run rule on CSV file if the output table contains column names with special characters
- Rules run on columns of type time in data assets from Amazon Redshift data sources do not return proper results
- Rule testing in the review step fails if the bound data comes from an Apache Hive data source connected through Knox
- Rules binding columns of type BigInt in data assets coming from a Snowflake connection might fail
- Data from embedded array fields in MongoDB tables might not be written to the output table for a data quality rule
- Issues with rules in FIPS-enabled environments
- Rules run on columns of type timestamp with timezone fail
- Results of rule test and rule run deviate if the expression contains a comparison between string and numeric columns
- Rule testing in the review step fails if the rule contains joins for columns from SAP HANA data sources
- Rules with random sampling fail for data assets from Teradata data sources
-
MANTA Automated Data Lineage for IBM Cloud Pak for Data
- Metadata import jobs for getting lineage might take very long to complete
- The advanced metadata import feature can't be enabled in FIPS-enabled environments
- Lineage import fails for schemas with names that contain special characters
- The metadata import option "Get lineage" gets intermittently disabled
- Chrome security warning for Cloud Pak for Data deployments where MANTA Automated Data Lineage is enabled
- Lineage imports fail if the data scope is narrowed to schema
- Rerun of a lineage import fails if assets were deleted from the source
- The metadata import option Get Lineage gets intermittently disabled
-
Lineage
- Assets connected through promoted lineage flows appear on column lineage
- Reimporting lineage after upgrade might lead to flows pointing to placeholder assets
- Missing column names in lineage for Netezza Performance Server and Google BigQuery data assets
- Accessing the lineage tab in the user interface does not work
Also see:
- Known issues for Data Refinery
- Troubleshooting Watson Knowledge Catalog
- Cluster imbalance overwhelms worker nodes
General issues
You might encounter these known issues and restrictions when you work with the Watson Knowledge Catalog service.
User groups not supported in certain areas
These areas do not support user groups:
- Data discovery
- Data quality
- Information assets
Applies to: 4.6
Migration of legacy HDFS connections default to SSL
After migrating legacy WebHDFS connections, you might receive the following error from the migrated Apache HDFS connection:
The assets request failed: CDICO0100E: Connection failed: SCAPI error: An invalid custom URL (https://www.example.com) was specified. Specify a valid HTTP or HTTPS URL.
Workaround: Modify your WebHDFS URL protocol from https to http.
Applies to: 4.6
No failover for some services when a node becomes unavailable
The following pods of the InfoSphere Information Server and Unified Governance services use only one replica:
c-db2oltp-iis-db2u
c-db2oltp-wkc-db2u
cassandra
gov-admin-ui
gov-catalog-search-index
gov-catalog-search-service
gov-enterprise-search-ui
gov-quality-ui
gov-ui-commons
ia-analysis
igc-ui-react
iis-services
is-en-conductor
kafka
omag
shop4info-event-consumer
shop4info-mappers-service
shop4info-rest
shop4info-scheduler
shop4info-type-registry-service
solr
zookeeper
As a result, these pods are not automatically restarted on another node if the node they're running on becomes unavailable.
The pods are restarted when the node becomes available again or if they are rescheduled to another node.
Applies to: 4.6
Searching platform connections might not return results
Searching for a connection on the Platform connections page might not return results because only the displayed connections are searched, although there might be more connections.
Workaround: Click Show more until all connections are loaded and rerun your search.
Applies to: 4.6.0
Search strings containing special characters might not render results
When you don't use global search to search for an asset but use the search field on a project or catalog page, search strings that contain special characters such as an underscore (_) might not return results.
Workaround: Rerun your search with a search string without special characters.
Applies to: 4.6
Automated discovery fails after changing from an ia.apt file to a dynamic apt file
You might encounter this issue when running an automated discovery job and you change the apt
file from an ia.apt
file to a dynamic apt
file.
Workaround: To fix this issue, run the following patch command:
oc patch iis iis-cr --type merge --patch '{"spec": {"iis_en_compute_resources":{"requests":{"cpu": "200m", "memory": "1000Mi"},"limits":{"cpu": "2", "memory": "4Gi"}}}}'
Note that these listed "limits":{"cpu": "2", "memory": "4Gi"}}}}"
values may vary depending on the automated discovery jobs being run.
Applies to: 4.6.1
The Watson Knowledge Catalog roles initialization job fails when the common core service (ccs) operator reconciles in 4.6.3
You might encounter this issue during the roles initialization job if you do not have the optional Data Quality feature enabled, which means you have set enableDataQuality: False
in the install.yaml
file.
This issue only applies to users on 4.6.3.
Workaround: Apply the missing data quality permission access_data_quality_asset_types
to prevent ccs reconciliation failures by following these steps:
-
Apply the following configmap in the
cpd-instance
namespace:cat <<EOF | oc apply -f - apiVersion: v1 data: extensions: | [ { "extension_point_id": "zen_permissions", "extension_name": "access_data_quality_asset_types", "display_name": "{{.global_wkc_access_data_quality_asset_types}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "key": "access_data_quality_asset_types", "category": "{{.global_wkc_category_data_curation}}", "category_description": [ "{{.global_wkc_category_data_curation_description}}" ], "description": [ "{{.global_wkc_access_data_quality_asset_types_description}}" ] } } ] kind: ConfigMap metadata: labels: app: wkc-lite app.kubernetes.io/instance: 0075-wkc-lite app.kubernetes.io/managed-by: Tiller app.kubernetes.io/name: wkc-lite chart: wkc-lite helm.sh/chart: wkc-lite heritage: Tiller icpdata_addon: "true" icpdata_addon_version: 4.6.3 release: 0075-wkc-lite name: wkc-permission-extensions-dq namespace: ${PROJECT_CPD_INSTANCE} EOF
-
Force a reconciliation of the ccs operator by running the following:
oc delete pods -l app.kubernetes.io/name=ibm-cpd-ccs-operator -n ${PROJECT_CPD_OPS}
Applies to: 4.6.3
Fixed in: 4.6.4
Data quality permissions are missing from the Data Quality Analyst role
You might encounter this issue when running Metadata enrichment and Metadata import as a user with the Data Quality Analyst role.
Workaround: Apply the following missing data quality permissions and role externsion configmaps to enable the use of Metadata enrichment and Metadata import:
-
Apply the first permission and configmap in the
cpd-instance
namespace:cat <<EOF | oc apply -f - apiVersion: v1 data: extensions: | [ { "extension_point_id": "zen_permissions", "extension_name": "wkc_manage_discovery_perm", "display_name": "{{.global_wkc_discover_assets}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "key": "manage_discovery", "category": "{{.global_wkc_category_data_curation}}", "category_description": [ "{{.global_wkc_category_data_curation_description}}" ], "description": [ "{{.global_wkc_discover_assets_description1}}", "{{.global_wkc_discover_assets_description2}}" ], "actions": [ { "description": "{{.global_wkc_action_discovery}}", "tooltip": "{{.global_wkc_discovery_tooltip}}" }, { "description": "{{.global_wkc_action_rerun_discovery}}" }, { "description": "{{.global_wkc_action_delete_discovery}}" } ] } }, { "extension_point_id": "zen_permissions", "extension_name": "wkc_manage_information_assets_perm", "display_name": "{{.global_wkc_manage_information_assets}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "key": "manage_information_assets", "category": "{{.global_wkc_category_catalogs}}", "category_description": [ "{{.global_wkc_category_catalogs_description1}}", "{{.global_wkc_category_catalogs_description2}}" ], "description": [ "{{.global_wkc_manage_information_assets_description}}" ], "actions": [ { "description": "{{.global_wkc_action_browse_assets}}" }, { "description": "{{.global_wkc_action_data_lineage}}", "tooltip": "{{.global_wkc_data_lineage_tooltip}}" }, { "description": "{{.global_wkc_action_business_lineage}}", "tooltip": "{{.global_wkc_business_lineage_tooltip}}" }, { "description": "{{.global_wkc_action_add_assets}}" }, { "description": "{{.global_wkc_action_edit_assets}}" }, { "description": "{{.global_wkc_action_delete_assets}}" }, { "description": "{{.global_wkc_action_lineage_reports}}" } ] } }, { "extension_point_id": "zen_permissions", "extension_name": "wkc_access_information_assets_perm", "display_name": "{{.global_wkc_access_information_assets_view}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "key": "access_information_assets", "category": "{{.global_wkc_category_catalogs}}", "category_description": [ "{{.global_wkc_category_catalogs_description1}}", "{{.global_wkc_category_catalogs_description2}}" ], "description": [ "{{.global_wkc_access_information_assets_view_description}}" ], "actions": [ { "description": "{{.global_wkc_action_browse_assets}}" }, { "description": "{{.global_wkc_action_explore_assets}}" }, { "description": "{{.global_wkc_action_data_lineage}}", "tooltip": "{{.global_wkc_data_lineage_tooltip}}" }, { "description": "{{.global_wkc_action_business_lineage}}", "tooltip": "{{.global_wkc_business_lineage_tooltip}}" } ] } }, { "extension_point_id": "zen_permissions", "extension_name": "wkc_manage_metadata_import_perm", "display_name": "{{.global_wkc_import_metadata}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "key": "manage_metadata_import", "category": "{{.global_wkc_category_data_curation}}", "category_description": [ "{{.global_wkc_category_data_curation_description}}" ], "description": [ "{{.global_wkc_import_metadata_description1}}", "{{.global_wkc_import_metadata_description2}}" ], "actions": [ { "description": "{{.global_wkc_action_metadata_repository}}" }, { "description": "{{.global_wkc_action_metadata_servers}}" }, { "description": "{{.global_wkc_action_import_settings}}" } ] } }, { "extension_point_id": "zen_permissions", "extension_name": "wkc_manage_quality_perm", "display_name": "{{.global_wkc_manage_data_quality}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "key": "manage_quality", "category": "{{.global_wkc_category_data_curation}}", "category_description": [ "{{.global_wkc_category_data_curation_description}}" ], "description": [ "{{.global_wkc_manage_data_quality_description1}}", "{{.global_wkc_manage_data_quality_description2}}" ], "actions": [ { "description": "{{.global_wkc_action_quality_rules}}", "tooltip": "{{.global_wkc_quality_rules_tooltip}}" }, { "description": "{{.global_wkc_action_edit_quality_rules}}" }, { "description": "{{.global_wkc_action_delete_quality_rules}}" }, { "description": "{{.global_wkc_action_create_automation_rules}}" }, { "description": "{{.global_wkc_action_edit_automation_rules}}" }, { "description": "{{.global_wkc_action_delete_automation_rules}}" }, { "description": "{{.global_wkc_action_configure_analysis_settings}}" }, { "description": "{{.global_wkc_action_run_quality_analysis}}" } ] } }, { "extension_point_id": "zen_permissions", "extension_name": "wkc_access_advanced_governance_capabilities_perm", "display_name": "{{.global_wkc_access_advanced_governance}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "key": "access_advanced_governance_capabilities", "category": "{{.global_wkc_category_catalogs}}", "category_description": [ "{{.global_wkc_category_catalogs_description1}}", "{{.global_wkc_category_catalogs_description2}}" ], "description": [ "{{.global_wkc_access_advanced_governance_description1}}", "{{.global_wkc_access_advanced_governance_description2}}" ], "actions": [ { "description": "{{.global_wkc_action_import_metadata}}", "tooltip": "{{.global_wkc_import_metadata_tooltip}}" }, { "description": "{{.global_wkc_action_configure_assets}}", "tooltip": "{{.global_wkc_configure_assets_tooltip}}" }, { "description": "{{.global_wkc_action_custom_information}}", "tooltip": "{{.global_wkc_custom_information_tooltip}}" }, { "description": "{{.global_wkc_action_custom_attributes}}", "tooltip": "{{.global_wkc_custom_attributes_tooltip}}" } ] } }, { "extension_point_id": "zen_permissions", "extension_name": "wkc_view_quality_perm", "display_name": "{{.global_wkc_view_data_quality}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "key": "view_quality", "category": "{{.global_wkc_category_data_curation}}", "category_description": [ "{{.global_wkc_category_data_curation_description}}" ], "description": [ "{{.global_wkc_view_data_quality_description1}}" ] } } ] kind: ConfigMap metadata: labels: app: wkc-lite app.kubernetes.io/instance: 0075-wkc-lite app.kubernetes.io/managed-by: Tiller app.kubernetes.io/name: wkc-lite chart: wkc-lite helm.sh/chart: wkc-lite heritage: Tiller icpdata_addon: "true" icpdata_addon_version: ${VERSION} release: 0075-wkc-lite name: ug-permission-extensions namespace: ${PROJECT_CPD_INSTANCE} EOF
-
Apply the second permission and configmap in the
cpd-instance
namespace. Depending on whether you have the Data Qualty feature enabled or not, you will need to apply a different permission and configmap.- If you have the Data Quality feature enabled, apply the following permission and configmap:
cat <<EOF | oc apply -f - apiVersion: v1 data: extensions: | [ { "extension_point_id": "zen_user_roles", "extension_name": "wkc_business_analyst_role", "display_name": "{{.global_business_analyst_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_business_analyst_role_description}}", "permissions": [ "access_catalog", "view_quality", "create_space" ] } }, { "extension_point_id": "zen_user_roles", "extension_name": "zen_data_engineer_role", "display_name": "{{.global_data_engineer_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_data_engineer_role_description}}", "permissions": [ "view_quality", "manage_metadata_import", "manage_discovery", "manage_quality", "create_space" ] } }, { "extension_point_id": "zen_user_roles", "extension_name": "wkc_data_quality_analyst_role", "display_name": "{{.global_data_quality_analyst_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_data_quality_analyst_role_description}}", "permissions": [ "manage_metadata_import", "manage_discovery", "manage_quality", "view_governance_artifacts", "author_governance_artifacts", "access_data_quality_asset_types", "access_catalog", "create_space" ] } }, { "extension_point_id": "zen_user_roles", "extension_name": "wkc_data_steward_role", "display_name": "{{.global_data_steward_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_data_steward_role_description}}", "permissions": [ "manage_metadata_import", "manage_discovery", "view_quality", "create_space" ] } }, { "extension_point_id": "zen_user_roles", "extension_name": "zen_administrator_role", "display_name": "{{.global_administrator_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_administrator_role_description}}", "permissions": [ "manage_quality", "manage_discovery", "manage_metadata_import" ] } } ] kind: ConfigMap metadata: labels: app: wkc-lite app.kubernetes.io/instance: 0075-wkc-lite app.kubernetes.io/managed-by: Tiller app.kubernetes.io/name: wkc-lite chart: wkc-lite helm.sh/chart: wkc-lite heritage: Tiller icpdata_addon: "true" icpdata_addon_version: ${VERSION} release: 0075-wkc-lite name: ug-user-role-extensions namespace: ${PROJECT_CPD_INSTANCE} EOF
- If you do not have the Data Quality feature enabled, apply the following permission and configmap:
cat <<EOF | oc apply -f - apiVersion: v1 data: extensions: | [ { "extension_point_id": "zen_user_roles", "extension_name": "wkc_business_analyst_role", "display_name": "{{.global_business_analyst_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_business_analyst_role_description}}", "permissions": [ "access_catalog", "view_quality", "create_space" ] } }, { "extension_point_id": "zen_user_roles", "extension_name": "zen_data_engineer_role", "display_name": "{{.global_data_engineer_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_data_engineer_role_description}}", "permissions": [ "view_quality", "manage_metadata_import", "manage_discovery", "manage_quality", "create_space" ] } }, { "extension_point_id": "zen_user_roles", "extension_name": "wkc_data_quality_analyst_role", "display_name": "{{.global_data_quality_analyst_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_data_quality_analyst_role_description}}", "permissions": [ "manage_metadata_import", "manage_discovery", "manage_quality", "view_governance_artifacts", "author_governance_artifacts", "access_catalog", "create_space" ] } }, { "extension_point_id": "zen_user_roles", "extension_name": "wkc_data_steward_role", "display_name": "{{.global_data_steward_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_data_steward_role_description}}", "permissions": [ "manage_metadata_import", "manage_discovery", "view_quality", "create_space" ] } }, { "extension_point_id": "zen_user_roles", "extension_name": "zen_administrator_role", "display_name": "{{.global_administrator_role_name}}", "match_permissions": "", "match_instance_id": "", "match_instance_role": "", "meta": {}, "details": { "description": "{{.global_administrator_role_description}}", "permissions": [ "manage_quality", "manage_discovery", "manage_metadata_import" ] } } ] kind: ConfigMap metadata: labels: app: wkc-lite app.kubernetes.io/instance: 0075-wkc-lite app.kubernetes.io/managed-by: Tiller app.kubernetes.io/name: wkc-lite chart: wkc-lite helm.sh/chart: wkc-lite heritage: Tiller icpdata_addon: "true" icpdata_addon_version: ${VERSION} release: 0075-wkc-lite name: ug-user-role-extensions namespace: ${PROJECT_CPD_INSTANCE} EOF
- If you have the Data Quality feature enabled, apply the following permission and configmap:
Applies to: 4.6.4
Fixed in: 4.6.5
The Administrator role's permission set does not include the Access data quality asset types permission
For new installations of Cloud Pak for Data 4.6.3 or 4.6.4, the predefined permission set of the Administrator role does not include the Access data quality asset types permission. Thus, users with the Administrator role do not have access to those asset types in projects even if the optional data quality component is installed.
Workaround: Manually add the Access data quality asset types permission to the Administrator role.
Applies to: 4.6.3 and 4.6.4
Fixed in: 4.6.5
Images for AI Factsheets fail to mirror
Mirroring images for AI Factsheets may not work when attempting to do an airgap install. During an airgap install, the mirror images can fail and users will not be able to continue the installation.
Applies to: 4.6.0, 4.6.1, 4.6.2, 4.6.3 and 4.6.4
Fixed in: 4.6.5
Installing and upgrading
You might encounter these known issues while either installing or upgrading Watson Knowledge Catalog.
The advanced metadata import feature can't be enabled in FIPS-enabled environments
MANTA Automated Data Lineage for IBM Cloud Pak for Data is not supported in a FIPS-enabled environment. If you try to enable the advanced metadata import feature when installing or after installing Watson Knowledge Catalog, the installation fails.
Applies to: 4.6.0, 4.6.1 and 4.6.2
Fixed in: 4.6.3
Installing the advanced metadata import feature may cause operator pods to crash
When installing MANTA Automated Data Lineage on large OpenShift clusters, MANTA Automated Data Lineage operator pods may crash with the following errors: OOMKilled
and CrashLoopBackOff
.
The reason the operator pods crash is due to a lack of adequate memory allocated to the container that is required to complete the installation successfully.
Workaround: These steps are specific for the MANTA Automated Data Lineage.
-
Identify the problem cluster by accessing the
manta-adl-operator-controller
on the cluster.If the
manta-adl-operator-controller
is in the installing phase, run the following:oc get -n ibm-common-services --template="{{.status.phase}}" "$(oc get pods -n ibm-common-services -o name | grep manta-adl-operator-controller)"
If the
manta-adl-operator-controller
has had several restarts and the restart count is unusually high, run the following:oc get -n ibm-common-services "$(oc get pods -n ibm-common-services -o name | grep manta-adl-operator-controller)"
-
Once the cluster has been determined, run the following patch command to update the resource requirements:
oc patch csv manta-adl-operator.v1.6.2 -n ibm-common-services --type merge -p '{"spec": {"install": {"spec": {"deployments": [{"name": "manta-adl-operator-controller-manager", "spec": {"selector":{"matchLables": {"control-plane": "controller-manager"}}, "template": {"spec": {"containers": [{"name": "manager", "resources": {"requests": {"memory": "512Mi"} } }] } } } }]}}}}'
Applies to: 4.6
Fixed in: 4.6.1
During installation, the wkc-db2u-init
job can get stuck
When installing Watson Knowledge Catalog, the wkc-db2u-init
job can get stuck in the RUNNING
state.
Workaround: Refer to the steps in this support page: WKC Db2U failed to provision the databases BGDB, ILGDB, and WFDB
Once the steps have been followed, refer back to the Watson Knowledge Catalog installation: Installing Watson Knowledge Catalog.
Applies to: 4.6
The Metadata import (lineage) feature is not active when upgrading from 4.5.x to 4.6.4
When upgrading from version 4.5.x to 4.6.4, the Metadata import (lineage) feature might not be active
Workaround:
-
Login to the OpenShift console
-
Navigate to the
Workloads
section and thenConfigMaps
-
Delete the following config map:
manta-route-flow-tls
-
Restart the
manta-adl-operator-controller-manager
pod -
Verify that the
manta-route-flow-tls
has been created automatically -
Ensure that the following line exists under
manta-dataflow-server
andmanta-dataflow-server
:access_by_lua_file /nginx_data/checkjwt.lua
-
Verify that the following line is under
manta-dataflow-server
:proxy_pass https://$manta_server:8080
Applies to: 4.6.4
Fixed in: 4.6.5
Installing or upgrading to version 4.6.4 requires a patch for offline backup and restore, and metadata enrichment
When installing or upgrading to version 4.6.4, users will need to install this patch if they use offline backup and restore, or Metadata enrichment. This patch fixes issues with offline backup and restore, global search, and profiling. Otherwise, it is not mandatory. You may still install the patch if you plan to use these features in the future.
Workaround: Install the patch as described in Installing the patch for version 4.6.4.
Applies to: 4.6.4
Fixed in: 4.6.5
Upgrading to version 4.6.5 may fail because of the Db2u container
When upgrading to version 4.6.5, the upgrade may fail because the DB2COMM
value inside the c-db2oltp-wkc-db2u-0
pod or the c-db2oltp-iis-db2u-0
pod is set to NULL
.
The c-db2oltp-wkc-db2u-0
pod
Follow these steps and workaround for the c-db2oltp-wkc-db2u-0
pod:
-
After upgrading to version 4.6.5, check the status of the
wkc-cr
using the following command:oc get wkc wkc-cr
-
If the
wkc-cr
is in theInprogress
orFailed
state, then check if any of the following pods are in0
or1
, andRunning
orCrashLoopBackOff
state:Manta-adl-operator-catalog-xxxx
dp-transform-xxxx
wdp-lineage-xxxx
wdp-policy-service-xxxx
wkc-bi-data-service-xxxx
wkc-glossary-service-xxxx
wkc-workflow-service-xxxx
-
If any of these pods are in the
0
or1
, andRunning
orCrashLoopBackOff
state, then check the logs of pods and look for db2u communication issues. -
Go into the
c-db2oltp-wkc-db2u-0
pod using the following command:oc exec -it c-db2oltp-wkc-db2u-0 bash
-
Run the following to find the
DB2COMM
value:db2set -all|grep DB2COMM
Workaround: Follow these steps to fix the issue:
-
Run the following command to set the
DB2COMM
value:db2set DB2COMM=TCPIP,SSL
-
Run the following command to stop Db2:
db2stop
-
Run the following command to start Db2:
db2start
-
Delete any of the pods which are in
0
or1
, andRunning
orCrashLoopBackOff
state by running:oc delete pods < >
Where
<>
is the list of pods that you want to delete.
The c-db2oltp-iis-db2u-0
pod
Follow these steps and workaround for the c-db2oltp-iis-db2u-0
pod:
-
After upgrading to version 4.6.5, check the status of the
wkc-cr
using the following command:oc get wkc wkc-cr
-
If the
wkc-cr
is in theInprogress
orFailed
state, then check if any of the following pods are in0
or1
, andRunning
orCrashLoopBackOff
state:audit-trail-service-xxxx
gov-app-config-service-xxxxx
gov-user-prefs-service-xxxx
shop4info-mappers-service-0
-
If any of these pods are in the
0
or1
, andRunning
orCrashLoopBackOff
state, then check the logs of pods and look for db2u communication issues. -
Go into the
c-db2oltp-iis-db2u-0
pod using the following command:oc exec -it c-db2oltp-iis-db2u-0 bash
-
Run the following to find the
DB2COMM
value:db2set -all|grep DB2COMM
Workaround: Follow these steps to fix the issue:
-
Run the following command to set the
DB2COMM
value:db2set DB2COMM=TCPIP,SSL
-
Run the following command to stop Db2:
db2stop
-
Run the following command to start Db2:
db2start
-
Delete any of the pods which are in
0
or1
, andRunning
orCrashLoopBackOff
state by running:oc delete pods < >
Where
<>
is the list of pods that you want to delete.
Applies to: 4.6.5
When installing 4.6.5 on an OCS cluster, an issue occurs in the is-en-conductor-0 pod
During a new install of Watson Knowledge Catalog on version 4.6.5, an issue appears with the OCS CSI driver on OCP4.12 and it doesn't set the correct FS group id when mounting. This leads to an issue in the is-en-conductor-0 pod.
Workaround:
- Run the command to edit the SCC:
oc edit scc wkc-iis-scc
- Change the
fsGroup
parameter fromRunAsAny
toMustRunAs
fsGroup: type: MustRunAs
- Restart the failing
is-en-conductor-0
pod.
Applies to: 4.6.5
After upgrading to 4.6.x, the workflow pod might not start
Applies to: 4.6.0 - 4.6.6
After upgrading 4.6.x, the wkc-workflow-service
pod does not come up. The liveness probe does not respond, and the pod keeps restarting over and over again. In the pod log, you'll see an exception stack trace. The last Caused by
entry in the trace is similar to this one: LockException: Could not acquire change log lock.
”`
Workaround: As a user with sufficient permissions, complete the following steps:
-
Log in to Red Hat OpenShift Container Platform. Then, log in to the Db2 pod that runs the workflow database:
oc rsh c-db2oltp-wkc-db2u-0
sudo su db2inst1
Alternatively, open a bash shell:
oc exec -it -n ${PROJECT_CPD_INST_OPERANDS} c-db2oltp-wkc-db2u-0 -- bash
-
Connect to the workflow database:
db2
connect to WFDB
-
Check the current change log locks:
select * from FLW_EV_DATABASECHANGELOGLOCK;
This command should return 1 record at maximum.
-
If there is an entry, clear the lock:
delete from FLW_EV_DATABASECHANGELOGLOCK;
COMMIT;
-
Close the connection to Db2 and restart the workflow service pod, for example, by deleting it.
Connections that use credentials from a vault
You might encounter these known issues and restrictions when you use connections that use credentials from a vault.
After upgrading, the settings for vault password enforcement and use of shared credentials are reset
After upgrading to version 4.6, the settings for vault password enforcement and use of shared credentials are reset to their default values in the config-wdp-connect-connection configmap.
Workaround: Update the config-wdp-connect-connection configmap:
-
Run the following command to edit the configmap:
oc edit configmap config-wdp-connect-connection
1. Change the parameter settings as follows.
-
To enable vault password enforcement:
allow-only-vaulted-password: "true"
-
To disable shared credentials:
allow-shared-credentials: "false"
-
Save your changes to the config-wdp-connect-connection configmap.
-
Restart the wdp-connect pods. For example, scale the pods to zero and then back to their previous value. Pods wdp-connect-connector and wdp-connect-connection must be restarted:
oc scale deploy wdp-connect-connection --replicas=0 oc scale deploy wdp-connect-connector --replicas=0 oc scale deploy wdp-connect-connection --replicas=N oc scale deploy wdp-connect-connector --replicas=N
Applies to: 4.6
Catalog issues
You might encounter these known issues and restrictions when you use catalogs.
Missing previews
You might not see previews of assets in these circumstances:
- In a catalog or project, you might not see previews or profiles of connected data assets that are associated with connections that require personal credentials. You are prompted to enter your personal credentials to start the preview or profiling of the connection asset.
- In a catalog, you might not see previews of JSON, text, or image files that were published from a project.
- In a catalog, the previews of JSON and text files that are accessed through a connection might not be formatted correctly.
- In a project, you cannot view the preview of image files that are accessed through a connection.
Applies to: 4.6
Missing default catalog and predefined data classes
The automatic creation of the default catalog after installation of the Watson Knowledge Catalog service can fail. If it does, the predefined data classes are not automatically loaded and published as governance artifacts.
Workaround: Ask someone with the Administrator role to follow the instructions for creating the default catalog manually.
Applies to: 4.6
Event details incomplete
For classification updates, the event details on the asset's Activities page don't include the original and updated values.
Applies to: 4.6.0
Fixed in: 4.6.1
Publishing a Cognos Dashboard with more than one image creates multiple attachments with the same image data
When you publish a Cognos dashboard to a catalog and choose to add more than one preview of the dashboard, all attachments in the catalog show the image added last. This issue occurs when you select multiple images at a time and drag them into the publish page and when you add files one at a time.
In addition, when you browse for files in the publishing step, you can select only one file. To add further images, you must drag them to the publish page.
Whenever you publish the same dashboard again, it will have the images from the previously published assets as well as the newly added images. For example, if you publish dashboard A with images 1, 2 and 3, it will have 3 screen captures of image 3. If you publish dashboard A again with images 4, 5, 6, it will have 5 screen captures, 3 with image 3 and 2 with image 6.
Applies to: 4.6
Catalog UI does not update when changes are made to the asset metadata
If the Catalog UI is open in a browser while an update is made to the asset metadata, the Catalog UI page will not automatically update to reflect this change. Outdated information will continue to be displayed, causing external processes to produce incorrect information.
Workaround: After the asset metadata is updated, refresh the Catalog UI page at the browser level.
Applies to: 4.6
Can't add business terms to a catalog asset
If terms on an asset in the default catalog were initially assigned through automated discovery, you can't add or remove terms even if you're the asset owner.
Workaround: Update terms on discovered asset in the discovery results and publish them to the default catalog.
Applies to: 4.6
A blank page might be rendered when you search for terms while manually assigning terms to a catalog asset
When you search for a term to assign to a catalog asset and change that term while the search is running, it can happen that a blank page is shown instead of any search results.
Workaround: Rerun the search.
Applies to: 4.6
No automatic reprofiling of assets where profiling failed when the assets were added to the catalog
If automatic profiling of an asset failed for some reason at the time the asset was added to the catalog, the asset isn't reprofiled automatically.
Workaround: The asset owner can manually trigger profiling of such an asset, or delete the asset and add it to the catalog again.
Applies to: 4.6.0
Fixed in: 4.6.1
Cannot delete a custom relationship definition for catalog assets
After adding a custom relationship of a type to a catalog, you will not be able to delete it on Asset and artifacts definitions page.
Workaround: To delete a custom relationship definition, you need to delete all other existing relationships of that type first.
Applies to: 4.6
Server outage occurs while previewing masked assets for the first time
When previewing masked assets for the first time server outage occurs. Every subsequent attempt to preview masked asset ends successfully.
Workaround: Refresh the Catalog UI page at the browser level.
Applies to: 4.6.1
Fixed in: 4.6.3
Assigning business terms to columns may cause an error
When assigning business terms to columns in the Overview tab of asset details page in Catalog, the error „The attribute of the asset couldn’t be changes” may occur.
Workaround: To assign business terms to columns, the asset needs to be profiled. If it was profiled before, you need to repeat profiling. Then refresh the asset details page and assign a business term to a column again.
Applies to: 4.6.1
Fixed in: 4.6.3
Special or double-byte characters in the data asset name are truncated on download
When you download a data asset with a name that contains special or double-byte characters from a catalog, these characters might be truncated from the name. For example, a data asset named special chars!&@$()テニス.csv
will
be downloaded as specialchars!().csv
.
The following character sets are supported:
- Alphanumeric characters:
0-9
,a-z
,A-Z
- Special characters:
! - _ . * ' ( )
Applies to: 4.6
The default catalog is not created automatically in version 4.6.4
When you install version 4.6.4, the default catalog is not installed automatically.
Before running the follwing steps, make sure that you run the jq
commands locally.
Workaround: Follow the steps to create the default catalog.
-
Create a new catalog in the user interface with the Enforce data protection rules and Allow duplicates options enabled.
-
Note down the GUID of this new catalog once created. You can get the GUID for the catalog from the browser address bar.
-
Replace the GUID in the following command with the GUID value for your catalog.
oc exec wdp-couchdb-0 -- bash -c 'curl -ks -u admin:`cat /etc/.secrets/COUCHDB_PASSWORD` https://localhost:6984/v2_admin/38c642c2-cfc7-4038-bc54-d2ae224eed7d' | jq -c ".metadata.uid = \"ibm-default-catalog\"" > /tmp/newCat.json oc cp -c couchdb /tmp/newCat.json wdp-couchdb-0:/tmp/newCat.json oc exec wdp-couchdb-0 -- bash -c 'curl -ks -u admin:`cat /etc/.secrets/COUCHDB_PASSWORD` -H "Content-Type: application/json" -X POST "https://localhost:6984/v2_admin" -d @/tmp/newCat.json'
In this example the GUID is
38c642c2-cfc7-4038-bc54-d2ae224eed7d
.
Applies to: 4.6.4
Fixed in: 4.6.5
Duplicate columns in files will not be displayed in the columns table
If a CSV or other structured file type contains duplicate columns with the same name, only the first instance of each column will be displayed in the columns table on the asset Overview page.
Applies to: 4.6.0 and later
Cannot import or export catalog assets
Assets fail to export from a catalog or import into a catalog.
Applies to: 4.6.1 and 4.6.2
Fixed in: 4.6.3.
Governance artifacts
You might encounter these known issues and restrictions when you use governance artifacts.
Synchronize the data policy service (DPS) category caches
For performance purpose, the data policy service (DPS) keeps a copy of glossary categories in caches. When categories are created, updated, or deleted, the glossary service publishes RabbitMQ events to reflect these changes. The DPS listens to these events and update the caches. However, in some rare occasions, the message might be lost when RabbitMQ service is down or too busy. The DPS provides a REST API utility to update the cache.
You can run the following REST API utility during downtime that has no category changes to help avoid unexpected enforcement results during the run and also avoid inaccurate cache updates:
curl -v -k -X GET --header "Content-Type: application/json"
--header "Accept: application/json"
--header "Authorization: Bearer ${token}"
"${uri}/v3/enforcement/governed_items/sync/category"
This REST API is available in Watson Knowledge Catalog version 4.6.5 or later.
Applies to: 4.6.0 and later
Row filtering not applied to asset profile
If you have data protection rules that filter rows from data assets, although the rows are removed from the asset preview, the rows are not filtered from the asset profile analysis in catalogs or projects. For example, if your data protection rule filters rows based on a specific value in a column, you might see that value in the Frequency section of the profile. Other aggregation statistics on the asset profile are also compiled with those filtered rows.
Applies to: 4.6.0, 4.6.1, 4.6.2, 4.6.3, and 4.6.4
Fixed in: 4.6.5
Masked data is not supported in data visualizations
Masked data is not supported in data visualizations. If you attempt to work with masked data while generating a chart in the Visualizations tab of a data asset in a project the following error message is received: Bad Request: Failed to retrieve data from server. Masked data is not supported
.
Applies to: 4.6.4
Artifacts are not synced to Information Assets view
Governance artifacts that were created on the core version of Watson Knowledge Catalog before installing the base version of Watson Knowledge Catalog are not synced to the Information Assets view. This problem occurs even after a manual reindex.
Workaround: Run a batch load command to sync the artifacts. See the following example:
curl -k -X GET "https://<hostname>/v3/glossary_terms/admin/resync?artifact_type=XXXXX" -H "accept: application/json" -H "Authorization: bearer <token>"
where <artifact_type>
can be all
, category
, glossary_term
, classification
, data_class
, reference_data
, policy
, or rule
.
Applies to: 4.6
Custom category roles created without the view permission by using Watson Data API
When you create a custom category role on the UI, the view permission is added by default and cannot be removed. When you use Watson Data API, you can create a custom role without the view permission. However, when you open this role on the UI, the view permission is incorrectly displayed as added.
Applies to: 4.6.0 and later
Cannot use CSV to move data class between Cloud Pak for Data instances
If you try to export data classes with matching method Match to reference data to CSV, and then import it into another Cloud Pak for Data instance, the import fails.
Workaround: For moving governance artifact data from one instance to another, especially data classes of this matching method, use the ZIP format export and import. For more information about the import methods, see Import methods for governance artifacts.
Applies to: 4.6
Error Couldn't fetch reference data values
shows up on screen after publishing reference data
When new values are added to a reference data set, and the reference data set is published, the following error is displayed when you try to click on the values:
Couldn't fetch reference data values. WKCBG3064E: The reference_data_value for the reference_data which has parentVersionId: <ID> and code: <code> does not exist in the glossary. WKCBG0001I: Need more help?
When the reference data set is published, the currently displayed view changes to Draft-history as marked by the green label on the top. The Draft-history view does not allow to view the reference data values.
Workaround: To view the values, click Reload artifact so that you can view the published version.
Applies to: 4.6
Troubleshooting problems with [uncategorized] category missing after deployment
If the out-of-the-box default category [uncategorized] is missing after deployment, the reason might be that at the time the glossary is installed, there are no users configured and therefore the permission creation fails. Restarting the bootstrap process resolves the issue.
Bootstrap is a process of setting up initial ACLs during:
- New installation
- Upgrade from previous version
- Migration of data from InfoSphere Information Server
This process is intended to be run once for every case above. When completed sucessfully, it does not have to be run again. It is also incremental - if it fails, you can run it again and it will start from the point where it failed.
Workaround:
-
Verify the bootstrap status:
curl -X GET "https://$HOST/v3/categories/collaborators/bootstrap/status" -H "Authorization: Bearer $TOKEN" -k
If the status is different than SUCCESS, you should also get some error information, for example:
{ "status": "FAILED", "current_step": "Retrieve user data", "completion_message": "Failed with exception, check logs for details", "completed_records": 0, "errors": [ "WKCBG2227E: There are no users with 'Manage Categories' permission status: OK, 'Manage Glossary' permission status: OK and 'Author Governance Artifacts' permission status: OK." ] }
-
Run the bootstrap process again:
curl -X POST https://$HOST/v3/categories/collaborators/bootstrap -H "Authorization: Bearer $TOKEN" -k
The process runs in the background, wait a while and verify the status with the command:
curl -X GET "https://$HOST/v3/categories/collaborators/bootstrap/status" -H "Authorization: Bearer $TOKEN" -k
Successful status is as follows:
{ "status": "SUCCEEDED", "completion_message": "Bootstrap process completed", "completed_records": 3, "total_records": 3 }
where record
represents a single operation performed during bootstrap.
Errors show up when viewing related content for data classes and governance rules
When displaying the Related content tab for data classes and governance rules, multiple Couldn't load related content
error messages might be displayed in notifications. The related content is displayed anyway,
if it exists.
Workaround: No workaround is needed.
Applies to: 4.6.4
Fixed in: 4.6.5
ZIP import of a reference data set with composite key fails
When importing a reference data set with a composite key using a ZIP file, the import process fails with errors on custom columns.
Workaround: Use the UI to import the reference data set from a CSV file.
Applies to: 4.6.4
Fixed in: 4.6.5
Using Reload artifact from reference data set details page does not refresh the page
When you click Reload artifact on the draft preview of a recently published reference data set, the following error is displayed:
Couldn't load artifact details
The page does not reload.
Workaround: Reload the page using the web browser reload option.
Applies to: 4.6.4 and 4.6.5
Reference data set can't be modified even if it is no longer set as a validator
When deleting an unpublished draft of a reference data set, its custom columns are not properly removed from the system. If a custom column has a validator reference data set specified, that validator reference data set can't be modified or deleted.
Workaround: Remove all custom columns before discarding the unpublished draft of a reference data set. If the custom columns are part of the composite key, you must first remove all the reference data values, then remove the custom columns.
Applies to: 4.6.4, 4.6.5
Governance artifact workflows
You might encounter these known issues and restrictions when you use governance workflows.
Workflow stalls for user tasks without an assignee
A user task might have no assignee because, for example, the category or artifact role selected for that task isn't assigned to any user or an empty user group was added. In this case, the workflow stalls.
You can prevent this in these ways:
- Assign a fallback user or user group to a step in addition to the category or artifact role.
- Specify a user group as the member of a category role or an artifact role. That way, the user group is assigned to the user task. Changes to user groups take effect immediately, even in already created user tasks.
Applies to: 4.6.0 Fixed in: 4.6.1
Custom workflows
You might encounter these known issues and restrictions when you use custom workflows.
HTTP method PATCH might not be supported in custom workflows
Custom workflow templates might call a REST API by using the HTTP task activity offered by the Flowable workflow engine. The HTTP task activity in version 6.5.0 of the embedded Flowable workflow engine that is used in Watson Knowledge Catalog
3.0, 3.5, and 4.0 does not support the HTTP method PATCH. Trying to call a REST API using that method results in a "requestMethod is invalid
" error. GET, POST, PUT, and DELETE methods work fine.
Workaround: Modify your REST API call to use the POST method instead, and add this special header to your request:
X-HTTP-Method-Override: PATCH
For this workaround to actually work, the called service must understand and correctly interpret this header field. Calls to REST APIs provided by the wkc-glossary-service service have worked properly.
Applies to: 3.0, 3.5, 4.0, 4.5, and 4.6
Action buttons are disabled and custom workflow can't be activated
When custom workflow template uses required form fields of type other than link or text, the action buttons are disabled and the custom workflow can't be activated. New requests based on this workflow can't be submitted and task actions can't be executed.
Workaround: There is currently no workaround for this issue. When building workflow templates, avoid required form properties of type other than link or text. You can also use optional form properties.
Applies to: 4.6.5
Legacy data discovery and data quality
You might encounter these known issues and restrictions when you use automated discovery or work in data quality projects.
Column analysis/auto discovery analysis generates "data out of range" error
A "data out of range" error is produced when you run a column analysis/auto discovery analysis.
Workaround: Complete the following steps to support persisting values of more than 20 characters in the DATACLASSIFICATION column of the frequency distribution (FD) table.
-
Log on to the Db2U pod:
oc rsh c-db2oltp-iis-db2u-0 bash
-
Connect to the Information Analyzer database where the FD tables are persisted:
db2 connect to iadb
-
Generate a script that contains an ALTER TABLE command for each of the FD tables that are in the Information Analyzer database:
db2 -x "select trim('alter table ' || trim(tabschema) || '.' || trim(tabname) || ' alter column ' || colname || ' SET DATA TYPE VARGRAPHIC(100);') from syscat.columns where tabschema like 'IAUSER%' and tabname like '%FD' and colname='DATACLASSIFICATION'" > /tmp/MODIFY_FD_TABLES.sql
-
Run the script that contains the ALTER TABLE command:
db2 -tvf /tmp/MODIFY_FD_TABLES.sql
Applies to: 4.6
Connections that use Cloud Pak for Data credentials for authentication can't be used in discovery jobs
When you create a discovery job, you cannot add a platform connection that is configured to use Cloud Pak for Data credentials for authentication.
Workaround: Modify the platform connection to use other supported authentication options or select a different connection.
Applies to: 4.6
Incorrect connections that are associated with connected data assets after automated discovery
When you add connected data assets through automated discovery, the associated connection assets might be incorrect. Connections that have the same database and hostnames are indistinguishable to automated discovery, despite different credentials and table names. For example, many Db2 databases on IBM Cloud have the same database and hostnames. An incorrect connection with different credentials might be assigned and then the data asset can't be previewed or accessed.
Applies to: 4.6
Changes to platform-level connections aren't propagated for discovery
After you add a platform-level connection to the data discovery area, any subsequent edit to or deletion of the platform-level connection is not propagated to the connection information in the data discovery area and is not effective.
Workaround: Delete the discovery connection manually. You must have the Access advanced governance permission to be able to complete the required steps:
- Go to Governance > Metadata import
- Go to the Repository Management tab.
- In the Navigation pane, select Browse assets > Data connections.
- Select the connection that you want to remove and click Delete.
Read updated platform-level connections to the data discovery area as appropriate.
Applies to: 4.6
Column analysis fails if system resources or the Java heap size are not sufficient
Column analysis might fail due to insufficient system resources or insufficient Java heap size. In this case, modify your workload management system policies as follows:
-
Open the Information Server operations console by entering its URL in your browser:
https://<server>/ibm/iis/ds/console/
-
Go to Workload Management > System Policies. Check the following settings and adjust them if necessary:
Job Count setting: If the Java Heap size is not sufficient, reduce the number to 5. The default setting is 20.
Job Start setting: Reduce the maximum number of jobs that can start within the specified timeframe from 100 in 10 seconds (which is the default) to 1 in 5 seconds.
Applies to: 4.6
Automated discovery might fail when the data source contains a large amount of data
When the data source contains a large amount of data, automated discovery can fail. The error message indicates that the buffer file systems ran out of file space.
Workaround: To have the automated discovery complete successfully, use one of these workarounds:
-
Use data sampling to reduce the number of records that are being analyzed. For example, set the sample size to 10% of the total number of records.
-
Have an administrator increase the amount of scratch space for the engine that is running the analysis process. The administrator needs to use the Red Hat OpenShift cluster tools to increase the size of the volume where the scratch space is, typically
/mnt/dedicated_vol/Engine
in the is-en-conductor pod. Depending on the storage class that is used, the scratch space might be on a different volume.The size requirements for scratch space depend on the workload. As a rule, make sure to have enough scratch space to fit the largest data set that is processed. Then, multiply this amount by the number of similar analyses that you want to run concurrently. For more information about expanding volumes, see the instructions in the OpenShift Container Platform documentation.
Applies to: 4.6
A connection that was made by using HDFS through the Execution Engine cannot be used with automated discovery
In Platform connections, you can make an HDFS connection two ways: by using HDFS through the Execution Engine or by using Apache HDFS. A connection that was made by using HDFS through the Execution Engine cannot be used with automated discovery.
Workaround: Use the Apache HDFS option to make an HDFS connection with automated discovery.
Applies to: 4.6
Platform connections with encoded certificates cannot be used for discovery
SSL-enabled platform connections that use a base64 encoded certificate cannot be used in discovery jobs. Connections that use decoded certificates will work.
Applies to: 4.6
Certain connections with spaces in the name can't be used in discovery
When you try to set up a discovery job with a platform connection that is configured with an SSL certificate and has a name containing spaces, an error occurs.
Workaround: Remove any spaces from the platform connection name and try adding the connection again.
Applies to: 4.6
In some cases, automated discovery jobs show the status Running although they actually failed
Automated discovery job seem to be in Running state when the job actually failed because some of the DataStage jobs failure caused by a restart of the is-en-conductor-0 pod.
If an automated discovery jobs is stuck in Running state, cancel the job by clicking Cancel the analyze phase for all assets
Applies to: 4.6
The discovery operation intermittently fails
When the discovery operation fails, WebSphere Application Server reports an error in the log file "java.lang.IllegalStateException: This PersistenceBroker instance is already closed". Additionally, queries to the XMETA repository fail due to this error.
Workaround: Restart the iis-services pod.
Applies to: 4.6
Connection is deleted from IMAM when running discovery with a new connection pointing to the same source
Discovering the same data source using two different data connections results in the earlier data connection getting deleted. For more information, see Discovering the same data sets using two different data connections gets the earlier data connection deleted.
Applies to: 4.6
Discovery fails for MongoDB schemas or tables where the name contains special characters
If the name of a MongoDB schema or table contains a special character such as a hyphen (-), the asset name is shown without that character in the asset browser. Also, discovery fails for such an asset.
Applies to: 4.6
Inconsistent hostnames in the XMETA database for JDBC connections used in automated discovery
Depending on where a JDBC connection used in automated discovery was created, assets are stored in the XMETA database under these hostnames:
- The JDBC URL if the connection was synced from the default catalog to the XMETA base
- The is-en-conductor pod name if the connection is a platform connection orwas created in the automated discovery UI
In some cases, this might lead to duplicate sets of assets in the XMETA database for the same JDBC connection.
Applies to: 4.5.0 and later
Connection might not be usable in discovery after updating the password in IMAM
If the password for a data connection is updated in IMAM, you might not be able to use that connection in automated discovery right after the update.
Workaround: After the connection password is updated, go to the import area for that connection and reimport the data.
Applies to: 4.6.0
Fixed in: 4.6.1
Automated discovery might not work for generic JDBC platform connections with values in the JDBC properties field
Automated discovery doesn't not work with a Generic JDBC connection that was created in the Platform connections UI and has JDBC property information in the JDBC properties field.
Workaround: Append all JDBC properties to the URL in the JDBC url field instead.
Before:
After:
Applies to: 4.6
Db2 SSL/TLS connections can't be used in discovery
When you create a discovery job, you can't add a Db2 platform connection that is configured with SSL and uses a custom TLS certificate. When you try to add such platform connection to an automated discovery or quick scan job, the following error occurs:
Failed to add connection. No connection could be created for discovery. Try again later
The request createDataConnection could not be processed because the following error occurred in the server: The connector failed to connect to the data source. The reported error is: com.ibm.db2.jcc.am.SqlInvalidAuthorizationSpecException: [jcc][t4][201][11237][4.28.11] Connection authorization failure occurred. Reason: Security mechanism not supported. ERRORCODE=-4214, SQLSTATE=28000. Transaction ID: 4899157575.
This error occurs because the security mechanism that is set by default for all SSL connections when a discovery connection is added doesn't match the security mechanism defined at the Db2 server level.
Workaround: Create 2 connections with the same name:
- A platform connection in Data > Platform connections
- A connection for use in discovery in Catalogs > Metadata import
When you set up a discovery job, use the metadata import connection.
Applies to: 4.6
Some connection names aren't valid for automated discovery
If the name of a connection in the default catalog contains special characters, this connection can't be used in automated discovery jobs.
Workaround: Do not use any special characters in connection names.
Applies to: 4.6
Can't view the results of a data rule run
In a data quality project, the following error occurs when you try to view the run details of a data rule.
CDICO0100E: Connection failed: Disconnect non transient connection error: [jcc][t4][2043][11550][4.26.14] Exception No route to host error: Error opening socket to server /172.30.217.104 on port 50,000 with message: No route to host (Host unreachable). ERRORCODE=-4499, SQLSTATE=08001
Workaround: Complete the following steps:
-
Log in to the iis-services pod by running the following command:
oc rsh `oc get pods | grep -i "iis-services" | awk -F' ' '{print $1}'` bash
-
Run the following command for each data quality project where data rules are defined:
/opt/IBM/InformationServer/ASBServer/bin/IAAdmin.sh -user -password -migrateXmetaMetadataToMicroservice -projectName <DQ-PROJECT_NAME> -forceUpdate true
Applies to: 4.6
Automated discovery might fail due to the length of the import area
Running automated discovery on deep levels of the select source can result in creating an import area name that exceeds 255 characters, which causes automated discovery to fail.
Workaround: Complete the steps in Enabling discovery of assets with long names.
Applies to: 4.6
Create external bundle location option not supported when scheduling analysis jobs
When you set up scheduling for analyzing a data asset in a data quality project, the option Create external bundle location is shown. You can select this option and also specify an external bundle location. However, no such file is created when you click Analyze.
Applies to: 4.6
Can't preview assets added to the default catalog by running automated discovery on a connection that uses credentials from a vault
When you run automated discovery on a connection that uses credentials from a vault and have selected to publish the discovered data assets to the default catalog, the data assets cannot be previewed in the catalog.
Applies to: 4.6
Connections that use credentials from a vault are not synced to the default catalog
Some types of connections that are used in automated discovery are synchronized to the default catalog. However, such connections are not synchronized if they use credentials from a vault.
Applies to: 4.6
Can't refine data assets added to the default catalog by running automated discovery on a connection that uses credentials from a vault
You can't refine a data asset that you added from the default catalog to the project if that asset was published to the catalog as a result of running automated discovery on a connection that uses credentials from a vault.
Workaround: Complete the following steps:
- Add the respective connection to the project.
- Create and run a metadata import to import the data assets into the project.
- Refine the imported data assets as required.
Applies to: 4.6
Overriding rule or rule set runtime settings at the columns level causes an error
In a data quality project, if you try to override runtime settings for a rule or a rule set when you run the rule or rule set on the Rules tab in the Column page, an error occurs. Instead of the Data quality user interface, an error message is displayed.
Workaround: Override the rule or rule set run time settings only when you run the rule or rule set on the Rules tab in the Project or Data asset page.
Applies to: 4.6
When a generic JDBC connection for Snowflake is synced to the default catalog, the connection type changes to native Snowflake
If you run automated discovery on a generic JDBC connection for Snowflake that was created as a platform connection and publish the results, this connection is synced to the default catalog as native Snowflake connection. When you add the platform connection as a data source to Watson Query, business terms on assets from such connection that were added in automated discovery won't show up.
Workaround: Use native Snowflake connections in automated discovery instead of generic JDBC connections.
Applies to: 4.6.0
Can't update the password of connections used in metadata import areas from the command line
You can use the updateDCPwd
action of the imam
command-line interface to update the password for a data connection that is used in a metadata import area. With an incorrect password, some services such as column analysis
will not work properly in automated discovery or data quality projects.
Workaround: To update a connection password, follow the instructions in one of these IBM Support documents:
- Unable to update data connection password of a data source in imam by using the command-line interface in Cloud Pak for Data
- Unable to update data connection password of a data source in imam by using the command-line interface in Cloud Pak for Data (vault-enabled environment)
Applies to: 4.6.5
Setting up reporting
Cannot load schema for SSL connections to PostgreSQL sources during reporting setup
If you select a PostgreSQL connection that is configured to use SSL communication to configure the reporting database, the database schemas can't be loaded. This also applies to the existing PostgreSQL connections that have an SSL certificate configured.
Workaround: Remove the SSL certificate that is part of the PostgreSQL connection.
Applies to: 4.6.4
Fixed in: 4.6.5
Quick scan migration
You might encounter these known issues and restrictions when you migrate quick scan jobs.
Generic JDBC connections to Google BigQuery sources used in quick scan might time out during migration
Before you can migrate quick scan jobs that were run against a Google BigQuery data source by using a generic JDBC connection, validate the connection. In the Platform connections page, open the connection and test it to ensure it works. If the connection test fails, create a platform connection of the type Google BigQuery with the same name as the generic JDBC connection. Otherwise, the connection might time out during migration.
Applies to: 4.6
Metadata import in projects
You might encounter these known issues and restrictions when you work with metadata import.
Can't publish more than 700 assets at once
When you publish assets from within a metadata import asset, you can't publish more than 700 assets in a single step.
Workaround: If the metadata import contains more than 700 assets and you want to publish all of them to a catalog, publish the assets in batches of up to 700 assets per publish.
Applies to: 4.6
Column information might not be available for data assets imported through lineage import
When a metadata import is configured to get lineage from multiple connections and databases with the same name exist in these data sources, the tables from these databases are imported but no column information.
Workaround: Configure a separate metadata import for each connection pointing to same-named databases.
Applies to: 4.5.0 and later
Running concurrent metadata import jobs on multiple metadata-discovery pods might fail
When you run several metadata import jobs in parallel on multiple metadata-discovery pods, an error might occur, and an error message similar to the following one is written to the job run log:
Error 429: CDICW9926E: Too many concurrent user requests: 50
Workaround: You can resolve the issue in one of these ways:
-
Increase the maximum number of concurrent requests allowed per user. In the wdp-connect-connection pod, change the value of the MAX_CONCURRENT_REQUESTS_PER_USER environment variable, for example:
MAX_CONCURRENT_REQUESTS_PER_USER: 100
-
If you don't have enough resources to increase the number of concurrent requests per user, reduce the number of threads connecting to the source. By default, 20 worker threads in a metadata-discovery pod access the wdp-connect-connection pod concurrently. If you define 4 pods for metadata import, 80 worker threads will access the data source at the same time. In a metadata-discovery pod, change the value of the
discovery_create_asset_thread_count
environment variable. For example:discovery_create_asset_thread_count: 10
Applies to: 4.6
Metadata import jobs might be stuck in running state
Metadata import jobs that are stuck in running state might not be cancelled automatically.
Workaround: Cancel the job manually:
- Go to a Jobs page, either the general one or the one of the project that contains the metadata import asset.
- Look for the respective job and cancel it.
Applies to: 4.6.0, 4.6.1, and 4.6.2
Fixed in: 4.6.3
Metadata import jobs might be stuck due to issues related to RabbitMQ
If the metadata-discovery pod starts before the rabbitmq pods are up after a cluster reboot, metadata import jobs can get stuck while attempting to get the job run logs.
Workaround: To fix the issue, complete the following steps:
- Log in to the OpenShift console by using admin credentials.
- Go to Workloads > Pods.
- Search for rabbitmq.
- Delete the rabbitmq-0, rabbitmq-1, and rabbitmq-2 pods. Wait for the pods to be back up and running.
- Search for discovery.
- Delete the metadata-discovery pod. Wait for the pod to be back up and running.
- Rerun the metadata import job.
Applies to: 4.6
After upgrading to 4.6.0, the metadata import option Get lineage
is disabled
After upgrading from 4.5.x to 4.6.0, the metadata import Get lineage option is disabled. Rerunning existing lineage imports also fails.
Workaround: Complete the following steps:
-
Change the port for the manta-server service from 8080 to 8443:
- Log in to the OpenShift console.
- Go to Networking > Services and search for
manta-server
. - Delete the manta-server service to have it created anew.
- Make sure that port and targetPort for the newly created service are set to 8443.
-
Restart the manta-dataflow pod.
-
Complete the following changes in the ibm-nginx pod:
-
Go to Workloads > Deployments.
-
Search for
ibm-nginx
and scale the pod down to 1. -
After that single ibm-nginx pod is up, click the pod name and go to the Terminal tab.
-
Edit the .conf file:
vi /user-home/global/nginx-conf.d/manta-routes-flow-tls.conf
-
Update the content as follows:
set_by_lua $nsdomain 'return os.getenv("NS_DOMAIN")'; location /manta-dataflow-server { set $manta_server manta-server.$nsdomain; proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_pass https://$manta_server:8443; proxy_buffer_size 32k; proxy_busy_buffers_size 40k; proxy_buffers 64 4k; } location /manta-admin-gui { set $manta_admin manta-admin.$nsdomain; proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_pass https://$manta_admin:8181; proxy_buffer_size 32k; proxy_busy_buffers_size 40k; proxy_buffers 64 4k; }
-
-
Change the network policy:
-
Go to Networking > NetworkPolicies and search for
manta-dataflow
. -
Open the yaml tab.
-
Change the ports as shown:
ports: - port: 8443 protocol: TCP - port: 8181 protocol: TCP
-
Applies to: 4.6.0
Can’t create lineage imports after upgrading from 4.5.3 to 4.6.3
After upgrading from 4.5.3 to 4.6.3, you can't create metadata imports for getting lineage.
Workaround: After the upgrade, an IBM Cloud Pak for Data administrator must delete the Open Manta Integration export connection created in the previous release.
-
Obtain authorization token as described in Generating an authorization token or API key.
-
Delete
Open MANTA Integration Export_Connection
from the MANTA Automated Data Lineage configuration by using the API:curl -X DELETE \ 'https://<cpd__url_>/manta-admin-gui/public/configurator/v1/connections/Open%20MANTA%20Integration%20Export/Open%20MANTA%20Integration%20Export_Connection' \ -H 'Authorization: Bearer {token}'
Applies to: 4.6.3
Fixed in: 4.6.4
Can’t create metadata import assets for getting lineage from Db2 SSL or Microsoft SQL Server connections
In a new installation of 4.6.3, you can't create metadata import assets for getting lineage if the connection type is Db2 SSL or Microsoft SQL Server.
Workaround: To work around the issue, complete these steps:
- Log in to the OpenShift console by using admin credentials.
- Navigate to Workloads > ConfigMaps > metadata-discovery-service-config.
- Set the
manta_scanner_validation_enabled
property tofalse
. - Restart metadata-discovery pod.
This workaround disables the validation of connections established through MANTA Automated Data Lineage when the metadata import asset is created. Thus, users can create invalid metadata import assets, for example, where the privileges aren't sufficient. Such metadata imports fail when run.
Applies to: 4.6.3 and later
Metadata enrichment
You might encounter these known issues and restrictions when you work with metadata enrichment.
In some cases, you might not see the full log of a metadata enrichment job run in the UI
If the list of errors in a metadata enrichment run is exceptionally long, only part of the job log might be displayed in the UI.
Workaround: Download the entire log and analyze it in an external editor.
Applies to: 4.6
Schema information might be missing when you filter enrichment results
When you filter assets or columns in the enrichment results on source information, schema information might not be available.
Workaround: Rerun the enrichment job and apply the Source filter again.
Applies to: 4.6
Metadata enrichment runs on connections with personal credentials might fail
When you set up a metadata enrichment for a connection with personal credentials that was created by another user, the metadata enrichment job fails unless you unlocked the connection with your credentials before. The metadata enrichment job log then contains one of the following error messages:
Data asset cannot be profiled as the user do not have stored credentials of the connection.
or
Data asset cannot be profiled as the data asset is associated to a connection configured to use personal credentials and the user has not yet provided credentials for that connection.
Workaround: If you are authorized to access the connection, unlock the connection with your credentials. In the project, open one of the assets in your metadata enrichment scope. Enter your credentials on the Preview tab or on the Profile tab. Then, rerun the metadata enrichment.
Applies to: 4.6.0
Fixed in: 4.6.1
Issues with search on the Assets tab of a metadata enrichment asset
When you search for an asset on the Assets tab of a metadata enrichment asset, no results might be returned. Consider these limitations:
- Search is case sensitive.
- The result contains only records that match the exact search phrase or start with the phrase.
Applies to: 4.6
Information about removed business terms isn't stored if those are removed in bulk
When you remove a business term from several columns in the metadata enrichment results at once, information about this removal might not be saved, and thus no negative feedback provided to the term assignment services.
Workaround: Remove the business term from each column individually. To do so, go to the Governance tab of the column details.
Applies to: 4.6.0
Fixed in: 4.6.1
Term assignments can't be removed after a term is deleted
After a term is deleted, you can no longer remove assignments of that term from assets or columns.
Workaround: You can apply one of these workarounds:
-
Turn off checking whether a term exists by setting the
wkc_term_assignment_feature_termupdate_check_terms
variable to false:- Log in to the cluster. Run the following command as a user with sufficient permissions to complete this task:
oc login <OpenShift_URL>:<port>
- Edit the Watson Knowledge Catalog custom resource by running the following command:
oc edit WKC wkc-cr
- Add the following entry after the top-level
spec
element in the yaml:
wkc_term_assignment_feature_termupdate_check_terms: "false"
Make sure to indent the entry by two spaces.
The change is picked up the next time the operator is reconciled, which can take 5 - 10 minutes. You can check in these ways whether the change is applied:
- Check whether the
wkc-term-assignment
pod was restarted. - Run the command
oc get WKC wkc-cr -o yaml
. The status information shows if and when the reconciliation was run.
-
Remove all terms by using the bulk action Remove business terms and reassign all terms manually. However, note that this bulk action also removes automatic assignments, suggestions and rejections, which will impact future automatic term assignments.
Applies to: 4.6.1 and 4.6.2
Fixed in: 4.6.3
Profiling in catalogs, projects, and metadata enrichment might fail for Teradata connections
If a generic JDBC connection for Teradata exists with a driver version before 17.20.00.15, profiling in catalogs and project and metadata enrichment of data assets from a Teradata connection fails with an error message similar to the following one:
2023-02-15T22:51:02.744Z - cfc74cfa-db47-48e1-89f5-e64865a88304 [P] ("CUSTOMERS") - com.ibm.connect.api.SCAPIException: CDICO0100E: Connection failed: SQL error: [Teradata JDBC Driver] [TeraJDBC 16.20.00.06] [Error 1536] [SQLState HY000] Invalid connection parameter name SSLMODE (error code: DATA_IO_ERROR)
Workaround: Apply one of these workarounds:
- If you don't work with automated discovery or data quality projects, complete these steps:
- Go to Data > Platform connections > JDBC drivers and delete the existing JAR file for Teradata (
terajdbc4.jar
). - Edit the generic JDBC connection, remove the selected JAR files, and add
SSLMODE=ALLOW
to the JDBC URL.
- Go to Data > Platform connections > JDBC drivers and delete the existing JAR file for Teradata (
- If you use a Teradata connection that was created as a generic JDBC platform connection in automated discovery or any data quality project, complete these steps:
- Go to Data > Platform connections > JDBC drivers and delete the existing JAR file for Teradata (
terajdbc4.jar
). - Upload the latest Teradata driver version (17.20.00.15). Make sure to use the same name for the JAR file as for the previously used driver. Otherwise, automated discovery will fail because the driver can't be found.
- Go to Data > Platform connections > JDBC drivers and delete the existing JAR file for Teradata (
Applies to: 4.6.3 and later
Turning off the schedule for an enrichment might not be possible
After you configure a repeating schedule for an existing metadata enrichment, you can't disable this schedule the next time you edit the enrichment. When you change the setting to Schedule off, you can't save the changes and an error is shown.
Workaround: Edit the metadata enrichment, clear the Repeat checkbox in the schedule configuration, and save the metadata enrichment. This stops future repetition of the enrichment job. Depending on the initial configuration, the enrichment job might run once.
Applies to: 4.6.4
Fixed in: 4.6.5
Issues with updating an existing metadata enrichment
When you edit an existing metadata enrichment, the following issues can occur:
- You can't clear the description of the metadata enrichment.
- Changes to custom sampling settings aren't saved if the sample size is provided in percent.
Workaround: To work around the issues:
- To remove the description, overwrite the existing description with a single character such as a hyphen (-).
- To change the custom sample settings for percentage samples:
- Change the sample size setting to Fixed Number and save your changes.
- Edit the metadata enrichment again, update any of the custom sampling options, and save your changes.
Applies to: 4.6.4
Fixed in: 4.6.5
Running primary key or relations analysis doesn't update the enrichment and review statuses
The enrichment status is set or updated when you run a metadata enrichment with the configured enrichment options (Profile data, Analyze quality, Assign terms). However, the enrichment status is not updated when you run a primary key analysis or a relationship analysis. In addition, the review status does not change from Reviewed to Reanalyzed after review if new keys or relationships were identified.
Applies to: 4.6.4
Can't run primary key analysis on data assets from an SAP OData data source
When you try to run primary key analysis on data assets from an SAP OData data source, analysis fails with the error message java.lang.IllegalArgumentException: The field <table>.<column> doesn't exist in the schema
.
Applies to: 4.6.4
Can't run primary key analysis on data assets from an Apache Cassandra data source
When you try to run primary key analysis on data assets from an Apache Cassandra data source, analysis fails with the error message java.util.concurrent.ExecutionException: com.ibm.wdp.service.common.exceptions.WDPException: KEYA1001E: The REST API service call was not handled successfully.
Applies to: 4.6.4
Fixed in: 4.6.5
Running relationship analysis on more than 200 data assets at once renders no results
If you try to run relationship analysis on more than 200 data assets at once, the analysis job times out after about 30 minutes and does not return any results.
A system administrator can check the Spark executor log for a timeout message:
-
Log in to the cluster.
-
Get the
wdp-profiling
logs:-
Run the following command to get the pod ID.
oc get pod | grep profiling
-
Run the following command replacing wdp-profiling-pod-id with the pod ID you obtained in the previous step.
oc exec -i <wdp-profiling-pod-ID> -- cat /logs/messages.log > <wdp-profiling-pod-ID>.log
The log contains a message similar to the following one:
Key Analysis Engine tasks were started, hbTaskId: <YYYY-ZZZZ>
-
-
Locate the
hb_tasks
log by using the following API:GET <CPD URL>/v1/hb_tasks/<YYYY-ZZZZ>/logs
This call will return the path to the log file:
"/opt/ibm/wlp/output/defaultServer/<YYYY-ZZZZ>.log"
-
Get the log file by running the following command:
oc exec -i <wdp-profiling-pod-ID> -- cat /opt/ibm/wlp/output/defaultServer/<YYYY-ZZZZ>.log > <YYYY-ZZZZ>.log
-
Check the log file for an error message similar to this one:
[executor-heartbeater] Executor: 94 - Issue communicating with driver in heartbeater org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10000 milliseconds]. This timeout is controlled by spark.executor.heartbeatInterval
Workaround: Select less than 200 data assets for each relationship analysis run.
Applies to: 4.6.4
Fixed in: 4.6.5
Metadata enrichment job is stuck in running state
If the metadata enrichment service manager can't process the events for tracking the metadata enrichment job status, the job can get stuck.
Workaround: Restart the metadata enrichment service manager pod. To restart the wkc-mde-service-manager pod, you can scale the pod to zero and then back to its previous value:
oc scale deploy wkc-mde-service-manager --replicas=0
oc scale deploy wkc-mde-service-manager --replicas=N
Applies to: 4.6.4 and later
When running enrichment on a large number of assets, profiling might fail for a subset
When metadata enrichment is run on a large number of data assets, processing might fail for a subset of the data assets. For each asset that couldn't be enriched, the following error message is written to the log of the metadata enrichment job:
DataProfile cannot be created as a profile already exists in the DataAsset with id \"<asset_ID>\"
Workaround: Rerun enrichment on the assets for which processing failed.
Applies to: 4.6.5
If you do a bulk removal of data classes, the Governance tab in the column details might temporarily be missing data class information
If you remove data classes from a set of columns in one go by using More > Remove data class, the respective data classes should be added to the list of suggested data classed for the columns. Instead, the data class section is empty and no button for adding a data class is shown.
Workaround: Leave the metadata enrichment results page and return, or refresh the page.
Applies to: 4.6.5
Restarts of the profiling pod might keep the metadata enrichment job in running state
When metadata enrichment is run on a large number of data assets, the wdp-profiling
pod might restart several times, which can cause the metadata enrichment job to be stuck in running state.
Workaround: Scale up the wdp-profiling
pod to 2 replicas and rerun the job.
Applies to: 4.6.5
When running enrichment on a large number of assets, profiling might fail for some assets due to gateway errors
When metadata enrichment is run on a large number of data assets, profiling might fail for some of the data assets due to gateway errors.
Workaround: Check the log of the metadata enrichment job for the IDs of the assets that couldn't be profiled. Rerun metadata enrichment for these asset. Alternatively, you can profile each asset individually from the asset's Profile page.
Applies to: 4.6.5
Business term or data class information panels in metadata enrichment do not show the correct information
When you click an assigned or suggested term or data class in the Governance tab in the metadata enrichment results, an information panel with a subset of properties defined for the artifact is shown. Not all sections of this information panel are currently filled properly.
Workaround: To see all properties of an artifact, open the artifact directly.
Applies to: 4.6
Data quality in projects
You might encounter these known issues and restrictions when you work with data quality assets in projects.
Can't run rule on CSV file if the output table contains column names with special characters
When you define the output table for a rule that is to run on a CSV file, you can select column names with special characters that will cause the DataStage flow to fail.
Workaround: Before you save your rule, make sure that any output column names comply with the following conventions:
- The name starts with an alphabetic character.
- The name does not contain any special characters other than underscores.
Applies to: 4.6.0
Fixed in: 4.6.1
Rules run on columns of type time in data assets from Amazon Redshift data sources do not return proper results
For data assets from Amazon Redshift data sources, columns of type time are imported with type timestamp. You can't apply time-specific data quality rules to such columns.
Applies to: 4.6.0, 4.6.1, 4.6.2, 4.6.3, and 4.6.4
Fixed in: 4.6.5
Rule testing in the review step fails if the bound data comes from an Apache Hive data source connected through Knox
In the review step of creating a rule, when you test a rule that is bound to a data asset from an Apache Hive data source connected through Knox, the test fails with an error message similar to this one:
Exception SCAPIException was caught during processing of the request: CDICO2034E: The property [zookeeper_discovery] is not supported.
Workaround: Configure such rule with sampling to run it against a limited number of rows and skip the test step during rule creation. After an initial run provides meaningful results, you can change the sample size or remove sampling altogether.
Applies to: 4.6.3
Fixed in: 4.6.4
Rules binding columns of type BigInt in data assets coming from a Snowflake connection might fail
A data quality rule will fail if a variable in the rule expression is bound to a column of the data type BigInt in a data asset coming from a Snowflake connection and that column contains negative numbers. Running such a rule will cause the following error:
java.lang.IllegalArgumentException The value of <some_negative_number> for field <column_name> is either too small or too large to fit into the BigInt type.
Applies to: 4.6.4
Fixed in: 4.6.5
Data from embedded array fields in MongoDB tables might not be written to the output table of a data quality rule
If the configured output table for a data quality rule for data assets from a MongoDB data source includes a column that contains an array, the respective column data is not written to the output table.
Applies to: 4.6.4
Fixed in: 4.6.5
Issues with rules in FIPS-enabled environments
In a FIPS-enabled environment, the following issues can occur if the connection is configured to use SSL for secure communication:
- You cannot create SQL-based rules for such connection.
- When you test a rule created from data quality definitions, the test fails with an error message similar to the following one:
However, actual runs of such data quality rules are successful.Exception SCAPIException was caught during processing of the request: CDICO0100E: The SSL certificate format is not valid; the certificate must conform to X509 PM format. Certificate parsing error: Key store error: JKS not found, user certificate:
These issues do not occur if the connection is not configured to use SSL.
Applies to: 4.6.4 and later
Rules run on columns of type timestamp with timezone fail
The data type timestamp with timezone is not supported. You can't apply data quality rules to columns with that data type.
Applies to: 4.6.5
Results of rule test and rule run deviate if the expression contains a comparison between string and numeric columns
If the rule expression contains a comparison between columns with string and numeric values, the rule test and the rule run return different results.
Workaround: When you configure the expression in your data quality definition, use the val(x) expression (or the parse _x_ as a number
block element) to convert the string values of column x to numeric values for comparison.
Applies to: 4.6.5
Rule testing in the review step fails if the rule contains joins for columns from SAP HANA data sources
If a rule contains bindings that require to join columns from a table from an SAP HANA data source, the rule test in the review step fails with an error message similar to this one:
Exception IllegalArgumentException was caught during processing of the request: Field t1_col2 does not exist in the input row
Applies to: 4.6.5
Rules with random sampling fail for data assets from Teradata data sources
If you bind data from a Teradata data source in a rule that is configured to use random sampling, the rule job fails to run. The underlying DataStage flow supports only the sampling types none
and block level sampling
for Teradata data sources.
Applies to: 4.6.5
MANTA Automated Data Lineage
You might encounter these known issues and restrictions when MANTA Automated Data Lineage is used for capturing lineage.
Metadata import jobs for getting lineage might take very long to complete
If multiple lineage scans are requested at the same time, the corresponding metadata import jobs for getting lineage might take very long to complete. This is due to the fact that MANTA Automated Data Lineage workflows can't run in parallel but are executed sequentially.
Applies to: 4.6.0 and later
Lineage import fails for schemas with names that contain special characters
If the scope of a metadata import for getting lineage includes a schema with a name that contains special characters, the lineage import fails.
Workaround: Make sure the scope of your metadata import does not include any schemas with names that contain special characters.
Applies to: 4.6.0
Fixed in: 4.6.1
The metadata import option "Get lineage" gets intermittently disabled
Although MANTA Automated Data Lineage for IBM Cloud Pak for Data is correctly enabled and the script count is not exhausted, the metadata import Get lineage option might be disabled.
Workaround: Restart the manta-keycloak pod to enable the option again:
-
Log in to the cluster as a project administrator.
-
Get the full name of the manta-keycloak pod by running the following command:
oc get pods -n ${PROJECT_CPD_INSTANCE} | grep manta-keycloak
-
Restart the pod by running the following command:
oc delete pod <manta-keycloak-pod> -n ${PROJECT_CPD_INSTANCE}
Applies to: 4.6.0
Fixed in: 4.6.1
Chrome security warning for Cloud Pak for Data deployments where MANTA Automated Data Lineage for IBM Cloud Pak for Data is enabled
When you try to access a Cloud Pak for Data cluster that has MANTA Automated Data Lineage for IBM Cloud Pak for Data enabled from the Chrome web browser, the message Your connection is not private
is displayed and you can't proceed.
This is due to MANTA Automated Data Lineage for IBM Cloud Pak for Data requiring an SSL certificate to be applied and occurs only if a self-signed certificate is used.
Workaround: To bypass the warning for the remainder of the browser session, type thisisunsafe
anywhere on the window. Note that this code changes every now and then. The mentioned code is valid as of the date
of general availability of Cloud Pak for Data 4.6.0. You can search the web for the updated code if necessary.
Applies to: 4.6
Lineage imports fail if the data scope is narrowed to schema
If the data scope of your lineage import is narrowed to schema for one or more connections, the import fails.
Workaround: Complete the following steps:
-
Log in to an infrastructure node of your Red Hat OpenShift cluster.
-
Create the files
config.properties
andopenExportCommon.properties
, each with this content:manta.cli.systemTruststore.path=/opt/mantaflow/keystores/tls-truststore.pkcs12 manta.cli.systemTruststore.password=${MTLS_KEYSTORE_PASSWORD} manta.cli.systemTruststore.verifyCertificateHostname=false
-
Copy the files to the manta-admin-gui pod:
- Get the ID of the manta-admin-gui pod.
oc get pod | grep manta-admin-gui
-
Run the following command. Replace manta-admin-gui-pod-id with the pod ID obtained in the previous step.
oc cp config.properties <manta-admin-gui-pod-id>:/opt/mantaflow/cli/scenarios/manta-dataflow-cli/conf/
-
Run the following command. Replace manta-admin-gui-pod-id with the pod ID obtained in the previous step.
oc cp openExportCommon.properties <manta-admin-gui-pod-id>:/opt/mantaflow/cli/scenarios/manta-dataflow-cli/conf/
Applies to: 4.6.1 and 4.6.2
Fixed in: 4.6.3
Rerun of a lineage import fails if assets were deleted from the source
When you rerun an lineage import after assets were deleted from the data source, the reimport fails with an error message similar to this one:
message" : "This error occurred while accessing the connectors service: The assets request failed: CDICO2005E: Table could not be found: SCHEMA.TABLE. If the table exists, then the user may not be authorized to see it.
Applies to: 4.6.3 and later
The metadata import option Get Lineage gets intermittently disabled
At times, you will see the Get lineage tile disabled when you create a metadata import and the following message is displayed:
Entitlement information couldn't be retrieved
Workaround: Restart the manta-admin-gui and manta-dataflow pods:
- Log in to the Red Hat OpenShift console.
- Go to Deployments.
- Search for
manta-admin-gui
and open the entry. - Scale down the pos down, then scale it back up.
- Repeat steps 3 and 4 for
manta-dataflow
.
Applies to: 4.6.4
Lineage
You might encounter these known issues and restrictions with lineage.
Assets connected through promoted lineage flows appear on column lineage
To improve performance of lineage on assets higher in hierarchy, business lineage promotes lineage flows from low levels to higher levels. Other assets connected through promoted lineage flow can appear on column lineage.
Applies to: 4.6.0 and 4.6.1
Fixed in: 4.6.2
Reimporting lineage after upgrade might lead to flows pointing to placeholder assets
When you rerun a metadata import for getting lineage after upgrading, some flows might point to placeholder assets. The problem occurs for lineage imported with metadata import before version 4.6.1.
Workaround: Visit IBM Cloud Pak for Data support for assistance.
Applies to: 4.6.1 and later
Missing column names in lineage for Netezza Performance Server and Google BigQuery data assets
Column names are missing when you import lineage for data assets from a Netezza Performance Server and Google BigQuery data source.
Applies to: 4.6.3
Fixed in: 4.6.4
Can't edit custom relationships
User can’t edit target-to-source custom relationships for artifacts or source-to-target and target-to-source custom relationships for categories.
Workaround: You can edit target-to-source custom relationships for artifact by accessing source artifact. You can’t edit custom relationships for categories.
Applies to: 4.6.3
Fixed in: 4.6.5
Predefined link value can’t be displayed correctly
When custom workflow template includes a predefined link value, the link value isn't displayed and No link set message appears.
Workaround: In a custom workflow that uses a predefined link value, the value shouldn't be set to the plain link but to a JSON string of the following pattern '{"name":"ANY NAME TO DESCRIBE THE LINK", "url":"URL"}'
Applies to: 4.6.5 Fixed in: 4.7
Accessing the lineage tab in the user interface does not work
In some instances, after applying the license to access lineage, the tab for lineage stays disabled.
Workaround: Restart the manta-admin-gui
and manta-dataflow
pods.
Follow these steps to restart the pods:
- Launch the OpenShift console.
- Navigate to
Deployments
. - Search for the
manta-admin-gui
pod and open it. - Scale down the
manta-admin-gui
pod and then scale it back up. - Search for the
manta-dataflow
pod and open it. - Scale down the
manta-dataflow
pod and then scale it back up.
Applies to: 4.6.5 and 4.6.6
Parent topic: Limitations and known issues in Cloud Pak for Data