Known issues and limitations

The following limitations and known issues apply to watsonx.data intelligence and watsonx.data integration.

Known issues for watsonx.data intelligence

Ownership of subdomains within private business domains in Data Product Hub cannot be individually modified

If the visibility of the parent domain is set to Selected community members only, then it is not possible to change individual ownership of any of the subdomains inside.

Workaround: To modify the owners of a subdomain, modify the owners of the parent domain so that the subdomains inherit the ownership from the parent domain.

Catalog asset search doesn't support special characters

If search keywords contain any of the following special characters, the search filter doesn't return the most accurate results.

Search keywords:

. + - && || ! ( ) { } [ ] ^ " ~ * ? : \

Workaround: To obtain the most accurate results, search only for the keyword after the special character. For example, instead of AUTO_DV1.SF_CUSTOMER, search for SF_CUSTOMER.

Predefined governance artifacts might not be available

If you don't see any predefined classifications or data classes, reinitialize your tenant by using the following API call:

curl -X POST "https://api.dataplatform.cloud.ibm.com/v3/glossary_terms/admin/initialize_content" -H "Authorization: Bearer $BEARER_TOKEN" -k

Assets are blocked if evaluation fails

The following restrictions apply to data assets in a catalog with policies enforced: File-based data assets that have a header can't have duplicate column names, a period (.), or single quotation mark (') in a column name.

If evaluation fails, the asset is blocked to all users except the asset owner. All other users see an error message that the data asset cannot be viewed because evaluation failed and the asset is blocked.

Issues with the Microsoft Excel add-in

The following issues are known for the Review metadata add-in for Microsoft Excel:

  • When you open the drop-down list to assign a business term or a data class, the entry Distinctive name is displayed as the first entry. If you select this entry, it shows up in the column but does not have any effect.

  • Updating or overwriting existing data in a spreadsheet is currently not supported. You must use an empty template file whenever you retrieve data.

  • If another user works on the metadata enrichment results while you are editing the spreadsheet, the other user's changes can get lost when you upload the changes that you made in the spreadsheet.

  • Only assigned data classes and business terms are copied from the spreadsheet columns Assigned / suggested data classes and Assigned / suggested business terms to the corresponding entry columns. If multiple business terms are assigned, each one is copied to a separate column.

Imported lineage mappings don't appear in the UI

You can import lineage mappings (Add to catalog > Lineage mapping), but the lineage mappings don't appear in the data lineage UI. The lineage mapping option is available only for business lineage available with IBM Knowledge Catalog.

Folder location doesn't automatically update when changing target asset in Data Refinery flow settings

When you change the target asset in the Data Refinery Flow settings, the folder location doesn't automatically update in the View Info panel if the new target asset is located in a different folder.

Workaround: Manually update the folder location in the Flow settings to match the location of the new target asset.

{render }

Target table loss and job failure when you use the Update option in a Data Refinery flow

Using the Update option for the Write mode target property for relational data sources (for example Db2) replaces the original target table and the Data Refinery job might fail.

Workaround: Use the Merge option as the **Write **mode and Append as the Table action.

Data column headers cannot contain special characters

Data with column headers that contain special characters might cause Data Refinery jobs to fail, and give the error Supplied values don't match positional vars to interpolate.

Workaround: Remove the special characters from the column headers.

Error opening a Data Refinery flow with connection with personal credentials

When you open a Data Refinery flow that uses a data asset that is based on a connection with personal credentials, you might see an error.

Workround: To open a Data Refinery flow that has assets which use connections with personal credentials, you must unlock the connection. You can unlock the connection either by editing the connection and entering your personal credentials, or by previewing the asset in the Project where you are prompted to enter your personal credentials. When you have unlocked the connection, you can then open the Data Refinery flow.

DataStage lineage job can produce no lineage

If you want to create a DataStage project export through the API, GET/v2/asset_exports/{export_id} API endpoint might not finish working and returns PENDING status. As a result, DataStage lineage job acts as complete and produces no lineage.

Workaround: Manually upload the project export as an asset and add the .zip file to the metadata import job to be processed.

Use the API to delete connections that are in use

From the user interface, you cannot delete a connection that is in use by a published data product. Using the user interface, you must retire all the data products that use the connection before you can delete the connection. You can override the user interface by using the API. Use the following API call to delete a connection that is in use by one or more published data products:

Call the DELETE Connection endpoint to delete connections. For example:

DELETE /v2/connections/{connection_id}

When a connection is deleted by using the API, the items in the data products that use the deleted connection cannot be delivered. If there are items in a data product that use other connections, those items will still be deliverable.

For details, see Delete connection.

The column-level profile information for a connected data asset with a column of type DATE does not show rows

In the column-level profile information for a connected data asset with a column of type DATE, no rows are displayed when you click show rows in the tabs Data Classes, Format or Types.

Can't create rules or SQL assets with plain-text queries for data quality output tables created prior to February 2026

For data quality rule output tables that were created before February 2026, you cannot create a new SQL-based data quality rule by using plain text queries. Also, you cannot create SQL assets (assets of the type query) from these tables.

In flows where a single document class is selected, classification and extraction might not work

In a flow where only one document class is provided for processing, the documents might not be properly processed.

In the Unstructured Data Integration flow, the Classification operator or Extract operator might fail to classify the documents or extract any entities respectively.

In unstructured data curation, the analysis flow might properly classify the documents. However, when you run the processing flow, the metrics might show that the documents were skipped for extraction or no entities were extracted.

Workaround: Manually update the generated flow:

  1. Replace the Classification operator with an Extract operator, and select all document classes.
  2. Remove any additional Extract operators that appear later in the flow.

Incremental ingestion of unstructured data is not supported for Slack

When Slack is used as data source in an Unstructured Data Integration flow, incremental ingestion is not supported and documents are ingested when re-running the flow even if they were not modified. There is currently no workaround for this issue.

Language annotator completes with warnings or errors when documents with unknown language are processed

When processing documents with unknown language, the Language annotator node status might report Completed with warnings, or Completed with errors, but the logs do now show the reason for such status.

Workaround: You can define how to proceed when the language is not recognized by using the Filter if language cannot be detected toggle:

  • When set to On, such documents are filtered out from further processing. The final status of the node would then be Completed With Errors.
  • When set to Off (default), documents with unknown language are processed, the language lang_name is defined as "UKNOWN" and the language score lang_score is set to 0. The final status of the node is then Completed With Warnings.

Milvus node fails with character lenght exception

Milvus node fails with the following exception:

MilvusException: (code=1100, message=length of varchar field text exceeds max length

Workaround: Use the chunking operator in the flow. Milvus only supports fields with maximum size of 65,536 characters, so the chunking operator is mandatory for files with text exceeding this limit.

Related assets not copied when copying flow from a catalog to a project

When you copy the Unstructured Data Integration flow from a catalog to a project by using the Add to project option, the related assets are not copied.

Workaround: Identify and copy the related assets to the project and then manually edit the flow in the project to set the related assets.

Alternatively, execute the following curl command:

curl --insecure -X POST "/udp/v1/flows/{flow_id}/deepcopy" -H "Authorization: Bearer <bearer_token>" -H "Content-Type: application/json" -d '{"container_kind": "catalog", "container_id": "<catalog_id>", "target_container_kind": "project", "target_container_id": "<project_id>"}'

Can't provide parameter values for scheduled jobs

When you schedule a job for future flow run for a flow with parameters defined, there is no possibility to provide values for these parameters.

Workaround: Make sure you provide default values for parameters, as these values will be used in future runs.

Strings with quotes might get truncated when using the Data Intelligence Agent

When you're using llama-3-2 models while working with the Data Intelligence Agent to generate SQL queries, strings with quotes might get truncated.

Workaround: Verify if the provided query was not truncated and edit it if required, after you get the confirmation prompt.

Flows using document libraries can't be promoted to space

A flow using document library does not work when it is promoted from a project to space, because document library can't be promoted along with the flow.

Workaround:

  1. When designing the flow in a project, create a local parameter or a parameter set for the document library ID.
  2. Assign this parameter in the property panel of the document set operator, instead of directly entering the value of the document library ID.
  3. Promote the flow to a space when ready.
  4. Create the document library in space.
  5. When executing the flow in space, pass the document library as a parameter or a parameter set.

Flows created by Unstructured Data Curation fail in deployment spaces at the Document Set operator

When you promote a flow created by Unstructured Data Curation to a deployment space, the flow might fail due to missing Presto connection configuration. The Document Set operator fails with an error Missing or Invalid 'asset_id' id because the project settings (including Presto connection configuration) are not automatically promoted to spaces.

Workaround: Before running an Unstructured Data Curation flow in a deployment space, you must manually configure a Presto connection:

  1. Navigate to the deployment space Manage tab.
  2. Locate the Document set storage section.
  3. Add and configure a Presto connection.
  4. Save the settings.
  5. Run the promoted flow.

This configuration is required for the Document Set operator to access the necessary storage resources in the deployment space environment.

Issues with publishing data quality rules

When a rule has a connection for input data asset and a connection with the same name already exists in the catalog, publishing the rule does not overwrite or update the connection in the catalog, regardless of the duplicate asset handling settings in the catalog.

Workaround: Publish the connection before you publish the data quality rule.

The Public Access group does not exist

The public access group AccessGroupID-PublicAccess is automatically added as a collaborator with the Viewer role to top-level categories. This group doesn't exist on AWS, which results errors when you try to view the list of group members on the Access Control tab for a top-level category.

Initial login takes you to the IBM Cloud Pak for Data experience

When you log in to the platform by using the default URL aws.data.ibm.com, you are taken to the IBM Cloud Pak for Data experience. This experience contains a only subset of the features that are available in watsonx.data intelligence. For example, it does not provide access to the data lineage feature.

Workaround: To gain access to the full set of watsonx.data intelligence features, you have these options:

  • Append ?context=df to the default URL before you log in.
  • If you logged in with the default URL, click the Switch platform icon alt="" next to your avatar on the home page, and select Data Fabric.

Can't publish imported assets from within the metadata import asset

When you open an metadata import asset and select one or more imported assets for publishing to a catalog, publishing fails.

Can't add assets from catalog to project due to missing permission

When you try to add an asset from a catalog to a project, an access-related error is shown and the asset is not added to the selected project even if you have the Administrator role in the project. This error can be due to the Add catalog assets to projects permission missing, which is part of the watsonx.data intelligence service.

Workaround: Make sure the watsonx.data intelligence service is provisioned in the account or you have the Account admin role assigned in the IBM SaaS console.

Private connectivity with Satellite is not available on AWS

Connecting to a source with private connectivity using a Satellite Connector or a Satellite location is not supported on AWS.

Workaround: There is currently no workaround.

Unable to upload a data contract file when creating a data product in AWS

When you create a data product, you cannot upload a data contract PDF file. Only a URL link to the contract is supported.

Delivery methods not supported in AWS

AWS does not support the listed delivery methods:

  • Data extract
  • Access in watsonx.data
  • Deliver to watsonx.data

For delivery methods that are available in AWS, see Working with delivery methods.

Data view is not available in data products when using AWS

When you are viewing your data products in AWS, you are unable to request a data view of your assets.

Extra collaborators in the Category view

For public access groups, for Categories, an extra collaborator (??) is displayed in the Collaborators section even though no collaborators were added. It's not an actual collaborator. Also, the API response includes AccessGroupId-public Access. This group doesn't exist and isn't available in the Artifact access control list or in AWS SaaS console.

You can ignore the ?? collaborator.

Advanced analysis and running data quality rules fails for uploaded CSV files (GovCloud)

For CSV files that you upload to the project from your local file system, running advanced analysis (advanced key or relationship analysis or advanced profiling) or data quality rules fails.

Advanced analysis fails with the error The [personal_credentials] flag is required and the metadata enrichment job run log shows messages like the following ones:

Key analysis job run (bba2b5aa-0a08-4c8d-a713-691dd9f408f7) is in state 'Completed with errors'.

Primary key detection task (299b56bd-96e6-4ade-889b-5317629eb483) of type 'pk_deep' is in state 'Completed with errors':
Key analysis service info:
 - Key analysis area id: 63349cc9-d814-4485-ac06-a5b2781704bd
Key analysis error: KEYA1001E: The REST API service call was not handled successfully. (status code=400, reason=Bad Request, url=https://internal.api.dai.ibmforusgov.com/v2/connections?project_id=d85a535d-2a76-4673-a040-a9b81c062b46, method=POST, response={"trace":"8v5hk7zll3ayydcufx2nxdx6h","errors":[{"code":"invalid_payload","message":"The [personal_credentials] flag is required.","more_info":"Set the flag and call the API again.","extra":{"environment_name":"daigovprodaws","http_status":400,"id":"CDICO9035E","source_cluster":"NULL","source_component":"wdp-connect-connection","timestamp":"2025-11-27T01:41:23.167Z","transaction_id":"7dtdv4cgd6wgzoefwatwtzzav"}}]})

For data quality rules, the following error occurs when you test or actually run the rule:

An unknown error occurred. Exception IOException was caught during processing of the request: The [personal_credentials] flag is required.

Workaround: To work around the issue, complete these steps:

  1. Upload the CSV files you want to analyze to a supported file storage such as Amazon S3 or Google Cloud Storage.
  2. In your project, create a connection to the storage and import the CSV files.
  3. Run the analysis or the data quality rule on the imported assets.

Data asset profile view is not available

Profile view for data assets is showing 500 Internal server error when you try to:

  • Import data asset from Platform assets catalog into a project and you try to view the selected data asset profile from the overflow menu.
  • View the Profile tab for a catalog asset and click on column name.

There is currently no workaround for this issue.

The meta-llama/llama-3-3-70b-instruct model doesn't respond for Data Intelligence Agent

If you're using the meta-llama/llama-3-3-70b-instruct model and don't get a response for more than 1 minute, the following message is returned: Sorry, I didn't receive any response from the agent: No response from agent..

Workaround: You can either restart the chat, type the the prompt again, or select another LLM from Settings.

The SQL Query Generation tool fails for Data Intelligence Agent

The reporting_sql_query_generation tool fails with the following error message:

fastmcp.exceptions.ToolError:
SQL generation API call failed:
Server Error (Status: 500) – Internal server error.

Connected data asset name updates aren't reflected in the Hierarchies view

If you change the name for connected data assets, the name changes aren't reflected when you're browsing asset hierarchies and in the related asset hierarchies panels.

Assets disappear in the Hierarchies panel and crash

For database connections with more than 10,000 assets, multiple network calls are performed to retrieve asset details. The initial requests successfully fetch and display up to 10,000 assets. When a subsequent fetch is triggered to load additional assets, they're not fetched and the assets already displayed in the Hierarchies panel disappear. The following error message is displayed: No contents. Items will appear here after they are added.

Inaccurate number of assets shown in the Hierarchies tree view

The number of assets displayed in the Hierarchies tree view is inaccurate as compared to the panel view due to the following reasons:

  • Unlike other assets, assets from database connections aren't categorized by schema name and tables when they're fetched to be displayed in the Hierarchies panel view. As a result, up to 1000 assets might be fetched in one call, but only 1 schema might be fetched if the retrieved assets include assets from a database connection.
  • If there's more than 1,000 assets available, they can't be fetched and displayed in the Hierarchies view.

Can't list business terms with the list_business_terms_by_search_term tool

The list_business_terms_by_search_term tool is broken and fails due to a missing ctx argument. When you're trying to prompt an AI agent to list business terms for an item, you get the following response:

list_business_terms_by_search_term is currently broken on the connected MCP server and fails with a missing ctx argument, so business terms for xxx could not be retrieved with that tool.

Workaround: Use the UI to see available business terms.

Can't import glossary because the glossary_csv_import tool is broken and fails

The glossary_csv_import tool is broken and fails. When you're trying to prompt an AI agent to list import glossary, an error message appears.

Workaround: Use the UI to import CSVs.

Local assets published from project to catalog are unavailable in Hierarchies view

If a local asset is added to a project and then published from the project to a catalog, the local asset isn't available in the Hierarchies view.

Workaround: Directly add a local file asset to the catalog. The local file asset appears under the default Cloud Object Storage connection.

Known issues for watsonx.data integration

In flows where a single document class is selected, classification and extraction might not work

In a flow where only one document class is provided for processing, the documents might not be properly processed.

In the Unstructured Data Integration flow, the Classification operator or Extract operator might fail to classify the documents or extract any entities respectively.

In unstructured data curation, the analysis flow might properly classify the documents. However, when you run the processing flow, the metrics might show that the documents were skipped for extraction or no entities were extracted.

Workaround: Manually update the generated flow:

  1. Replace the Classification operator with an Extract operator, and select all document classes.
  2. Remove any additional Extract operators that appear later in the flow.

Related assets not copied when copying flow from a catalog to a project

When you copy the Unstructured Data Integration flow from a catalog to a project using Add to project, related assets are not copied.

Workaround: Copy related assets manually and update the flow in the project. Alternatively, run:

curl --insecure -X POST "/udp/v1/flows/{flow_id}/deepcopy" -H "Authorization: Bearer <bearer_token>" -H "Content-Type: application/json" -d '{"container_kind": "catalog", "container_id": "<catalog_id>", "target_container_kind": "project", "target_container_id": "<project_id>"}'

Flows using document libraries can't be promoted to space

Document libraries can't be promoted from a project to a space.

Workaround:

  1. When designing the flow in a project, create a local parameter or a parameter set for the document library ID.
  2. Assign this parameter in the property panel of the document set operator, instead of directly entering the value of the document library ID.
  3. Promote the flow to a space when ready.
  4. Create the document library in space.
  5. When executing the flow in space, pass the document library as a parameter or a parameter set.

Document Set and Entity Store operators using Python Orchestrator fail

The Document Set and Entity Store operators using Python Orchestrator might fail with the following error:

Node Document set failed and caused aborting the branch execution: Please check if the MinIO bucket associated with the catalog, and the service route has been created or not.

Workaround: Ensure the following two prerequisites are met for these operators:

  • Access to the associated metadata store bucket

    Each metadata store (for example, Hive or SQL) is associated with an S3 or Cloud Object Storage bucket. The user executing the operators must have access permissions to this underlying bucket. If the access is not already granted, you must add the required user or group to the bucket using the watsonx Infrastructure Manager Console. Without bucket access, the operators are not able to read or write data to the metadata store.

  • Handling the default MinIO bucket (Non-production usage)

    For exploratory or non-production scenarios, watsonx.data includes a default MinIO bucket that is automatically associated with the metadata store. However, this default bucket uses an internal S3 endpoint that is not accessible from external systems such as Unstructured Data Integration. If you plan to use this default MinIO bucket, you must expose the endpoint externally so that it can be accessed by outside systems.

    Note: Creating the edge route exposes the MinIO console externally, allowing external clients to interact with it.

    Follow these steps to expose the MinIO bucket:

    1. Access the MinIO Console.
    2. Create an edge route to expose the MinIO service:
    oc create route edge ibm-lh-lakehouse-minio-console --service=ibm-lh-lakehouse-minio-svc --port=9000
    ```  1. Retrieve the route host for the MinIO service:
    ```txt {: .codeblock}
    oc get routes ibm-lh-lakehouse-minio-console
    ```     You will now see that the route is port forwarded and is accessible from external systems.
    1. Extract the access and secret keys if needed:
    ```txt {: .codeblock}
    oc extract secret/ibm-lh-config-secret --to=- --keys=env.properties | grep -E "LH_S3_ACCESS_KEY|LH_S3_SECRET_KEY"
    

    This step is only required when using the default internal MinIO bucket for testing or non-production purposes. Production-grade metadata stores already use S3 or COS buckets with external endpoints, and do not require port forwarding.

The flow node output preview table is not available when using Spark orchestrator

When you run a flow that uses Spark orchestrator, the preview table that shows all the node output is not available.

Workaround: There is currently no workaround for this issue.

Iceberg metastore connection test is always successful

When you create a connection to Iceberg metastore and click Test connection, the test always passes. There is no validation for this test, so the result is unreliable.

Workaround: There is currently no workaround for this issue.

Entity store operator fails if the target table has special characters depending on the source used

The Entity store operator will fail if the target table has special characters and Iceberg metastore is used.

Workaround: There is currently no workaround for this issue.

Document set operator fails with Schema not found

Document set operator is failing when running it in Spark orchestrator.

Workaround: Document set operations are supported only for catalogs that are connected to the Spark engine within the Lakehouse. You can't use an external Presto connection to create document set or ingest data using ingest document set. Ensure both the Spark engine and the catalog are present in the Lakehouse and connected.

Limitations for watsonx.data intelligence and watsonx.data integration

For assets from SAP OData sources, the metadata enrichment results do not show the table type

In general, metadata enrichment results show for each enriched data asset whether the asset is a table or a view. This information cannot be retrieved for data assets from SAP OData data sources and is thus not shown in the enrichment results.

You can't manage more than 20 assets at the same time

You can edit, delete only up to 20 assets at the same time.

Data Refinery does not support the Satellite Connector

You cannot use a Satellite Connector to connect to a database with Data Refinery

Creating new workflow templates is not available for Data Product Hub

You cannot create a new workflow template for Data Product Hub.

Workaround: Predefined templates for 1-step and 2-step approvals are available in the Work Template Files tab. To access them, go to the navigation menu and click: Administration > Configurations and settings > Workflow management > Access request for data product > Work Template Files.

Limitations for Unstructured Data Integration multilingual support

Supported languages:

  • English
  • Japanese

When working with documents in Japanese, the following limitations apply:

  • In text extraction, some entity key values might be missing or incorrectly extracted from Japanese documents.
  • Text extraction alters the original order of the content for Japanese language.
  • PII and HAP annotator doesn't work for documents in Japanese even with multilingual models.

When a document library includes documents in multiple languages, the library search results and the prompt lab queries return results in all languages. If you want the results in one language only, you can design the unstructured data flow so that each schema only contains one language.