Known issues and limitations

The following limitations and known issues apply to Cloud Pak for Data as a Service.

List of IBM Knowledge Catalog issues

List of Masking flow issues

List of Data Virtualization issues

List of watsonx.ai Studio issues

List of watsonx.ai Studio limitations

List of Data Refinery known issues

List of Data Refinery limitations

List of Visualizations issues

List of machine learning issues

List of machine learning limitations

List of Watson OpenScale issues

List of SPSS Modeler issues

List of SPSS Modeler limitations

List of connection issues

Issues with Cloud Object Storage

  • List of machine learning issues
    • Error with assets using watsonx.ai Runtime in projects specifying Cloud Object Storage with Key Protect enabled.
    • Auto AI
    • Federated Learning
    • Pipelines
  • List of SPSS Modeler issues
    • Unable to save model to project specifying Cloud Object Storage with Key Protect enabled.
  • List of notebooks issues
    • Unable to save model to project specifying Cloud Object Storage with Key Protect enabled.

IBM Knowledge Catalog

If you use IBM Knowledge Catalog, you might encounter these known issues and restrictions when you use catalogs.

Connection asset type doesn't get permanently deleted after the removal

Asset type Connection does not get deleted immediately after the removal, even if the asset removal configuration is set to Purge assets automatically upon removal in the catalog UI, and is showing in trash.

Catalog asset search doesn't support special characters

If search keywords contain any of the following special characters, the search filter doesn't return the most accurate results.

Search keywords:

. + - && || ! ( ) { } [ ] ^ " ~ * ? : \

Workaround: To obtain the most accurate results, search only for the keyword after the special character. For example, instead of AUTO_DV1.SF_CUSTOMER, search for SF_CUSTOMER.

Masked data is not supported in data visualizations

Masked data is not supported in data visualizations. If you attempt to work with masked data while generating a chart in the Visualizations tab of a data asset in a project the following error message is received: Bad Request: Failed to retrieve data from server. Masked data is not supported.

Data is not masked in some project tools

When you add a connected data asset that contains masked columns from a catalog to a project, the columns remain masked when you view the data and when you refine the data in the Data Refinery tool. However, other tools in projects do not preserve masking when they access data through a connection. For example, when you load connected data in a Notebook, a DataStage flow, a dashboard, or other project tools, you access the data through a direct connection and bypass masking.

Predefined governance artifacts might not be available

If you don't see any predefined classifications or data classes, reinitialize your tenant by using the following API call:

curl -X POST "https://api.dataplatform.cloud.ibm.com/v3/glossary_terms/admin/initialize_content" -H "Authorization: Bearer $BEARER_TOKEN" -k

Add collaborators with lowercase email addresses

When you add collaborators to the catalog, enter email addresses with all lowercase letters. Mixed-case email addresses are not supported.

Object Storage connection restrictions

When you look at a Cloud Object Storage (S3 API) or Cloudant connection, the folder itself is listed as a child asset.

Multiple concurrent connection operations might fail

An error might be encountered when multiple users are running connection operations concurrently. The error message can vary.

Can't enable data protection rule enforcement after catalog creation

You cannot enable the enforcement of data protection rules after you create a catalog. To apply data protection rules to the assets in a catalog, you must enable enforcement during catalog creation.

Assets are blocked if evaluation fails

The following restrictions apply to data assets in a catalog with policies enforced: File-based data assets that have a header can't have duplicate column names, a period (.), or single quotation mark (') in a column name.

If evaluation fails, the asset is blocked to all users except the asset owner. All other users see an error message that the data asset cannot be viewed because evaluation failed and the asset is blocked.

Only the data class filter in metadata enrichment results is case-sensitive

When you filter metadata enrichment results on the Column tab, only the Data class entries are case-sensitive. The entries in the Business terms, Schemas, and Assets filters are all lowercase regardless of the actual casing of the value.

Enrichment details for an asset might not reflect the settings applied on latest enrichment run

After you edit the enrichment options for a metadata enrichment that was run at least once, the asset details might show the updated options instead of the options applied in the latest enrichment run.

Can't access individual pages in a metadata enrichment asset directly

If the number of assets or columns in a metadata enrichment asset spans several pages, you can't go to a specific page directly. The page number drop-down list is disabled. Use the Next page and Previous page buttons instead.

In some cases, you might not see the full log of a metadata enrichment job run in the UI

If the list of errors in a metadata enrichment run is exceptionally long, only part of the job log might be displayed in the UI.

Workaround: Download the entire log and analyze it in an external editor.

Schema information might be missing when you filter enrichment results

When you filter assets or columns in the enrichment results on source information, schema information might not be available.

Workaround: Rerun the enrichment job and apply the Source filter again.

Writing metadata enrichment output to an earlier version of Apache Hive than 3.0.0

If you want to write data quality output generated by metadata enrichment to an Apache Hive database at an earlier software version than 3.0.0, set the following configuration parameters in your Apache Hive Server:

set hive.support.concurrency=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.enforce.bucketing=true;   # (not required for version 2)

set hive.compactor.initiator.on=true;
set hive.compactor.cleaner.on=true;   # might not be available depending on the version
set hive.compactor.worker.threads=1;

For more information, see Hive Transactions.

Business terms filter in enrichment results might not immediately reflect assignment changes

When you assign or unassign business terms manually, the business terms filter might not immediately reflect these changes.

Workaround: Refresh the page by clicking the Refresh icon Refresh icon.

For assets from SAP OData sources, the metadata enrichment results do not show the table type

In general, metadata enrichment results show for each enriched data asset whether the asset is a table or a view. This information cannot be retrieved for data assets from SAP OData data sources and is thus not shown in the enrichment results.

Issues with the Microsoft Excel add-in

The following issues are known for the Review metadata add-in for Microsoft Excel:

  • When you open the drop-down list to assign a business term or a data class, the entry Distinctive name is displayed as the first entry. If you select this entry, it shows up in the column but does not have any effect.

  • Updating or overwriting existing data in a spreadsheet is currently not supported. You must use an empty template file whenever you retrieve data.

  • If another user works on the metadata enrichment results while you are editing the spreadsheet, the other user's changes can get lost when you upload the changes that you made in the spreadsheet.

  • Only assigned data classes and business terms are copied from the spreadsheet columns Assigned / suggested data classes and Assigned / suggested business terms to the corresponding entry columns. If multiple business terms are assigned, each one is copied to a separate column.

Masking flow

If you use Masking flow, you might encounter these known issues and restrictions when you are privatizing data.

Masking flow jobs might fail

During a masking flow job, Spark might attempt to read all of a data source into memory. Errors might occur when there isn't enough memory to support the job. The largest volume of data that can fit into the largest deployed Spark processing node is approximately 12GBs.

watsonx.ai Studio

You might encounter some of these issues when getting started with and using notebooks:

Duplicating a notebook doesn't create a unique name in the new projects UI

When you duplicate a notebook in the new projects UI, the duplicate notebook is not created with a unique name.

Can't create assets in older accounts

If you're working in an instance of watsonx.ai Studio that was activated before November, 2017, you might not be able to create operational assets, like notebooks. If the Create button stays gray and disabled, you must add the watsonx.ai Studio service to your account from the Services catalog.

500 internal server error received when launching watsonx.ai Studio

Rarely, you may receive an HTTP internal server error (500) when launching watsonx.ai Studio. This might be caused by an expired cookie stored for the browser. To confirm the error was caused by a stale cookie, try launching watsonx.ai Studio in a private browsing session (incognito) or by using a different browser. If you can successfully launch in the new browser, the error was caused by an expired cookie. You have a choice of resolutions:

  1. Exit the browser application completely to reset the cookie. You must close and restart the application, not just close the browser window. Restart the browser application and launch watsonx.ai Studio to reset the session cookie.
  2. Clear the IBM cookies from the browsing data and launch watsonx.ai Studio. Look in the browsing data or security options in the browser to clear cookies. Note that clearing all IBM cookies may affect other IBM applications.

If the 500 error persists after performing one of these resolutions, check the status page for IBM Cloud incidents affecting watsonx.ai Studio. Additionally, you may open a support case at the IBM Cloud support portal.

Error during login

You might get this error message while trying to log in to watsonx.ai Studio: "Access Manager WebSEAL could not complete your request due to an unexpected error." Try to log in again. Usually the second login attempt works.

Failure to export a notebook to HTML in the Jupyter Notebook editor

When you are working with a Jupyter Notebook created in a tool other than watsonx.ai Studio, you might not be able to export the notebook to HTML. This issue occurs when the cell output is exposed.

Workaround

  1. In the Jupyter Notebook UI, go to Edit and click Edit Notebook Metadata.

  2. Remove the following metadata:

    "widgets": {
       "state": {},
       "version": "1.1.2"
    }
    
  3. Click Edit.

  4. Save the notebook.

Manual installation of some tensor libraries is not supported

Some tensor flow libraries are preinstalled, but if you try to install additional tensor flow libraries yourself, you get an error.

Connection to notebook kernel is taking longer than expected after running a code cell

If you try to reconnect to the kernel and immediately run a code cell (or if the kernel reconnection happened during code execution), the notebook doesn't reconnect to the kernel and no output is displayed for the code cell. You need to manually reconnect to the kernel by clicking Kernel > Reconnect. When the kernel is ready, you can try running the code cell again.

Using the predefined sqlContext object in multiple notebooks causes an error

You might receive an Apache Spark error if you use the predefined sqlContext object in multiple notebooks. Create a new sqlContext object for each notebook. See this Stack Overflow explanation.

Connection failed message

If your kernel stops, your notebook is no longer automatically saved. To save it, click File > Save manually, and you should get a Notebook saved message in the kernel information area, which appears before the Spark version. If you get a message that the kernel failed, to reconnect your notebook to the kernel click Kernel > Reconnect. If nothing you do restarts the kernel and you can't save the notebook, you can download it to save your changes by clicking File > Download as > Notebook (.ipynb). Then you need to create a new notebook based on your downloaded notebook file.

Can't connect to notebook kernel

If you try to run a notebook and you see the message Connecting to Kernel, followed by Connection failed. Reconnecting and finally by a connection failed error message, the reason might be that your firewall is blocking the notebook from running.

If watsonx.ai Studio is installed behind a firewall, you must add the WebSocket connection wss://dataplatform.cloud.ibm.com to the firewall settings. Enabling this WebSocket connection is required when you're using notebooks and RStudio.

Insufficient resources available error when opening or editing a notebook

If you see the following message when opening or editing a notebook, the environment runtime associated with your notebook has resource issues:

Insufficient resources available
A runtime instance with the requested configuration can't be started at this time because the required hardware resources aren't available.
Try again later or adjust the requested sizes.

To find the cause, try checking the status page for IBM Cloud incidents affecting watsonx.ai Studio. Additionally, you can open a support case at the IBM Cloud Support portal.

Limitations when using watsonx.ai Studio:

Files that are uploaded through the watsonx.ai Studio UI are not validated or scanned for potentially malicious content

Files that you upload through the watsonx.ai Studio UI are not validated or scanned for potentially malicious content. It is strongly recommended that you run security software, such as an anti-virus application on all files before uploading your files to ensure the security of your content.

Data Refinery known issues

Target table loss and job failure when you use the Update option in a Data Refinery flow

Using the Update option for the Write mode target property for relational data sources (for example Db2) replaces the original target table and the Data Refinery job might fail.

Workaround: Use the Merge option as the **Write **mode and Append as the Table action.

Data Refinery limitations

Data column headers cannot contain special characters

Data with column headers that contain special characters might cause Data Refinery jobs to fail, and give the error Supplied values don't match positional vars to interpolate.

Workaround: Remove the special characters from the column headers.

Data Refinery does not support the Satellite Connector

You cannot use a Satellite Connector to connect to a database with Data Refinery

Error opening a Data Refinery flow with connection with personal credentials

When you open a Data Refinery flow that uses a data asset that is based on a connection with personal credentials, you might see an error.

Workround: To open a Data Refinery flow that has assets which use connections with personal credentials, you must unlock the connection. You can unlock the connection either by editing the connection and entering your personal credentials, or by previewing the asset in the Project where you are prompted to enter your personal credentials. When you have unlocked the connection, you can then open the Data Refinery flow.

Visualizations issues

You might encounter some of these issues when working with the Visualization tab in a Data asset in a project.

The column-level profile information for a connected data asset with a column of type DATE, does not show rows

In the column-level profile information for a connected data asset with a column of type DATE, no rows are displayed when you click show rows in the tabs Data Classes, Format or Types.

watsonx.ai Runtime issues

You might encounter some of these issues when working with watsonx.ai Runtime.

Region requirements

You can only associate a watsonx.ai Runtime service instance with your project when the watsonx.ai Runtime service instance and the watsonx.ai Studio instance are located in the same region.

Accessing links if you create a service instance while associating a service with a project

While you are associating a watsonx.ai Runtime service to a project, you have the option of creating a new service instance. If you choose to create a new service, the links on the service page might not work. To access the service terms, APIs, and documentation, right click the links to open them in new windows.

Federated Learning assets cannot be searched in All assets, search results, or filter results in the new projects UI

You cannot search Federated Learning assets from the All assets view, the search results, or the filter results of your project.

Workaround: Click the Federated Learning asset to open the tool.

Deployment issues

  • A deployment that is inactive (no scores) for a set time (24 hours for the free plan or 120 hours for a paid plan) is automatically hibernated. When a new scoring request is submitted, the deployment is reactivated and the score request is served. Expect a brief delay of 1 to 60 seconds for the first score request after activation, depending on the model framework.
  • For some frameworks, such as SPSS modeler, the first score request for a deployed model after hibernation might result in a 504 error. If this happens, submit the request again; subsequent requests should succeed.

Previewing masked data assets is blocked in deployment space**

A data asset preview may fail with this message: This asset contains masked data and is not supported for preview in the Deployment Space

Deployment spaces currently don't support masking data so the preview for masked assets has been blocked to prevent data leaks.

Batch deployment jobs that use large inline payload might get stuck in starting or running state

If you provide a large asynchronous payload for your inline batch deployment, it can result in the runtime manager process to go out of heap memory.

In the following example, 92 MB of payload was passed inline to the batch deployment which resulted in the heap to go out of memory.

Uncaught error from thread [scoring-runtime-manager-akka.scoring-jobs-dispatcher-35] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[scoring-runtime-manager]
java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
	at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:172)
	at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:538)
	at java.base/java.lang.StringBuilder.append(StringBuilder.java:174)
   ...

This could result in concurrent jobs getting stuck in starting or running state. The starting state can only be cleared once the deployment is deleted and a new deployement is created. The running state can be cleared without deleting the deployment.

As a workaround, use data references instead of inline for huge payloads that are provided to batch deployments.

watsonx.ai Runtime limitations

AutoAI known limitations

  • Currently, AutoAI experiments do not support double-byte character sets. AutoAI only supports CSV files with ASCII characters. Users must convert any non-ASCII characters in the file name or content, and provide input data as a CSV as defined in this CSV standard.

  • To interact programmatically with an AutoAI model, use the REST API instead of the Python client. The APIs for the Python client required to support AutoAI are not generally available at this time.

Data module not found in IBM Federated Learning

The data handler for IBM Federated Learning is trying to extract a data module from the FL library but is unable to find it. You might see the following error message:

ModuleNotFoundError: No module named 'ibmfl.util.datasets'

The issue possibly results from using an outdated DataHandler. Please review and update your DataHandler to conform to the latest spec. Here is the link to the most recent MNIST data handler or ensure your sample versions are up-to-date.

Setting environment variables in a conda yaml file does not work for deployments

Setting environment variables in a conda yaml file does not work for deployments. This means that you cannot override existing environment variables, for example LD_LIBRARY_PATH, when deploying assets in watsonx.ai Runtime.

As a workaround, if you're using a Python function, consider setting default parameters. For details, see Deploying Python functions.

Files that are uploaded through the deployment space UI are not validated or scanned for potentially malicious content

Files that you upload through the deployment space UI are not validated or scanned for potentially malicious content. It is strongly recommended that you run security software, such as an anti-virus application on all files before uploading your files to ensure the security of your content.

Watson OpenScale issues

You might encounter the following issues in Watson OpenScale:

Drift configuration is started but never finishes

Drift configuration is started but never finishes and continues to show the spinner icon. If you see the spinner run for more than 10 minutes, it is possible that the system is left in an inconsistent state. There is a workaround to this behavior: Edit the drift configuration. Then, save it. The system might come out of this state and complete configuration. If drift reconfiguration does not rectify the situation, contact IBM Support.

SPSS Modeler issues

You might encounter some of these issues when working in SPSS Modeler.

SPSS Modeler runtime restrictions

watsonx.ai Studio does not include SPSS functionality in Peru, Ecuador, Colombia and Venezuela.

Timestamp data measured in microseconds

If you have timestamp data that is measured in microseconds, you can use the more precise data in your flow. However, you can import data that is measured in microseconds only from connectors that support SQL pushback. For more information about which connectors support SQL pushback, see Supported data sources for SPSS Modeler.

SPSS Modeler limitations

Languages supported by Text Analytics

The Text Analytics feature in SPSS Modeler supports the following languages:

  • Dutch
  • English
  • French
  • German
  • Italian
  • Japanese
  • Portuguese
  • Spanish

SPSS Modeler doesn't support Satellite Connector

You cannot use a Satellite Connector to connect to a database with SPSS Modeler.

Merge node and unicode characters

The Merge node treats the following very similar Japanese characters as the same character.
Japanese characters

Connection issues

You might encounter this issue when working with connections.

Apache Impala connection does not work with LDAP authentication

If you create a connection to a Apache Impala data source and the Apache Impala server is set up for LDAP authentication, the username and password authentication method in Cloud Pak for Data as a Service will not work.

Workaround: Disable the Enable LDAP Authentication option on the Impala server. See Configuring LDAP Authentication in the Cloudera documentation.

Orchestration Pipelines known issues

The issues pertain to Orchestration Pipelines.

Asset browser does not always reflect count for total numbers of asset type

When selecting an asset from the asset browser, such as choosing a source for a Copy node, you see that some of the assets list the total number of that asset type available, but notebooks do not. That is a current limitation.

Cannot delete pipeline versions

Currently, you cannot delete saved versions of pipelines that you no longer need. All versions will be deleted when the pipeline is deleted.

Deleting an AutoAI experiment fails under some conditions

Using a Delete AutoAI experiment node to delete an AutoAI experiment that was created from the Projects UI does not delete the AutoAI asset. However, the rest of the flow can complete successfully.

Cache appears enabled but is not enabled

If the Copy assets Pipelines node's Copy mode is set to Overwrite, cache is displayed as enabled but remains disabled.

Pipelines cannot save some SQL statements

Pipelines cannot save when SQL statements with parentheses are passed in a script or user variable.

To resolve this issue, replace all instances of parentheses with their respective ASCII code (( with #40 and ) with #41) and replace the code when you set it as a user variable.

For example, the statement select CAST(col1 as VARCHAR(30)) from dbo.table in a Run Bash script node will cause an error. Instead, use the statement select CAST#40col1 as VARCHAR#4030#41#41 from dbo.table and replace the instances when setting it as a user variable.

Orchestration Pipelines abort when limit for annotations is reached

Pipeline expressions require annotations, which have a limit due to the limit for annotations in Kubernetes. If you reach this limit, your pipeline will abort without displaying logs.

Orchestration Pipelines limitations

These limitations apply to Orchestration Pipelines.

Single pipeline limits

These limitation apply to a single pipeline, regardless of configuration.

  • Any single pipeline cannot contain more than 120 standard nodes
  • Any pipeline with a loop cannot contain more than 600 nodes across all iterations (for example, 60 iterations - 10 nodes each)

Input and output size limits

Input and output values, which include pipeline parameters, user variables, and generic node inputs and outputs, cannot exceed 10 KB of data.

Batch input limited to data assets

Currently, input for batch deployment jobs is limited to data assets. This means that certain types of deployments, which require JSON input or multiple files as input, are not supported. For example, SPSS models and Decision Optimization solutions that require multiple files as input are not supported.

Bash scripts throws errors with curl commands

The Bash scripts in your pipelines might cause errors if you implement curl commands in them. To prevent this issue, set your curl commands as parameters. To save a pipeline that causes error when saving, try exporting the isx file and importing them into a new project.

Issues with Cloud Object Storage

These issue apply to working with Cloud Object Storage.

Issues with Cloud Object Storage when Key Protect is enabled

Key Protect in conjunction with Cloud Object Storage is not supported for working with watsonx.ai Runtime assets. If you are using Key Protect, you might encounter these issues when you are working with assets in watsonx.ai Studio.

  • Training or saving these watsonx.ai Runtime assets might fail:
    • Auto AI
    • Federated Learning
    • Pipelines
  • You might be unable to save an SPSS model or a notebook model to a project

Issues with watsonx.governance

The Action button isn't displayed in a questionnaire assessment

Applies to: Governance console 9.0.0.3
Fixed in: Instances created in 9.0.0.4 or later

What's happening

When you add a question template and link it to a questionnaire assessment. The Action button is not displayed.

To verify the cause of the issue, do the following steps:

  1. Open the Administration menu, and then click Workflow > Questionnaire Assessment Workflow.
  2. Click the arrow that connects Start and Applicability Assessment.
  3. Expand Conditions Notice that two conditions are defined.

Why it's happening

The default value of the AI Assessment Type field on the questionnaire assessment is Not Determined. But this field value should be empty.

How to fix it

To resolve this issue, you need to make a change so that the questionnaire assessment meets the criteria for the Applicability Assessment workflow stage.

  1. Open the questionnaire assessment and click the Task tab.
  2. Clear the value of the AI Assessment Type field.

Integration limitation with OpenPages

When the AI Factsheets is integrated with OpenPages, the fields created in the field groups MRG-UserFacts-Model or MRG-UserFact-Model and MRG-UserFacts-ModelEntry or MRG-UserFact-ModelUseCase are synced to modelfacts_user_op and model_entry_user_op asset type definitions. However, when the fields are created from the OpenPages application, avoid specifying the fields as required, and do not specify a range of values. If you mark them as required or assign a range of values, the sync will fail.

Delay showing prompt template deployment data in a factsheet

When a deployment is created for a prompt template, the facts for the deployment are not added to factsheet immediately. You must first evaluate the deployment or view the lifecycle tracking page to add the facts to the factsheet.

Redundant attachment links in factsheet

A factsheet tracks all of the events for an asset over all phases of the lifecycle. Attachments show up in each stage, creating some redundancy in the factsheet.

Attachments for prompt templates are not saved on import or export

If your AI use case contains attachments for a prompt template, the attachments are not preserved when the prompt template asset is exported from a project or imported into a project or space. You must reattach any files following the import operation.