Known issues and limitations for Data Refinery
- Known issues
-
- Unable to specify target data with an Elasticsearch connection in a Data Refinery flow
- Data Refinery might fail when you upgrade IBM Software Hub to version 5.3.0
- Target table loss and job failure when you use the Update option in a Data Refinery flow
- Google BigQuery connection: TRUNCATE TABLE statement fails in Data Refinery flow jobs
- Limitations
-
- Changes to interactive code templates
- Data Refinery flows with connections that use vault secrets only work if the user has permission to access the vault secrets.
- Tokenize GUI operation does not work for Data Refinery on R 4.3 with Spark 3.4 or on Default Spark 3.4 + R 4.3 environments
- Error opening a Data Refinery flow
- Error opening a Data Refinery flow with connection with personal credentials
- Data Refinery does not support the Satellite Connector
- Data column headers cannot contain special characters
- Unable to use masked data in visualizations from data assets imported from 4.8 or earlier
- Tokenize GUI operation might not work on large data assets
- Data Refinery does not support Kerberos authentication
Known issues
See also Known issues for Common core services for the issues in other services in Cloud Pak for Data that might affect Data Refinery.
- Unable to specify target data with an Elasticsearch connection in a Data Refinery flow
-
Applies to: 5.3.1 Patch 2
Fixed in: 5.3.2 Patch 4
If you create a Data Refinery flow with an Elasticsearch connection as the target output location, the Next button remains disabled when you try to create a new file within the connection. You are thus prevented from creating the Data Refinery flow.
- Data Refinery might fail when you upgrade IBM® Software Hub to version 5.3.0
-
Applies to: 5.3.0 and later
When you upgrade Watson Studio or IBM Knowledge Catalog to IBM Software Hub to version 5.3.0 from an earlier version, Data Refinery might fail to complete and you might get this error
The conditional check '( "Bound" == test_pvc_info.resources[0].status.phase )' failed. The error was: error while evaluating conditional (( "Bound" == test_pvc_info.resources[0].status.phase )): list object has no element 0.Workaround: Run the following commands:- Retrieve the service broker
secret:
oc get secret zen-service-broker-secret -o jsonpath='{.data.token}' | base64 -d - Open a remote shell in an Nginx
pod:
oc -n <cpd_instance_namespace> rsh <ibm-nginx-xxx> - To list the Nginx
pods:
oc -n <cpd_instance_namespace> get pods | grep ibm-nginx | egrep -v "tester|spark" - Get the volume instance
ID:
curl -ks -X GET 'https://internal-nginx-svc.<cpd_instance_namespace>.svc:12443/zen-data/v3/service_instances?fetch_all_instances=true&display_name=<dataplane_instance_namespace>::datarefinerylibvol' --header 'Content-Type: application/json' --header 'secret: <service-broker-secret >' | jq -r .service_instances[0].idwhere
<service-broker-secret>is the secret retrieved in step 1. Also, replace<cpd_instance_namespace>and<dataplane_instance_namespace>with the names you use on your cluster. - Delete the volume instance by using the instance
ID:
wherecurl -k -X DELETE 'https://internal-nginx-svc.<cpd_instance_namespace>.svc:12443/zen-data/v3/service_instances/<instance_id>' --header 'secret: <service-broker-secret >'<instance_id>is the volume service instance ID returned in step 3.
- Retrieve the service broker
secret:
- Target table loss and job failure when you use the Update option in a Data Refinery flow
-
Applies to: 5.3.0 and later
Using the Update option for the Write mode target property for relational data sources (for example Db2) replaces the original target table and the Data Refinery job might fail.
Workaround: Use the Merge option as the Write mode and Append as the Table action.
- Google BigQuery connection: TRUNCATE TABLE statement fails in Data Refinery flow jobs
-
Applies to: 5.3.0 and later
If you run a Data Refinery flow job with data from a Google BigQuery connection and the DDL includes a TRUNCATE TABLE statement, the job fails
Limitations
- Changes to interactive code templates
- You can no longer use the following interactive code templates in Data Refinery:
mutate_ifselect_ifsummarize_if
- Data Refinery flows with connections that use vault secrets only work if the user has permission to access the vault secrets.
-
If the source or target data of a Data Refinery flow uses connections that reference vault secrets, the user running the Data Refinery flow job must have permission to access the vault secrets. Otherwise, you obtain the error
authorization_failed no access to secret. - Tokenize GUI operation does not work for Data Refinery on R 4.3 with Spark 3.4 or on Default Spark 3.4 + R 4.3 environments
-
Data Refinery flow jobs that include the Tokenize GUI operation do not work on Spark and R environment. Users can use the Default Data Refinery XS environment for small datasets.
- Error opening a Data Refinery flow
-
When you open the Data Refinery user interface, you might obtain the error
The selected data set wasn't loaded. Error occurred while launching the container (retry attempts exceeded).Workaround: Delete the existing interactiveRuntimeAssembly(RTA)as follows:
or in Segregation of Duty (SoD) mode:oc -n <CPD_INSTANCE_NAMESPACE> delete rta -l type=service,component=shaperoc -n <DATAPLANE_NAMESPACE> delete rta -l type=service,component=shaper - Error opening a Data Refinery flow with connection with personal credentials
-
When you open a Data Refinery flow that uses a data asset that is based on a connection with personal credentials, you might see an error.
Workround: To open a Data Refinery flow that has assets that use connections with personal credentials, you must unlock the connection. You can unlock the connection either by editing the connection and entering your personal credentials, or by previewing the asset in the Project where you are prompted to enter your personal credentials. When you have unlocked the connection, you can then open the Data Refinery flow.
- Data Refinery does not support the Satellite Connector
-
You cannot use a Satellite Connector to connect to a database with Data Refinery
- Data column headers cannot contain special characters
- Unable to use masked data in visualizations from data assets that are imported from version 4.8 or earlier
-
Applies to: 5.3.0 and later
- Tokenize GUI operation might not work on large data assets
- Data Refinery does not support Kerberos authentication
- Data refinery does not support connecting to data with Kerberos authentication.