Known issues and limitations for Data Virtualization

The following known issues and limitations apply to Data Virtualization.

For additional solutions to problems that you might encounter with Data Virtualization, see the troubleshooting topic Troubleshooting the Data Virtualization service.

Administration issues

The following issues impact administrator users in Data Virtualization.

Data Virtualization agent pod logs are missing from the log collection diagnostic job

General issues

The following general issues impacts Data Virtualization users.

Unlocking a Data Virtualization connection incorrectly requires an access token after specifying to use your platform login credentials
Running the NUMTABSREMOTE view for multiple catalog sources returns -1 as the total number of tables

Audit issues

The following issues and limitations apply to audit issues in Data Virtualization.

Audit entries with multibyte characters are converted to hexadecimal string
Incomplete audit logging when stopping Big SQL or upon Db2 termination
Audit runs generate an ignorable error entry
Audit log entries might not be forwarded to the zen-audit service

Data source and connection issues and limitations: For information about limitations and known issues that apply to data sources and connections, see Limitations and known issues for data sources in Data Virtualization.

Installation and upgrade issues

The following issue applies to installing and upgrading Data Virtualization. For more information, see Installing Data Virtualization and Upgrading Data Virtualization.

listnodes does not display remote connects after upgrade from version 5.2.0 to 5.3.0
Upgrading your Data Virtualization instance from IBM Cloud Pak for Data version 4.8 and 5.1.2 displays Error updating Db2 database error
Data Virtualization passthrough route c-db2u-dv-db2u is missing after upgrade
Upgrading remote connectors from Cloud Pak for Data version 4.8.8 fails with SQL4302N error
Upgrading remote connectors from Cloud Pak for Data version 5.0 fails with Failed to upgrade GaianDB jars null error
Upgrading existing Cloud data source connections to Data Virtualization on Cloud Pak for Data version 5.3.0 causes certificate errors
Upgrading remote connectors from IBM Cloud Pak for Data versions prior to 5.1.0 causes an Unable to initialise GaianNode data (aborting) error
Db2REST pod in CrashLoopBackOff state

Data virtualization issues

The following issues and limitations apply to virtualization in Data Virtualization.

The List view might display additional tables
The List view might not display the column name or type of some tables
Unable to create the schema in the Shopping Cart page after dropping the same schema
Tables are not published to catalog after virtualization
The List view on the Virtualize page does not consistently display tables or views
Recreating virtualized tables after using the DROP NICKNAME SQL statement results in object already exists error
Data that is virtualized with double-byte coded character sets and special characters might appear truncated
Virtualizing a table with many columns or long column names might fail

User and group management issues

The following issues and limitations apply to user and group management in Data Virtualization. For more information about user and group management, see Virtualizing data.

Privileges and authorities that are granted to user groups are not considered when you create views

Data governance issues

The following issues and limitations apply to data governance in Data Virtualization.

Cannot create caches after changing a shared credentials connection to personal credentials
Connections with personal credentials incorrectly appear when it should be shared credentials
RCAC data protection rules are not applied to a view if both Deny access and masking IBM Knowledge Catalog rules are applied to the table referenced by the view
You cannot apply business terms when you virtualize files in data sources on the Files tab
You cannot see business term assignments or grouped tables in the Explore view
Cannot see list of available tables in the strict virtualization mode
A virtualized object cannot be used in Cognos Dashboards without credentials and an appropriate role

Caching issues

The following issues and limitations apply to caching in Data Virtualization.

Online backup of Data Virtualization can affect cache refreshes

File issues

The following issues and limitations apply to files in Data Virtualization.

You cannot preview long string values in headings in CSV, TSV, or Excel files
You might encounter errors when you virtualize large Excel files
Encoding detection override for files data with Japanese characters

Administration issues

Data Virtualization agent pod logs are missing from the log collection diagnostic job

Applies to: 5.3.0

When you run a IBM Cloud Pak® for Data diagnostic for Data Virtualization, the diagnostic zip file does not contain logs from the Data Virtualization agent pods.

Workaround: After you provision a Data Virtualization instance, complete these steps to patch the Data Virtualization agent pod logs. Run this patch one time for each Data Virtualization instance, and ensure you reapply this patch when you re-provision your Data Virtualization instance.

Download a copy of dvendpoint_collect_logs.sh to your local environment:

oc -n ${DV_INSTANCE_NAMESPACE} cp c-db2u-dv-dvendpoint-0:/opt/ibm/dvendpoint_collect_logs.sh dvendpoint_collect_logs.sh

Update the local copy of dvendpoint_collect_logs.sh:

sed -i "s/PYTHON_VERSION/3.11/" dvendpoint_collect_logs.sh

Create a configmap in the Data Virtualization instance namespace containing the contents of your local copy of dvendpoint_collect_logs.sh:
```
oc -n ${DV_INSTANCE_NAMESPACE} create configmap dvendpoint-collect-logs-sh --from-file=dvendpoint_collect_logs.sh
```

Patch the Data Virtualization endpoint statefulset:

oc -n $DV_INSTANCE_NAMESPACE set volume statefulset/c-db2u-dv-dvendpoint \
      --add --name=dvendpoint-collect-logs-sh --type=configmap --containers=dvendpoint \
      --mount-path="/opt/ibm/dvendpoint_collect_logs.sh" \
      --sub-path="dvendpoint_collect_logs.sh" --configmap-name=dvendpoint-collect-logs-sh \
      --default-mode="755" --read-only=false --overwrite

Optional: If you need to unpatch, run this command:

oc -n $DV_INSTANCE_NAMESPACE set volume statefulset/c-db2u-dv-dvendpoint \
        --remove --name=dvendpoint-collect-logs-sh --containers=dvendpoint \
        --mount-path="/opt/ibm/dvendpoint_collect_logs.sh" \
        --sub-path="dvendpoint_collect_logs.sh" \
        --read-only=false

General issues

Unlocking a Data Virtualization connection incorrectly requires an access token after specifying to use your platform login credentials

Applies to: 5.3.0

When you use a Data Virtualization connection in catalogs or projects, you might be prompted to unlock the connection with your own personal credentials. If you select Use my platform login credentials, the Access token input field is incorrectly marked as required, and you are unable to connect.

Workaround: Open the Data Virtualization connection asset explicitly in the current catalog or project (Platform connections) and then select Use my platform login credentials. You will not be prompted to provide an Access token. Save the changes.

Running the NUMTABSREMOTE view for multiple catalog sources returns -1 as the total number of tables: Applies to: 5.3.0; When you run the NUMTABSREMOTE view to retrieve the total number of tables for multiple catalog sources, the output returns -1 and this is expected. The console displays xxx/? with xxx as the placeholder for the number of tables in the selected catalog. For example: 123/?.

Audit issues

Audit entries with multibyte characters are converted to hexadecimal string: Applies to: 5.3.0; If an audited attribute contains multibyte characters, then those characters are converted to its hexadecimal equivalent when captured in the audit logs. For example: a table named DËMO in schema TEST is logged as "objschema":"TEST","objname":"44C38B4D4F". There is currently no workaround for this issue.

Incomplete audit logging when stopping Big SQL or upon Db2 termination

Applies to: 5.3.0

If you run the bigsql stop command, or in the case of an abrupt unhandled Db2 termination (crash), any statements or events that are in buffer but had not been written to disk yet are not captured in the audit logs. Buffer size can be configured by using the audit_buf_sz configuration parameter.

Example of an error that you can ignore:

2025-11-13-13.18.07.454178+000 E6480E972             LEVEL: Error (OS)
PID     : 2109615              TID : 140412667792896 PROC : db2fmp (
INSTANCE: db2inst1             NODE : 000            DB   : BIGSQL
APPID   : *N0.DB2.251030051340
HOSTNAME: c-db2u-dv-db2u-0
FUNCTION: DB2 UDB, oper system services, sqloQualifyPath, probe:20
MESSAGE : ZRC=0x870F0152=-2029059758=SQLO_RC_NOT_SET
          "A return code from a function that conditionally does not set one has been inappropriately consumed."
CALLED  : OS, -, realpath                         OSERR: EACCES (13)
DATA #1 : Codepath, 8 bytes
0
DATA #2 : signed integer, 4 bytes
2
DATA #3 : String, 53 bytes
/mnt/logs/audit/temp_del_path/stmt_20251113131807.del
DATA #4 : unsigned integer, 8 bytes
4096
DATA #5 : String, 53 bytes
/mnt/logs/audit/temp_del_path/stmt_20251113131807.del
DATA #6 : String, 105 bytes
Search for ossError*Analysis probe point after this log entry for further
self-diagnosis of this problem.

Audit runs generate an ignorable error entry: Applies to: 5.3.0; Every audit run generates an additional (false) error entry in the diagnostic logs, indicating a path access permission issue. You can safely ignore this error.

Audit log entries might not be forwarded to the zen-audit service: Applies to: 5.3.0; You might lose log entries when the log-streaming pod crashes or the audit log PVC is experiencing issues, resulting in logs not being forwarded to the zen-audit service.

Data source and connection issues

: For information about limitations and known issues that apply to data sources and connections, see Limitations and known issues for data sources in Data Virtualization.

Data virtualization issues

The List view might display additional tables: Applies to: 5.3.0; The List view in the Virtualize page might show additional tables to which the user may not have access to along with showing all the tables that the user has access. However, for the tables you do not have access to, you will not be able to preview or virtualize them.

The List view might not display the column name or type of some tables: Applies to: 5.3.0; The column name and type metadata of some tables might not be displayed in the List view. However, you can still virtualize those tables and modify the metadata of the virtualized table by selecting Edit columns from the Review cart and Virtualize tables page.

Unable to create the schema in the Shopping Cart page after dropping the same schema

Applies to: 5.3.0

If you ever dropped a schema, then you will be unable to create the same schema in the Shopping Cart.

Workaround: In RunSQL, run these commands together:

SET SYSHADOOP.CATALOG_SYNC_MODE='N';
CREATE SCHEMA ${SCHEMA_NAME};
SET SYSHADOOP.CATALOG_SYNC_MODE='Y';

Tables are not published to catalog after virtualization: Applies to: 5.3.0; When you add a table to the cart and then virtualize it, the virtualization completes after a period of time but it is not published to the catalog. In addition, this error appears: The operation has timed out.; Workaround: To publish the table to the catalog, select the table from the Virtualized data page, and then attempt to publish it to the default catalog.

The List view on the Virtualize page does not consistently display tables or views: Applies to: 5.3.0; The List view does not accurately display the tables or views the user has access to for data sources that have personal credentials enabled. For data sources with personal credentials enabled, this issue occurs because the list view displays tables or views that are relevant to CCDEFINER user.; Workaround: Use the Explore view to view the tables and views that you have access to.

The virtualization status might be stuck in the Virtualize data page even after virtualization is complete

Applies to: 5.3.0

When you navigate to the Virtualize page and then virtualize an object in the Files tab, the virtualization status for that object on the Virtualized data page might be stuck in Started even after the process completes.

Workaround: Refresh the Virtualized data page.

Recreating virtualized tables after using the DROP NICKNAME SQL statement results in object already exists error: Applies to: 5.3.0; When you use the DROP NICKNAME SQL statement to drop tables that have a defined nickname, attempting to recreate and virtualize the tables in the Review cart and virtualize tables page results in an Object already exists error.; Workaround: Use the delete API to remove all the data that is associated with the virtualized tables, then attempt to virtualize the table again using the same nickname. The delete API does not indicate whether it ran successfully, and you may encounter an error. Regardless, you should attempt to virtualize the tables again. To use the delete API, you will need to get an authorization token. For more information, see Delete virtualized data and Get authorization token.

Data that is virtualized with double-byte coded character sets and special characters might appear truncated: Applies to: 5.3.0; When you virtualize a table with a double-byte coded character set and special characters, such as Swedish characters, data might appear to be cut off.; Workaround: When virtualizing a table, use the Edit columns menu to extend the field lengths for those columns.

Virtualizing a table with many columns or long column names might fail

Applies to: 5.3.0

Virtualizing some tables in Data Virtualization might fail with a Virtual table creation failed message. Some tables contain columns with large STRING and CLOB data. Virtualizing the table by reducing the column width or even choosing just one of the columns fails with an error message.

Workaround:

Set the remote schema filter to a schema where the large table does not exist so that it does not get cached.
Create the virtual table with a direct CREATE OR REPLACE NICKNAME statement that includes the SOURCELIST definition option as shown in the following example. The SOURCELIST value describes the remote location of the table to be virtualized in the format <CID>:<RemoteSchema>. The CID is the connection ID for the source and can be found in source details of the Data sources page, or in the result of select * from dvsys.listrdbc query.
```
/* IBM_DVSYS */ CREATE OR REPLACE NICKNAME "ADMIN"."VERY_LARGE_TABLE" for QPLEX.gaiandb."VERY_LARGE_TABLE"
    ( 
        VERY_LONG_COLUMN_IDENTIFIER1,
        VERY_LONG_COLUMN_IDENTIFIER2,
        VERY_LONG_COLUMN_IDENTIFIER3,
        ...
        VERY_LONG_COLUMN_IDENTIFIER999,
        VERY_LONG_COLUMN_IDENTIFIER1000
) OPTIONS(SOURCELIST 'DB210000:DRV');
```
Query individual columns of the large virtual table.

Run aggregations and most other queries.

Note:

You cannot select all columns for the large table with a SELECT * statement. You see an error message that is similar to the following message.

SQL5105N  The statement failed because a Big SQL component encountered an 
error.  Component receiving the error: "DV-FMP".  Component returning the 
error: "Virtualization Connector".  Log entry identifier: "GAI-001-NA".  
Reason: "".  SQLSTATE=58040

Installation and upgrade issues

listnodes does not display remote connects after upgrade from version 5.2.0 to 5.3.0

Applies to: 5.3.0

After you upgrade to IBM Cloud Pak for Data version 5.3.0, all of the remote connectors associated with the Data Virtualization instance does not show up in listnodes when you run

select
node_name from dvsys.listnodes

. Only the qpagents are displayed. Upgrading remote connectors by using the stored procedure causes the following error, preventing the remote connectors from being updated automatically.

Example error:

[db2inst1@c-db2u-dv-db2u-0 - Db2U ~]$ db2 "call dvsys.updateremoteconnector('apiocp418fips1ucpfyreibmcom:6556',?,?)"

  Value of output parameters
  --------------------------
  Parameter Name  : NUM_UPGRADED
  Parameter Value : 0

  Parameter Name  : DIAGS
  Parameter Value : null,null, failed with The number of columns in the derived column list must match the number of columns in table 'GQ'.;

  Return Status = 0
[db2inst1@c-db2u-dv-db2u-0 - Db2U ~]$

Workaround: Update the remote connectors manually by completing these steps.

On the remote connector machine, run the following command from the sysroot sub-directory inside of your remote connector directory:
```
killGaianServers.sh
```
Create a new remote connector install script from the IBM Cloud Pak for Data console. Use a temporary directory on the Remote Connector machine as your target install directory for this step.
Run the newly created remote connector install script. This script downloads the latest remote connector release and installs it in your chosen temporary directory.
Promptly stop the new remote connector:

Linux

On Linux, run the killGaianServer.sh script.

Windows

On Windows, run the uninstall_service.bat file.
Copy the contents of sysroot/lib in your new temporary directory to your original remote connector directory sysroot/lib to update the old remote connector jars.
Start your original remote connector:
Linux
On Linux, run this command:
```
nohup datavirtualization_start.sh
```
Windows

On Windows, start the DataVirtualizationService service. For example: DataVirtualizationService6414.

Upgrading your Data Virtualization instance from IBM Cloud Pak for Data version 4.8 and 5.1.2 displays Error updating Db2 database error

Applies to: 5.3.0

When you attempt to upgrade your Data Virtualization instance to 5.3.0, the upgrade fails and displays the

Error updating Db2
database

error in the head pod similar to this example:

2025-06-07T18:53:46.044414567Z MESSAGE: Failed to create module SYSIBMADM.IDAX_MESS_P_RAISE_MESSAGE so forced to recreate all objects in the array
2025-06-07T18:53:46.044414567Z MESSAGE: Error creating module: SYSIBMADM.IDAX_MESS_P_RAISE_MESSAGE
... ...
2025-06-07T18:53:46.156792202Z The database update (db2updv121) exited with a non-zero return-code 72, See /db2u/tmp/db2updv.BIGSQL.20250607184055.log for more details ...
2025-06-07T18:53:46.156792202Z + return 0
2025-06-07T18:53:46.156792202Z + exit 72
2025-06-07T18:53:46.159809670Z + [[ 72 -ne 0 ]]
2025-06-07T18:53:46.159809670Z + log_echo bigsql_update_or_upgrade_db 'Error updating Db2 database'

Additionally, Db2 query indicates that there were no ROUTINE with name IDAX_MESS:

[db2inst1@c-db2u-dv-db2u-0 - Db2U ~]$ db2 "select varchar(routineschema,10)routineschema, varchar(routinename,25)routinename , varchar(specificname,35)specificname from syscat.rodb2 "select varchar(routineschema,10)routineschema, varchar(routinename,25)routinename , varchar(specificname,35)specificname from syscat.routines where routinemodulename = 'IDAX' and (routinename like 'IDAX_MESS%' or specificname like 'IDAX_MESS%' )"

ROUTINESCHEMA ROUTINENAME               SPECIFICNAME
------------- ------------------------- -----------------------------------

  0 record(s) selected.

However, the worker pods are upgraded successfully:

2025-06-07T18:36:04.195966791Z + echo -e '/opt/dv/current/db2u-dv-setup.sh 2025-06-07_18.36.04.191_UTC  INFO db2u-dv-setup,PHASE:end,MSG:Start up DV on c-db2u-dv-db2u-1.c-db2u-dv-db2u-internal.thanos.svc.cluster.local'
2025-06-07T18:36:04.200718069Z + ec=0

Workaround: Scale up or scale down the Data Virtualization head and worker pods to restart the pods and resume the upgrade. Replace instances of <DV_INSTANCE_NAMESPACE> with your Data Virtualization instance namespace.

Obtain the Data Virtualization head and worker sts current replicas number:

DV_STS_REPLICAS=$(oc -n <DV_INSTANCE_NAMESPACE> get sts c-db2u-dv-db2u -ojsonpath={.spec.replicas})

Scale down the Data Virtualization head and worker pods:
```
oc -n <DV_INSTANCE_NAMESPACE> scale sts c-db2u-dv-db2u --replicas=0
```
Wait for all of the c-db2u-dv-db2u-xxx pods to delete.
Scale up the Data Virtualization head and worker pods to the replicas that you obtained from the first step:
```
oc -n <DV_INSTANCE_NAMESPACE> scale sts c-db2u-dv-db2u --replicas=$DV_STS_REPLICAS
```
The Data Virtualization head and worker pods restart and the Data Virtualization upgrade resumes automatically.

Data Virtualization passthrough route c-db2u-dv-db2u is missing after upgrade

Applies to: 5.3.0

When you upgrade your Data Virtualization instance, the passthrough route is missing and Data Virtualization displays the error

FAIL: route c-db2u-dv-db2u does not exist in namespace
zen

The error might resemble this example:

=======================================================================
dv-validate.sh : 12. Check able to connect to DV port
=======================================================================

Tue Jun 3 08:02:02 UTC 2025 12. Check able to connect to DV SSL port

$ oc get --namespace zen -o jsonpath='{.spec.host}' route c-db2u-dv-db2u
Error from server (NotFound): routes.route.openshift.io "c-db2u-dv-db2u" not found
dv-validate.sh : FAIL: route c-db2u-dv-db2u does not exist in namespace zen
$ date
Tue Jun  3 08:02:03 UTC 2025

Symptoms: Running this command does not return a route.

Replace <DV_INSTANCE_NAMESPACE> with your Data Virtualization instance namespace.

oc -n <DV_INSTANCE_NAMESPACE> get route c-db2u-dv-db2u

Workaround: To workaround this issue, you can either use NodePort instead of the passthrough route, or you can manually create the route after the Data Virtualization instance upgrade. For more information on NodePort, see Configuring network requirements.

To manually create the route, complete these steps:

Set the DV_NAMESPACE variable with the OpenShift® namespace that contains the Data Virtualization instance.
Set the EXTERNAL_HOST variable with the external host name accessible to the end users.

Run this command to obtain the Data Virtualization formation ID.

DV_FORMATION_UID=$(oc -n $DV_NAMESPACE get formation db2u-dv -o jsonpath="{.metadata.uid}")

Run this command to create the route in the same namespace as you set in the DV_NAMESPACE variable. The name of the route is c-db2u-dv-db2u.

cat <<EOF | oc apply -f -
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  annotations:
    openshift.io/host.generated: "true"
  labels:
    app.kubernetes.io/component: cr
    app.kubernetes.io/name: db2u-dv
    icpdsupport/addOnId: dv
    icpdsupport/module: instance
  name: c-db2u-dv-db2u
  namespace: $DV_NAMESPACE
  ownerReferences:
  - apiVersion: db2u.databases.ibm.com/v1
    kind: Formation
    name: db2u-dv
    uid: $DV_FORMATION_UID
spec:
  host: c-db2u-dv-db2u-$DV_NAMESPACE.$EXTERNAL_HOST
  port:
    targetPort: 50001
  tls:
    termination: passthrough
  to:
    kind: Service
    name: c-db2u-dv-dv-head-svc
    weight: 100
  wildcardPolicy: None
EOF

Verify the new route by running this command:
```
oc -n $DV_NAMESPACE get route c-db2u-dv-db2u 
```
This command should return one route.

Upgrading remote connectors from Cloud Pak for Data version 4.8.8 fails with SQL4302N error

Applies to: 5.3.0

When you upgrade your Data Virtualization remote connectors from Cloud Pak for Data version 4.8.8 to 5.3.0, the dvsys.updateremoteconnector() procedure fails and displays the SQL4302N error.

This issue is due to jars in the /mnt/blumeta0/home/db2inst1/sqllib/function directory not being upgraded correctly.

The error might resemble this example:

select node_name,agent_class from dvsys.listnodes where agent_class = 'R' 

NODE_NAME                                                                        AGENT_CLASS         
-------------------------------------------------------------------------------- --------------------
apiocp455fips6rcpfyreibmcom:6556                                                 R                   

  1 record(s) selected.


call dvsys.updateremoteconnector('',?,?) 
SQL4302N  Procedure or user-defined function "DVSYS.UPDATEREMOTECONN", 
specific name "DV_UPDATEREMOTEGAIANNO" aborted with an exception 
"com/ibm/gaiandb/GaianDBC".  SQLSTATE=38501

select node_name,agent_class from dvsys.listnodes where agent_class = 'R' 

NODE_NAME                                                                        AGENT_CLASS         
-------------------------------------------------------------------------------- --------------------
apiocp455fips6rcpfyreibmcom:6556                                                 R                   

  1 record(s) selected.

Workaround:

Log into the Data Virtualization head pod and switch to the db2inst1 user:
```
oc ${DV_INSTANCE_NAMESPACE} rsh c-db2u-dv-db2u-0 bash
su - db2inst1
```
In the sqllib/function directory, complete the following:
1. Delete the dv_caching_predict.py file.
2. Delete the model_rr_d3_d3_1g.pkl file.
3. Delete all of the jar files except for the BIDDL.jar file.
Restart Db2 by running these commands:
1. ```
db2stop force
```
2. ```
db2start
```
On the remote connector host, install Java 21 from IBM Semeru Runtimes Downloads.
Open the datavirtualization.env file, and replace the existing JAVA_HOME path with the new Java 21 path.
In the Data Virtualization head pod as the db2inst1 user, run this command:
```
dvsys.updateremoteconnector()
```
On the remote connector host, navigate to the sysroot/data directory and complete the following:
1. In the derby.properties file, change the derby.authentication.provider property to com.ibm.gaiandb.queryplex.DVAuthenticator.
2. In the gaiandb_config.properties file, change the EXTENDER_CLASS_NAME property to com.ibm.gaiandb.queryplex.ConfluenceExtender.
Start the remote connector:
- Linux
  On Linux, run this command:
```
nohup datavirtualization_start.sh
```
- Windows
  
  On Windows, start the DataVirtualizationService service. For example: DataVirtualizationService6414.

In the Data Virtualization head pod as the db2inst1 user, verify the remote connector by running these commands:

Note: In addition, queries that you created using the remote connector before the upgrade should now work again.

```
db2 connect to bigsql
```

db2  "select node_name from dvsys.listnodes"

The result displays the endpoints and the remote connector nodes, similar to this example:

NODE_NAME                                                                       
--------------------------------------------------------------------------------
AdminNode                                                                       
apiocp455b1ccpfyreibmcom:6556                                                   
qpendpoint_1:6415                                                               
qpendpoint_2:6416                                                               
qpendpoint_3:6417                                                               
qpendpoint_4:6418                                                               
qpendpoint_5:6419                                                               

  7 record(s) selected.

Upgrading remote connectors from Cloud Pak for Data version 5.0 fails with Failed to upgrade GaianDB jars null error

Applies to: 5.3.0

When you upgrade your Data Virtualization remote connectors from Cloud Pak for Data version 5.0, the upgrade fails with the Failed to upgrade GaianDB jars null error.

The error resembles the following example:

select node_name,agent_class from dvsys.listnodes where agent_class = 'R' 

NODE_NAME                                                                        AGENT_CLASS         
-------------------------------------------------------------------------------- --------------------
apiocp417fips6ecpfyreibmcom:6556                                                 R                   

  1 record(s) selected.


call dvsys.updateremoteconnector('',?,?) 

  Value of output parameters
  --------------------------
  Parameter Name  : NUM_UPGRADED
  Parameter Value : 0

  Parameter Name  : DIAGS
  Parameter Value : apiocp417fips6ecpfyreibmcom:6556, failed with The exception 'java.lang.Exception: Failed to upgrade GaianDB jars null' was thrown while evaluating an expression.;

  Return Status = 0

select node_name,agent_class from dvsys.listnodes where agent_class = 'R' 

NODE_NAME                                                                        AGENT_CLASS         
-------------------------------------------------------------------------------- --------------------

  0 record(s) selected.

Workaround: Complete these steps to workaround the issue.

On the remote connector machine, run the following command from the sysroot sub-directory inside of your remote connector directory.
```
killGaianServers.sh
```
Create a new remote connector install script from the Cloud Pak for Data console. Use a temporary directory on the Remote Connector machine as your target install directory for this step.
Run the newly created remote connector install script. This script downloads the latest remote connector release and installs it in your chosen temporary directory.
Promptly stop the new remote connector.
- Linux
  
  On Linux, run the killGaianServer.sh script.
- Windows
  
  On Windows, run the uninstall_service.bat file.
Copy the contents of sysroot/lib in your new temporary directory to your original remote connector directory sysroot/lib to update the old remote connector jars.
Start your original remote connector.
- Linux
  On Linux, run this command:
```
nohup datavirtualization_start.sh
```
- Windows
  
  On Windows, start the DataVirtualizationService service. For example: DataVirtualizationService6414.

Upgrading existing Cloud data source connections to Data Virtualization on Cloud Pak for Data version 5.3.0 causes certificate errors

Applies to: 5.3.0

When you upgrade an existing Cloud data source connection from a release prior to 5.1.0 to Data Virtualization on Cloud Pak for Data version 5.3.0, you might encounter this error stating that the certificate cannot be found:

... Reason: /opt/ibm/db2/V11.5.0.0/java/jdk64/jre/lib/security/cacerts (No such file or directory)

In releases prior to version 5.1.0, certificates that were required to connect to an external data source were stored in the default Java trust store. In version 5.1.0, additional security requirements (including FIPS support) require certificates to be stored in trust stores specific to each data source connection. As a result, when you upgrade to version 5.3.0, existing connections might continue to point to the old Java trust store instead of the required trust store for each data source.

Workaround: To workaround this issue, complete these steps.

Run this stored procedure to recreate all the connections using the newly updated connection information.
```
call DVSYS.RESTORECONFIG('',?,?)
```
Ensure that all the connections (CID) are listed as AVAILABLE on the web client. If this does not fix the problem, or there remains UNAVAILABLE connections, then proceed to the next step.
Conditional step: If the previous step did not fix the problem, then complete this conditional step in the RunSQL console.
1. Run this query to retrieve the CID of all UNAVAILABLE connections.
```
select CID from DVSYS.listrdbc where cstatus='UNAVAILABLE';
```
  Take a note of the CID for the next step.
2. Run this query to retrieve the CCID, USR and PWD. Replace <CID> with the CID from the previous step.
  Note: You must repeat these steps for each UNAVAILABLE connection.
```
select * from  dvsys.connection_details where cid='<CID>'" 
```
  For example:
```
select * from  dvsys.connection_details where cid='DB210000'" 
```
  If the connection uses vaulted credentials, the output displays the user and password masked using the string {SECRET_REF}.
3. Run the following setRdbcX stored procedure. Ensure you replace <SRC_TYPE>, <HOST_NAME>,<DATABASE_PORT>,<DATABASE_NAME>, <USR>, <PWD>, <USE_SSL> and <CCID> values from the connection_details table, and the associated <NODE_NAME> and <CID> values from the first step.
  You must also add additional options including EDITCID and CCID in <ADDITIONAL_OPTIONS>. See the following example.
  
  Note: If the connection does not use vaulted credentials, then you must explicitly replace <USR> and <PWD> with the correct user and password. This requires a user with the Data Virtualization Admin role.
```
call DVSYS.setRdbcX('<SRC_TYPE>','<HOST_NAME>','<DATABASE_PORT>','<DATABASE_NAME>','','<USR>','<PWD>','<USE_SSL>','<VALIDATE_CERT>','','','<NODE_NAME>','<ADDITIONAL_OPTIONS>',?,?,?);
```
  For example:
```
call DVSYS.setRdbcX('DB2','924186d3-ec79-4daf-aa77-3297081a929b.zu9tc4nd0urletsa6ufg.ds.app.cloud','31101','bludb','','{SECRET_REF}','{SECRET_REF}', '1', '0', '', '', 'qpendpoint_3:6417', 'EDITCID=DB210000,CCID=85b3d676-f627-4f57-a8fd-aad00782c2de',?,?,?) 
```
  The connection is recreated successfully.

Upgrading remote connectors from IBM Cloud Pak for Data versions prior to 5.1.0 causes an Unable to initialise GaianNode data (aborting) error

Applies to: 5.3.0

When you upgrade remote connectors to IBM Cloud Pak for Data version 5.3.0 from versions prior to 5.1.0, using the stored procedure DVSYS.UPDATEREMOTECONNECTOR, the remote connectors do not start and are not displayed in the output select node_name from dvsys.listnodes. An error message similar to the following example can be found in <Remote_Connector_Install_Path>/sysroot/data/gaiandb-stdout.log.

Example error message:

Unable to initialise GaianNode data (aborting): Another GaianNode or Derby instance may be attached to db folder 'gaiandb'? Otherwise, authentication may have failed or the database may be corrupted (delete it & restart to recycle it). Also check optional messages below.
java.sql.SQLException: java.sql.SQLException: Failed to start database 'gaiandb6556' with class loader jdk.internal.loader.ClassLoaders$AppClassLoader@f1fd8dc9, see the next exception for details. SQL: CREATE FUNCTION maintainConnection2 (nodeid VARCHAR(100), usr VARCHAR(10), pwd VARCHAR(10), extraInfo VARCHAR(32672)) RETURNS VARCHAR(32672) PARAMETER STYLE JAVA LANGUAGE JAVA NO SQL EXTERNAL NAME 'com.ibm.gaiandb.GaianDBConfigProcedures.maintainConnection2'

Workaround:

Change the following properties in the files gaiandb_config.properties and derby.properties. Both of the files are located in the directory <Remote_Connector_Install_Path>/sysroot/data.
- In the gaiandb_config.properties file, replace EXTENDER_CLASS_NAME=com.ibm.gaiandb.confluence.ConfluenceExtender with this property:
```
EXTENDER_CLASS_NAME=com.ibm.gaiandb.queryplex.ConfluenceExtender
```
- In the derby.properties file, replace derby.authentication.provider=com.ibm.gaiandb.confluence.DVAuthenticator with this property:
```
derby.authentication.provider=com.ibm.gaiandb.queryplex.DVAuthenticator
```
Restart the Remote connector by completing one of the following options.
- Linux
  For Linux systems, run this command to start the Remote connector from the directory <Remote_Connector_Install_Path>/sysroot:
```
nohup ./datavirtualization_start.sh &
```
- Windows
  
  For Windows systems, open the Microsoft Windows Services console (or services.msc), and then restart the Remote connector. Your Remote connector is named DataVirtualizationService<port> .
  For example, the Remote connector is named DataVirtualizationService6414, and you would restart this service in the Microsoft Windows Services console.

Db2REST pod in CrashLoopBackOff state

Applies to: 5.3.0

You can't connect to the Db2REST server. After enabling REST, the REST pod enters CrashLoopBackOff state and repeatedly restarts and never reaches completion.

You see the following log:

chmod: cannot access '/db2u/tmp/os_envar_configmap': No such file or directory
/db2u/scripts/include/common_functions.sh: line 23: /db2u/tmp/os_envar_configmap: No such file or directory
ls: cannot access '/opt/ibm/dbrest/auth': No such file or directory
......+.+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*....+......+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*............+..+.+.....+.......+...........+....+..+....+..............+...+...+.......+...+............+...+....................+.+...+..+.........+....+.......................+............+...+....+........+.+.....+.+.....+....+...............+.....+.+.....+......+....+..................+......+..+....+...+..+....+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
...........+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*..........+.........+.....+......+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*.+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-----
Starting up IBM Db2 REST container...
Starting the Db2 REST server...
Starting server with https
/opt/ibm/dbrest/scripts/../bin/db2rest-server: error while loading shared libraries: libcrypt.so.1: cannot open shared object file: No such file or directory
SERVER STATUS: INACTIVE
#############################################################
###  IBM Db2 REST container was started successfully   ###
#############################################################
* If you used docker/oc/kubectl logs to monitor progress, detach from the
console by pressing Ctrl+C.
* To get a command line prompt, issue the following command from the host:
    [ docker | oc | kubectl -n <NS> ] exec -it <rest container name> bash
#############################################################

Workaround: Complete the following steps to resolve this issue:

Find your affected db2ucluster resource by running the following command:
```
oc get db2ucluster
```
Refer to the following example output:
```
NAME      STATE      AGE
db2u-dv   Ready      34m
```
Disable REST for your db2ucluster.
1. Run the following command to edit your db2ucluster:
```
oc edit db2ucluster deployment_ID
```
2. Set rest to false in your CR. Refer to the following example:
```
spec:
  ...
  addOns:
    rest:
      enabled: false
```

Save your db2u-release Config Map:

oc get cm db2u-release -n ${PROJECT_CPD_INST_OPERATORS} -o yaml > db2u-release.yaml

Extract, then replace your REST images.

Extract your Db2 11.5.9.0 REST image:

REST_IMAGE_1159=$(oc get cm db2u-release -n ${PROJECT_CPD_INST_OPERATORS} -o=jsonpath='{.data.json}' | jq '.databases.db2u["11.5.9.0"].images.rest')

Extract your Db2 12.1.0.0 REST image:

REST_IMAGE_CN1=$(oc get cm db2u-release -n ${PROJECT_CPD_INST_OPERATORS} -o=jsonpath='{.data.json}' | jq '.databases.db2u["12.1.0.0"].images.rest')

Replace your Db2 12.1.0.0 REST image with your 11.5.9.0 REST image:

sed -i.bak "s|\"rest\": $REST_IMAGE_CN1|\"rest\": $REST_IMAGE_1159|g" db2u-release.yaml

Apply your changes to your ConfigMap:

oc apply -f db2u-release.yaml -n ${PROJECT_CPD_INST_OPERATORS}

Delete your ConfigMap YAML file:
```
rm db2u-release.yaml
```
Enable REST for your db2ucluster.
1. Run the following command to edit your db2ucluster:
```
oc edit db2ucluster deployment_ID
```
2. Set rest to true in your CR. Refer to the following example:
```
spec:
  ...
  addOns:
    rest:
      enabled: true
```

User and group management issues

Privileges and authorities that are granted to user groups are not considered when you create views: Applies to: 5.3.0; Privileges and authorities that are granted to user groups are not considered when you create views. This limitation is a result of a Db2 limitation on groups.; Workaround: To workaround this issue, instead of granting access to specific groups from the Manage access page, you should grant access to the public by selecting All data virtualization users. Then create a new data protection rule to define Allow or Deny rules based on groups.

Data governance issues

Cannot create caches after changing a shared credentials connection to personal credentials: Applies to: 5.3.0; If you change a data source connection that uses shared credentials to use personal credentials through USER_PERSONAL_CREDENTIAL=TRUE, then Data Virtualization cannot create caches (both user-defined and auto-cache) for virtual objects based on that connection. There is currently no workaround for this issue.

Connections with personal credentials incorrectly appear when it should be shared credentials: Applies to: 5.3.0; If you change a personal credential to a shared credential through USER_PERSONAL_CREDENTIAL=TRUE, then connections might incorrectly appear in the web client as a personal credential connection when it is still a shared credential. There is currently no workaround for this issue.

RCAC data protection rules are not applied to a view if both Deny access and masking IBM Knowledge Catalog rules are applied to the table referenced by the view: Applies to: 5.3.0; If a table has both the Deny access rule and external RCAC (column and row filters) IBM Knowledge Catalog data protection rules applied to it, then the RCAC rules are not applied to any views that reference that table, leaving the view unmasked.; Workaround:
Use Data Virtualization Db2 authorizations to deny access and define IBM Knowledge Catalog data protection rules to mask data and filter rows instead of doing both through IBM Knowledge Catalog data protection rules. For more information, see Authorization model for views.

You cannot apply business terms when you virtualize files in data sources on the Files tab

Applies to: 5.3.0

When you virtualize files in Data > Data virtualization > Virtualize, the Business terms column is not available for data sources on the Files tab. These data sources do not support business terms.

You cannot see business term assignments or grouped tables in the Explore view

Applies to: 5.3.0

On the Virtualize page, you cannot see business term assignments and you cannot see grouped tables.

Workaround: Switch to the List view to see business term assignments and grouped tables.

Cannot see list of available tables in the strict virtualization mode

Applies to: 5.3.0

In the strict virtualization mode (where you can see tables only if they have at least one column with business term assignment), when you navigate to the Virtualize page, the console appears to be loading the table list for a while without showing any tables. The loading can be much slower compared to the default virtualization mode while the console evaluates the list of eligible tables that can be virtualized, depending on term assignments to data source table and column names.

A virtualized object cannot be used in Cognos Dashboards without credentials and an appropriate role

Applies to: 5.3.0

Using a virtualized object in Data Virtualization displays an error message if you do not enter credentials for the Data Virtualization connection or you do not have the correct role.

If you did not enter a username and password when you created the connection to the Data Virtualization data source, you see the following error: Missing personal credentials. If you are not assigned the Admin or Steward role for this object, you see the following error: Unable to access this asset.

To work around this issue, see A virtualized object cannot be used in Cognos Dashboards without credentials and an appropriate role in Data Virtualization.

Caching issues

Online backup of Data Virtualization can affect cache refreshes

Applies to: 5.3.0

If you schedule an online backup in Data Virtualization and if at the same time a cache refresh operation runs, the cache write operation can fail.

Online backup in Data Virtualization works by quiescing database write and update operations. Cache refresh is a write operation that updates Data Virtualization caches with fresh data from remote data sources. Both online backup and cache refresh operations can be scheduled to run periodically. These schedules can conflict and every time the cache tries to refresh, an online backup might be in progress or it might start while the cache refresh is in progress. In these scenarios, cache refresh operations can fail with an error similar to the following message. The cache is active but it contains stale data.

The statement failed because a Big SQL component encountered an error. 
Component receiving the error: "BigSQL IO". Component returning the error: "UNKNOWN". 
Log entry identifier: "[BSL-1-b1bf428f]". Reason: "Cannot add block to /warehouse".. 
SQLCODE=-5105, SQLSTATE=58040, DRIVER=4.27.25

Cache auto-refresh might also fail and you can either manually refresh or wait for the next auto-refresh. However, if the online backup is scheduled such that it conflicts with any cache's refresh schedule, then every auto-refresh of those caches will fail.

Workaround: Reschedule the cache's refresh or the online backup schedule to avoid conflicts. You can also refresh the cache manually at any time.

File issues

You cannot preview long string values in headings in CSV, TSV, or Excel files

Applies to: 5.3.0

When you use the first row as column headings, the string values in that row must not exceed the maximum Db2 identifier length of 128 characters and cannot be duplicated. If your file has string names in the header row with values that are too long or are duplicated, an error message is displayed when you try to preview your file in Data Virtualization.

400: Missing ResultSet:java.sql.SQLSyntaxErrorException: Long column type column or parameter 'COLUMN2' not permitted in declared global temporary tables or procedure definitions.

Column heading names are case-insensitive and converted to uppercase in the API response, which is exposed by the console. Therefore, a column that is named ABC is considered the same as a column that is named abc. However, the columns can be renamed to mixed case when you virtualize your data source.

Workaround: Review the first row in the data file that contains the intended column heading names and make the necessary changes to avoid this limitation.

You might encounter errors when you virtualize large Excel files

Applies to: 5.3.0

You might experience an error when you preview or virtualize large files with Excel format (XLS):

ERROR 400 'Your InputStream was neither an OLE2 stream nor an OOXML stream' was thrown while evaluating an expression.

Workaround: You can follow these steps:

Load the file data into a table on a supported data source.
Virtualize the table.
Optionally, you can change the format of the file; for example, to CSV format before you virtualize it again.

Encoding detection override for files data with Japanese characters

Applies to: 5.3.0

For text files exposed by remote connectors, Cloud Pak for Data automatically detects the encoding scheme of flat data files, such as CSV and TSV files. However, to avoid decoding issues, it is recommended that you set the encoding scheme manually for flat data files.