Data source issues
setrdbcx
fails with
java.lang.NullPointerException
error when connecting to the Snowflake data source
-
Applies to:
5.2.0
When you attempt to connect to the Snowflake data source, the setrdbcx
stored procedure fails to run, and displays a
java.lang.NullPointerException
error.
- The error might resemble this
example:
2025-05-04 10:01:26 [ INFO] Stmt: Call dvsys.setrdbcx('Snowflake', 'wv03539.us-east-2.aws.snowflakecomputing.com', 443, 'TM_DV_DB', 'warehouse=DV_WH;schema=TM_DV;role=TM_DV_ROLE', 'TM_DV_USER', 'DpEE9k9ucqkppGp@NdV2', 1, 0, '', '', 'qpendpoint_3', '', ?, ?, ?) (fvt_utils.py:4677)
2025-05-04 10:01:52 [ WARNING] dvsys.setrdbcx diags = qpendpoint_3, failed with The exception 'java.lang.Exception: java.lang.NullPointerException: Cannot invoke "java.io.File.exists()" because "this.cacheFile" is null' was thrown while evaluating an expression.; (fvt_utils.py:4692)
- Workaround: Attempt to connect to the Snowflake data source again until the connection is
established.
Note: The workaround might not work in all cases.
- Collecting statistics for a virtualized Presto table results in an error
-
-
Applies to:
5.2.0
-
When you try to collect the statistics of a virtualized Presto table using the
COLLECT_STATISTICS
stored procedure in Run SQL, you might
encounter the error No statistics returned for table
.
- Workaround: Wait for an interval of 2-3 minutes and then attempt to
collect the statistics again.
Note: You can apply a permanent fix on the data source side by setting
the parameter
http-server.max-request-header-size=<value
up to 10TB> in the
Presto coordinator configuration file. For more information, see
Update Presto engine.
- Unable to connect to an SSL-enabled data source in Data
Virtualization by using a remote
connector if you use custom certificates
-
Applies to:
5.2.0
-
After you create your own custom certificates to use when Data
Virtualization connects to the
IBM® Software Hub platform, you are not able to connect
to a remote SSL-enabled data source.
- For more information about remote data sources and connectors in Data
Virtualization, see Accessing data sources by using remote connectors in Data Virtualization.
- Workaround:
- Complete the following steps to resolve this issue and connect to remote data sources:
- Download the CA certificate files from /etc/pki/ca-trust/source/anchors in
the Data
Virtualization head pod
c-db2u-dv-db2u-0
.
- Upload the CA certificate files that you downloaded in step 1 to each remote connector
computer.
- If a file has more than one CA certificate, split the certificates into individual files
- Run the following command to set
AGENT_JAVA_HOME
to the Java home that is used
by the remote
connectors:AGENT_JAVA_HOME=$(tac <Remote connector install path>/datavirtualization.env| grep -i java_home | grep -v "#" -m )
- For each file that you created in step 3, run the following command to add the certificates to
the Java
cacerts
truststore in
AGENT_JAVA_HOME/lib/security/cacerts. Make sure that you provide a unique alias
value for the -alias command parameter for each
file.keytool -storepass changeit -keystore $AGENT_JAVA_HOME/lib/security/cacerts -importcert -alias <pick a alias> -rfc -file <absolute path to cert generated in step 3> -noprompt
- Restart the remote connector. For more information, see Managing connectors on remote data sources.
-
- Querying a virtualized table in a Presto
catalog with a matching schema from a previous catalog might result in an error
-
Applies to:
5.2.0
-
When you use a new catalog and then query a virtualized table
from a previous catalog with an identical schema name, the query might run if the schema exists in
the new catalog, but the outcome might result in an error. The resulting error might be caused by
the differences in column definitions within the tables under the same schema name across
catalogs.
- Special characters are not supported in MongoDB database names
-
-
Applies to:
5.2.0
- You cannot use special characters such as semicolons and single quotes in a MongoDB database name.
- Limited file types are supported with the Microsoft
Azure Data Lake Storage Gen2 data source connection
-
Applies to:
5.2.0
-
You can connect to
Microsoft
Azure Data Lake Storage Gen2 data
source from
Data
Virtualization. However, only the following file types are supported with this
connection in
Data
Virtualization:
- CSV
- TSV
- ORC
- Parquet
- JSON Lines
- Special characters are not preserved in databases or schemas with MongoDB connections after you upgrade to Data
Virtualization on IBM Software Hub
-
Applies to:
5.2.0
- After you upgrade from an earlier version of Data
Virtualization on IBM Software Hub to version 5.2.0,
with a MongoDB connection, special characters in
the database or schema are not preserved even though
SpecialCharBehavior=Include
is
set in the data source connection. The updated MongoDB driver (version 6.1.0) that is used in Data
Virtualization on IBM Software Hub does not
recognize special characters in tables by default. This might cause issues with your results when
you query a virtual table that has special characters.
- The DECFLOAT data type is not supported in Data
Virtualization
- Applies to:
5.2.0
-
The DECFLOAT data type is not supported in Data
Virtualization. As a result, the type DECFLOAT
is converted to DOUBLE and the special numeric values NaN, INF, and -INF are converted to NULL.
-
- The Data sources page might fail to load data
sources when remote connectors are added, edited, or removed
-
Applies to:
5.2.0
-
-
If you remove a connection to a remote connector or the remote connector becomes unavailable
because credentials expire, the Data sources page fails to load this
connection.
- Unable to add a connection to SAP S/4HANA data source with a SAP OData connection
-
Applies to:
5.2.0
-
- If you try to connect to an SAP S/4HANA data source that contains many tables, that connection
might time out and the connection might fail. Increasing timeout parameters has no impact.
- To work around this issue, run the following
commands.
db2 connect to bigsql
db2 "call DVSYS.setRdbcX('SAPS4Hana', '<Data source IP:port>', '0', '', 'CreateSchema=ForceNew', '<username>', '<password>', '0', '0', '', '', '<internal connector instance and port>', ?,?,?);"
db2 terminate
- You cannot connect to a MongoDB data source with special characters in a database
name
-
Applies to:
5.2.0
-
-
The current MongoDB JDBC Driver does not
support connection to database names that contain special characters.
- When you virtualize data that contains LOB (CLOB/BLOB) or Long
Varchar data types, the preview might show the columns as empty
-
Applies to:
5.2.0
-
-
After you virtualize the table, in Virtualized data, the data is available
for the columns that contain LOB or Long Varchar data types.
- Remote data sources - Performance issues when you create data source
connection
-
Applies to:
5.2.0
-
You try to create a data source by searching a different host, but the process takes several
minutes to complete. This performance issue occurs only when these two conditions are met:
- The remote data source is connected to multiple IBM Software Hub clusters.
- Data
Virtualization connects to multiple data sources in different IBM Software Hub clusters by using the remote connectors.
To solve this issue, ensure that your Data
Virtualization connections are on a single Cloud Pak for Data cluster.
- Query fails due to unexpectedly closed connection to data
source
-
Applies to:
5.2.0
-
-
Data
Virtualization does not deactivate the connection pool for the data source when your
instance runs a continuous workload against virtual tables from a particular data source. Instead,
Data
Virtualization waits for a period of complete inactivity before it deactivates the connection
pool. The waiting period can create stale connections in the connection pool that get closed by the
data source service and lead to query failures.
- Workaround: Check the properties for persistent connection (keep-alive
parameter) for your data sources. You can try two workarounds:
-
- Consider disabling the keep-alive parameter inside any data sources that receive continuous
workload from Data
Virtualization.
- You can also decrease the settings for corresponding Data
Virtualization properties,
RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC
and
RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC
, as shown in the following examples:
CALL DVSYS.SETCONFIGPROPERTY('RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC', '10', '', ?, ?); -- default 20s, minimum 5s
CALL DVSYS.SETCONFIGPROPERTY('RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC, '30', '', ?, ?); -- default 120s, minimum 5s
Decreasing
the RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC
and
RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC
settings might help if there are small gaps of
complete inactivity that were previously too short for the Data
Virtualization shrink and
deactivate timeouts to take effect.
- Schema map refresh in-progress message appears for
reloaded connections that do not require a refresh schema map
Applies to:
5.2.0
The Schema map refresh in-progress message appears when you reload
connections in Data
Virtualization, even when the data source does not require a refresh schema
map.
- Only connections from data sources such as Google BigQuery, MongoDB, SAP S/4HANA, and Salesforce.com require a refresh schema map to update any
changes in tables and columns for existing connections.