Known issues and limitations for Watson Query

The following known issues and limitations apply to Watson Query.

For additional solutions to problems that you might encounter with Watson Query, see the troubleshooting topic Troubleshooting the Watson Query service.

Administration issues
The following issues apply to administration tasks in Watson Query. For more information, see Administering Watson Query.
Data source issues
The following issues and limitations apply to data sources in Watson Query. For more information about data sources, see Supported data sources in Watson Query. For additional solutions to problems that you might encounter with data source connections, see the troubleshooting topic Troubleshooting data source connections in Watson Query.
Installation and upgrade issues
The following issue applies to installing and upgrading Watson Query. For more information, see Installing Watson Query and Upgrading Watson Query.
Data virtualization issues
The following issues and limitations apply to virtualization in Watson Query. For additional solutions to problems that you might encounter with data virtualization, see the troubleshooting topic Troubleshooting virtualization issues in Watson Query.
User and group management issues
The following issues and limitations apply to user and group management in Watson Query. For more information about user and group management, see Managing roles for users and groups in Watson Query.
Data governance issues
The following issues and limitations apply to data governance in Watson Query. For additional solutions to problems that you might encounter with data governance, see the troubleshooting topic Troubleshooting governance issues in Watson Query.
Caching issues
The following issues and limitations apply to caching in Watson Query. For additional solutions to problems that you might encounter with caching, see the troubleshooting topic Troubleshooting data caches and queries in Watson Query.
File issues
The following issues and limitations apply to files in Watson Query.
Resolved issues
The following issues and limitations are fixed in Watson Query.

Administration issues

Diagnostic logs for an instance appear to be missing

Applies to: 4.7.0, 4.7.1, and 4.7.2

Fixed in: 4.7.3

After you collect diagnostics for a Watson Query instance, the dv-engine-bundle-*.zip file appears to be missing. The file might not be missing, but permission errors prevent you from downloading it.

Workaround: Run the following command to ensure permissions on the logs are open:
oc rsh c-db2u-dv-db2u-0 bash
su - db2inst1
sudo chmod -R 777 /mnt/logs/diaglogs/dv
Shutting down Watson Query service instances in tethered namespaces does not work as expected

Applies to: 4.7.0, 4.7.1, and 4.7.2

Fixed in: 4.7.3

The shut-down process should fail if instances in tethered namespaces are not shut down first. When you shut down the Watson™ Query service, however, the shut-down completes successfully, even if instances in tethered namespaces are not shut down.

Also, forcefully shutting down the Watson Query service instance should forcefully shut down all instances in tethered namespaces. When you forcefully shut down the Watson Query service, however, the procedure does not forcefully shut down any Watson Query service instances that are running in tethered namespaces.

Workaround: Directly shut down and restart the Watson Query service instances in tethered namespaces, rather than shutting them down as part of shutting down and restarting all instances through the Watson Query service CR.

For more information, see Shutting down and restarting services.

Data source issues

Watson Query connections to Microsoft Azure Data Lake Storage Gen 2 might experience delays, timeouts, or hangups when using TLSv1.3 with specific versions of JDK
Applies to: 4.7.0 and later

Microsoft Azure Data Lake Storage supports TLSv1.3 which is not compatible with the particular version of Java (Java 8) that Watson Query is configured to operate with. Running queries might result in failures accompanied by an InvalidAbfsRestOperationException error.

Workaround:
  1. Run the following command from the OpenShift® cluster within the namespace where Watson Query was provisioned:
    
    for podname in $(oc get pods -l "role in (db,hurricane)" -o=custom-columns=NAME:metadata.name --no-headers) ; do oc exec  ${podname} -- su - db2inst1 -c 'echo "export IBM_JAVA_OPTIONS=\"\${IBM_JAVA_OPTIONS} -Dhttps.protocols=TLSv1.2 \"" >> ~/.bashrc ' ; done 
  2. On the head node as the db2inst1 user, run the following command to restart Watson Query:
    
    bigsql stop ; bigsql start
Kerberized Cloudera Impala cluster is not supported
Applies to: 4.7.3 and later
When you establish a connection to Cloudera Impala with Kerberos authentication, the cluster fails when you attempt to virtualize and preview the tables. This issue is caused by the over utilization of resources by the JDBCgetTables() API.
Workaround: To make the cluster workable again, remove the Kerberized Cloudera Impala connection by completing the following steps.
Note: Make sure there are no users on the UI before you remove the connection to prevent triggering the issue again.
  1. Make sure an instance administrator has access to the terminal of the head node pod c-db2u-dv-db2u-0 as the db2inst1 user.

    For more information on how to complete this step, see the Before you begin section in Disabling and enabling pass-through authentication in Watson Query.

  2. Restart the hosted agents by running the following command:
    [db2inst1@c-db2u-dv-db2u-0 - Db2U ~]$ cd /opt/dv/current/qp/helper_scripts/
    [db2inst1@c-db2u-dv-db2u-0 - Db2U helper_scripts]$ ./qp_agents restart
  3. As the db2inst1 user, connect to the instance from the CLI and run the dvsys.removerdbcx stored procedure to remove the problematic data source. Refer to the following example for more information. See also removeRdbcX stored procedure (Variation 1).
    Note: Do not use the UI to complete this step.
    [db2inst1@c-db2u-dv-db2u-0 - Db2U ~]$ db2 "select * from dvsys.listrdbc"
    
    NODE_NAME                                                                        CID                                                                              DRIVER                                                                           URL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              USR                                                                                                                              SRCTYPE              SRCHOSTNAME                                                                                                                      SRCPORT     DBNAME                                                                                                                           IS_DOCKER CPOOLSIZE   CPOOLALL    CSTATUS             
    -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- -------------------- -------------------------------------------------------------------------------------------------------------------------------- ----------- -------------------------------------------------------------------------------------------------------------------------------- --------- ----------- ----------- --------------------
    qpendpoint_1:6415                                                                IMPAL10000                                                                       com.ibm.jdbc.impala.ImpalaDriver                                                 jdbc:ibm:impala://myhostname.mydomain.com:21050;databaseName=default;AuthenticationMethod=kerberos;ServicePrincipalName=impala/myhostame.mydomain.com@MYDOMAIN.COM;LoginConfigName=JAAS_CFG_IMPAL10000                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Impala               myhostname.mydomain.com                                                                                                              21050 default                                                                                                                          N                   2           2 AVAILABLE           
    
      1 record(s) selected.
    
    [db2inst1@c-db2u-dv-db2u-0 - Db2U ~]$ db2 "call dvsys.removerdbcx('IMPAL10000','qpendpoint_1:6415',?,?)"
    
      Value of output parameters
      --------------------------
      Parameter Name  : NUM_REMOVED
      Parameter Value : 1
    
      Parameter Name  : DIAGS
      Parameter Value : 
    
      Return Status = 0
Special characters are not supported in MongoDB database names

Applies to: 4.7.0 and later

You cannot use special characters such as semicolons and single quotes in a MongoDB database name.
Unable to add MySQL IBM Cloud connections from existing platform connects to Watson Query

Applies to: 4.7.3

You cannot create MySQL IBM Cloud SSL connections from platform connections into Watson Query.

Workaround: Create the connections directly in Watson Query by using the latest MariaDB driver: mariadb-java-client-2.7.2.jar.

Complete the following steps:

  1. Create a Generic JDBC connection in Watson Query.
  2. Enter the following JDBC URL and add the appropriate values for <your-host-name>:<port>, <your-database-name>, <your-username>, and <your-password>:
    jdbc:mariadb://<-host->:32661/<-database-name->?trustServerCertificate=true&user=<-username->&password=<-password->&keyStore=TRUSTSTORE_LOCATION&enabledSslProtocolSuites=TLSv1.2&useSSL=true
  3. Upload the following .jar file:

    mariadb-java-client-2.7.2.jar

  4. Select the following driver class:

    org.mariadb.jdbc.Driver

  5. Click Create to save the new connection without entering any other details. The rest of the properties, such as user and password, are included in the JDBC URL that you entered in step 2.
Note: The new connection provides encrypted communications to the data source but does not verify the certificate.

For more information about MySQL, see About MariaDB Connector/J in the MySQL documentation.

The List view in the Virtualize page does not display certain Hive schemas correctly

Applies to: 4.7.3 and later

For Hive connections, on the Virtualize page, the List view might not show data assets, due to restricted column access.

Workaround: Use the Explore view to virtualize the data assets. Alternatively, you can navigate to the Data source page, select the specific Hive schema, save the setting, and then try to reload the data assets in the List view.

Only the connection owner can transfer ownership of a data source connection on the Manage access page

Applies to: 4.7.0 and 4.7.1

Fixed in: 4.7.2

4.7.0 and 4.7.1 In Watson Query versions 4.7.0 and 4.7.1, only the connection owner or a user with DB2 SECADM privilege can transfer can transfer ownership of a data source connection on the Manage access page in Watson Query. The Transfer ownership button is disabled for non-owners.

Workaround:
4.7.0 and 4.7.1 After a deleted connection is added again by a user who is not the previous owner, and if the previous owner is not available and you want to transfer the connection owner, a Watson Query Admin user can run the transfer connection owner API in Run SQL. The following example shows the API call:
call DVSYS.TRANSFER_CONNECTION_OWNER('DB210000', 'user1', 'R', ?)

Replace DB210000 with the CID value of your connection and replace user1 with the user who you want to transfer ownership to.

Note:

Don't run the API call a second time, or the user will lose ownership of the connection. If the API is run a second time by mistake, you can run the following command to recover ownership:

CALL DVSYS.GRANT_CONNECTION_PRIVILEGE('CONTROL','DB2100000','U','user1', ?)

Replace DB210000 with the CID, and replace user1 with the original username.

For more information about Run SQL, see Running SQL in Watson Query.
Limited file types are supported with the Microsoft Azure Data Lake Storage Gen2 data source connection

Applies to: 4.7.0 and later

You can connect to Microsoft Azure Data Lake Storage Gen2 data source from Watson Query. However, only the following file types are supported with this connection in Watson Query:
  • CSV
  • TSV
  • ORC
  • Parquet
  • JSON Lines
Inconsistent varchar type mapping returned by dvsys.reloadtables in databases or schemas with MongoDB connections after you upgrade to Watson Query on Cloud Pak for Data version 4.7.0

Applies to: 4.7.0, 4.7.1, and 4.7.2

Fixed in: 4.7.3

After you upgrade from an earlier version of Watson Query on Cloud Pak for Data to version 4.7.0, with a MongoDB connection, you might see reduced varchar size values in column details when you virtualize tables.
Workaround: You can manually increase the varchar size values in the shopping cart if the discovered sizes are too small.
Special characters are not preserved in databases or schemas with MongoDB connections after you upgrade to Watson Query on Cloud Pak for Data version 4.7.0

Applies to: 4.7.0 and later

After you upgrade from an earlier version of Watson Query on Cloud Pak for Data to version 4.7.0, with a MongoDB connection, special characters in the database or schema are not preserved even though SpecialCharBehavior=Include is set in the data source connection. The updated MongoDB driver (version 6.1.0) that is used in Watson Query on Cloud Pak for Data does not recognize special characters in tables by default. This might cause issues with your results when you query a virtual table that has special characters.
List view on the Virtualize page can't list tables for duplicated connections for the same data source and username

Applies to: 4.7.0 and later

In Data > Data virtualization > Data sources, when you try to create a connection to a data source that has already been connected to, you might see the message The connection could not be added. A connection to this data source already exists for this username.

If you create duplicate connections to the same data source with the same credentials in Platform connections and Watson Query data sources, you do not see a warning message; however, the Virtualize page can't list tables for the duplicated connections.

Workaround: Make sure that you do not have the same connection with the same set of credentials added twice as a data source in Watson Query.
The DECFLOAT data type is not supported in Watson Query
Applies to: 4.7.0 and later

The DECFLOAT data type is not supported in Watson Query. As a result, the type DECFLOAT is converted to DOUBLE and the special numeric values NaN, INF, and -INF are converted to NULL.

The Data sources page might fail to load data sources when remote connectors are added, edited, or removed

Applies to: 4.7.0 and later

If you remove a connection to a remote connector or the remote connector becomes unavailable because credentials expire, the Data sources page fails to load this connection.

After you create some connections in Watson Query, you cannot edit that connection from Platform connections

Applies to: 4.7.0 and later

The following connection types cannot be edited from the Platform connections page if they are created on the Data sources page in Watson Query.
  • Db2®
  • Denodo
  • SAP HANA
  • SAP S/4HANA
  • Spark

Workaround: Edit these connections from the Data sources page only.

Cannot create connections to Db2 on Cloud data source with API keys that are stored in vaults

Applies to: 4.7.0 and later

When you add a connection to a Db2 on Cloud data source through a remote connector or a local agent, and you use an API key that is stored in a vault as your credentials, the action fails with an Error SQL4302N "Exception occurred during connection" message.
To work around this issue for a local agent, enter your API key directly, not as a secret reference in a vault. You cannot work around this issue for Db2 on Cloud data sources that are accessed through a remote connector.
Unable to add a connection to SAP S/4HANA data source with a SAP OData connection

Applies to: 4.7.0 and later

If you try to connect to an SAP S/4HANA data source that contains many tables, that connection might time out and the connection might fail. Increasing timeout parameters has no impact.
To work around this issue, run the following commands.
db2 connect to bigsql
db2 "call DVSYS.setRdbcX('SAPS4Hana', '<Data source IP:port>', '0', '', 'CreateSchema=ForceNew', '<username>', '<password>', '0', '0', '', '', '<internal connector instance and port>', ?,?,?);"
db2 terminate
Some queries to SAP HANA data sources do not show correct results

Applies to: 4.7.0 and later

When you query virtualized tables in SAP HANA data sources that select certain columns and use the NOT operator as a filter on those columns, you might see incorrect results.

Tables in a MongoDB data source might be missing when you virtualize

Applies to: 4.7.0 and later

When you create a connection to MongoDB, you see only tables that were created in the MongoDB data source before the connection was added.

For example, if you have 10 tables in your MongoDB data source when you create a connection, you see 10 tables when you start to virtualize the table. If a user adds new tables into the MongoDB data source after the connection is added and before you click Virtualize, Watson Query won't display the new tables under the Virtualize tab.

Workaround: To see all recently added virtualized MongoDB tables, delete the connection to MongoDB and re-create the connection.

You cannot connect to a MongoDB data source with special characters in a database name

Applies to: 4.7.0 and later

The current MongoDB JDBC Driver does not support connection to database names that contain special characters.

When you virtualize data that contains LOB (CLOB/BLOB) or Long Varchar data types, the preview might show the columns as empty

Applies to: 4.7.0 and later

After you virtualize the table, in Virtualized data, the data is available for the columns that contain LOB or Long Varchar data types.

Errors when you edit an SSL-based connection

Applies to: 4.7.0 and later

For example, you might see the following message.
DB_WARNING: ENGINE_JDBC_CONN_ATTEMPT_ERROR: Failed JDBC Connection attempt in XXX ms for: YYY;sslTrustStoreLocation=/path/.dsTempTrustStore;, cause: .... Message: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty.

To work around this issue, see Errors when you edit an SSL-based connection in Watson Query.

Support of timestamp data type up to nanoseconds

Applies to: 4.7.0 and later

Watson Query supports the timestamp data type up to nanoseconds. When a remote data source's timestamp type is with a scale greater than 9 nanoseconds, Watson Query returns truncated timestamp column values up to nanoseconds. Additionally, for timestamp predicates, Watson Query compares only up to nanoseconds.

The Explore view on the Watson Query Virtualize page does not show tables in IBM® Db2 for i
Applies to: 4.7.0 and later

You cannot use the Explore view on the Watson Query Virtualize page to preview and virtualize tables in IBM Db2 for i.

Workaround:

Use the List view on the Watson Query Virtualize page to preview and virtualize tables IBM Db2 for i.

Limitations for SSL-based data source connections

Applies to: 4.7.0 and later

The following limitations and known issues apply to SSL-based data source connections in Watson Query:
Support of only PEM certificate format for built-in data sources
When you add a built-in data source to Watson Query, and you upload an SSL certificate, Watson Query supports SSL certificates with PEM format only. If you add a built-in data source to Watson Query, and you use an SSL certificate with a non-PEM format, the connection fails.

You can use openssl to convert your SSL certificates to PEM format before you upload the SSL certificate file to add a data source connection. For more information, see Cannot connect to a data source.

Issues to upload SSL certificates for connections that require a third-party driver
When you add a data source connection by uploading a third-party driver and you try to use an SSL certificate, you might encounter an error. It is recommended that you use non-SSL connections only for data sources that require third-party drivers.
Query performance issues against Db2 data sources

Applies to: 4.7.0 and later

When you run a query against Db2 data sources, you might encounter performance issues. For example, these performance issues might appear if your SQL statement uses a string comparable, such as c1 = 'abc', c2=c3, where c1, c2, and c3 are string data types such as char or varchar. To avoid these performance issues, you must modify the collating sequence (COLLATING_SEQUENCE) server option of the data source. For more information about this server option, see Collating sequence.

Remote data sources - Performance issues when you create data source connection

Applies to: 4.7.0 and later

You try to create a data source by searching a different host, but the process takes several minutes to complete. This performance issue occurs only when these two conditions are met:
  • The remote data source is connected to multiple Cloud Pak for Data clusters.
  • Watson Query connects to multiple data sources in different Cloud Pak for Data clusters by using the remote connectors.

To solve this issue, ensure that your Watson Query connections are on a single Cloud Pak for Data cluster.

Remote data sources - Cannot use system junction points

Applies to: 4.7.0 and later

Watson Query does not support browsing data on remote data sources by using paths that contain system junction points. System junction points provide compatibility with an earlier version. For example, on Windows 10 C:\Documents and Settings is a system junction point to C:\Users. Thus, when you browse files on a remote Windows data source, you cannot enter a path that contains system junction points, such as C:\Documents and Settings. By default, system junction points are hidden from Windows users.
Note: Watson Query does support junction points and symbolic links that are created by Windows users.
Query fails due to unexpectedly closed connection to data source

Applies to: 4.7.0 and later

Watson Query does not deactivate the connection pool for that data source when your instance runs a continuous workload against virtual tables from a particular data source. Instead, Watson Query waits for a period of complete inactivity before it deactivates the connection pool. The waiting period can create stale connections in the connection pool that get closed by the data source service and lead to query failures.

Workaround: Check the properties for persistent connection (keep-alive parameter) for your data sources. You can try two workarounds:

  • Consider disabling the keep-alive parameter inside any data sources that receive continuous workload from Watson Query.
  • You can also decrease the settings for corresponding Watson Query properties, RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC and RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC, as shown in the following examples: 

    CALL DVSYS.SETCONFIGPROPERTY('RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC', '10', '', ?, ?);    -- default 20s, minimum 5s
    CALL DVSYS.SETCONFIGPROPERTY('RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC, '30', '', ?, ?);    -- default 120s, minimum 5s
    Decreasing the RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC and RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC settings might help if there are small gaps of complete inactivity that were previously too short for the Watson Query shrink and deactivate timeouts to take effect.

Data virtualization issues

Previewing objects with row filtering data protection rules with multiple conditions applied causes a Runtime Error

Applies to: 4.7.0 and later

When a row filtering data protection rule with multiple filter conditions applies to a Watson Query object, attempting to preview that object in the Watson Query web user interface causes a Runtime Error.

Workaround: If you are using row filters with an AND condition, then you should create two separate filtering rules. Alternatively, you can access the object using a Run SQL query or through a catalog or project asset.

Data that is virtualized with double-byte coded character sets and special characters might appear truncated

Applies to: 4.7.0 and later

When you virtualize a table with a double-byte coded character set and special characters, such as Swedish characters, data might appear to be cut off.

Workaround: When virtualizing a table, use the Edit columns menu to extend the field lengths for those columns.

You cannot use special characters in schema names that are used in remote data sources

Applies to: 4.7.0 and later

You can virtualize a table in the remote data source but querying the virtual table fails with an SQL5105N error similar to the following example.

ERROR] SQL error: [IBM][CLI Driver][DB2/LINUXX8664] SQL5105N The statement failed because a Big SQL component encountered an error. Component receiving the error: "DV-FMP". Component returning the error: "Virtualization Connector". Log entry identifier: "GAI-003-NA". SQLSTATE=58040 SQLCODE=-5105 (fvt_sql.py:157)
You cannot use special characters such as semicolons in schema names when you virtualize a table in MongoDB on Cloud

Applies to: 4.7.0 and later

When you virtualize a table in a MongoDB on Cloud data source that has a semicolon in the schema name, you will not be able to virtualize the table. The MongoDB on Cloud data source removes the semicolon from the schema name and queries of this virtual table fail with error SQL5105N.

Tables in Watson Query appear to be virtualized in the UI but they are not actually virtualized

Applies to: 4.7.0

Fixed in: 4.7.3

Virtualization of object store files silently fails, causing tables in Watson Query to not be virtualized as expected, even though they appear to be virtualized in the Watson Query UI.

If you refresh the Virtualized data page, the virtualization status does change from the "Success" state (green) to the "Failed" state (red), accurately indicating the failed status of the virtualized tables.

Workaround: Refresh the target directory, preview the objects, add them to the cart, and attempt the virtualization again.

Leaving the page during a bulk deletion of tables causes errors

Applies to: 4.7.0 and later

If you delete tables in bulk, for example more than 30 at the same time, don't leave the page until the deletion is complete.

Virtualizing large tables might be slow or might fail

Applies to: 4.7.0 and later

When you virtualize multiple large tables from data sources, you might see the following messages in the Virtualize objects dialog box.
Error: CREATE OR REPLACE NICKNAME -- DB2 SQL Error: SQLCODE=-1229, SQLSTATE=40504, SQLERRMC=null, DRIVER=4.29.24
Or
Error: CREATE OR REPLACE NICKNAME -- "<schema>"."<table>" already exists in Server "<server>" (use REPLACE=Y to overwrite)
Or
The table <schema>.<table> already exists. Use REPLACE=Y to overwrite CallStack

This issue occurs even when you do not have tables with the same schema name and table name. The root cause is a timeout and retry sequence that is triggered by HAProxy timeouts. To work around this issue, ensure that your HAProxy settings meet the recommend values. For more information, see Configuring network requirements for Watson Query and Changing load balancer timeout settings. If virtualization of large files takes longer than 5 minutes, increase the HAProxy settings and retry the virtualization.

Deleting a virtual table causes a dependent object error

Applies to: 4.7.0 and later

When you create a virtual table from a Db2 source and then create a cache that is based on the virtual table, you might get a dependent object error when you delete the cache, and then attempt to delete the virtual table.

Workaround:

Choose one of the following workarounds:

    1. Click Add Cache.
    2. Enter a valid cache definition and then select Test Queries > Input SQL queries.
    3. Provide one or more valid SQL queries for matching.
    4. Select Add Quieries > Test cache.
    5. Exit the cache creation steps to return to the virtual objects that need to be deleted.
    6. Attempt to delete the virtual table again.
  • Delete the temporary objects from the Run SQL editor by running the command:drop table CACHESYS.DVTMPMxxxx
    Note: Only delete objects that begin with DVTMPxxx, and not other objects such as DVxxxxorDVxxx_B.
Duplicate data asset is displayed in virtualized data view after the asset is granted to public

Applies to: 4.7.0, 4.7.1, and 4.7.2

Fixed in: 4.7.3

A duplicated data asset is displayed for the asset owner in the virtualized data list after access to the table is granted to public.

Workaround: Grant the access to a specific user, group, or role instead of granting access to public.
Virtualizing objects in Cloud Object Storage fails

Applies to: 4.7.0 and later

When a cluster is new or when the Cloud Object Storage data source is newly added in Watson Query, virtualization of objects might fail with the following error: Failed to read COS schema.

Workaround: Virtualize the objects again.

Users can create a virtualized data view in only one instance of Watson Query
Applies to: 4.7.0 and 4.7.1
Fixed in: 4.7.2

If you have more than one Watson Query instance in a cluster, users can create a view on the Virtualized data page in only one instance. View creation fails in the other Watson Query instances.

Workaround: Users can run the CREATE VIEW statement in Run SQL. For more information, see Running SQL in Watson Query.
Virtualizing a table with many columns or long column names might fail

Applies to: 4.7.0 and later

Virtualizing some tables in Watson Query might fail with a Virtual table creation failed message. Some tables contain columns with large STRING and CLOB data. Virtualizing the table by reducing the column width or even choosing just one of the columns fails with an error message.

Workaround: See Virtualizing a table with many columns or long column names fails in Watson Query.

Preview of tables to be virtualized might not be correct when the table contains a column that is named id

Applies to: 4.7.0 and later

If you try to virtualize a table that contains a column that is named id, preview shows only the column value of the first row in the table. This issue occurs because the keyword id is reserved for the user interface that displays the table.

In addition, if a table contains a column that is named id and the column contains identical values in the id column, these rows display only the values of the first row for each row that has the same value for the id column.

For example, see the following table.
id a2 a3
1 C1-sjis-ok A0_*&
1 abc 123
This table appears as follows because the second row is replaced with the data from the first row.
id a2 a3
1 C1-sjis-ok A0_*&
1 C1-sjis-ok A0_*&

Workaround: Rename the column of the table to a unique name that is not id. If you cannot rename the column name, you must ensure that values in the column are unique.

Installation and upgrade issues

Old .jar files remain after upgrade to Watson Query version 4.7.3
Applies to: 4.7.3

When you upgrade to Watson Query on Cloud Pak for Data version 4.7.3, the old .jar files remain in the directory $IBM_DB_HOME/function/jar/DVSYS. This does not affect any of the Watson Query functionality and no workaround is necessary.

Developer users can create a Watson Query instance in a namespace that already has an instance provisioned
Applies to: 4.7.0 and later

Although Watson Query supports multiple instances in tethered namespaces, a namespace can have only one provisioned instance. Users with the Developer role, however, can bypass this restriction and mistakenly provision more than one Watson Query instance in a namespace.

Workaround:

To avoid this issue, users with the Developer role must check with an Admin user to determine which namespaces do not have a Watson Query instance already provisioned. The Developer users then can choose one of those namespaces to provision the instance.

User and group management issues

Privileges and authorities that are granted to user groups are not considered when you create views

Applies to: 4.7.0 and later

Privileges and authorities that are granted to user groups are not considered when you create views. This limitation is a result of a Db2 limitation on groups.

For more information, see Restrictions on the use of group privileges when executing DDL statements or binding packages.

You can also grant public access on your objects for all roles or all Watson Query users and then restrict access by using data protection rules that are defined on groups. For more information, see Governing virtual data with data protection rules in Watson Query.

See also SQL0727N error when you query view results in Watson Query.

Workaround: Instead of granting access to specific groups from the Manage access page, you should grant access to the public by selecting All data virtualization users. Then create a new data protection rule to define Allow or Deny rules based on groups.

See Governing virtual data with data protection rules in Watson Query.

Data governance issues

Unauthorized objects might be published by Engineer and User roles when the Restrict visibility setting is disabled

Applies to: 4.7.0 and later

When you disable the Restrict visibility setting in Watson Query, users with the Engineer role might be able to publish any object to a catalog or project. This behavior also extends to users with the User role when publishing a REST API.

Workaround: Enable the Restrict visibility service setting in Watson Query. For more information, see Managing the visibility of virtual objects in Watson Query.

Some Watson Query audit events in tethered projects might not be logged

Applies to: 4.7.0, 4.7.1, and 4.7.2

Fixed in: 4.7.3

If Watson Query is provisioned in an OpenShift project that is different than where the Cloud Pak for Data control plane is provisioned (a tethered project), audit events in Watson Query Governance might not get logged.

For more information about Watson Query audit events, see Audit events for Watson Query.

Workaround: none

You cannot apply business terms when you virtualize files in data sources on the Files tab

Applies to: 4.7.0 and later

When you virtualize files in Data > Data virtualization > Virtualize, the Business terms column is not available for data sources on the Files tab. These data sources do not support business terms.

You cannot see business term assignments or grouped tables in the Explore view

Applies to: 4.7.0 and later

On the Virtualize page, you cannot see business term assignments and you cannot see grouped tables.

Workaround: Switch to the List view to see business term assignments and grouped tables.

Access control issues when you preview assets with masked or filtered data

Applies to: 4.7.0 and later

When you preview, download, or refine Watson Query data assets in Watson services other than Watson Query, in Cloud Pak for Data (for example, Watson Knowledge Catalog, Watson Studio, and Data Refinery), and in cases when data masking or row-level filtering applies, the preview is subject to the data protection rules and catalog or project access control only. Watson Query access controls are not enforced.

Watson Query access control is not applied when data masking or row-level filtering applies to the preview in Watson services (other than Watson Query). The Watson Query internal access controls, which are controlled by using Manage access in the Watson Query UI, do not apply to the preview from the other Watson services with masking or row-level filtering. You must define your rules to manage access to the catalogs, projects, data assets, or connections for access control in the other Watson services.

Cannot see list of available tables in the strict virtualization mode

Applies to: 4.7.0 and later

In the strict virtualization mode (where you can see tables only if they have at least one column with business term assignment), when you navigate to the Virtualize page, the console appears to be loading the table list for a while without showing any tables. The loading can be much slower compared to the default virtualization mode while the console evaluates the list of eligible tables that can be virtualized, depending on term assignments to data source table and column names.

A virtualized object cannot be used in Cognos® Dashboards without credentials and an appropriate role

Applies to: 4.7.0 and later

Using a virtualized object in Watson Query displays an error message if you do not enter credentials for the Watson Query connection or you do not have the correct role.

If you did not enter a username and password when you created the connection to the Watson Query data source, you see the following error: Missing personal credentials. If you are not assigned the Admin or Steward role for this object, you see the following error: Unable to access this asset.

To work around this issue, see A virtualized object cannot be used in Cognos Dashboards without credentials and an appropriate role in Watson Query.

Caching issues

Online backup of Watson Query can affect cache refreshes

Applies to: 4.7.0 and later

If you schedule an online backup in Watson Query and if at the same time a cache refresh operation runs, the cache write operation can fail.

Online backup in Watson Query works by quiescing database write and update operations. Cache refresh is a write operation that updates Watson Query caches with fresh data from remote data sources. Both online backup and cache refresh operations can be scheduled to run periodically. These schedules can conflict and every time the cache tries to refresh, an online backup might be in progress or it might start while the cache refresh is in progress. In these scenarios, cache refresh operations can fail with an error similar to the following message. The cache is active but it contains stale data.

The statement failed because a Big SQL component encountered an error. 
Component receiving the error: "BigSQL IO". Component returning the error: "UNKNOWN". 
Log entry identifier: "[BSL-1-b1bf428f]". Reason: "Cannot add block to /warehouse".. 
SQLCODE=-5105, SQLSTATE=58040, DRIVER=4.27.25

Cache auto-refresh might also fail and you can either manually refresh or wait for the next auto-refresh. However, if the online backup is scheduled such that it conflicts with any cache's refresh schedule, then every auto-refresh of those caches will fail.

Workaround: Reschedule the cache's refresh or the online backup schedule to avoid conflicts. You can also refresh the cache manually at any time.
A cache might not be deleted and you cannot reuse the cache name as a result

Applies to: 4.7.0, 4.7.1, and 4.7.2

Fixed in: 4.7.3
In Watson Query on Cloud Pak for Data, you might not be able to successfully delete a cache. The cache might end up in an Inactive state while also being in a Failed state, with one of the following error messages:
  • The current transaction has been rolled back because of a deadlock or timeout.
  • An autonomous transaction executing a procedure has been terminated abnormally
When one of these errors occurs, the cache is inactive and does not interfere with your query workload. However, the cache is not fully deleted, which prevents you from creating another cache with the same name.
Workaround: As a workaround, use a different name when you create a new cache.

File issues

You cannot preview long string values in headings in CSV, TSV, or Excel files

Applies to: 4.7.0 and later

When you use the first row as column headings, the string values in that row must not exceed the maximum Db2 identifier length of 128 characters and cannot be duplicated. If your file has string names in the header row with values that are too long or are duplicated, an error message is displayed when you try to preview your file in Watson Query.
400: Missing ResultSet:java.sql.SQLSyntaxErrorException: Long column type column or parameter 'COLUMN2' not permitted in declared global temporary tables or procedure definitions.

Column heading names are case-insensitive and converted to uppercase in the API response, which is exposed by the console. Therefore, a column that is named ABC is considered the same as a column that is named abc. However, the columns can be renamed to mixed case when you virtualize your data source.

Workaround: Review the first row in the data file that contains the intended column heading names and make the necessary changes to avoid this limitation.
You might encounter errors when you virtualize large Excel files

Applies to: 4.7.0 and later

You might experience an error when you preview or virtualize large files with Excel format (XLS):
ERROR 400 'Your InputStream was neither an OLE2 stream nor an OOXML stream' was thrown while evaluating an expression.
Workaround: You can follow these steps:
  1. Load the file data into a table on a supported data source. For more information, see Supported data sources.
  2. Virtualize the table. For more information, see Creating a virtualized table.
  3. Optionally, you can change the format of the file; for example, to CSV format before you virtualize it again.
Encoding detection override for files data with Japanese characters

Applies to: 4.7.0 and later

For text files exposed by remote connectors, Cloud Pak for Data automatically detects the encoding scheme of flat data files, such as CSV and TSV files. However, to avoid decoding issues, it is recommended that you set the encoding scheme manually for flat data files. For more information, see Setting encoding scheme.

Only UTF-8 character encoding is supported for CSV, TSV, and JSON files in Cloud Object Storage

Applies to: 4.7.0 and later

For text files in Cloud Object Storage data sources with CSV, TSV, and JSON formats, only UTF-8 character encoding is supported in Watson Query. Cloud Object Storage binary formats such as Optimized Row Columnar (ORC) or Parquet are unaffected because they transparently encode character types.

Resolved issues

You cannot successfully virtualize flat files that contain byte order mark (BOM) characters in the first row of data

Fixed in: 4.7.0

If you virtualize CSV, TSV, and JSON files that have BOM characters in the first row, you cannot successfully virtualize and preview the data because extra characters are appended to virtualized column names.

Workaround: To avoid this issue, do not use BOM characters at the beginning of CSV, TSV, or JSON files. Review the first row in the data file that contains the intended column heading names and make the necessary changes.
Incorrect status messages are displayed when statistics are collected or deleted from virtualized data assets
Fixed in: 4.7.0

When you collect or delete virtualized data assets in Watson Query, the statistics are not always displayed as expected, due to a job status update lag. For example, the statistics collection status might be not collected even though the statistics were successfully collected.

Workaround:

If the collection status messages are not displaying the expected status values, you can select View job detail to view accurate collection status values. You also can delete a job manually and create a new job to collect accurate statistics.

Cannot add a Hive or other Kerberos-enabled data source connection with personal credentials from Platform connections to Watson Query
Fixed in: 4.7.0
When you add a new Hive or other Kerberos-enabled data source connection from Platform connections to Watson Query, and the connection is configured with personal credentials, you might be prompted to add the required credentials. However, after you provide the credentials, the connection is not added and you receive an error message similar to the following example:
Can’t authenticate connection Permission to the Asset API is
      denied. Contact IBM software support. The message received from the  Asset API was:
      {"trace":"c5b73p65qayanboafvtj7a5xq","errors":[{"code":"not_authorized","message":"ASTSV3108E:
      User '1000331067' with catalog role '[viewer]' must be an editor or  owner of the asset to
      perform the operation
    'ASSET_ATTACHMENT_ATTRIBUTE_CREATE_UPDATE_DELETE'."}]}.
Assigned business terms are not displayed in the Watson Query Virtualize list
Fixed in: 4.7.0

When you view the list of assigned business terms in the Watson Query Virtualize page and the assigned business terms belong to the same table under the same physical data source but in different Watson Knowledge Catalog catalogs, some assigned terms might not be displayed.

When you add a connection in Watson Knowledge Catalog and you create a new connection, rather than selecting an existing platform connection, when you query this connection, the connection obtains the hostname and constructs the data source resource key. It might get a different value each time and therefore might not identify the different connections as being the same asset because the resource keys are different.

Workaround:

To work around this issue and view the assigned business terms, complete the following steps:
  1. In Platform connections, create a new connection.
  2. Import this connection from the Watson Knowledge Catalog catalog, import the table from this connection, and then assign the terms to this table manually.
  3. Import the connection from the Watson Query data source.
  4. Go to the Virtualize page to see the assigned terms.
Business terms are not displayed in the Watson Query Virtualize page when the data source schema name contains special characters
Fixed in: 4.7.0
Business terms might not be displayed in the Watson Query Virtualize page when the data source schema name contains characters other than uppercase A-Z, underscore, or digits 0 to 9. As a result, you cannot virtualize tables from those schemas in strict virtualization mode. In standard mode, you cannot name virtual tables and columns to match business terms.
Authorizations are not carried over when you rename groups in Watson Query

Fixed in: 4.7.0 See MIGRATE_GROUP_AUTHZ stored procedure in Watson Query for more information about how this issue is resolved.

Watson Query group-level authorizations are assigned to group names. When groups are renamed, Watson Query authorizations are not automatically propagated to the new group name.
Workaround: You must manually revoke authorizations from the previous group names and apply them to the new ones.