Slow performance for queries linked to pushdown issues to data sources

You might experience delays in receiving results from views in Watson Query due to an SQL dialect mismatch between SAP HANA and Db2® data sources.

Symptoms

When you query a view from a SAP HANA remote data source, it takes a long time to return results.

Causes

Watson Query is not pushing down the query to SAP HANA because of an SQL dialect mismatch. As a result, Federation applies predicates or functions locally instead of pushing them down to the remote data source, and it might also split down the query into multiple branches.

To determine whether the query is split into multiple branches, generate an Explain plan by using the db2expln command, and then view the results. A query that is pushed down to the remote data source displays only two boxes.

For more information, see db2expln - SQL and XQuery Explain command.

Resolving the problem

To resolve the problem, use a remote view to query your tables. If you cannot use a remote view, then try to virtualize your queries natively. Lastly, for every query that you cannot virtualize natively, virtualize each remote table that the native query references. Then construct and optimize the equivalent Watson Query query that references these virtual tables instead of the original remote ones.

Primary solution: Use a remote view
To resolve this problem, use a remote view to query your tables directly from SAP HANA. Watson Query queries a virtual table that points directly to a view, which means the remote data source runs the underlying query.
Secondary solution: Virtualize your tables natively
Virtualize your tables natively if you can't use a remote view to query your tables by completing the following steps.
  1. Determine which queries or virtual tables you need to virtualize and then consider the following.
    • Some queries cannot be virtualized natively such as ones that reference positional parameters or variables, or if the application cannot be modified to reference a single abstract object (such as a virtual table) for the query. In these cases, you need to virtualize the referenced remote tables and then optimize the equivalent query against virtual tables. Refer to the Optional tertiary solution: Enhance query performance.
    • Identify what tables referenced in the virtualized native queries were also virtualized separately. If you don't need to query these virtual tables outside of your virtualized native queries, then consider removing them to avoid maintaining unnecessary virtual tables.
  2. Run the VIRTUALIZENATIVEQUERY stored procedure to virtualize a remote query as a virtual table so that Federation can send the query to SAP HANA without splitting it into multiple branches or apply predicates or functions locally.

    To run the stored procedure, see VIRTUALIZENATIVEQUERY stored procedure.

  3. Run the stored procedure for each remote query that you want to virtualize.
  4. After you run the stored procedure, you can run queries on the virtual table the same way you would with any other virtualized table or query by using SELECT * FROM <VTSCHEMA>.<VTNAME>.

    Replace <VTSCHEMA> with the schema of the virtual table, and replace <VTNAME> with the name of the virtual table.

Optional tertiary solution: Enhance query performance
If you cannot virtualize your queries natively, then virtualize each remote table that the native query references. Then create the equivalent Watson Query query that references these newly created virtual tables instead of the original remote ones. After, consider the following steps to optimize your virtual tables before you query them.
  1. Enable MAX_PUSHDOWN query mode, which attempts to push down as much of the query processing to the data sources despite SQL dialect differences.

    To learn how to enable MAX_PUSHDOWN, see Setting the query mode in Watson Query.

    Note: If you are experiencing query result issues and cannot enable MAX_PUSHDOWN, then adjust the following pushdown properties by referring to Setting the Federation server options.
    • db2_maximal_pushdown
    • varchar2_compat
    • db2_varchar_blankpadded_comparison
    • varchar_no_trailing_blanks
    • collating_sequence
  2. Simplify your SQL queries by avoiding complex structures like Common Table Expressions (CTE), On-Line Analytical Processing (OLAP) functions, or other functions that are specific to Db2. This helps ensure that your query is compatible with the same SQL dialect as the data source, and can be pushed down more effectively.
  3. Query your SAP HANA tables.