Query Optimizer overview
In the domain of database management systems, query optimizers play a pivotal role in ensuring efficient execution of database queries. Query Optimizer, a component of IBM® watsonx.data, improves the performance of queries that are processed by Presto (C++) engine. If optimization is analyzed to be feasible, the query undergoes rewriting; otherwise, the native engine optimization takes precedence.
Within watsonx.data, Query Optimizer operates as a component, tasked with optimizing queries. It accepts a standard SQL query as input and generates an optimized SQL equivalent, which is tailored for enhanced execution by Presto (C++). In instances where optimization is not feasible, the system reverts to using the original query.
Query Optimizer emerges as a valuable addition to the watsonx.data, empowering users to optimize their queries and achieve enhanced performance from their engines.
You can see Query Optimizer section for more information.
Advantages of Query Optimizer
The query optimization feature of Db2 is leveraged in watsonx.data and the key factors considered include:
- watsonx.data uses Db2 as the Decades-Honed
Query Optimization for Peak Performance.Leveraging extensive development, query optimization feature of Db2 analyzes your SQL queries and generates optimal execution plans. Key factors considered include:
- Accurate statistics:
RUNSTATS
gathers data distribution and cardinality estimates for informed decisions. - Well-designed indexes and constraints: These guide the optimizer towards efficient access paths and enforce data integrity.
- Advanced techniques for complex queries: Cost-based optimization and cardinality estimation ensure efficient processing.
- Accurate statistics:
-
Enhanced Query Performance: Query Optimizer effectively optimizes queries, leading to significant performance improvements.
-
Seamless Integration: Query Optimizer seamlessly integrates with existing watsonx.data infrastructure, ensuring a smooth adoption process.
-
Flexible Optimization: Query Optimizer operates flexibly as users can enable and disable the feature either at global or session level.
- Query Optimizer supports Hive and Iceberg tables.
- Query Optimizer cannot be configured from the watsonx.data web console as the console does not use session variables.
- Query Optimizer cannot verify whether an original query or a rewritten query was run. Verify the Presto (C++) log to get this information.
- When metastores are synced, all schemas and tables are in the uppercase. For example,
"
catalog.SCHEMA.TABLE
". - For optimal performance, you must define constraints like
NOT NULL
,Primary key
, andForeign key
in Query Optimizer after the tables are synced. - Query Optimizer does not support views.
- Decimal and float columns selected using the SELECT query might interchange and can cause mismatch in data type.
- Three-part name queries need quotation marks around the catalog name in lowercase
(
catalog.SCHEMA.TABLE
). Query returns an error otherwise. - Certain queries (full outer join, anti join) do not return the correct result.
- Special characters in the identifier do not work properly.
- Query Optimizer currently only supports
SELECT
statements.