Query optimization with materialized query tables

Materialized query tables (MQTs) are a powerful way to improve response time for complex analytical queries because their data consists of precomputed results from the tables that you specify in the materialized query table definitions.
MQTs can improve response time, particularly for queries that use one or more of the following types of data:
  • Aggregate data over one or more dimensions
  • Joins and aggregate data over a group of tables
  • Data from a commonly accessed subset of data
The larger the tables on which MQTs are based, the more significant are the potential improvements in response time when you use MQTs.

MQTs can effectively eliminate overlapping work among queries. Computations are performed only once when MQTs are built and once each time that they are refreshed, and their content can be reused during the execution of many queries.

You can create MQTs as Db2® tables on the head node only. You can also create indexes on Db2 MQTs, but these indexes also reside on the head node only.

Hadoop MQTs are stored on HDFS. The data from these MQTs is accessible from any node in your cluster, thereby reducing network traffic. And because there is usually more storage capacity on HDFS than on local disks, capacity is less of an issue when using Hadoop MQTs.

You can create Hadoop MQTs over nicknames (federated sources), which enables you to cache some of the data from the federated tables in Db2 Big SQL. Querying these MQTs over nicknames can be far more efficient than querying the nicknames themselves. You need to drop such MQTs before you can drop the nicknames on which they are based.

Hadoop MQTs must be maintained by the user. If an underlying base table has grown, you must manually insert the new data into the MQT; otherwise, the new rows will not be accessible when the MQT is used. Altering a base table may not be allowed if an MQT is defined on the base table. If an error (SQL0270N) is returned, you must drop the MQT before altering the table. If the base table is dropped, any Hadoop MQT on the base table is automatically removed.

It is not recommended to alter an MQT or a base table with dependent MQTs from Hive. After a table with a dependent MQT is altered in Hive, Db2 Big SQL might not be able to apply the changes, and subsequent queries might fail. Such dependent MQTs must be dropped before their base table can be synced successfully. Recreate the MQTs after the sync operation completes.

The following table shows the operations that you can perform on an MQT from either Db2 Big SQL or Hive and the expected results.

Table 1. Valid operations on MQTs from Db2 Big SQL or Hive
Operation Effects on Db2 Big SQL tables Effects on Hive tables
Create an MQT from Db2 Big SQL The MQT is registered and a Hadoop table is created. A Hive table is created.
Populate the base table from Db2 Big SQL or Hive The base table is populated and the MQT becomes stale. The Hive base table is populated.
Populate a Hadoop MQT from Db2 Big SQL or Hive The MQT is populated or refreshed. The Hive table is populated.
Enable query optimization Db2 Big SQL considers the MQT during query planning. Not applicable
Alter MQT column definitions from Db2 Big SQL Not allowed Not applicable
Alter Hive equivalent MQT column definitions The MQT must be dropped and then recreated to reflect the changes from Hive. Hive equivalent MQT column definitions are updated.
Alter column definitions of MQT base tables in Db2 Big SQL Not allowed Not applicable
Alter column definitions of MQT base tables in Hive The base table definition in Hive and Db2 Big SQL are out of sync. The MQT must be dropped before the base table definitions can be synchronized. Base table column definitions change.
Drop an MQT The MQT is dropped. The Hive equivalent MQT is dropped.
Drop a base table The base table and all of its dependent MQTs are dropped. The base table is dropped.

If a query directly references an MQT, security policies are not inherited from the base table. However, if the query is selecting from the base table, security policies on the base table are applied to the MQT.

MQT matching or automatic query rewrite is possible with Hadoop MQTs. This means that queries do not need to be manually rewritten to take advantage of any Hadoop MQT.

You can create Hadoop MQTs as partitioned tables (see CREATE TABLE (HADOOP) statement). Retrieving data from a partitioned Hadoop MQT can be faster than retrieving data from a nonpartitioned Hadoop MQT.

For more information about Hadoop MQTs and Db2 Big SQL performance, see Hadoop Materialized Query Tables (MQTs) Boost Big SQL Performance.