Query optimization with materialized query tables
- Aggregate data over one or more dimensions
- Joins and aggregate data over a group of tables
- Data from a commonly accessed subset of data
MQTs can effectively eliminate overlapping work among queries. Computations are performed only once when MQTs are built and once each time that they are refreshed, and their content can be reused during the execution of many queries.
You can create MQTs as Db2® tables on the head node only. You can also create indexes on Db2 MQTs, but these indexes also reside on the head node only.
Hadoop MQTs are stored on HDFS. The data from these MQTs is accessible from any node in your cluster, thereby reducing network traffic. And because there is usually more storage capacity on HDFS than on local disks, capacity is less of an issue when using Hadoop MQTs.
You can create Hadoop MQTs over nicknames (federated sources), which enables you to cache some of the data from the federated tables in Db2 Big SQL. Querying these MQTs over nicknames can be far more efficient than querying the nicknames themselves. You need to drop such MQTs before you can drop the nicknames on which they are based.
Hadoop MQTs must be maintained by the user. If an underlying base table has grown, you must manually insert the new data into the MQT; otherwise, the new rows will not be accessible when the MQT is used. Altering a base table may not be allowed if an MQT is defined on the base table. If an error (SQL0270N) is returned, you must drop the MQT before altering the table. If the base table is dropped, any Hadoop MQT on the base table is automatically removed.
It is not recommended to alter an MQT or a base table with dependent MQTs from Hive. After a table with a dependent MQT is altered in Hive, Db2 Big SQL might not be able to apply the changes, and subsequent queries might fail. Such dependent MQTs must be dropped before their base table can be synced successfully. Recreate the MQTs after the sync operation completes.
The following table shows the operations that you can perform on an MQT from either Db2 Big SQL or Hive and the expected results.
Operation | Effects on Db2 Big SQL tables | Effects on Hive tables |
---|---|---|
Create an MQT from Db2 Big SQL | The MQT is registered and a Hadoop table is created. | A Hive table is created. |
Populate the base table from Db2 Big SQL or Hive | The base table is populated and the MQT becomes stale. | The Hive base table is populated. |
Populate a Hadoop MQT from Db2 Big SQL or Hive | The MQT is populated or refreshed. | The Hive table is populated. |
Enable query optimization | Db2 Big SQL considers the MQT during query planning. | Not applicable |
Alter MQT column definitions from Db2 Big SQL | Not allowed | Not applicable |
Alter Hive equivalent MQT column definitions | The MQT must be dropped and then recreated to reflect the changes from Hive. | Hive equivalent MQT column definitions are updated. |
Alter column definitions of MQT base tables in Db2 Big SQL | Not allowed | Not applicable |
Alter column definitions of MQT base tables in Hive | The base table definition in Hive and Db2 Big SQL are out of sync. The MQT must be dropped before the base table definitions can be synchronized. | Base table column definitions change. |
Drop an MQT | The MQT is dropped. | The Hive equivalent MQT is dropped. |
Drop a base table | The base table and all of its dependent MQTs are dropped. | The base table is dropped. |
If a query directly references an MQT, security policies are not inherited from the base table. However, if the query is selecting from the base table, security policies on the base table are applied to the MQT.
MQT matching or automatic query rewrite is possible with Hadoop MQTs. This means that queries do not need to be manually rewritten to take advantage of any Hadoop MQT.
You can create Hadoop MQTs as partitioned tables (see CREATE TABLE (HADOOP) statement). Retrieving data from a partitioned Hadoop MQT can be faster than retrieving data from a nonpartitioned Hadoop MQT.
For more information about Hadoop MQTs and Db2 Big SQL performance, see Hadoop Materialized Query Tables (MQTs) Boost Big SQL Performance.