Incorrect results when querying recently refreshed or loaded partitioned tables
SELECT might return incorrect results when run concurrently with ADD or DROP PARTITION.
Symptoms
The correct or expected results are seen if the SELECT is run again, generally within a half hour.
Causes
When multiple operations that evict or load table metadata into the scheduler cache are run concurrently against the same table, it is possible to hit a race condition. This race condition results in the refreshed cache not reflecting the latest changes to the table. As a result, a SELECT from that table may not return the expected data.
ADD and DROP PARTITION statements are passed through to Hive, which uses its own internal locking, these statements do not block ongoing DML, like SELECT, which might depend on the affected data.
By default, ALTER Datalake TABLE ADD (or DROP) PARTITION commands will not evict the entire table from the cache, but will load (or evict) the specified partition metadata only.
The entire table is evicted when:
- Dropping a partial partition.
- Altering a table when the parameter scheduler.cache.delta.updates.for.partitions is disabled (set to FALSE in bigsql_conf.xml).
Resolving the problem
There are a few ways to address this rare condition:
- Do not run concurrent DDL and DML on the same table.
- Enable scheduler.cache.delta.updates.for.partitions (if it has been disabled).
- Stop the concurrent queries (and any other connections concurrently querying the table), and rerun the DROP PARTITION and ADD PARTITION statements. This will forcibly evict the stale cache reference from the scheduler.
- To force a refresh on the table metadata, run the following
command:.
For more information, see MSCK REPAIR TABLE (Datalake) statement.db2 MSCK REPAIR TABLE <TABLENAME>
- Run the HCAT_CACHE_SYNC stored procedure on the affected table. For more information, see HCAT_CACHE_SYNC stored procedure.