HCAT_SYNC_OBJECTS fails with SQL5105N

The HCAT_SYNC_OBJECTS stored procedure returns SQL5105N and a FileNotFound exception is written to the scheduler log during an attempted table scan.

Symptoms

SQL5105N is returned after an HCAT_SYNC_OBJECTS procedure call. The log entry contains a FileNotFound exception.

Causes

An underlying partitioning directory for a partitioned table might have been removed. This exception might be encountered when the scheduler attempts to rescan a table. The following example shows an error being returned during an HCAT_SYNC_OBJECTS operation:

bigsql$ db2 "CALL SYSHADOOP.HCAT_SYNC_OBJECTS(
  'bigsql', 'parttab', 'a', 'REPLACE', 'CONTINUE','TRANSFER OWNERSHIP TO hive')"
SQL5105N The statement failed because a Big SQL component encountered an
error. Component receiving the error: "SCHEDULER". Component returning the
error: "FRONT-END". Log entry identifier: "[SCL-0-3d22b14dc]". SQLSTATE=58040
Use the SYSHADOOP.LOG_ENTRY table function to get log file information for a particular log entry identifier.

Diagnosing the problem

The failure is not directly related to the HCAT_SYNC_OBJECTS procedure itself, but rather in the scheduler that is attempting to locate the table partitions. This error does not affect the completion of the HCAT_SYNC_OBJECTS procedure, but returning the error from the scheduler takes precedence over returning the result set from the procedure. You can see more details about the failure by looking at the log entry. For example:

select * from TABLE( SYSHADOOP.LOG_ENTRY('SCL-0-3d22b14dc'))

2019-07-08 07:49:15,004 ERROR com.ibm.biginsights.bigsql.scheduler.server.cache.DescriptorTableCache
[pool-1-thread-2] : [SCL-0-3d20f6df8] Incomplete-table: Front-End-Exception. schemaName=bigsql tableName=parttab.
com.thirdparty.cimp.catalog.TableLoadingException: Failed to load metadata for table: parttab
at com.thirdparty.cimp.catalog.HdfsTable.load(HdfsTable.java:1593)
at com.thirdparty.cimp.catalog.TableLoader.load(TableLoader.java:85)
at com.thirdparty.cimp.catalog.TableLoadingMgr$3.call(TableLoadingMgr.java:296)
at com.thirdparty.cimp.catalog.TableLoadingMgr$3.call(TableLoadingMgr.java:267)
at java.util.concurrent.FutureTask.run(FutureTask.java:277)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:812)
Caused by: com.thirdparty.cimp.catalog.CatalogException: Failed to create partition:
at com.thirdparty.cimp.catalog.HdfsTable.createPartition(HdfsTable.java:1334)
at com.thirdparty.cimp.catalog.HdfsTable.loadPartitions(HdfsTable.java:900)
at com.thirdparty.cimp.catalog.HdfsTable.load(HdfsTable.java:1586)
... 7 more
Caused by: java.io.FileNotFoundException: File hdfs://burners1.fyre.ibm.com:8020/tmp/bigsql/parttab/c4=15 does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:748)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:809)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:805)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:805)
at com.thirdparty.cimp.catalog.HdfsTable.createPartition(HdfsTable.java:1254)
... 9 more
In this example, the partition directory for partition (c4=15) does not exist. It might have been removed outside of Db2® Big SQL.

Resolving the problem

You can use the ALTER TABLE (HADOOP/HBASE) statement to remove a partition from the table definition, as shown in the following example:
db2 "ALTER TABLE bigsql.parttab DROP partition (c4=15)"
Best practice is to use Db2 Big SQL or Hive to remove partitions. Manual partition removal is not communicated to the scheduler, which in turn causes the error condition that is described here.