升级 Db2 Big SQL 后查询失败并返回 SQL-5105 错误 (Db2 Big SQL)

Db2® Big SQL 3.0.1 升级后,查询运行失败。

症状

SQL 查询失败并返回 SQL-5105 错误,其中带有类似于以下消息的格式不正确文件异常。

SQL5105N The statement failed because a Big SQL component encountered an error. Component receiving the error: "BigSQL IO". Component returning the error: "UNKNOWN". Log entry identifier: "[BSL-1-5895b4337]".
Reason: "Malformed ORC file s3a://ceph1". SQLSTATE=58040Also, in the bigsql.log an error trace similar to the following is observed:
<timestamp> ERROR com.ibm.biginsights.bigsql.dfsrw.reader.DfsBaseReader [Master-1-S:5.1001.1.0.0.95] : [BSL-1-5895b4337] Exception raised by Reader at node: 1 Scan ID: S:5.1001.1.0.0.95 
Table: test.table23 Spark: false VORC: true VPQ: false VAVRO: false VTEXT: false VRCFILE: false VANALYZE: false Exception Label: UNMAPPED(org.apache.orc.FileFormatException: Malformed ORC file s3a://ceph1-os-bucket/warehouse/tablespace/external/hive/test.db/table23/._biginsights_stats/09e382157f925948f772e0c55f116c56.TEMP/1.akmv. 
Invalid postscript.) org.apache.orc.FileFormatException: Malformed ORC file s3a://ceph1-os-bucket/warehouse/tablespace/external/hive/test.db/table23/._biginsights_stats/09e382157f925948f772e0c55f116c56.TEMP/1.akmv. 
Invalid postscript. at org.apache.orc.impl.ReaderImpl.ensureOrcFooter(ReaderImpl.java:274) at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:580) at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:369) at org.apache.orc.OrcFile.createReader(OrcFile.java:358) 
at com.ibm.biginsights.bigsql.dfsrw.orc.DfsOrcReader.createOrcReader(DfsOrcReader.java:433) at com.ibm.biginsights.bigsql.dfsrw.orc.DfsOrcSplit2Batch.getOrcReader(DfsOrcSplit2Batch.java:343) at com.ibm.biginsights.bigsql.dfsrw.orc.DfsOrcSplit2Batch.split2Batch(DfsOrcSplit2Batch.java:135) 
at com.ibm.biginsights.bigsql.dfsrw.jaro.DfsSplitManager$SplitRunnable.run(DfsSplitManager.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.lang.Thread.run(Thread.java:820) 
at com.ibm.biginsights.bigsql.dfsrw.jaro.DfsSplit2BatchThread.run(DfsSplit2BatchThread.java:58)

原因

IBM® Cloud Pak for Data 3.0.1 上的 Db2 Big SQL 中,Analyze 实用程序将概要文件(用于增量分析)存储在 HDFS 或对象存储存储区上的表目录位置下的特殊目录 ._biginsights_stats 中。在 Cloud Pak for Data 3.5 中,Analyze 实用程序不再将概要文件存储在 HDFS 或对象存储上。但是,在一些罕见的情况下,._biginsights_stats 下的临时目录中存在旧的概要文件。这些文件可能导致查询失败并返回 SQL-5105 错误。

解决问题

请仔细移除表位置下的 ._biginsights_stats 目录中的任何 TEMP 目录。例如,在症状部分中描述的 bigsql.log 错误中,移除以下目录:

s3a://ceph1-os-bucket/warehouse/tablespace/external/hive/test.db/table23/._biginsights_stats/09e382157f925948f772e0c55f116c56.TEMP

移除此类 TEMP 目录后,对此表的查询将成功运行。此外,下次 Analyze 实用程序在此表上运行时,将自动移除旧的 ._biginsights_stats 目录。