APAR status
Closed as fixed if next.
Error description
A select from a parquet table with large data values may fail with SQL5105N The statement failed because a Big SQL component encountered an error. Component receiving the error: "BIGSQL Native IO". Component returning the error: "BIGSQL Native IO". Log entry identifier: "[NRL-003-27b174de ]". SQLSTATE=58040 and the native reader log shows: E1205 17:20:29.406167 4611 bi-dfs-reader.cc:494] [NRL-003-27b174de ] SQL CODE -5105: Error calling GetNext on Scan node (couldn't deserialize thrift msg: No more data to read. ParquetScanner: could not read data page because page header exceeded maximum size of 8.00 MB) The issue is that the C++ reader sets the maximum parquet page header size to 8 MB, so attempts to read a page with a larger header (due to a large string column, for example) will fail. The fix is to make the header size configurable, with a default size of 8 MB. This fix adds support for a new config parm that must be added to the bigsql-conf.xml file on all nodes, and then Big SQL needs to be restarted: <property> <name>dfsio.max_page_header_size</name> <value>8388608</value> </property> Once this property is present on all nodes, changes made on the head node will be propagated to the workers when Big SQL is restarted, allowing the maximum value to be set as needed.
Local fix
Using the Java reader instead will work, although that may have performance implications. db2 "SET DFS_EXTERNAL_INPUT_LIBRARY = 'JAVA'" <run select> db2 "SET DFS_EXTERNAL_INPUT_LIBRARY = NULL
Problem summary
Please see problem description.
Problem conclusion
Temporary fix
Comments
APAR Information
APAR number
PI92937
Reported component name
INFO BIGINSIGHT
Reported component ID
5725C0900
Reported release
425
Status
CLOSED FIN
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-01-25
Closed date
2020-09-09
Last modified date
2020-09-09
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Applicable component levels
[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"425"}]
Document Information
Modified date:
10 September 2020