APAR status
Closed as program error.
Error description
Over time, BigSQL operations may begin to fail due to excessive memory usage by the DFS C++ read/write process. Errors include SQL5199 reason code 2 (unable to allocate more memory for the FMP memory set). The excessive memory usage is caused by the large number of malloc memory arenas created under RedHat's per-thread arena strategy. A malloc arena is created for every thread within the process, up to a maximum of 8 times the number of cores. An additional overhead is caused by the default memory map threshold, which increases to the largest block ever freed (not specific to RedHat). Any allocation below this level is retained for reuse, which both contributes to fragmentation as well as prevents the application from freeing memory back to the operating system (via malloc free). The memory footprint is accounted for in the DFSRW_PRIVATE memory consumer of the BIGSQL DB2 instance memory controller. This can be viewed by running "db2pd -dbptnmem" on a given worker node and observing the current usage for the DFSRW_PRIVATE consumer. The db2diag.log should show excessive sizes for the DFSRW_PRIVATE consumer at the time of the errors, eg. $ grep DFSRW_PRIVATE <db2diag.log> DFSRW_PRIVATE - Current size : 80008000 KB, HWM : 80008000 KB, Cached : 0 KB DFSRW_PRIVATE is normally no higher than roughly 50% of the Instance Memory limit.
Local fix
Add a cap of 16 malloc arenas and mmap threshold of 1MB by inserting the following into <bigsql home>/sqllib/userprofile : <snip> HIVE_HOME=$BIGSQL_DIST_HOME/hive export HIVE_HOME MALLOC_ARENA_MAX=16 export MALLOC_ARENA_MAX MALLOC_MMAP_THRESHOLD_=1048576 export MALLOC_MMAP_THRESHOLD_ DB2ENVLIST="LD_LIBRARY_PATH DB2LIBPATH BIGSQL_DIST_HOME BIGSQL_HOME HADOOP_HOME EGO_CONFDIR" DB2ENVLIST="${DB2ENVLIST} HADOOP_CONF_DIR HIVE_HOME DB2ENVLIST="${DB2ENVLIST} LIBHDFS_OPTS DB2_EXT_TABLE_READER HADOOP_MAPRED_HOME" DB2ENVLIST="${DB2ENVLIST} SQOOP_HOME BIGSQL_DIST_LIB BIGSQL_DIST_VAR BIGSQL_AUX_JARS_PATH" DB2ENVLIST="${DB2ENVLIST} GSK_STRICTCHECK_CBCPADBYTES" DB2ENVLIST="${DB2ENVLIST} DB2_BIGSQL_LIBPATH DB2_BIGSQL_CLASSPATH" DB2ENVLIST="${DB2ENVLIST} METASTORE_PORT HCAT_PID_DIR HCAT_LOG_DIR HCAT_CONF_DIR DBROOT" DB2ENVLIST="${DB2ENVLIST} MALLOC_ARENA_MAX" DB2ENVLIST="${DB2ENVLIST} MALLOC_MMAP_THRESHOLD_" Above, the new lines to set were: MALLOC_ARENA_MAX=16 export MALLOC_ARENA_MAX MALLOC_MMAP_THRESHOLD_=1048576 export MALLOC_MMAP_THRESHOLD_ and DB2ENVLIST="${DB2ENVLIST} MALLOC_ARENA_MAX" DB2ENVLIST="${DB2ENVLIST} MALLOC_MMAP_THRESHOLD_" After adding the above, bigsql must be recycled twice in order for the userprofile to be propagated to all nodes. i.e. bigsql stop;bigsql start;bigsql stop;bigsql start It is strongly advised to upgrade glibc to a level containing a fix for the Linux "cyclic malloc arena selection bug". This bug results in very imbalanced memory usage across the arenas, which causes inefficient/excessive memory usage as well as performance degradation. The fix is contained in the following glibc levels : RHEL 6 : glibc 2.12.1.192 ( bug ID 1264189 ) https://bugzilla.redhat.com/show_bug.cgi?id=1264189 RHEL 7 : glibc 2.17-157 ( bug ID 1276753 ) https://bugzilla.redhat.com/show_bug.cgi?id=1276753 as well as glibc 2.23
Problem summary
See Description
Problem conclusion
Problem is fixed in Version 4.2.0.0 December 2016 Refresh (DB2 level s161128, Bigsql-dist version 5.78.5.310 )
Temporary fix
see description
Comments
APAR Information
APAR number
PI61701
Reported component name
INFO BIGINSIGHT
Reported component ID
5725C0900
Reported release
400
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-04-29
Closed date
2017-05-12
Last modified date
2017-05-31
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
DFSRW
Fix information
Fixed component name
INFO BIGINSIGHT
Fixed component ID
5725C0900
Applicable component levels
R410 PSN
UP
R420 PSN
UP
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"400","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Document Information
Modified date:
25 August 2020