Big SQL memory error during LOAD

Big SQL generates a memory error during a LOAD operation when the size of the input data exceeds the memory available for the LOAD.

Symptoms

Big SQL LOAD a memory exception error: OutOfMemoryError.

Resolving the problem

The amount of memory required by a load is dependent on multiple factors such as: the number of partitions in the table, the size of the input data set, and the number map and reduce jobs (that are either specified by LOAD statement, or concluded from number of splits in the data source). To get around this problem, break up your input data set into two or more sets and re-LOAD.
You can inspect the following property values to see how memory can be further optimized:
  • YARN containers memory set in Ambari > YARN > Configs
  • Map task memory: mapreduce.map.java.opts
  • Heap allocated for Hadoop: HADOOP_HEAPSIZE
  • Memory allocated for data buffered during read operation: io.file.buffer.size
  • File sort buffer memory: mapreduce.task.io.sort.mb