Memory tuning

This topic describes the memory tuning.

Refer the following table to plan your system memory per node:
Table 1. System Memory Allocation
Total memory per node Recommended reserved system memory Recommended reserved HBase memory IBM Storage Scale pagepool Transparency NameNode heap size Transparency DataNode heap size
16 GB 2 GB 2 GB 4GB 2GB+ 2GB
24 GB 4 GB 4 GB 6GB 2GB+ 2GB
48 GB 6 GB 8 GB 12GB 2GB+ 2GB
64 GB 8 GB 8 GB 16GB 2GB+ 2GB
72 GB 8 GB 8 GB 18GB 2GB+ 2GB
96 GB 12 GB 16 GB 20GB 2GB+ 2GB
128 GB 20 GB 24 GB 20GB 2GB+ 2GB
256 GB 32 GB 32 GB 20GB 2GB+ 2GB
512 GB 64 GB 64 GB 20GB 2GB+ 2GB
Note: For detailed memory requirements, see Recommended hardware resource configuration.

HDFS Transparency DataNode service is a lightweight daemon and does not need a lot of memory.

For HDFS Transparency NameNode, when ranger support is enabled (by default, it is enabled and you could turn it off by setting gpfs.ranger.enabled=false in gpfs-site.xml), Transparency NameNode will cache inode information. Therefore, if your Transparency NameNode heap size is very small, JVM garbage collection is executed frequently. Usually, it is 2GB and you could increase this up to 4GB if you have a large set of files in your file system.

Note:
  • From HDFS Transparency 3.1.0-6 and 3.1.1-3, ensure that the gpfs.ranger.enabled field is set to scale. The scale option replaces the original true/false values.
  • Transparency NameNode does not manage FSImage as native HDFS does. It does not need large memory for large number of files as native HDFS.

The pagepool (memory cache for IBM Storage Scale) size for IBM Storage Scale is recommended for most cases in production. However, if you mainly run HBase and want HBase of the best performance, follow section 4.6 Tuning HBase/YCSB.

Table 2. How to change the memory size
Configuration Guide
IBM Storage Scale pagepool size mmchconfig pagepool=XG -N <node1,node2…>

Need to restart IBM Storage Scale daemon to make the change effective.

Transparency NameNode Heap Size

Ambari GUI > HDFS > Configs, change the value of NameNode Java heap size.

If you take community Hadoop, modify the variable HADOOP_NAMENODE_OPTS in $HODOOP_HOME_DIR/etc/hadoop/hadoop-env.sh. For example, add -Xms2048m -Xmx2048m to set the heap size as 2GB. If the option -Xms and -Xmx have been there, you can modify the number (For example, 2048 for the example) directly.

Need to restart HDFS Transparency to make the change effective.

Transparency DataNode Heap size

Ambari GUI > HDFS > Configs, change the value of DataNode maximum Java heap size.

If you take community Hadoop, modify the variable HADOOP_DATANODE_OPTS in $HODOOP_HOME_DIR/etc/hadoop/hadoop-env.sh. For example, add -Xms2048m -Xmx2048m to set the heap size as 2GB. If the option -Xms and -Xmx have been there, you could modify the number (For example, 2048 for the example) directly.

Need to restart HDFS Transparency to make the change effective.