General HBase tuning
When you tune HBase, you can improve the performance and balance the memory usage.
Updating environment variables (hbase-env.sh)
Depending on the availability of memory on the cluster nodes, you can use environment variables to tune the memory that is available to the HBase master server and the HBase region servers. You can also configure the garbage collector. As part of the HBase tuning process, consider the map reduce workload and the memory that is allocated to the map reduce JVMs.
After any changes to the variables, save the configuration changes and restart the HBase service.
- Master and region server memory
- Each region server contains regions that contain all of the data in a
The HBASE_HEAPSIZE value is the maximum amount of heap to use, in MB. The default is 1000. This is small for an HBase system that is used regularly in your cluster. Give HBase as much memory as you can to avoid swapping to achieve good performance. The example uses a value of 8000, but you should tune the size based on your environment and workloads.You can increase the HBase master server JVM heap size with the following steps:
- Expand the Advanced hbase-env section.
- In the hbase-env template field, find the current reference to
HBASE_HEAPSIZE and modify the value:
Remove the hash sign if it exists to uncomment the string.
- Then, increase the JVM heap size for the region servers. In the same hbase-env
template field, scroll to find the HBASE_REGIONSERVER_OPTS variable, and update the
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xms8G -Xmx8g"
- Garbage collection
- HBase uses the JVM garbage collection subsystem, which reduces some
memory management issues. Garbage collection is an automated system
that handles both the allocation and reclamation of memory for Java
objects. For a JVM that contains less that 4 GB of memory, use gcpolicy=gencon. A suggested best practice is the following setting:
-Xms3000m -Xmx3000m -Xgcpolicy:gencon
The -Xms<size> sets the initial size of the heap. The -Xmx<size> sets the maximum size of the heap.For a JVM that contains more than 4 GB or memory, use policy=balanced. With this policy, you do not need to set anything beyond the initial size and the maximum size of the heap.
-Xms8192m -Xmx8192m -Xgcpolicy:balanced
- You can manipulate the garbage collection options in HBASE_OPTS. Search
for garbage collection and then update with the following
export HBASE_OPTS="$HBASE_OPTS -Xgcthreads2 -Xgcpolicy:gencon -Xalwaysclassgc"
Updating configuration values (hbase-site.xml)
This parameter defines the number of threads that are kept open to answer incoming requests to user tables. The default value is 30.
A rule of thumb is to keep the value low when the payload for each request is large, and keep the value high when the payload is small. Increase the hbase.regionserver.handler.count to a value that is approximately the number of CPUs on the region servers. Go to the Settings tab, and find the Number of Handlers per RegionServer field. Move the horizontal bar to the value 64.
- This parameter is the maximum HStoreFile size. The default value is 10737418240. Decrease the region server size. Big SQL determines the number of mappers based on the region size. There is one mapper for each region. Go to the Settings tab, and find the Maximum Region File Size field. Move the horizontal bar to the value between 10GB and 11GB.
- This parameter is the size of the HTable client write buffer in bytes. The default value is 2097152.
- A bigger buffer takes more memory,on both the client and server side, but a larger buffer size
reduces the number of remote procedure calls that are made. Increase the
hbase.client.write.buffer value. To change these values, click
HBase and then click the tab. Expand the Custom
hbase-site section section. or you can search for each variable in the
hbase.client.write.buffer = 8388608
- This parameter is the number of rows that are fetched when calling next on a scanner, if it is not served from memory. The default value is 100.
- A higher caching value enables faster scanners, but uses more memory and some calls of next can
take longer times when the cache is empty. Increase the scanner cache size to improve the
performance of large reads. To change these values, click HBase and then
tab. Expand the Custom hbase-site section section. or you can search for
each variable in the Filter
hbase.client.scanner.caching = 10000