Tuning for YCSB/HBase
HBase configuration
- Change the hbase-site.xml from Ambari if you take HortonWorks or IBM®
BigInsights®. If you take open source HBase, you could
modify $HBASE_HOME/conf/hbase-site.xml directly.
Table 1. HBase Configuration Tuning Configuration Default value Recommended value Comments Java™ Heap N/A Refer the Memory tuning section. HBase Master server Heap Size; HBase Region Server Heap Size
hbase.regionserver.handler.count 30 60 zookeeper.session.timeout N/A 180000 hbase.hregion.max.filesize 10737418240 10737418240 Check the default value. If it is not 10GB, change it into 10GB. hbase.hstore.blockingStoreFiles 10 50 hbase.hstore.compaction.max 10 10 hbase.hstore.compaction.max.size LONG.MAX_VALUE Variable If you see a lot of compaction, you could set this to 1GB to exclude those HFiles from compaction. hbase.hregion.majorcompaction 604800000 0 Turn off the major compaction when running benchmark to ensure that the results are stable. In production, this should not be changed. hbase.hstore.compactionThreshold 3 3 hbase.hstore.compaction.max 10 3
| Configuration | Default value | Recommended value | Comments |
|---|---|---|---|
| pagepool | 1GB | 30% of physical memory | 30% of physical memory |
Note: 30% of physical memory is only for running HBase/YCSB. In production, you need to consider the
memory allocation for other workloads. If you run Map/Reduce jobs, Hive jobs over the same cluster,
you need to trace off the performance for these different workloads. If you allocate more memory for
pagepool because of HBase, you will have fewer memory for Map/Reduce jobs and therefore degrade the
performance for Map/Reduce jobs.
| Configuration | Default value | Recommended value | Comments |
|---|---|---|---|
| writebuffersize | 12MB | 12MB | |
| clientbuffering | False | True | For benchmark, keep this the same as what you use to run YCSB over native HDFS. |
| recordcount | 1000 | 1000000 | |
| operationcount | N/A | N/A | Depends on the number of operations you want to benchmark. For example, 20M operations |
| threads | N/A | Variable | Depends on the number of threads you want to benchmark. |
| requestdistribution | zipfian | Not changed | |
| recordsize | 100*10 | Not changed | YCSB for HBase takes 100 bytes per field and 10 fields for one record. |
Important: While creating the HBase table before running YCSB, you need to pre-split the
table accordingly. For example, you need to pre-split the table into 100 partitions for ~10 HBase
Region servers. If it is more than 10 HBase Region servers, you need to increase the pre-split
partition number.
If you do not pre-split the table, all requests are handled by limited HBase Region servers and therefore the performance of YCSB is impacted.