Running YCSB/HBase
This section lists the steps to run YCSB/ HBASE.
- Download YCSB from YCSB 0.14.0.
- Extract the YCSB package into $YCSB_HOME.
- Remove all the libraries shipped by
YCSB:
#rm -fr $YCSB_HOME/hbase10-binding/lib/* - Copy the libraries from your Hadoop distro:The following is for HortonWorks HDP 2.6:
cp /usr/hdp/2.6.0.3-8/hbase/lib/* $YCSB_HOME/ hbase10-binding/lib/ cp /usr/hdp/2.6.0.3-8/Hadoop/*.jar $YCSB_HOME/ hbase10-binding/lib/ - Create the following script accordingly:The script for loading:
vim ycsb_workload_load.sh #!/bin/bash set -x # <script> <thread-number> <YCSB_RESULT_HOME> ${YCSB_HOME}/bin/ycsb load hbase10 -P ${YCSB_HOME}/workloads/workloada -p columnfamily=family -p recordcount=${RECORD_COUNT} -s -threads $1 -p measurementtype=timeseries -p timeseries.granularity=2000 2>&1 > ${YCSB_RESULT_HOME}/$2/workload_load.output.thread-$1-.`date "+%y%m%d_%H%M%S"`In the previous script, RECORD_COUNT is the record number you want to load into HBase. RECORD_COUNT should be 1M. <thread-number> depends on your running.
If you want to change writebuffersize and clientbuffering, you could add
-p <writebuffersize> -p <clientbuffering>for the previous YCSB command.The script for YCSB workload A/B/C/D/E/F:#vim ycsb_workload_a.sh #!/bin/bash # <script> <workloadA-recordcount> <thread-number> <result sub dir> ${YCSB_HOME}/bin/ycsb run hbase10 -P ${YCSB_HOME}/workloads/workloada -p columnfamily=family -p operationcount=${OPERATION_COUNT} -p recordcount=$1 -s -threads $2 -p measurementtype=timeseries -p timeseries.granularity=2000 2>&1 | tee ${YCSB_RESULT_HOME}/$3/workload_a.output.thread-$2.`date "+%y%m%d_%H%M%S"`
Similarly create the other scripts for workload B/C/D/E/F.
You need to first run the ycsb_workload_load.sh script. After it loads data into HBase, run ycsb_workload_a.sh or other scripts.