Workloads/Benchmarks information
This topic describes the Workloads/Benchmarks information.
Teragen
Shipped with Hadoop release.
Terasort
Terasort is shipped with Hadoop package. It is located at $BI_HOME/IHC/hadoop-example.jar.
TPC-H
Usually, TPC-H for big data is done with Hive. Also, some users will do TPC-H for big data with Pig. The former one is more widely used.
TPC-H data generation (DBGEN) should be downloaded from TPC-H website:
TPC-H over hive:
Imperative and Declarative Hadoop: TPC-H in Pig and Hive
Hive is shipped with BigInsights®. For more information, see APACHE HIVE TM.
YCSB
YCSB (Yahoo! Cloud System Benchmark) is widely used to benchmark some no SQL db, such as hbase. You can download it from YCSB.
Hibench
Hibench is a workload-mixed benchmark. It consists of nine different workloads (such as TeraGen, Terasort, DFSIO, Hive etc). It is used to evaluate the cluster performance under many different workloads.
Hive
Shipped with BigInsights (v2.1). You can download the benchmark for Hive from Running TPC-H queries on Hive.
For more information, see APACHE HIVE TM.