Workloads/Benchmarks information

This topic describes the Workloads/Benchmarks information.

Teragen

Shipped with Hadoop release.

Terasort

Terasort is shipped with Hadoop package. It is located at $BI_HOME/IHC/hadoop-example.jar.

TPC-H

Usually, TPC-H for big data is done with Hive. Also, some users will do TPC-H for big data with Pig. The former one is more widely used.

TPC-H data generation (DBGEN) should be downloaded from TPC-H website:

TPC-H over hive:

Running TPC-H queries on Hive

TPC-H-Hive

Imperative and Declarative Hadoop: TPC-H in Pig and Hive

Hive is shipped with BigInsights®. For more information, see APACHE HIVE TM.

YCSB

YCSB (Yahoo! Cloud System Benchmark) is widely used to benchmark some no SQL db, such as hbase. You can download it from YCSB.

Hibench

Hibench is a workload-mixed benchmark. It consists of nine different workloads (such as TeraGen, Terasort, DFSIO, Hive etc). It is used to evaluate the cluster performance under many different workloads.

HiBench

Hive

Shipped with BigInsights (v2.1). You can download the benchmark for Hive from Running TPC-H queries on Hive.

For more information, see APACHE HIVE TM.