Support My IBM Log in

IBM Support

Spark SQL vs. Big SQL Performance

Technical Blog Post


Abstract

Spark SQL vs. Big SQL Performance

Body

Last month, we provided an update of Big SQL vs Hive performance tests running the Hadoop-DS benchmark. Hive is based on map reduce and Java while Big SQL is using a native C/C++ MPP engine – so its not surprising that Big SQL was 20X faster, on average.

This month, we’ve performed a similar test against Spark SQL. It is commonly said that Spark is 10X to 100X faster than Map Reduce. How will Big SQL compare?

Both Spark SQL and Big SQL leverage Hive metastore and storage model. The major difference between the two (from SQL perspective) is the Optimizer and execution engine.

Preserving Hive metastore and its storage model is critically important for preserving openness of your data. Hive is the de-facto standard for SQL on Hadoop as it is included in every commercial Hadoop distribution. Big SQL preserves your data in Hive format so that if you fall out of love with Big SQL, you can always uninstall it and use something else….. your data remains compatible with Hive or other SQL engines that support the same standard like Spark SQL.

For this test, we compared the latest version of Spark 1.5.0 and Big SQL V4.1 – both running on the IBM Open Platform. (We actually also tested Spark 1.5.1 but it actually ran 7% slower than 1.5.0 — so we’re sharing Spark’s best result). Two 20-node clusters were setup with the same specifications on Softlayer using bare metal servers configured according to IBM’s reference architecture.


[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16259977

Overview Annual report Corporate social responsibility Financing Investor Newsroom Security, privacy & trust Senior leadership Careers with IBM Website Blog Publications Automotive Banking Consumer Goods Energy Government Healthcare Insurance Life Sciences Manufacturing Retail Telecommunications Travel Our strategic partners Find a partner Become a partner - Partner Plus Partner Plus log in IBM TechXChange Community LinkedIn X Instagram YouTube Subscription Center Participate in user experience research Podcasts Contact IBM Privacy Terms of use Accessibility United States — English