Feature spotlights

Deep integration into the Hadoop ecosystem

Exploit Hive, Hbase and Spark using a single database connection. Whether on the cloud, on premises or both, access data across Hadoop and relational data bases.

Advanced cost-based optimizer and massive parallel processing (MPP)

Run smarter queries supporting more concurrent users with less hardware compared to other SQL solutions for Hadoop. Run all 99 TPCDS queries up to 100TB with numerous concurrent users. Provides an ANSI-compliant SQL parser to run queries from unstructured streaming data using new APIs.

Data Science-ready

Build, train, deploy and manage AI models, and prepare and analyze data for machine learning in a single, integrated environment. Integrate with Spark for easier data delivery and faster processing.

SQL compatible with other vendors, products and dialects

Integrate with Oracle, IBM Db2 and IBM Netezza® and provide federated access to relational database management system (RDBMS) sources outside of Hadoop with IBM fluid Query. Connect to HDFS, RDBMS, NoSQL databases, object stores and Web HDFS.

Hybrid Flex

IBM Hybrid Data Management Platform allows you to leverage all available data, no matter the type, source or structure. Simply purchase IBM FlexPoints and allocate towards multiple resources with a single, subscription-based license.

User friendly, familiar SQL interface and tools

Based on standard compliant ODBC and JDBC, an administrator can easily start and stop services, set up users, and views, and define alerts and notifications.

Enterprise Security

Robust Role-Based Access Control (RBAC), row-based dynamic filtering, column-based dynamic masking with Ranger integration provides centralized security administration and auditing for data lakes. Advanced row and column security empowers self-service data access.

Available on IBM Power

Build on IBM Power Systems and drive the ability to crush the most advanced data applications — from the mission-critical workloads you run today to the next generation of AI.

Use cases

  • New forms of unstructured and semi-structured data



    New forms of unstructured and semi-structured data needing integration with traditional structured data.


    Federate new data formats; social media/sentiment, streaming media, sensor, log and more. Store massive amounts of data without the initial processing costs.

  • Massive amounts of historical or "cold"



    Massive amounts of historical or "cold" data taking up space and driving up costs in you Enterprise Data Warehouse (EDW) built on Netezza, Oracle ExaData and Teradata.


    Free up bandwidth and storage by moving to an Hadoop based data lake. Big SQL is a superior platform for offloading Oracle Data Marts and Warehouses to Hadoop.

  • Access to data in Hadoop is difficult for data users



    Access to data in Hadoop is difficult for data users; data scientist, line of business analysts
    and developers.


    Equip your data users with the right tools including Db2 Big SQL so they can do ad hoc and real-time queries to meet the needs of the business.

  • Growing requirements for operational and process improvements.



    Growing requirements for operational and process improvements.


    Build analytics that unify across different data types and sets. Use virtualization and federation to unify data access across the logical data warehouse, Cloud and Hadoop.

Next Steps