Feature spotlights

Deep integration into the Hadoop ecosystem

Exploit Hive, Hbase and Spark using a single database connection. Whether on the cloud, on premises or both, access data across Hadoop and relational databases.

Advanced cost-based optimizer and massive parallel processing (MPP)

Run smarter queries supporting more concurrent users with less hardware compared to other SQL solutions for Hadoop. Run all 99 TPCDS queries up to 100 TB with numerous concurrent users. Provides an ANSI-compliant SQL parser to run queries from unstructured streaming data using new APIs.

Data science-ready

Build, train, deploy and manage AI models, and prepare and analyze data for machine learning in a single, integrated environment. Integrate with Apache Spark for easier data delivery and faster processing.

SQL compatible with other vendors, products and dialects

Integrate with Oracle, the IBM Db2® product family, including the IBM Db2 AI database, and IBM Netezza® and provide federated access to relational database management system (RDBMS) sources outside of Hadoop with IBM Fluid Query. Connect to HDFS, RDBMS, NoSQL databases, object stores and Web HDFS.

User-friendly, familiar SQL interface and tools

Based on standard compliant open database connectivity (ODBC) and Java database connectivity (JDBC), an administrator can easily start and stop services, set up users, and views, and define alerts and notifications.

Enterprise security

Robust role-based access control (RBAC), row-based dynamic filtering, column-based dynamic masking with Apache Ranger integration provides centralized security administration and auditing for data lakes. Advanced row and column security empowers self-service data access.

Available on IBM Power Systems

Build on IBM Power Systems and drive the ability to crush the most advanced data applications — from the mission-critical workloads you run today to the next generation of AI.

Use cases

  • Binary code disappearing into the distance

    Better data-driven decisions  

    Problem

    New forms of unstructured and semi-structured data needing integration with traditional structured data.

    Solution

    Integrate new forms of semi and unstructured data (social media, sentiment, streaming audio/video, log and more) with your traditional structured data using advanced querying capabilities.

  • Data center with overhead lights

    Data warehouse modernization to free up bandwidth and storage  

    Problem

    Massive amounts of historical or "cold" data taking up space and driving up costs in your enterprise data warehouse (EDW) built on Netezza, Oracle ExaData and Teradata.

    Solution

    Modernize your data warehouse to free up bandwidth and storage. Cloudera Enterprise Data Hub and Db2 Big SQL provide a superior platform for offloading historical or “cold” data in Oracle data marts and warehouses to Hadoop and also to port applications easily and accelerate time to value.

  • Young man wearing headphones and looking at code on monitor

    Real-time and ad hoc data queries  

    Problem

    Access to data in Hadoop is difficult for data users; data scientists, line of business analysts and developers.

    Solution

    Equip your data users with the right tools including Db2 Big SQL integrated with Apache Spark so they can do ad hoc and real-time queries to meet the needs of the business.

  • Futuristic-looking black blocks

    Operational and process improvements  

    Problem

    Growing requirements for operational and process improvements.

    Solution

    Use virtualization and federation to unify data access across the logical enterprise data warehouse, Cloud and Hadoop for more accurate data-driven decisions.

Next Steps

Ask an expert