Last month, IBM and Hortonworks announced an expansion of their partnership (press release) focused on extending IBM’s data science, machine learning and complex SQL for big data to more developers and across the Apache Hadoop ecosystem. One of the initiatives was “Hortonworks and IBM will create new solution that integrates HDP with IBM Big SQL, IBM’s SQL engine for Hadoop, giving Hortonworks’ legions of clients and users a familiar method of managing their data.”
Today we are excited to announce Big SQL 5.0, delivering on what we announced last month. Big SQL 5.0 will tightly integrate with Hortonworks Data Platform (HDP), extend the capabilities of Hive, and exploit HBase and Spark to provide best-in-class analytics capabilities for big data.
Big SQL offers the following capabilities for big data and modern data warehouse needs:
- Federation and integration
With its federation capabilities using Fluid Query, Big SQL can virtualize data from many different data stores such as Hive, HBase, Spark, DB2, Oracle, SQL Server, Netezza, Informix, Teradata, WebHDFS and object store. These federation capabilities can be managed quite easily from an intuitive UI.
Big SQL offers bi-directional integration with Spark, Apache Spark 2.1 integration, and features efficient synthesis between Spark executors and Big SQL worker nodes. Big SQL uniquely exploits Apache Hive, Apache HBase, and Spark concurrently for best-in-class analytics capabilities.
- SQL compatibility
Big SQL understands SQL dialects from other offerings such as IBM DB2 database and IBM Netezza data warehouse appliances and Oracle database, making it a platform that is well suited for RDBMS offload for fast and easy consolidation. For example, it comes with built-in support for Oracle’s SQL and PL/SQL dialects, which enables many applications that were written against Oracle to run in Big SQL virtually unchanged. Being able to immediately develop Big SQL applications affords a considerable advantage to customers and developers who have invested in Oracle SQL application and PL/SQL development skills.
- Elastic scalability and unmatched performance
Big SQL now offers Yarn integration through Slider for efficient allocation of resources. In addition, Big SQL has added a new technology called “Elastic Boost” that can significantly improve its performance (up to 50%). It enables allocation of multiple workers per node for more efficient CPU and memory utilization.
Big SQL comes with an ANSI-compliant SQL parser that can run all the 99 TPC-DS queries without the need for query modifications and structured streaming with new APIs. It can scale up to 100TBs with multiple concurrent users with fewer number of nodes than many other SQL engines– a capability that can’t be easily rivaled.
- Enterprise security
Big SQL comes with built-in row and column level security. It is very granular, rule based, and supports Hive tables, native (DB2) tables, and HBase tables uniformly. It also provides Role Based Access Control (RBAC) for easier management of access and privileges. In addition, Big SQL supports policy based authorization through Ranger with its Ranger plugin. Big SQL can also help enhance Hive security by providing selective data masking capabilities without the need for lookup tables.
All these features work seamlessly with Hortonworks Data Platform. We are very excited to join forces with Hortonworks to continue driving innovation and leadership in the open source community and provide our clients with the opportunity to maximize the value of their data. For more information about the release, please visit the IBM Hadoop page (ibm.com/hadoop)