Analyzing data

Big SQL is the IBM SQL interface in the Apache Hadoop environment. You can use the familiar standard SQL syntax in Big SQL, as well as SQL extensions, with Hadoop-based technologies. You can use Big SQL to query, analyze, and summarize data.

Big SQL is not a replacement for relational database management systems (RDBMS). Rather, it is designed to complement and use a Hadoop-based infrastructure. You can use Big SQL to create tables with existing data by running the CREATE EXTERNAL HADOOP TABLE statement. Or you can create tables by running the CREATE HADOOP TABLE or the CREATE HBASE TABLE statement and then load data into those tables by running the LOAD HADOOP statement. You can also create a table and simultaneously load data that is returned by a query.

Big SQL supports several underlying storage mechanisms on the Hadoop distributed file system (HDFS), including the types that are referenced in File formats supported by Big SQL and JSON files. Big SQL also supports HDFS transparent encryption, which means that data is decrypted as it is read, but the files themselves remain encrypted.

Big SQL supports JDBC and ODBC client access from Linux and Windows platforms, which enables you to use your SQL skills, SQL-based business intelligence applications, and query or reporting tools to query data.

For installation information, see installing IBM Big SQL.