What's new

IBM® Db2® Big SQL 7.1.0.0 has improved performance, usability, serviceability, and consumability capabilities.

Highlights

Support for IBM Spectrum Scale: Db2 Big SQL supports GPFS (IBM Spectrum® Scale). For more information about IBM Spectrum Scale requirements, see Requirements for installing Db2 Big SQL.
Db2 Big SQL integration with Apache Ozone Storage: Apache Ozone Storage stores encrypted and dispersed data across multiple geographic locations. Db2 Big SQL can access and run SQL analytics on data that is stored in Apache Ozone.
Upgrade from Db2 Big SQL 5.0.4 and 6.0.0: If you are currently running Db2 Big SQL 5.0.4 on Hortonworks Data Platform (HDP), you can now upgrade to Db2 Big SQL 7.1.0 on Cloudera Data Platform (CDP) Private Cloud Base 7.1.3, 7.1.4, and 7.1.6. If you are currently running Db2 Big SQL 6.0.0 on HDP, you can now upgrade to Db2 Big SQL 7.1.0 on CDP Private Cloud Base 7.1.6. For more information, see Upgrading Db2 Big SQL via CLI.
Note: When you upgrade to Db2 Big SQL 7.1.0, you also get the new features that were added to Db2 Big SQL 6.0.0 and 7.0.0, which are described below.
Improved full transactional table (ACID) support: Db2 Big SQL support for transactional tables is enhanced, and includes support for Db2 Big SQL load operations. For more information, see Transactional tables in Db2 Big SQL.
Note: Although supported, the use of transaction tables is discouraged in Db2 Big SQL. See Restrictions on transactional tables for a list of restrictions with using transactional tables.
ANALYZE command enhancements: Synopsis data that is related to statistics collection is no longer stored in a table’s directory. When this enhanced version of the ANALYZE command runs against a table for the first time, any existing synopsis data for that table is deleted. If you want to retain this synopsis data (for example, if older instances of Db2 Big SQL are accessing the table), set the bigsql.stats.v2.preserve property in the bigsql-conf.xml file to true. If you subsequently want to delete this synopsis data, you must do so manually. For more information, see ANALYZE command.
Row and column access control (RCAC): Ranger security support for Db2 Big SQL now includes masking and row-level filter policies in the Hadoop SQL and Db2 Big SQL plugins. For more information, see Creating masking policies and Creating row-level filter policies.
Db2 Big SQL integration with Apache Atlas: Apache Atlas is an open source data governance tool that is used for classifying, cataloging, and governing data assets, thereby enabling enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allowing integration with the whole enterprise data ecosystem. For more information, see Db2 Big SQL integration with Apache Atlas.
JSON web tokens (JWT) support: Token authentication is a mechanism for generalizing tokens such that they can be used for authentication to the Db2 Big SQL server in a unified method. For more information, see Enabling JSON web tokens (JWT) support.
New SYSHADOOP table functions: Two new table functions are supported in this release. The GET_CACHE_FILE_INFO table function and the GET_CACHE_TABLE_INFO function provide methods for retrieving information about the files for tables, as well as the tables, that are cataloged in the Hive metastore and that might be cached in the Db2 Big SQL scheduler cache. For more information, see GET_CACHE_FILE_INFO table function and GET_CACHE_TABLE_INFO table function.

Installation

Improved error handling when installing Db2 Big SQL workers: Four new configuration parameters are available to control how the installation proceeds when problems or timeouts occur while installing Db2 Big SQL workers. For example, you can specify the minimum percentage of workers that are allowed to fail or time out during the installation. For more information, see Db2 Big SQL configuration utility.

For information on known issues with this release, see the Release notes.

What's new in Db2 Big SQL 7.0.0

Highlights

Db2 Big SQL 7.0.0 is the first Db2 Big SQL release that runs on Cloudera Data Platform (CDP). Although there is no migration from previous Db2 Big SQL releases to this release, you have the opportunity to become familiar with CDP and some of the new Db2 Big SQL features.

Full transactional table (ACID) support

Db2 Big SQL 7.0.0 now supports full transactional tables, as defined in Hive 3.0. You can now create full transactional or insert-only transactional tables in Db2 Big SQL.

In version 7.0.0 (but not in 7.1.0), additional Hive configuration changes are needed to fully enable transactional tables. For details about those required changes, see Configuration considerations for transactional tables.

Important: For Db2 Big SQL 7.0.0 (and 7.1.0), it is recommended that you not use the automatic compaction feature for transactional tables, because of the potential issues that might occur. When it is known that compaction will not interfere with other statements, you can trigger a compaction operation manually by running an ALTER TABLE...COMPACT...AND WAIT statement.

For more information about the new transactional table support, see Transactional tables in Db2 Big SQL.

Querying real time data from Apache Kafka

Db2 Big SQL now supports queries against real time data from Apache Kafka streams. You can use the wide range of SQL capabilities and functions that are available in Db2 Big SQL to access and process this real time data. With additional support to read Kafka record metadata, you can create queries and views to answer questions about data changes in the stream over time. For more information, see Querying real time data from Apache Kafka streams.

BINARY and VARBINARY support in the ANALYZE command

You can now analyze a table that contains these data types, and statistics will be returned for those columns. For more information, see ANALYZE command.

Ranger integration

When Db2 Big SQL integration with Ranger is enabled, Db2 Big SQL will perform table authorization checks against the following objects:

The Db2 Big SQL plugin for native Db2 tables, views, and nicknames
The Hadoop SQL (Hive) plugin for Hadoop tables
The HBase plugin for HBase tables

The Db2 Big SQL plugin now supports column-level policies and policies against the {OWNER} variable. For more information about Db2 Big SQL integration with Ranger, see Ranger.

Reading files in HDFS subdirectories

Db2 Big SQL now supports the reading of data files in subdirectories by default. For more information, see Reading files in HDFS subdirectories.

Installation

Installation, configuration, and administration are managed through utilities.: In Db2 Big SQL on Hortonworks Data Platform (HDP), installation, configuration, and administration are done in Ambari. In Db2 Big SQL on CDP, utilities are used to do these tasks. For more information, see Installing Db2 Big SQL, Db2 Big SQL configuration utility, and Db2 Big SQL cluster administration utility.

Enterprise and performance

Sample scripts for workload management (WLM)

Db2 Big SQL has workload management and the user service class SYSDEFAULTUSERCLASS enabled by default. The SYSDEFAULTMANAGEDSUBCLASS is the subclass where heavy weight (resource intensive) queries are executed, and SYSDEFAULTSUBCLASS is the subclass where light weight queries are executed. This works well for many scenarios, but depending on the types of workloads, additional WLM capabilities might be required. In Db2 Big SQL 7.0.0, some sample WLM scripts are available, and you can use these scripts as templates for customized concurrency control on a Db2 Big SQL cluster. You can find these sample scripts in the $BIGSQL_HOME/wlm_concurrency_template directory. For more information, see Workload management.

Other performance improvements

The performance of reading data from and writing data to tables that use object storage is significantly improved. Db2 Big SQL 7.0.0 introduces new multithreaded and parallel capabilities, as well as improvements around caching.
Reading data from tables that use the PARQUET file format is also improved. The addition of lazy read semantics can result in as much as a 35% improvement in scan performance for very selective predicates.

Usability and serviceability

Improved serviceability and stability

Several key enhancements have been made to Db2 Big SQL in this area.

Enhanced Java™ DFSIO fenced mode process (FMP) handling: The stability of Db2 Big SQL is improved by the allocation of multiple FMPs to handle the Java DFSIO (readers and writers). This leads to better system management when issues such as insufficient memory or unstable FMP handling occur. The proactive restarting of Java DFSIO FMPs is also used to prevent certain problems.
Improved error messages: Improvements were made to several error messages to facilitate problem determination. For example, a reason string token was added to SQL5105N. In addition, some errors that previously returned SQL5105N will now return SQL5104N, with greater clarity around causation.
Improved diagnostic information: Additional diagnostic information is now logged for errors such as SQL5199N to facilitate problem determination.
Improved problem determination for memory issues: The db2pd -dbptnmem command is enhanced with a sub-option detail that shows a breakdown of the DFSRW_PRIVATE memory consumers.

For information on known issues with this release, see the Release notes.

What's new in Db2 Big SQL 6.0.0

Highlights

Db2 Big SQL 6.0.0 is the first Db2 Big SQL release that runs on Hortonworks Data Platform (HDP) 3.1.0.

New Db2 Big SQL management console: This release introduces the Db2 Big SQL console, which you can use for database monitoring, administration, and configuration. The console supersedes the Data Server Manager (DSM), which is not available in this release. For more information, see Db2 Big SQL console. Unlike DSM, the Db2 Big SQL console does not integrate with Knox.

Integration with Apache Atlas: Db2 Big SQL is now integrated with Apache Atlas, which provides a scalable and extensible set of governance services. Integration with Atlas enables the definition of tag-based security policies in Ranger. For more information, see Db2 Big SQL integration with Apache Atlas.

Installation and upgrade

Faster deployment: Db2 Big SQL head and worker node installation is now done in parallel, reducing deployment time.
HA in effect during upgrade: When upgrading from Db2 Big SQL 5.0.4 and higher, High Availability no longer needs to be disabled.
If you are upgrading from versions older than Db2 Big SQL 5.0.4, you must still disable HA prior to the upgrade.

Enterprise and performance

Enhanced data source support for federation: This release adds a number of new data sources and platforms for federation that are bundled in Db2 Big SQL 6.0.0. For details, see Data source matrix for the federation support that is bundled in IBM Db2 Big SQL.
Improved performance for star schema queries: An enhancement enables more filtering of data while tables in a star schema are joined.
Nickname support in the Ranger plugin: The Db2 Big SQL plugin for Ranger now supports access policies on nicknames. When migrating to 5.0.4 and above, if you have enabled the Ranger plugin and have nicknames defined in your Db2 Big SQL database, access policies for the nicknames will need to be created in the Ranger plugin.

SQL compatibility improvements

Improved Netezza® compatibility

Db2 Big SQL now has enhanced NZPL SQL support, including support for %ROWTYPE, %TYPE, FOUND, ROW_COUNT, REFTABLE, AUTOCOMMIT, and the SELECT INTO record type. There are also many SQL statement enhancements, additional data type support, and improvements to scalar functions that significantly increase Netezza compatibility.

Note: Netezza style external tables and the CREATE EXTERNAL TABLE statement are not supported in Db2 Big SQL.

For more information, see Compatibility features for Netezza Platform Software (NPS®).

Improved compatibility with Db2 for z/OS®

For improved compatibility with Db2 for z/OS, you can now specify the fetch-clause in a select-statement. For example, the following statement, which failed with SQLSTATE 42601 in previous releases, is now supported:

SELECT * FROM T ORDER BY 1 OPTIMIZE FOR 2 ROWS FETCH FIRST 2 ROWS ONLY

For more information, see fetch-clause.

You can also specify the offset-clause without the fetch-clause, as shown in the following example:

SELECT * FROM T ORDER BY 1 OFFSET 20 ROWS

For more information, see offset-clause.

Enhanced JSON support

Db2 Big SQL now supports a new set of built-in JSON SQL functions for enhanced SQL interaction with JSON data. With these functions, you can store, retrieve, and query JSON and BSON data directly by using SQL. You can also create JSON documents by using SQL. These new functions follow the grammar and semantics that are outlined in the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) SQL Technical Report TR 19075-6:2017. Part 6 of the report outlines a set of SQL for storing, querying, and publishing JSON data. The built-in Db2 JSON SQL functions reside in the SYSIBM schema and do not require any special privileges to invoke them. For more information, see JSON scalar functions, JSON_TABLE table function, and JSON_EXISTS predicate.

New DISTINCT predicate

The DISTINCT predicate enables you to compare two expressions and evaluates to TRUE if the values of the two expressions are not identical. For more information, see DISTINCT predicate.

Usability and serviceability

4-KB sector support: Support for storage devices that use 4-KB sector sizes is available as a technical preview only (not for production environments until further notice). For more information, see DB2_4K_DEVICE_SUPPORT in Performance variables.

For information on known issues with this release, see the Release notes.