DB2 10: What's new in DB2 10 for Linux, UNIX, and Windows
Increased performance and reliability, lower operational costs, and easier application development
Each new DB2 version builds upon the "pillars of DB2", including low operation costs, ease of development, and reliability - and DB2 10 for Linux, UNIX, and Windows is no exception. With this latest version, you can create time-aware tables and queries in less than an hour, protect data privacy with less duplication, designate optimal storage allocation for data based on its currency, compress data dynamically and incrementally, and lower the costs and risks of creating and maintaining point-in-time queries.
DB2 10 for Linux, UNIX, and Windows will help you to save time and money, deliver high performance, capture business insight, simplify data security, and increase team productivity for a new generation of applications and operational analytics.
Table 1 summarizes the main features and enhancements in DB2 10, classified according to version objectives and DB2 pillars:
Table 1. Main features and enhancements in DB2 10, by version objective and DB2 pillar.
|Pillars of DB2|
|DB2 10 objectives||Low operational costs||Ease of development||Reliability||Other enhancements|
|Save time and money||Adaptive compression||SQL compatibility enhancements|
|Multi-temperature data management|
|Deliver top performance||Performance improvements||DB2 pureScale enhancements||Continuous data ingest|
|Increase team productivity||Time Travel Query|
|Other enhancements||Data security enhancements|
Adaptive compression is a significant enhancement to DB2 compression technologies. Adaptive compression is an advanced row compression technique that uses two compression dictionary types (table-level and a new page-level dictionary) to improve compression ratios, especially when data changes. Of course, a better compression ratio translates into performance improvements for I/O-bound systems and database backup operations, as well as increased storage savings, but with adaptive compression, an additional benefit is that you don't need to perform explicit table reorganizations to maintain these high compression ratios. Page-level compression dictionaries are created automatically, and if the content of a page changes significantly, page-level compression dictionaries are recreated automatically. When a page becomes full, page-level compression is applied, and more storage on that page is freed up immediately.
Adaptive compression is the default behavior for new tables that are enabled for row compression. You can enable an existing table for adaptive compression by using the ALTER TABLE statement with either the COMPRESS YES clause or the new COMPRESS YES ADAPTIVE clause. After an existing table has been enabled for adaptive compression, all subsequently updated or newly-added data is subject to compression. To apply adaptive compression to the data that already exists in a table, simply perform a table reorganization.
Adaptive compression is available as part of the Storage Optimization Feature (SOF) for Enterprise Server Edition, and SOF is included in Advanced Enterprise Server Edition.
Multi-temperature data management
The underlying concept upon which multi-temperature data management is built is known as a storage group. Storage groups map to storage classes in a data warehouse, and you can assign table spaces to storage groups based on the "temperature" of the data that they contain. The major advantage of this feature is that you can physically separate recent and more frequently accessed (hot) data from older and less frequently accessed (warm or cold) data by placing "hot data" on your fastest storage and cold data on your slowest storage.
By segregating your data in this way, you reduce the need for faster and more expensive storage, especially because hot data tends to represent only a small fraction of the total amount of data in a typical data warehouse. As your hot data cools down and is accessed less frequently, you can easily move it to slower storage by moving it to another storage group. The data movement is online, happens in the background, and can be suspended (and later resumed) so that higher priority workloads can be processed without undue delay.
This feature fully integrates with DB2 workload management, enabling you to prioritize queries based on the data that they access. You can assign a data tag attribute (a value from 0 to 9) to a storage group or table space, and the data tag is subsequently used to determine how a specific workload is to be handled. You can even prioritize an activity based on the data that the activity accesses, either predictively (before the activity starts) or reactively (while the activity is executing).
Predictive prioritization uses a list that contains the data tags for all table spaces that the compiler believes will be accessed during execution of an activity. You can define work class sets to identify activities that have a particular data tag in their estimated data tag lists. You can then define a work action to map any activities that match a work class set to a specific service class before they begin to execute.
Reactive prioritization (using the new DATATAGINSC threshold) maps an activity to a specific service class at run time, based on the data tag with which the accessed data is associated. Take advantage of reactive prioritization if the compiler cannot accurately determine in advance what table spaces will be accessed.
Multi-temperature data management is available in Enterprise Server Edition and Advanced Enterprise Server Edition.
DB2 10 delivers a number of enhancements whose immediate effect will be faster out-of-the-box query performance with less work by your DBAs, and more consistent and repeatable query performance across normal system and database changes.
The following enhancements focus on improving certain classes of warehouse queries.
- Queries with hash joins. In many cases, a hash join is the most efficient join strategy for queries that include a join, and in DB2 10, a hash join can be used in more situations. The query optimizer will automatically consider a hash join when the join predicates include expressions even if the two columns in the join are not of the same data type.
- Queries over star schemas. Star schema performance improvements include an improved star schema detection algorithm and a new join method. The improved algorithm enables the query optimizer to detect queries that are based on a star schema design and to use appropriate strategies to improve performance for those queries. You can also use the new zigzag join method to expedite the processing of queries that are based on a star schema design.
The following enhancements focus on improving statistical views, which the query optimizer uses to generate better access plans.
- Predicates containing complex expressions. The query optimizer can now use statistics from expression columns (columns with one or more functions) in statistical views to generate better access plans.
- Fewer statistical views are now required to obtain good statistics for a star join query if referential integrity constraints are defined.
- Column group statistics on statistical views. Combining column group statistics with statistical views can improve access plans in certain situations.
- The DB2 automatic statistics collection function can now automatically gather statistics for statistical views.
You can set certain registry variables that affect the SQL compiler in the optimization profile at either the global level or the statement level. The optimization profile also supports inexact matching, which ignores literals, host variables, and parameter markers when statements are being matched.
You can specify the new INDEXSAMPLE option on the RUNSTATS command to more efficiently collect index statistics by using a sampling method instead of scanning an entire index. The default method for collecting detailed index statistics has also changed to use sampling. Of course, you still have the option to collect detailed index statistics by scanning the entire index, as in previous releases (just specify the UNSAMPLED clause).
SQL compatibility enhancements
DB2 10 contains enhancements that make it easier to work with the DB2 product if you are more familiar with other relational database products. The following enhancements, which build upon a list of SQL compatibility enhancements that were released with previous versions of DB2 for Linux, UNIX, and Windows, reduce the time and complexity of enabling some applications to run in a DB2 environment.
- More flexible and function-rich CREATE TRIGGER support. The trigger-event clause can now contain more than one INSERT, UPDATE, or DELETE operation, which means that a trigger can be activated by any combination of specified events. Although you can specify one, two, or all three trigger events in a single CREATE TRIGGER statement, a trigger event cannot be specified more than once. You can also use the new trigger event predicates (INSERTING, UPDATING, and DELETING) in a triggered action to test the event that activated a trigger. A trigger event predicate is only valid in the triggered action of a compiled trigger definition. And finally, the FOR EACH STATEMENT option is now supported in the CREATE TRIGGER statement for PL/SQL triggers. You can create triggers that fire only once per statement.
- Declared user-defined data types and procedures. You can now declare user-defined data types and procedures that are local to a compound SQL (compiled) statement. Information about these types and procedures is not stored in the system catalog.
- New scalar functions. The INSTRB function returns the starting position, in bytes, of a string within another string. The TO_SINGLE_BYTE function returns a string in which multi-byte characters are converted to the equivalent single-byte character, if possible.
Time Travel Query
The Time Travel Query function enables you to query data as it was at any point in the past, or as it will be at some point in the future. This is accomplished by using temporal tables to store a history of your data changes (deleted rows or the original values of rows that have been updated) so that you can query the past state of your data. You can associate date and time ranges to specific rows of data, thereby defining business validity periods for the data in your table. Time Travel Query helps you make your existing DB2 tables time aware and compliant with audit requirements. It also helps you to avoid the cost of developing, implementing, and maintaining complex time-aware applications.
You can easily enable the Time Travel Query function for existing tables by using the ALTER TABLE statement. DB2 supports the following three types of temporal tables.
- System-period temporal tables manage data according to system time. DB2 keeps a running history of rows that have been updated or deleted. Each row and its corresponding history are assigned a pair of system timestamps. When a row has been deleted, it is automatically moved to a history table that is transparent to users and applications.
- Application-period temporal tables manage business time. Applications provide dates or timestamps to describe when the data in a given row was or will be valid in a business context.
- Bitemporal tables manage both system time and business time, enabling applications to manage the effective dates of their business data while DB2 keeps a full history of data changes.
DB2 10 is the first database product in the industry to provide temporal capabilities based on the ANSI/ISO SQL:2011 standard. The temporal data management capabilities in DB2 are seamlessly integrated with all other database features.
Time Travel Query is available in most DB2 editions, starting with DB2 Express.
Data security enhancements
Row and column access control (RCAC) limits data access to those users who have a business need to know. RCAC makes it easy to set up a rich security policy. In DB2 10, RCAC is implemented by two different approaches that address the shortcomings of traditional row and column access control methods: row permissions and column masks. No database user is automatically exempt. The ability to manage row permissions and column masks rests with the SECADM, and users with DATAACCESS authority can no longer freely access all of the data in the database.
- A row permission is a database object that expresses a row access control rule for a specific table; it is essentially a search condition that describes which rows you can access. For example, a doctor can see only the rows that represent her patients.
- A column mask is a database object that expresses a column access control rule for a specific column in a specific table, using a CASE expression that describes what you see when you access the column. For example, a teller can see only the last 4 digits of a credit card number. Column masks replace the need to create views to implement access control.
Row permissions and column masks require no application changes; row and column access control is based on specific rules that are transparent to existing applications. Moreover, enforcement of your security policy does not depend on how applications or tools access the data. RCAC also facilitates multi-tenancy, which means that several independent customers or business units can share a single database table without being aware of one another.
You can enable row and column access control for new or existing tables by using the ALTER TABLE statement.
How is RCAC different from LBAC? Label-based access control (LBAC) is a security model that is primarily intended for defense applications. LBAC requires that data and users be classified, and implements a fixed set of rules. RCAC is a general-purpose security model that is primarily intended for commercial customers. RCAC allows you to create your own security rules.
Row and column access control is available in most DB2 editions, starting with DB2 Express.
DB2 pureScale enhancements
The DB2 pureScale feature is a clustering technology that reduces the risk and cost of business growth. DB2 pureScale enables you to scale your system with near-linear efficiency and predictable query performance, and without modifying applications or changing how data is distributed.
With the DB2 pureScale feature, multiple database servers, known as members, process incoming database requests. These members operate in a clustered environment and share data. The cluster caching facility (CF) handles centralized lock management and global caching. If a member fails, only the database requests that were being processed by the failed member must be resubmitted, and all other in-flight requests are unaffected. But perhaps best of all, you can add or remove members to adjust to your changing workload, without changing your applications, redistributing data, or tuning performance.
In DB2 10, the DB2 pureScale feature includes the following changes, which were also included in DB2 9.8 FP4.
- Support for split mirror backup and SET WRITE operations
- CF support for multiple low-latency, high-speed cluster interconnects
- Support for network topologies with multiple network switches
The DB2 10 pureScale feature contains additional improvements, including the following enhancements.
- DB2 workload management integration
- Support for range partitioned tables
- Multiple enhancements in the areas of monitoring and performance optimization
- Table space-level backup and recovery
The DB2 pureScale feature is now more tightly integrated with the DB2 for Linux, UNIX, and Windows core engine, and in DB2 10, you can install the feature while installing Enterprise Server Edition, Workgroup Server Edition, or Advanced Enterprise Server Edition. DB2 pureScale is suggested to be used with pre-tested configurations, as described at "DB2 pureScale".
Continuous data ingest
The new high-speed ingest utility streams data from files in various data formats or pipes into DB2 tables. Ingest operations are fast because the utility has a multi-threaded architecture, and will quickly populate even large databases in partitioned database environments.
The ingest utility does not lock the target table; rather, it uses row locking to minimize its impact on concurrent activities against the same table.
Ingest comes with a rich set of data manipulation capabilities, including SQL-like interfaces for INSERT, UPDATE, MERGE, REPLACE, and DELETE statements, and the ability to apply SQL expressions to input data.
You can also specify what is to be done with rows that are rejected by the ingest utility (by using the DUMPFILE clause) or by DB2 (by using the EXCEPTION TABLE clause).
Continuous data ingest is available in Advanced Enterprise Server Edition.
A rich tool environment
Tools that support database administration and application development can enhance your performance gains and overall productivity.
- IBM Data Studio, which replaces the existing DB2 Control Center, is included in every edition of DB2 10. Data Studio incorporates advanced development, management, and monitoring tools into a single Eclipse-based offering. Multi-temperature data management, temporal data management, and row and column access control all rely on the tools support that is provided by Data Studio.
- DB2 Advanced Enterprise Server Edition now includes IBM InfoSphere Data Architect, which you can use to discover, model, visualize, relate, and standardize diverse and distributed data assets across your enterprise.
- InfoSphere Optim Performance Manager now has a number of enhancements to support DB2 pureScale.
- InfoSphere Optim Query Workload Tuner is included to provide expert recommendations for maximizing application performance.
- InfoSphere Optim Configuration Manager is included to offer centralized management of database and client configuration.
By adhering to the design pillars of low operational costs, ease of development, and reliability, DB2 10 for Linux, UNIX, and Windows delivers a large number of enhancements to help save you time and money. The resulting improvements in productivity from both development and administrative perspectives support a new generation of applications operating in an increasingly complex and demanding business environment.
- Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.