Accelerating batch processing with IBM DB2 Analytics Accelerator

This article provides an overview of the benefits that a company can achieve by introducing IBM® DB2® Analytics Accelerator in batch processing systems. The example given is based on a real implementation done at Swiss Re, a major re-insurance company based in Zurich, Switzerland.

Share:

Philipp Spaeti (philipp.spaeti@ch.ibm.com), Executive IT Architect, IBM

Philipp Spaeti is an IT architect with more than 20 years of experience in the IT business. Currently, he is an executive IT architect with IBM Financial Services Sector in the role of a client technical adviser to major financial services companies. He provides his expertise, conceptual capabilities, and problem-solving capability in major customer projects with particular focus on driving innovation and adoption of new technologies as a catalyst for growth. Spaeti also is a member of the IBM Academy of Technology.



16 April 2013

Also available in Chinese Russian

Introduction

The IBM DB2 Analytics Accelerator is a workload-optimized appliance, which enables companies to integrate business insights into operational processes to drive winning strategies. It combines the System z® quality of service and IBM DB2 Analytics Accelerator hardware-accelerated analytics, enabling it to speed up complex queries and deliver unprecedented response times in a highly secure and available environment.

A key element DB2 Analytics Accelerator brings is its transparent integration into DB2, which allows running unchanged workload and queries that automatically benefit from faster response times and reduced load on the mainframe central processors.

Swiss Re is always looking for ways to make its IT environment more cost-effective and better-performing. Therefore, in the past years, it has converted its financial reporting system from COBOL batch to Java™ batch on z/OS®. This allows for achieve significant cost savings.

In order to make a further quantum leap, the company introduced IBM DB2 Analytics Accelerator to accelerate the execution of batch jobs and move workload to the accelerator appliance, freeing up capacity for the benefit of other applications.


The solution

Swiss Re's reporting system is a batch-oriented system that runs on z/OS. In the past years, the data growth has posed some challenges that led Swiss Re in partnership with IBM to implement innovative solutions that allowed it to contain costs and increase the performance to satisfy new business challenges.

The first step has been to convert the batch processes from COBOL to Java by leveraging WebSphere® Compute Grid on z/OS. This allowed Swiss Re to offload a good portion of the workload to System z Application Assist Processors, while continuing to use the same application interfaces as well as the same database and data models.

The second step was then to offload the database processing portion of the workload to an external appliance. The introduction of IBM DB2 Analytics Accelerator allowed the company to reach this goal and to evolve to a new era of optimization.


IBM DB2 Analytics Accelerator

IBM DB2 Analytics Accelerator integrates behind the application layer of a DB2 for z/OS environment. This provides complete transparency to end users and applications that submit queries into DB2 for z/OS subsystem. No changes need to be made to the connectivity or the application design to take advantage of the IBM DB2 Analytics Accelerator. This avoids many training and integration issues associated with deploying new technology.

The data at the table level must simply be loaded to the Analytics Accelerator via a GUI (IBM Data Studio). As of V3, the maintenance data currency is automated, and changes to the base table can be automatically replicated to the accelerator.

Figure 1. Transparent access to IBM DB2 Analytics Accelerator via DB2
Diagram shows query flow from application, through interface, optimizer, and IDAA DRDA requestor, to the SMP host

With the introduction of the IBM DB2 Analytics Accelerator, you now have a number of options for processing queries in an existing DB2 for z/OS environment.


Query flow options in DB2 for z/OS

There are essentially three query flow options:

  • Today, a subset of DB2 processing is routed to the IBM System z Integrated Information Processors (zIIP), including parallel queries and DRDA processing. These queries continue to pull data from the DB2 database on disk, moving the data into the real memory on the System z processor to be completed by DB2 for z/OS.
  • With the IBM DB2 Analytics Accelerator, a subset of the tables in DB2 are copied and compressed onto the IBM DB2 Analytics Accelerator. DB2 z/OS recognizes that these tables are also available for special processing. When the DB2 Optimizer determines how to best resolve a query, it evaluates whether the query could be more quickly processed on the IBM DB2 Analytics Accelerator. Typically, the Optimizer has been designed for queries that have specific OLAP-style characteristics that scan a FACT table within a star schema and return answers that are aggregations. The data is scanned by the IBM DB2 Analytics Accelerator code, and the answer set is passed back to the DB2 environment. The IBM DB2 Analytics Accelerator simply adds another option for processing incoming queries. Just install the IBM DB2 Analytics Accelerator, and DB2 for z/OS will identify and route appropriate queries within your workload to the new environment for faster execution.
  • Any remaining queries that do not qualify for the zIIP engine, or DB2 Analytics Accelerator processing will continue to be processed on the general processor within the DB2 for z/OS environment, where data continues to be accessed off the disk. Workload Management (WLM) policies can be implemented to ensure that queries are prioritized relative to the rest of the workload, ensuring that the most important tasks are completed first.

DB2 for z/OS leverages its available options to select the most efficient way to process incoming requests. Compared to other solutions, integration with the database management system goes far beyond a simple communication link. Management and administration is controlled through DB2 for z/OS. In this context, IBM DB2 Analytics Accelerator is a virtual resource pool for DB2 for z/OS. Proven characteristics associated with DB2 for z/OS such as security, reliability, and continuous availability aren't compromised; even in the case of the unlikely unavailability of the IBM DB2 Analytics Accelerator, business-critical BI solutions will stay online.

The current release of IBM DB2 Analytics Accelerator is targeted at dynamic SQL processing and control parameters allow to define whether queries should always be offloaded to IBM DB2 Analytics Accelerator, only when the optimizer decides, or never. The requirement to add capability for static SQL has been formulated and may be added in the future.

However, in general, data warehousing tools, as well as decentralized applications connecting to DB2 z/OS, most commonly use dynamic SQL. Therefore, the current solution covers many use cases. Executing most queries on the accelerator allows eliminating many indexes that DB2 would need for optimal query performance. This decreases the storage needs and helps achieve better insert and update performance. The updates then get automatically propagated to the accelerator and are available for querying.


Solution architecture

Swiss Re runs its z/OS system in active-active mode in two data centers that lie 10 km apart from each other. Therefore, the design point of the IBM DB2 Analytics Accelerator architecture was to fulfill all requirements imposed by the System z landscape architecture.

The IBM DB2 Analytics Accelerator design is such that a failure of the appliance would not affect the execution of a query, as it would simply be executed classically within DB2 (i.e., without acceleration). However, the first experiences show that the acceleration of queries brings response-time improvement by factors. This fact then may induce changes and improvements of the business processes, so IBM DB2 Analytics Accelerator becomes a critical factor of the infrastructure.

To accommodate this new requirement, a better high-availabilty design was needed. This was achieved by attaching an IBM DB2 Analytics Analyzer system on each side and cross-connect them across the two data centers such that each System z can reach each Analytics Accelerator box.

Being one single accelerator appliance fully redundant in itself, with this high-availability concept, the system has no single point of failure, as even the failure of a whole data center or of the connection between the centers can be supported.

Figure 2. Connectivity between EC12 and IBM DB2 Analytics Accelerator across two data centers
Diagram shows connection from Nexus on data center 1 to Nexus on data center 2

To allow fail-over and maintain the same performance for the end-user application, the tables needed by the applications are copied to both IBM DB2 Analytics Accelerator boxes.


Results

The acceleration of queries achieved shows quite a broad distribution from three times faster to 90 times faster or even more. So the first finding was that the performance improvement is not measured in percent but rather in factors, which is significant. Furthermore the acceleration has a dependency on the size of the result set.

The source tables in the data warehouse contain between 500 and 900 million rows and the result set range from reports (queries) producing five rows to reports (queries) producing more than 100,000 rows. The reports with the largest result sets usually benefit less acceleration than those with a small result set.

Interesting observations were also done at the storage level. A table that at the source occupies 400-GB raw data (uncompressed)in DB2 gets compressed down to 123 GB; and in IDAA, down to 40 GB. This results in a compression factor of 10x. The load time for such a table was measured in 29 minutes (i.e., 800 GB/hour).


Summary

IBM DB2 Analytics Accelerator brings multiple benefits:

  • Acceleration of current workload thus productivity increase for the people running reports.
  • Off-loading of long-running CPU-consuming queries to IBM DB2 Analytics Accelerator; free up main processors, giving the system more capacity without increasing costs.
  • No changes needed to the applications nor queries.
    • Investing time to adapt queries to exploit specific accelerator strengths (like SUM function) brings additional benefits.

All these factors summed up bring significant improvements for the TCO as it allows running more workload at the same costs.

Furthermore, IBM DB2 Analytics Accelerator also brings benefits to the DB2 administrator, as common query-tuning activities can be eliminated, as the system needs no tuning, no indices, and no partitioning.

IBM DB2 Analytics Accelerator makes it possible to hold operational data and analytics data on the same platform, bringing the opportunity to decrease the time needed for data transfer and allowing the business to rapidly access the most up-to-date data possible. This can give the company unprecedented competitive advantage.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Big data and analytics on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Big data and analytics, Information Management
ArticleID=870696
ArticleTitle=Accelerating batch processing with IBM DB2 Analytics Accelerator
publish-date=04162013