CSI: DB2 - Part Two
Historical Data Forensics On Demand for Audit Defense
In the CSI: DB2 – Part One blog, I discussed the "data forensic" enabling features possible on IBM Z using DB2 for z/OS Temporal Tables. This second entry discusses how the IBM DB2 Analytics Accelerator (The Accelerator) rounds out the story.
IBM DB2 Analytics Accelerator:
The IBM DB2 Analytics Accelerator for z/OS (the Accelerator) is a high-performance appliance for DB2 z/OS that deeply integrates Netezza balanced and highly parallelized asymmetric massively parallel processing technology with IBM Z technology at the database kernel level. The Accelerator allows DB2 to offload data-intensive and complex static and dynamic DB2 queries (e.g. data warehousing, business intelligence, and analytic workloads) to the Accelerator without any application changes. The Accelerator then executes these queries significantly faster than was previously possible, while avoiding expensive general purpose (GP) CPU utilization in DB2 for z/OS. The performance and cost savings of the Accelerator opens up unprecedented opportunities for organizations to make use of their data on the IBM Z platform. It allows users to run workloads that historically were offloaded from IBM Z, thus lowering or eliminating the cost of acquiring HW and SW for data warehousing and analytics as well as lowering or eliminating the cost incurred from data movement, transformation, landing, storage, and maintenance of systems. With the Accelerator, clients can consolidate disparate data to their existing IBM Z platform while benefiting from integrated operational BI. It allows users to run queries that were governed or shunted in DB2 for z/OS such as ad hoc queries whose performance characteristics are typically unknown at runtime. And IT administrators can allow DB2 for z/OS to choose where to run these queries, or they can force these queries to the DB2 Analytics Accelerator to prevent additional DB2 for z/OS consumption.
This blog specifically focuses on two capabilities within the Accelerator that facilitate temporal data analysis: High Performance Storage Saver (HPSS) for online archiving, and Accelerator-only tables (AoT’s) for intra-Accelerator Extract, Load, Transform (ELT) and creating/storing snapshots.
High Performance Storage Saver
With data quantities exploding, archiving has become an even more important component of your data management strategy. While active data that is still changing remains available in source systems, unchanging historical data is generally archived for future use. Driven by cost, historical data is typically archived to less expensive, tape-based solutions. This includes archives that remain online due to auditing requirements and archives stored online for more permanent storage. While it is not being actively updated, the reality is that most archived data must still be analyzed by end users. For online archives, data access experiences a latency that eliminates real-time analysis options, because analysis typically requires nightly batch processes to restore the data. This limits the use of this archived data to trend-type analyses.
To address these challenges, the DB2 Analytics Accelerator includes an online archiving functionality called High Performance Storage Saver (HPSS). Using HPSS to move historical data into the DB2 Analytics Accelerator reduces the data volume in the DB2 for z/OS table; this means smaller indexes and smaller materialized query tables. Just as important, moving this historical data into the DB2 Analytics Accelerator enables active analysis without any restore process. Access to both the historical and active data is managed through the same DB2 for z/OS interface and is protected through RACF security as opposed to interacting with the tape archive or copying the data to a separate DBMS and maintaining a separate data interface. And archive data durability is ensured through existing backup/recovery strategies. By employing HPSS, both active and online historical data are always available to the user for analysis and auditing purposes—with near zero latency—for online historical data queries. The DB2 for z/OS optimizer manages traffic and directs any queries that touch archived data to the DB2 Analytics Accelerator, while queries against strictly non-archived data follow the DB2 Analytics Accelerator’s usual query routing criteria. This means there will be near zero CPU consumption associated with the analysis of the historical data. In addition, you can realize cost savings by removing any batch jobs that take data from tape and load them in online tables. Range partitioning temporal tables on the SYSTEM_END timestamp guarantees that all of its partitions except for the most recent partition are not changing. Thus, those partitions can be archived onto the DB2 Analytics Accelerator. The aforementioned snapshot queries can then be run on the temporal data within the Accelerator.
Please note that the NPS code on the DB2 Analytics Accelerator does not support timestamp (12) columns used for SYSTEM_TIME and BUSINESS_TIME. The DB2 Analytics Accelerator resolves this issue by mapping the precision 12 timestamp column to a precision 6 timestamp column. The impact is that the precision of these timestamps when the data is returned from the DB2 Analytics Accelerator differs from when the data is returned from DB2 for z/OS.
Please refer to this Redbook, which discusses temporal support in DB2 for z/OS and the DB2 Analytics Accelerator in great detail.
As their name suggests, these tables only exist in the Accelerator. This newer capability helps organizations expand the business-critical queries that can be processed on the Accelerator. Accelerator-only tables (AoT) can be used to further simplify data-transformation processes by virtually eliminating the need for data marts and complex ETL processes. This can help your organization reduce IT sprawl associated with analytics. In addition, Accelerator-only tables allow organizations to derive real-time insight from z/OS transactional systems. In the context of temporal data processing, Accelerator-only tables can be used to store the result sets of temporal snapshot queries that are run on the Accelerator. These AoTs can then be queried directly for further temporal/audit analysis.
The DB Analytics Accelerator can help reduce the costs associated with storing temporal data. Additionally, it can greatly improve the time to analyze the current and temporal data without impacting transactional systems. The Accelerator represents an “online archive” with the added benefit that all access is managed by the transactional system. This means that end users don’t have to create separate connections to the data with separate credentials. By maintaining one point of access to the enterprises sensitive data, the liability of data breach is mitigated. For more information on this topic, or to request a free consultative workshop on these two technologies, please contact your IBM sales representative or visit the following website: https://ibm.biz/BdXTz4
About the author
Shantan Kethireddy is an IBM Master Inventor and the NA Solution Architect for analytics solutions on IBM Z products. He is responsible for leading consultative business value validation engagements for big data and analytics. He holds a Masters in Computer Engineering and Electrical Engineering from the University of Iowa and possesses two dozen patents, primarily focused on data-centric technologies.
Thanks to the following people for their valuable contributions:
- James Knisley - IBM Client Technical Specialist
- Ruiping Li - Senior Technical Staff Member in DB2 for z/OS development at IBM Silicon Valley Lab
- Patricia Zakhar - Portfolio Marketing Manager, DB2 Analytics Accelerator and z Analytics
Disclaimer: The comments in this blog are based upon the author’s current knowledge and personal experiences. All comments are the author’s personal view and do not necessarily reflect the positions or opinions of IBM or its affiliates. You should conduct independent tests to verify the validity of any statements made in this blog before basing any decisions upon those statements.