Blog Post by : Davendra Paltoo, Offering Manager, IBM Data Replication
In November 2017, IDC’s Data Integration and Integrity (DII) Software Research Group conducted a survey targeted at end-users of DII software, including similar questions as asked in 2015 to help identify trends. With the need to bring real time and most recent data in to the enterprise for analytics efforts, IDC survey respondents found that keeping data synchronized among applications is and will be the most prevalent use case for data integration with 51% of survey respondents indicating application data sync to be the top use case. IDC is also observing that data intelligence will grow from one of the least today, to one of the most prevalent by 2020, in support of data governance, profiling, discovery and knowledge1.
As organizations continue to face the challenge of bringing real time data for analytics applications/purposes, IBM provides the IBM® Integrated Analytics System (IIAS) which consists of a high-performance hardware platform and optimized database query engine software that work together to support various data analysis and business reporting features for today’s big data needs. IBM also provides Db2 Warehouse for Data Warehousing to provide users with in database analytics capabilities.
Users often need to integrate data from various data sources into their IIAS appliance or Db2 Warehouse deployment, if such technologies exist in their enterprise.
In addition, for an increasing variety of analytics use cases, only the freshest data is sufficient. Whether it is the customer interacting with a self-service portal or an executive looking for up to the minute financial performance, no organization can afford to serve up stale data. Yet, this can happen if organizations depend on periodic bulk movement of data around the enterprise.
IBM Data Replication provides up to the second replicas of changing data where and when needed. Our users are replicating operational data to everything from a traditional data warehouses to a data appliance such as the Pure Data Analytics (PDA) appliance or IIAS, to a Big Data cluster driven by Apache Kafka and Hadoop or even to a Cloud based OLAP environment such as Db2 Warehouse.
IBM data replication (Change Data Capture technology) can deliver changes using log based captures that minimize the impact on source databases from ALL supported CDC sources into IIAS and Db2 warehouse directly (i.e. in one hop).
In the recent release, IBM has introduced a new Mirror Bulk Apply option that supports Db2 External Tables as the apply mechanism for faster ingest into column organized tables within Db2 Warehouse deployed in the IIAS appliance or column organized tables in “standalone” Db2 Warehouse databases. This is as compared to the previously available apply mechanisms for applying changes to such column organized tables. External table bulk apply is the algorithm that CDC employs to apply changes to the IBM Pure Data Appliance or “Netezza”.
This support is now being extended to Db2 Warehouse. Such column organized tables are useful in databases intended for use in analytics since they aid query performance.
The new CDC apply performance capability will give end users the confidence that even the data from the most high volume transactional systems can be replicated with acceptable latency into IIAS and Db2 Warehouse’s column organized tables.
For more information on the Mirror Bulk apply capability please see our knowledge center.
For more information on IBM Data replication, please read the IBM Data Replication solution brief.