Exploitation of data is critical to business success, and quicker data processing improves an organization’s ability to react to business events in real time. As a result, organizations are bringing together new types of data from a variety of internal and external sources for real-time data or near-real-time analytics. This can involve building data lakes and information hubs — often on public clouds — fed by real-time streaming technologies, to process and gain value from this variety of data. All these trends drive a growing need for capabilities that can effectively feed data into information hubs, data lakes and data warehouses and thereafter quickly process large data sets. These capabilities empower quick responses to changing business events, better engagement with clients, and more.
As organizations struggled to manage the ingestion of rapidly changing structured operational data, a pattern emerged in which organizations leverage data initially delivered to Kafka-based information hubs.
Kafka was conceived as a distributed streaming platform. It provides a very low latency pipeline that enables real-time event processing, movement of data between systems and applications, and real-time transformation of data. However, Kafka is more than just a pipeline; it can also store data. Kafka-based information hubs go well beyond feeding a data lake; they also deliver continuously changing data for downstream data integration with everything from the cloud to AI environments and more.
To help organizations deliver transactional data from the OLTP databases that power the mission-critical business applications into Kafka-based information hubs. IBM® Data Replication provides a Kafka target engine that applies data with very high throughput into Kafka. The Kafka target engine is fully integrated with all of the IBM data replication low-impact log-based captures from a wide variety of sources, including Db2® z/OS®; Db2 for iSeries; Db2 for UNIX, Linux® and Windows; Oracle; Microsoft SQL Server; PostgreSQL; MySQL; Sybase; Informix®; and even IBM Virtual Storage Access Method (VSAM) and Information Management System (IMS).
In the event that the requirement does not involve delivery to Kafka, the IBM data replication portfolio also provides a comprehensive solution for delivery of data to other targets such as databases, Hadoop, files, and message queues.
There is often little room for latency when delivering the data that will optimize decision making or provide better services to your customers. Hence, you need the right data replication capability that can incrementally replicate changes captured from database logs in near-real time. In turn, this capability can facilitate streaming analytics, feeding a data lake, and more, using the data landed by IBM replication into Kafka.