New innovations from IBM Data Integration: Real-time quality data and analytics with next-gen deployment flexibility

11 December 2024

 

Author

Scott Brokaw

Director, Product Management, Data Integration, IBM

Today’s business leaders face pressure to operationalize artificial intelligence (AI), but the exploding volume, velocity, and variety of data—set to triple global storage by 2032—pose challenges and a refocus on a hybrid cloud data strategy. Such challenges include the ability to deliver quality data without delays. To stay competitive and drive growth, enterprises necessitate a scalable solution that can deliver high-quality, real-time data, adaptable to their unique deployment requirements.

IBM Data Integration, a crucial component of IBM Data Fabric, addresses this need with exciting new product innovations. IBM’s approach offers a complete and unified set of capabilities that provide users with multiple integration patterns for constructing production-ready pipelines to deliver real-time data for AI and analytics, across all data types, hybrid-multi cloud infrastructure, and use cases.  

The IBM Data Integration team continues to evolve to meet client’s needs and simultaneously enhance user experience with the following new offerings:

  • IBM StreamSets client-managed software
  • IBM DataStage as a Service expansion to IBM Cloud Sydney
  • IBM Data Replication ingestion for watsonx.data
  • IBM Data Integration for Unstructured Data
  • IBM DataStage ELT pushdown to watsonx.data
  • IBM Databand self-hosted deployment option on AWS marketplace

Let’s take a deeper dive into each of these new features.

IBM StreamSets client-managed software

Enterprises require real-time data for agile business decisions. IBM’s recent acquisition of StreamSets addresses this necessity by providing data teams with the tools to build streaming data pipelines. The Software as a Service (SaaS) version of the StreamSets architecture features a control plane delivered as a SaaS offering, while customers independently manage the data plane within their own environments. This approach ensures data security and efficiency, enabling users to execute pipelines seamlessly wherever their data resides.  

The team is excited to announce an additional deployment option of StreamSets. The client-managed version of the control plane empowers organizations to take complete control of their data integration and transformation processes within their own infrastructure. This new offering caters to enterprises needing greater control over their data environments due to regulatory compliance, data sovereignty concerns, or specific security requirements and running both the control plane and data plane in their own VPC or data center.  

Learn more about this new technology.

IBM DataStage as a Service expands into a new region, IBM Cloud Sydney

Enterprises must navigate increasing regulatory pressures, emphasizing data residency compliance and ensuring data management aligns with their chosen regions and geographies. To address this need, the team is excited to announce DataStage is generally available in the IBM Cloud® Sydney region, with additional expansion announcements coming soon. With this expansion, a wide range of clients will be able to leverage the power of DataStage capabilities without the concern of infrastructure and installation. This release reiterates the investment made to modernizing to meet client's data strategies, accelerating new opportunities for them to supercharge their cloud data integration projects while adhering to unique data sovereignty regulations.  

Learn more about this new technology.

IBM Data Replication ingestion for watsonx.data

For users seeking to synchronize their data in near-real time, IBM Data Replication serves as an integration pattern that seamlessly unlocks data across diverse environments. By capturing source system changes as they occur with minimal impact to transactional sources, this product ensures near real-time data availability for downstream consumption and use cases such as analytics and AI, driving greater business agility and actionable, data-driven insights.  

IBM Data Replication continues to demonstrate its commitment to enabling modern workloads with the new support of watsonx.data as a target system. By enabling Data Replication for watsonx.data, users can unlock the value of transactional data in their distributed source systems and make that data available for analytics and machine learning at scale.  

Learn more about this new technology.

Unstructured data integration tech preview

Unstructured data is a valuable asset for organizations, offering rich customer insights and serving as a critical resource for driving AI innovation. Unfortunately, converting unstructured data into a usable format for AI is a labor-intensive and time-consuming task. Enter a new capability: IBM Data Integration for Unstructured Data. With this technology you will be able to build pipelines that ingest, transform, and process unstructured data for retrieval augmented generation (RAG) use cases. By leveraging a similar approach to structured data ingestion and promoting reusability, users will be able to seamlessly operationalize unstructured data through an automated solution, unlocking the full potential of generative AI.  

Learn more about the new technology and sign-up for the preview waitlist.

IBM DataStage ELT pushdown to watsonx.data

In addition to real-time streaming, IBM offers users batch-style integration with IBM DataStage. DataStage is a leading data integration tool that supports extract, transform, and load (ETL) and extract, load, transform (ELT) workflows. DataStage is evolving to address modern customer needs with new ELT capabilities, including expanding pushdown optimization to IBM watsonx.data™, IBM's hybrid, open data lakehouse. This new capability enhances the integration between DataStage and the lakehouse by combining the strength of the solutions.  

With DataStage, users can now directly load raw data and perform transformation within the lakehouse, capitalizing on its near-unlimited compute, resources, and storage. With transformations pushed down and occurring where data resides, suboptimal data movement between data centers is eliminated, optimizing execution performance as well as reducing data ingress or egress. This new capability highlights IBM’s commitment to delivering runtime flexibility that aligns with diverse infrastructure and use case needs.  

Learn more about this new technology.

IBM Databand self-hosted on AWS marketplace

In addition to providing clients multiple integration patterns purpose fit to their use case, IBM Data Integration strengthens the health of and visibility into these integration pipelines with built-in data observability powered by IBM Databand®. This technology provides continuous data observability with issue detection and remediation capabilities, enhancing the quality of data for real-time data and analytics as well as optimizing for costs.  

With Databand now available as a self-hosted solution on AWS, organizations can deploy in private environments, aligning with where their data resides. This approach provides complete control over data governance and security configurations.

Learn more about this new technology.

An industry-leading solution

IBM was recently named a Leader in the 2024 Gartner® Magic Quadrant™ for Data Integration Tools, a testament to IBM’s investment in enabling customers modern data workloads with a complete and performant approach to data integration.

The new features available underscore this recognition, further demonstrating how IBM supports the entire lifecycle of data integration and the strong commitment to clients’ existing and new workload requirements. With multiple integration patterns and the flexibility to support hybrid-multi cloud infrastructure, organizations can unlock value wherever their data resides for all their use cases. IBM Data Integration provides customers a complete, unified set of data integration capabilities so they can optimize data usage for AI, business intelligence, analytics, and so much more.

Read more about why IBM was named a leader in the 2024 Gartner® Magic Quadrant™ for Data Integration Tolls for the 19th consecutive year.

Learn about the future of IBM Data Integration directly from the product management team.