IBM BigIntegrate is a big data integration solution that provides superior connectivity, fast transformation and reliable, easy-to-use data delivery features that execute on the data nodes of an Apache Hadoop cluster. IBM BigIntegrate provides a flexible and scalable platform to extract, transform and integrate your Hadoop data.
Part of the IBM InfoSphere® Information Server product family built specifically to run on Hadoop clusters, BigIntegrate and IBM BigQuality offer end-to-end integration and governance capabilities for your Hadoop data.
Provides Hadoop big data integration
Provides a massively scalable, shared-nothing, in-memory data integration engine running natively in a Hadoop cluster to help bring enterprise big data analytics capabilities to the data lake.
Empowers data profiling
Delivers a rich set of data profiling capabilities to understand the assets that are moved into Hadoop.
Improves big data navigation
Uses metadata management to help make sense of the enormous quantities of information in the data lake.
Applies big data governance
Delivers big data-related governance features such as impact analysis and data lineage on virtually any integration point, enabling scalable analytics without sacrificing organizational insight.
Uses real-time analytics
Transforms big data projects with real-time analytical processing. Integrates with IBM Streams. Uses standard data integration conventions to gather and pass data to powerful big data analytics.
User interface modernization and consolidation
A key integration tool within the IBM InfoSphere Information Server product family, the IBM InfoSphere DataStage® Flow Designer features automatic schema propagation, highlighted compile errors, type-ahead search and compatibility with any existing DataStage job.
Improved management and runtime on the data lake
High resilience for running Information Server on Hadoop is gained through measures such as improved high availability, higher precision in container estimation, and queue management for faster, uninterrupted execution.
Connectors to many data sources
The feature-rich palette includes connectors to a wide range of data sources including major traditional databases running on platforms that include distributed, IBM z/OS® and file-type systems. BigIntegrate supports common big data enterprise storage applications including those built upon Oracle, Salesforce, SAP, Hadoop and other distributed big data warehousing frameworks. A simple drag-and-drop user interface makes data sources more easily available.
Easier integration of Hadoop big data
Empower your application developers to more easily manage and seamlessly integrate Hadoop distributed big data sources. Provide a fully featured, scalable Hadoop integration platform for discovering, finding, transforming and integrating big data — no matter where it resides or what data type. Apply governance rules faster and with less effort, enabling scalable analytics, such as impact analysis and data lineage, without sacrificing organizational insight.
Data warehousing transformation
Combine traditional data warehouse tools with current big data distributed storage techniques and technologies. Use the unique capabilities of BigIntegrate to tap the full potential of Hadoop data storage clusters, stream computing technologies, data exploration, advanced analytics and IBM Watson® cognitive computing. Deliver the power of big data insights to your enterprise application users more efficiently and faster.
The power of IBM BigQuality
Fully leverage the scale and capacity of Hadoop big data by using IBM BigIntegrate with IBM BigQuality to achieve information empowerment across your big data ecosystem. Continuously cleanse and monitor data quality to transform your integrated Hadoop big data into consumable trusted information for a wide range of enterprise applications.