Turning hidden data flows into trusted insight: OpenLineage within IBM watsonx.data intelligence

IBM watsonx.data intelligence is expanding its OpenLineage support with Custom Lineage Mappings, an enhancement that brings those hidden data connections to light.

Colorful sin and cosine lines weaving with dots at the top of the waves

Many critical data flows run through custom-built systems, ETL tools and legacy applications that aren’t automatically captured by standard lineage solutions. These invisible connections obscure dependencies, slow audits and weaken both governance and AI readiness.

The challenge: Hidden data flows that undermine trust

Automated scanners cover many modern platforms, but much of your data’s journey still happens in custom scripts, pipelines and legacy systems. These unseen movements fragment the lineage picture, making it hard to trace dependencies or understand downstream impact.

The result is a fragmented picture of data journey with blind spots that obscure dependencies, slow audits, and weaken confidence in analytics and AI initiatives.

The solution: Seeing the full story of your data with Custom Lineage Mappings

Custom Lineage Mappings in IBM watsonx.data intelligence extend lineage beyond automated scanners, illuminating hidden data flows across modern, custom and legacy systems.

Built on the OpenLineage standard, this enhancement allows teams to connect every dataset, job, and transformation into one unified, governed view, thereby giving teams the clarity and confidence to act faster.

The result is a transparent, end-to-end picture of data dependencies and transformations empowering leaders to strengthen governance, accelerate audits and advance analytics and AI readiness with confidence.

How it works: Connecting automated and custom lineage

Custom Lineage Mappings bring together the best of all worlds: automated lineage from scanners, lineage emitted by existing OpenLineage producers and custom lineage defined through OpenLineage payloads.

IBM watsonx.data intelligence uses OpenLineage, an open standard for describing how data moves at runtime or design-time. It allows organizations define lineage events for any system, including custom and legacy ones, and bring them together in a unified, governed lineage graph that reveals how data moves and transforms across the entire data journey.

Each OpenLineage event describes how data moves: the inputs it reads, the job that transforms it and the outputs it produces. These events can be generated automatically by tools such as dbt or Airflow, or customers can decide whether to emit them automatically or document them manually for proprietary and legacy environments. Once imported, watsonx.data intelligence aligns these events with known assets and visualizes them within the broader lineage view.

To ensure accuracy and consistency, watsonx.data intelligence provides two core mechanisms:

  • Technology templates define how jobs and datasets defined in OpenLineage payloads are interpreted and displayed. watsonx.data intelligence includes pre-defined templates and allows you to build custom ones, ensuring consistent lineage across diverse technologies.
  • Data source mapping automatically links OpenLineage datasets and jobs to existing data assets in watsonx.data intelligence. You can also define custom rules to map specific namespaces to known data sources, stitching custom lineage seamlessly into the governed enterprise view.

As a result, Custom Lineage Mappings provides a governed, unified graph that integrates automated and custom lineage.

4 reasons why Custom Lineage Mappings matter to data leaders

Custom Lineage Mappings close the visibility gap in enterprise data understanding, giving organizations a clear, connected view of every data flow—no matter where it originates—so teams can act with confidence.

The benefits reach every team that relies on accurate, explainable data:

  • End-to-end visibility: See how data truly moves, across every system and transformation.
  • Audit-ready lineage: Provide regulators and stakeholders with a clear, traceable lineage.
  • Faster, informed decisions: Quickly assess dependencies and downstream impact before making changes.
  • Shared data understanding: Align data engineers, architects and governance teams around a single view of truth.

Built on open standards, powered by collaboration

IBM chose OpenLineage, an open standard maintained by the Linux Foundation AI and Data, to ensure interoperability and transparency across your data ecosystem.

OpenLineage provides the flexibility, extensibility, and freedom from lock-in that enterprises need, while watsonx.data intelligence adds IBM’s governance, security and enterprise integration.

  • Broad adoption: Trusted by a growing ecosystem of lineage-aware tools.
  • Openness and extensibility: Designed to evolve as your data ecosystem grows.
  • Freedom from lock-in: Open by design, ensuring your lineage remains yours.
  • Lower barrier to adoption: Familiar, consistent format that teams can start using immediately.

Built on OpenLineage, IBM removes proprietary formats and future-proofs customers’ investments in lineage documentation through open standards.

From lineage to leadership: Visibility designed for action

With Custom Lineage Mappings, IBM watsonx.data intelligence gives organizations the power to see and trust their entire data story across automated, custom and legacy systems.

The result? Stronger governance, faster impact analysis and confident AI decisions built on a foundation of open standards and enterprise trust.

As data ecosystems continue to evolve, IBM’s commitment remains the same: helping organizations turn hidden complexity into confident, data-driven decisions.

Learn more about Data Lineage within watsonx.data intelligence

Explore IBM watsonx.data intelligence demo library

Dive into the technical foundations of Custom Lineage Mappings in our community blog

Jakub Moravec

Product Manager, IBM watsonx.data intelligence

Diana Toma

Product Marketing Manager