Custom Lineage Mappings bring together the best of all worlds: automated lineage from scanners, lineage emitted by existing OpenLineage producers and custom lineage defined through OpenLineage payloads.
IBM watsonx.data intelligence uses OpenLineage, an open standard for describing how data moves at runtime or design-time. It allows organizations define lineage events for any system, including custom and legacy ones, and bring them together in a unified, governed lineage graph that reveals how data moves and transforms across the entire data journey.
Each OpenLineage event describes how data moves: the inputs it reads, the job that transforms it and the outputs it produces. These events can be generated automatically by tools such as dbt or Airflow, or customers can decide whether to emit them automatically or document them manually for proprietary and legacy environments. Once imported, watsonx.data intelligence aligns these events with known assets and visualizes them within the broader lineage view.
To ensure accuracy and consistency, watsonx.data intelligence provides two core mechanisms:
- Technology templates define how jobs and datasets defined in OpenLineage payloads are interpreted and displayed. watsonx.data intelligence includes pre-defined templates and allows you to build custom ones, ensuring consistent lineage across diverse technologies.
- Data source mapping automatically links OpenLineage datasets and jobs to existing data assets in watsonx.data intelligence. You can also define custom rules to map specific namespaces to known data sources, stitching custom lineage seamlessly into the governed enterprise view.
As a result, Custom Lineage Mappings provides a governed, unified graph that integrates automated and custom lineage.