Manta Flow Alation: Integration Architecture
Components
The following diagram shows which components are used for the integration and what the relations between them are.
Integration Process
The integration process is divided into three phases.
-
Metadata extraction and lineage analysis
-
Export
-
Upload
Metadata Extraction and Lineage Analysis
During the metadata extraction and lineage analysis phase, only the IBM Automatic Data Lineage components (and source systems) are utilized. At the beginning of this phase, Automatic Data Lineage extracts the metadata necessary for the lineage analysis of the source systems (databases as well as ETL, analytical, and reporting tools) and produces the lineage as a product of the metadata analysis. At the end of this phase, data lineage is available in Automatic Data Lineage.
Export
During the export phase, Automatic Data Lineage exports the lineage from the Automatic Data Lineage Repository to (JSON) files, which can then be uploaded to Alation. Thus, the component that is utilized the most during this phase is Manta Flow Server. This is also the first phase in which Automatic Data Lineage interacts with Alation (at least if any analytical tools, reporting tools and/or custom databases are exported).
The reason for this is that the output files containing lineage for analytical/reporting tools have to contain the IDs of the Alation BI servers.
In the case of custom databases, the files have to contain the ID of the Alation Virtual Datasource servers (VDS).
Thus, at the beginning of the export of the analytical/reporting tool, Automatic Data Lineage connects to Alation and:
-
Lists all the BI/VDS servers
-
If an existing BI/VDS server should be used (based on the mappings documented in the Manta Flow Alation: Manta Configuration article), Automatic Data Lineage fetches the ID of the BI/VDS server
-
If a new BI/VDS server should be created, Automatic Data Lineage creates the BI/VDS server and fetches its ID.
The connection to Alation is not needed for the rest of the export process. As a result of the process, JSON files containing the exported lineage and analytical/reporting assets are created in the Automatic Data Lineage output folder.
Upload
During this phase, all the exported files containing the data lineage and analytical/reporting assets are uploaded to Alation.
The upload is done in batches according to the configuration described in Manta Flow Alation: Manta Configuration.
Once the upload phase is finished, up-to-date lineage is available in Alation. Only update and insert operations are performed during the upload phase; none of the objects previously ingested by Automatic Data Lineage are deleted.
The following diagram illustrates all the interactions between Automatic Data Lineage and Alation in greater detail.