Manta Flow Collibra User Documentation
Definition of Exported Entities
Manta Flow analyzes SQL scripts (procedures, view definitions, macros, ad-hoc scripts, etc.) and database structures as well as ETL, analytical, reporting, and modeling tools. It then imports to Collibra DGC both database structures and metadata about interesting SQL statements, ETL transformations, and analytical models and reports. The export is based on the latest revision in the Manta Data Lineage repository.
Manta Flow creates/updates:
- Assets for all database dictionary objects (for example, tables, views) as well as ETL, analytical, reporting, and modeling tool objects
- Assets for database transformation objects—if there is a sequence of transformations (for example., a procedure calling a function that calls another function), only the last applied transformation is exported (in this case the procedure) and the lineage through the functions is contracted without listing the specific intermediate transformation object
- Hierarchical relations for all assets created/updated
- Relations mapping logical data dictionary assets to physical data dictionary assets
- Relations / complex relations for all data lineage relationships
Export Model
The following diagram illustrates the domain model of Collibra DGC assets, relations, and complex relations used by Manta Data Lineage. The default asset types, relation types, and complex relation types used in this model can be changed in the configuration. For more information, see Collibra DGC Entity Types.
Browsing Assets
All the assets are imported to communities and domains that are configured for export. The default configuration is that the whole Manta Data Lineage repository is exported to one DGC community and the assets are separated into five domains as follows.
- Systems & Databases (technology assets)
- Physical Data Assets (physical data dictionary)
- Data Transformation (mapping domain)
- Logical Model (logical data dictionary)
- Reports (report catalog)
The Export Model shows which assets can be exported to which communities.
-
To browse the assets, open the domain containing a particular asset (for example, the Physical Data Assets domain in the Controlling community in this case).
-
The list of assets in the domain can be either flat or hierarchical. The hierarchy button can be used to set up the hierarchy of assets in the view. The view in the image below has the following hierarchy of assets: Schema contains Table, Table contains Column.
-
The asset overview screen can be opened by clicking on the name of the asset. Below is an overview of a Mapping Specification asset representing an Oracle stored procedure. The asset's attributes and relations are visible in the overview. In this case, there are Source and Target relations representing the table-level lineage and Field Mapping complex relations representing the column-level lineage (and also containing the extracted transformation logic) for the stored procedure Import. Transformation logic is created by collecting transformations (for example, ETL tools) or calls of functions and procedures (for example, SQL).
Lineage Diagram
A Collibra diagram can be shown for every single asset that is in Collibra DGC, but in order to show the lineage diagram, the typical starting assets are:
- Table-level data assets (for example, Table, View, Report) or column-level data assets (for example, Column, Report Attribute)
- Table-level logical data assets (Data Entity) or column-level logical data assets (Data Attribute)
- Transformation assets (Mapping Specifications)
The diagrams in Collibra are highly customizable. Manta Data Lineage a set of predefined diagrams that are suitable for data lineage diagrams.
-
The (lineage) diagram can be opened by selecting Diagram on the asset overview screen.
-
It is possible to change the type of diagram that is shown by using the Diagram Combobox or to even change the settings of the current diagram by using the Edit Diagram View Configuration button.
-
It is possible that the Diagram View being used does not contain all the relations that are defined in the displayed assets. In such cases, it is possible to show these relations and their source/target assets by using the Explore context menu for a particular asset.
-
Also, it is possible to restart the Diagram (or a different Diagram View) from a particular asset.
Export Statistics
The export to Collibra generates a report on the number of uses of each rule in collibraExportMantaMapping.csv
and the number of cases where there was a no-matching rule for a node in the Manta Data Lineage. The report is saved in a
JSON file located in
mantaflow\cli\output\collibra\exportReport.json
.
The following is an example of a report showing an ORCL database with an HR and DB2 database with IADB using a default mapping to a Collibra resource.
{
"/DB2/IADB":1,
"/ORCL/HR":232,
"No mapping rule found":3
}
Deleted/Obsolete Object Handling
Depending on whether Integration API or Synchronization API is used, the export handles deleted/non-existent objects and flows differently.
- Synchronization API automatically removes any objects (e.g., tables, stored procedures, and so on.) from Collibra that were imported from Manta Data Lineage but are not available anymore in the source system or in the current revision in Manta Data Lineage.
- Integration API does not delete any objects. Depending on how your Manta Data Lineage and Collibra administrators configured the integration, it is possible that the deleted objects have a status set to a specific value (for example,
Deleted
,Deprecated
,Obsolete
, and so on). However, these assets still exist in Collibra so that they can be governed. As they no longer participate in your data flows, you are free to remove them if that's what your governance practice requires.
Consult your export configuration with your Manta Data Lineage and Collibra administrators.