Manta Flow Informatica EDC Integration
Manta Flow integrates with Informatica EDC through a set of CSV files that are produced by a Manta Flow process and loaded into Informatica EDC resources. There are two types of resources in EDC where metadata from Manta Flow is stored.
- Manta Resource — a resource based on Manta Model containing all the transformation objects like scripts, stored procedures, functions, statements, etc.
- Custom Lineage — a resource that contains links between EDC database management resources (like Teradata, Oracle, Microsoft SQL Server, IBM Db2, IBM Netezza) and Manta resources
Manta Model
To be able to load metadata from Manta Flow into Informatica EDC, it is first necessary to import Manta Model. To do this, follow these steps in the Informatica Catalog Administrator.
- Click on the Manage menu and select Models.
- Click on the arrow in the left panel and select New Custom Model.
- Select the model XML file from the
<MANTAFLOW_INSTALLATION_DIRECTORY>/cli/scenarios/manta-dataflow-cli/modelfolder, where the model XML file is:MantaEdcModel_EDC_prior_to_10.4.1.xml— EDC versions prior to 10.4.1MantaEdcModel.xml— otherwise
- You should see Manta Model in the list of custom models in the left panel.
If Manta Model already exists, you can update it by selecting the model from the list, clicking on the Update button, and selecting the MantaEdcModel.xml file or MantaEdcModel_EDC_prior_to_10.4.1.xml (EDC versions
prior to 10.4.1).
Manta Model contains these classes.
- Group — Group for any kind of input; is under a resource or another group object
- Package — SQL package
- Database — Database in which DDL scripts are stored
- Schema — Schema in which DDL scripts are stored
- Directory — Directory in which DDL scripts are stored
- Input — Any kind of input; is under a group object
- Script — SQL script
- Function — SQL Function
- Procedure — SQL stored procedure
- Trigger — SQL trigger
- Operation — Any kind of operation on data; is under an input object
- Statement — SQL statement that works with data
- Leaf — Any kind of data processor; is under an operation object
- Column — SQL column that works with data
As of EDC 10.4.1, Manta Model contains these additional classes.
- Parameter — Formal parameter of a procedure or function
- ProcessInstance — Instance of a procedure or function call
- ProcessInstanceParameter — Instance of a procedure or function call parameter
Manta Resource Type
Based on Manta Model, it is necessary to create a Manta resource type. To do this, follow these steps in the Informatica Catalog Administrator.
- Click on the Manage menu and select Custom Resource Types.
- Click on the plus button in the left panel.
- Fill in the form like this.
- Name: Manta Resource (please capitalize it exactly as it is written here)
- Description: Data transformations provided by Manta
- Model: com.getmanta.edc
- Connection types: leave blank
- Click on the OK button and you should see Manta Resource in the list of Custom Resource Types in the left panel.
Database Management Resources
For each database resource analyzed by Manta Flow, you need to first add a corresponding database management resource in Informatica EDC with a similar database/schema selection. To do this, follow these steps in the Informatica Catalog Administrator.
- Click on the New menu and select Resource.
- On the General page, enter the Name exactly as
${< technology >.dictionary.id}configured in Manta Flow for this database resource (e.g.,Oracle_ODS) and select the corresponding resource type (like Teradata, Oracle, Microsoft SQL Server, IBM Db2, or IBM Netezza). Fill out any additional attributes you need. - On the Metadata Load Settings page, check Enable Source Metadata and select all the schemas that are configured to be included in Manta Flow for this database resource. Fill out any additional attributes you need.
- Configure the last pages as necessary and Save the resource. If you want to schedule the load of this resource, always be sure to run it before loading any Manta resources.
- Before loading any Manta resources, load this one by clicking on Run.
Manta Script and Manta Link Resources
Automated Manta Resource Creation in EDC
If enabled in iedcExportCommon.properties (property
manta.iedc.autoCreateResource), Automatic Data Lineage can automatically create Manta script and, optionally, Manta link resources in Informatica EDC as part of the metadata upload into Informatica EDC (see Manta Flow Informatica EDC Export
Execution). This is done while executing Manta Flow Client by executing < technology >IedcUploadMetadataMasterScenario.bat (on Windows) or < technology >IedcUploadMetadataMasterScenario.sh (on UNIX-like systems).
If detailed lineage is enabled in iedcExportCommon.properties (property manta.iedc.exportDetailedLineage), the ETL Resource checkbox of the script resource created in EDC is checked. If detailed lineage is disabled,
the checkbox is unchecked.
Alternatively, both resources can be created manually.
Manta Script Resource (Manually)
For each database resource analyzed by Manta Flow, you need to create a corresponding script resource based on the Manta Resource type that contains all its transformation objects. To do this, follow these steps in the Informatica Catalog Administrator.
- Click on the New menu and select Resource.
- On the General page, enter the Name exactly as
${manta.resource.iedc.scripts} which is by default equal to${< technology >.dictionary.id}Scriptsconfigured in Manta Flow for this database resource (e.g.,'Oracle_ODSScripts', where'Oracle_ODS'is the dictionary ID andScriptsis appended) and select the Manta Resource type. - In Connection Properties:
- If applicable, click Choose and select the
objects.zipfile from the corresponding output folder; for example, for Db2, theobjects.zipfile will be located in(C:\mantaflow\cli\output\db2\DICTIONARYID\iedc). - If it exists in your version of EDC:
- Check the box ETL Resource to enable detailed lineage.
- Keep the box ETL Resource unchecked to disable detailed lineage.
- If applicable, click Choose and select the
- On the Metadata Load Settings page:
- Check Enable Source Metadata.
- Set Memory to High based on the expected volume of metadata to be loaded.
- Fill out any additional attributes you need.
- Configure the last pages as necessary and Save the resource. If you want to schedule the load of this resource, always be sure to run it after loading the corresponding database management resource and before loading the Manta link resource.
- After loading the corresponding database management resource, load this one by clicking on Run.
Manta Link Resource (Manually)
For each database resource analyzed by Manta Flow, you need to create a corresponding link resource based on the custom lineage type that contains links between database tables and columns and transformation objects from the Manta script resource. To do this, follow these steps in the Informatica Catalog Administrator.
- Click on the New menu and select Resource.
- On the General page, enter the Name exactly as
${manta.resource.iedc.links} which is by default equal to${< technology >.dictionary.id}Linksconfigured in Manta Flow for this database resource (e.g., 'Oracle_ODSLinks', where 'Oracle_ODS' is the dictionary ID and Links is appended) and select the Custom Lineage type. - In Connection Properties, click on Choose and select the
customLinks.csvfile from the corresponding output folder.- That is, for Db2, the
customLinks.csvfile will be located in(C:/mantaflow/cli/output/db2/<DICTIONARYID>/iedc).
- That is, for Db2, the
- On the Metadata Load Settings page:
- Check Enable Source Metadata.
- Set Memory to Medium or High based on the expected volume of metadata to be loaded.
- Fill in any additional attributes you need.
- Configure the last pages as necessary and Save the resource. If you want to schedule the load of this resource, always be sure to run it after loading the corresponding database management resource and Manta script resource.
- After loading the corresponding database management resource and Manta script resource, load this one by clicking on Run.