Manta Flow Informatica EDC Integration

Manta Flow integrates with Informatica EDC through a set of CSV files that are produced by a Manta Flow process and loaded into Informatica EDC resources. There are two types of resources in EDC where metadata from Manta Flow is stored.

Manta Model

To be able to load metadata from Manta Flow into Informatica EDC, it is first necessary to import Manta Model. To do this, follow these steps in the Informatica Catalog Administrator.

  1. Click on the Manage menu and select Models.
  2. Click on the arrow in the left panel and select New Custom Model.
  3. Select the model XML file from the <MANTAFLOW_INSTALLATION_DIRECTORY>/cli/scenarios/manta-dataflow-cli/model folder, where the model XML file is:
    • MantaEdcModel_EDC_prior_to_10.4.1.xml— EDC versions prior to 10.4.1
    • MantaEdcModel.xml— otherwise
  4. You should see Manta Model in the list of custom models in the left panel.

If Manta Model already exists, you can update it by selecting the model from the list, clicking on the Update button, and selecting the MantaEdcModel.xml file or MantaEdcModel_EDC_prior_to_10.4.1.xml (EDC versions prior to 10.4.1).

Manta Model contains these classes.

As of EDC 10.4.1, Manta Model contains these additional classes.

Manta Resource Type

Based on Manta Model, it is necessary to create a Manta resource type. To do this, follow these steps in the Informatica Catalog Administrator.

  1. Click on the Manage menu and select Custom Resource Types.
  2. Click on the plus button in the left panel.
  3. Fill in the form like this.
    1. Name: Manta Resource (please capitalize it exactly as it is written here)
    2. Description: Data transformations provided by Manta
    3. Model: com.getmanta.edc
    4. Connection types: leave blank
  4. Click on the OK button and you should see Manta Resource in the list of Custom Resource Types in the left panel.

Database Management Resources

For each database resource analyzed by Manta Flow, you need to first add a corresponding database management resource in Informatica EDC with a similar database/schema selection. To do this, follow these steps in the Informatica Catalog Administrator.

  1. Click on the New menu and select Resource.
  2. On the General page, enter the Name exactly as ${< technology >.dictionary.id} configured in Manta Flow for this database resource (e.g., Oracle_ODS) and select the corresponding resource type (like Teradata, Oracle, Microsoft SQL Server, IBM Db2, or IBM Netezza). Fill out any additional attributes you need.
  3. On the Metadata Load Settings page, check Enable Source Metadata and select all the schemas that are configured to be included in Manta Flow for this database resource. Fill out any additional attributes you need.
  4. Configure the last pages as necessary and Save the resource. If you want to schedule the load of this resource, always be sure to run it before loading any Manta resources.
  5. Before loading any Manta resources, load this one by clicking on Run.

Manta Script and Manta Link Resources

Note: IBM Automatic Data Lineage does not automatically manage the ETL Resource checkbox for the Manta script resource in EDC as it relates to the Detailed Lineage export setting in Automatic Data Lineage. This means that once the script resource is created (regardless of whether it is done automatically or manually) with the ETL Resource checkbox unchecked, if a user uploads detailed lineage into EDC, Automatic Data Lineage does not automatically check the checkbox and the EDC upload process fails. The same happens if a user performs an EDC upload with the detailed lineage disabled into a resource with the ETL Resource checkbox checked. The Manta Admin needs to keep the two in sync.

Automated Manta Resource Creation in EDC

If enabled in iedcExportCommon.properties (property manta.iedc.autoCreateResource), Automatic Data Lineage can automatically create Manta script and, optionally, Manta link resources in Informatica EDC as part of the metadata upload into Informatica EDC (see Manta Flow Informatica EDC Export Execution). This is done while executing Manta Flow Client by executing < technology >IedcUploadMetadataMasterScenario.bat (on Windows) or < technology >IedcUploadMetadataMasterScenario.sh (on UNIX-like systems).

Note: EDC resource names are case-insensitive. In other words, if a resource with the same name but different case already exists in EDC, the new one won’t be created and the upload will fail.

If detailed lineage is enabled in iedcExportCommon.properties (property manta.iedc.exportDetailedLineage), the ETL Resource checkbox of the script resource created in EDC is checked. If detailed lineage is disabled, the checkbox is unchecked.

Alternatively, both resources can be created manually.

Manta Script Resource (Manually)

For each database resource analyzed by Manta Flow, you need to create a corresponding script resource based on the Manta Resource type that contains all its transformation objects. To do this, follow these steps in the Informatica Catalog Administrator.

  1. Click on the New menu and select Resource.
  2. On the General page, enter the Name exactly as ${manta.resource.iedc.scripts} which is by default equal to ${< technology >.dictionary.id}Scripts configured in Manta Flow for this database resource (e.g., 'Oracle_ODSScripts', where 'Oracle_ODS' is the dictionary ID and Scripts is appended) and select the Manta Resource type.
  3. In Connection Properties:
    1. If applicable, click Choose and select the objects.zip file from the corresponding output folder; for example, for Db2, the objects.zip file will be located in (C:\mantaflow\cli\output\db2\DICTIONARYID\iedc).
    2. If it exists in your version of EDC:
      1. Check the box ETL Resource to enable detailed lineage.
      2. Keep the box ETL Resource unchecked to disable detailed lineage.
  4. On the Metadata Load Settings page:
    1. Check Enable Source Metadata.
    2. Set Memory to High based on the expected volume of metadata to be loaded.
    3. Fill out any additional attributes you need.
  5. Configure the last pages as necessary and Save the resource. If you want to schedule the load of this resource, always be sure to run it after loading the corresponding database management resource and before loading the Manta link resource.
  6. After loading the corresponding database management resource, load this one by clicking on Run.
Note: Manta link resources are only used for standard lineage export to EDC; they are not used for detailed lineage export mode.

For each database resource analyzed by Manta Flow, you need to create a corresponding link resource based on the custom lineage type that contains links between database tables and columns and transformation objects from the Manta script resource. To do this, follow these steps in the Informatica Catalog Administrator.

  1. Click on the New menu and select Resource.
  2. On the General page, enter the Name exactly as ${manta.resource.iedc.links} which is by default equal to ${< technology >.dictionary.id}Links configured in Manta Flow for this database resource (e.g., 'Oracle_ODSLinks', where 'Oracle_ODS' is the dictionary ID and Links is appended) and select the Custom Lineage type.
  3. In Connection Properties, click on Choose and select the customLinks.csv file from the corresponding output folder.
    1. That is, for Db2, the customLinks.csv file will be located in (C:/mantaflow/cli/output/db2/<DICTIONARYID>/iedc).
  4. On the Metadata Load Settings page:
    1. Check Enable Source Metadata.
    2. Set Memory to Medium or High based on the expected volume of metadata to be loaded.
    3. Fill in any additional attributes you need.
  5. Configure the last pages as necessary and Save the resource. If you want to schedule the load of this resource, always be sure to run it after loading the corresponding database management resource and Manta script resource.
  6. After loading the corresponding database management resource and Manta script resource, load this one by clicking on Run.