How to Import Additional Attributes for Existing Objects

Goal

It is often necessary to import additional attributes to existing objects in the IBM Automatic Data Lineage repository. This may include business friendly names, DQ metrics, data classification attributes, etc. that Automatic Data Lineage does not provide. Such additional attributes can be used for display as active tags or user-supplied aliases.

Instructions

Importing Custom Attributes to Existing Objects

Using custom metadata, you can create custom objects and attributes in the Automatic Data Lineage repository. Here is an example of how to import three aliases and two customer attributes to objects that already exist in the Automatic Data Lineage repository. You may need to adjust the object path to reference objects in your own Automatic Data Lineage repository.

  1. Go to the folder mantaflow/cli/input/import/<CONNECTION_ID>. Create it, if does not exist yet. Use the <CONNECTION_ID> that you specified when creating the new connection.

  2. Create the file node_attribute.csv with contents specifying the additional attributes that you want to ingest. The meaning of each column is described in Open Manta Extensions: Files and Formats.

    node_attribute.csv

    "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe","MANTA_ALIAS","MySapHana Server Alias"
    "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe/H00","MANTA_ALIAS","MySapHana Database Alias"
    "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe/H00/SCHEMA_WITH_SYNONYM","MANTA_ALIAS","SchemaWithSysnonym Alias"
    "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe/H00/SCHEMA_WITH_SYNONYM/Customer","Classification","PII"
    "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe/H00/SCHEMA_WITH_SYNONYM/Customer","DQScore","90"
    

    A few additional notes on formatting:

    • Attribute names can be arbitrary for the custom attributes.

    • Note that there are some special attributes; for example, Display User-Supplied Aliases.

    • The first column (node_id) represents a path to the already existing object in the repository (typically created by an OOTB scanner or by another import).

    • Ensure proper quoting and escaping of characters with special meaning (commas and double quotes).

Make sure that the file is delimited by commas.

Loading Custom Metadata and Lineage into Automatic Data Lineage

The files created in the previous steps are automatically ingested during the lineage analysis run triggered in Process Manager in Admin UI.

However, for a quick test, it is often more convenient to only run the import manually, done as follows or by creating the following workflow in Process Manager.

  1. New Minor Revision Scenario — to open a new minor revision and add the custom lineage to the last existing revision
  2. Import Dataflow Scenario — to ingest node_attribute.csv
  3. Commit Revision Scenario — to persist data into the repository; if you see any errors coming from the previous steps, you can also Rollback Revision Scenario to revert back and start again
  4. Review the custom lineage import logs.

If you are not happy with the result and need to repeat the above steps, simply run Delete Revision Scenario to remove the newly added minor revision and start again. Note that both the deletion and rollback may take some time to complete based on the repository size.

Review Logs

Any errors reported during lineage import will help you identify any issues with the input file format. The logs are located under the LogViewer tab in Admin UI or directly on the filesystem.