How to Import Additional Attributes for Existing Objects
Goal
It is often necessary to import additional attributes to existing objects in the IBM Automatic Data Lineage repository. This may include business friendly names, DQ metrics, data classification attributes, etc. that Automatic Data Lineage does not provide. Such additional attributes can be used for display as active tags or user-supplied aliases.
Instructions
Importing Custom Attributes to Existing Objects
Using custom metadata, you can create custom objects and attributes in the Automatic Data Lineage repository. Here is an example of how to import three aliases and two customer attributes to objects that already exist in the Automatic Data Lineage repository. You may need to adjust the object path to reference objects in your own Automatic Data Lineage repository.
-
Go to the folder
mantaflow/cli/input/import/<CONNECTION_ID>
. Create it, if does not exist yet. Use the<CONNECTION_ID>
that you specified when creating the new connection. -
Create the file
node_attribute.csv
with contents specifying the additional attributes that you want to ingest. The meaning of each column is described in Open Manta Extensions: Files and Formats.node_attribute.csv
"/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe","MANTA_ALIAS","MySapHana Server Alias" "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe/H00","MANTA_ALIAS","MySapHana Database Alias" "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe/H00/SCHEMA_WITH_SYNONYM","MANTA_ALIAS","SchemaWithSysnonym Alias" "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe/H00/SCHEMA_WITH_SYNONYM/Customer","Classification","PII" "/SAP HANA/b0ccc0384-5e4d-4b52-a3e0-2b34fc9619fe/H00/SCHEMA_WITH_SYNONYM/Customer","DQScore","90"
A few additional notes on formatting:
-
Attribute names can be arbitrary for the custom attributes.
-
Note that there are some special attributes; for example, Display User-Supplied Aliases.
-
The first column (
node_id
) represents a path to the already existing object in the repository (typically created by an OOTB scanner or by another import). -
Ensure proper quoting and escaping of characters with special meaning (commas and double quotes).
-
Loading Custom Metadata and Lineage into Automatic Data Lineage
The files created in the previous steps are automatically ingested during the lineage analysis run triggered in Process Manager in Admin UI.
However, for a quick test, it is often more convenient to only run the import manually, done as follows or by creating the following workflow in Process Manager.
New Minor Revision Scenario
— to open a new minor revision and add the custom lineage to the last existing revisionImport Dataflow Scenario
— to ingestnode_attribute.csv
Commit Revision Scenario
— to persist data into the repository; if you see any errors coming from the previous steps, you can alsoRollback Revision Scenario
to revert back and start again- Review the custom lineage import logs.
If you are not happy with the result and need to repeat the above steps, simply run Delete Revision Scenario
to remove the newly added minor revision and start again. Note that both the deletion and rollback may take some time to complete
based on the repository size.
Review Logs
Any errors reported during lineage import will help you identify any issues with the input file format. The logs are located under the LogViewer tab in Admin UI or directly on the filesystem.
mantaflow/cli/logs/importDataflowMasterScenario.log
— Review the errors fornode_attribute.csv
parsing errors. The most common issues are related to invalid file structure or typos in the file.mantaflow/server/manta-dataflow-server-dir/logs/manta-dataflow.log
— Review the errors for object references. The most common issues are related to the import of attributes to objects that do not exist in the Automatic Data Lineage repository (e.g., a typo in the object path, objects that no longer exist).