Open Manta Designer - User's manual for R42

About Open Manta Designer

Open Manta Designer (OMD for short) is intended for creating metadata to augment IBM Automatic Data Lineage automated lineage and also for technologies that cannot be automatically scanned by Automatic Data Lineage scanners. It replaces the previous method of manual creation of custom metadata CSV configuration files (Open Manta Extensions). Although it is still possible to use this previous method, now there is an intuitive tool for rapid prototyping that significantly speeds up the work, making it much less tedious and less error-prone.

Basic OMD principles

OMD consists of two basic screens, the so-called Homepage and Designer. The Homepage is the first screen a user sees after logging in. It contains an overview of all Connections that the user has created in the past or those that the user is currently working on.

No alt text provided

Although OMD does not support the simultaneous work of multiple users within a single Connection, a user with the appropriate permissions can still see or access other users' Connections (in case the data is saved to OMD server).

In the properties section of each Connection, the user can see the name, description, status, and the date of the last modification.

No alt text provided

Data from each Connection can be stored either locally in the browser's storage, in which case they have the status Unsaved, or they are stored in the OMD server's storage and marked as Saved. Connections can also be deleted, in which case all data for that Connection is irretrievably deleted, regardless if it was stored locally or on the OMD server.

Some actions are performed directly by the OMD back-end, namely importing Nodes and Edges from Flow Repository into OMD. Such data is automatically saved into OMD server's storage, so the Unsaved flag is not used in such cases.

Connections can be created either from scratch or by importing existing Open Manta Extension CSV files/JSON Connection data. These might be existing Open Manta Extension projects you have already worked on or are also a means for sharing completed Connection data extracts with other team members using JSON. This also means that none of your previous work is lost and you don't have to create anything twice.

If you want to open a given Connection, just click on its name in the tile or directly on the Canvas name, the list of which can also be seen in the Connection tile. Alternatively, you can open the Connection from within the context menu.

Then the main Designer window opens, it consists of the following items:

No alt text provided

The sections below explains how to create your Node hierarchies, Import nodes from Flow Repository, create edges, and basically take all the steps you need to create your custom data lineage. Once it is defined, do not forget to save your Connection, otherwise, all the data is stored only in your local browser storage.

Using the Repository Tree for creating Nodes hierarchy

Before you start creating a data lineage showing the flow of data between individual nodes and technologies, you must first define which assets you want to work with. In the Repository Tree, you can create an entire hierarchy of assets yourself, but not only that, you can also import Nodes and entire sub-hierarchies of Nodes from the Manta Flow repository in case you want to extend the existing scanned metadata, enrich it with your own attributes, or continue with the manual creation of data lineage.

First, you have to create a Layer to which you then can add Resources and Nodes. If you decide to import Nodes from the Manta Flow repository, then the corresponding Layers and Resources are imported automatically. Of course, you can also delete the assets from the Repository Tree or change their name and description. This is done by using the Property Editor in the right part of the main Designer window, where you can change the names of the Layers, Resource, and Nodes. In addition, you can see the attributes for Nodes / Edges which were either created in OMD or imported from Manta Flow Repository.

After you created a Layer, you can add Resources that represent particular technologies. When you're creating a Resource, you must choose its Name and Type. Based on the type selection, OMD then allows you to create only correct Node hierarchies, and you are thus sure that the hierarchy is be valid. When you're creating a new Resource, you can also select the Create Child node Hierarchy option. In that case, OMD generates a sample hierarchy that contains all possible Node types. You can then continue to work with this generated hierarchy and change it however you like, of course adding new or renaming removing existing Nodes.

You can add nodes to a specified Parent Node from Repository Tree or Canvas by using the + icon. Both options yield the same result. When you're adding more nodes of the same type, you can add as many nodes as needed and all nodes are created simultaneously. You can do it by clicking + Add node or hitting TAB key to enter a new Node name.

No alt text provided

OMD always checks node conflicts and prevents the user from creating two nodes with the same Node Name a Node Type within the same Parent Node.

PRINCIPLES: Creating Nodes hierarchy

Duplicating Nodes

For easier and faster creation of Node hierarchy, you can also use the Duplicate Node feature, which can be accessed via Node’s context menu in Repository Tree. This will help you quickly create new Nodes based on existing ones. In addition to duplicating the selected node and defining its new name, you can duplicate the Node including whole Child Node subhierarchy, defined Edges, and Node / Edge attributes. However, when you're duplicating Node, you cannot change its Node Type.

No alt text provided

Moving Node within Repository Tree

The existing Node in the OMD Repository Tree can be easily moved into another parent Node, providing the Node Types are compatible. OMD does not allow to move of a node under an incompatible parent node, so you can be sure the metadata are consistent and valid all the time. Moving the Node is done by simple drag and drop of the Node within the Repository Tree area. You can move not only a single Node, but a whole subhierarchy of Nodes under a new parent node.

No alt text provided

Defining additional Node attributes in Properties Editor

If you need to define some additional Node attributes for individual Nodes, or if you want to find out the value of attributes imported with a given node from the Manta Flow Repository, you can do so via the Properties Editor, which is located on the right side of the screen. Assets and their attributes that were imported from Manta Flow Server have an IMP label in Repository Tree, at Canvases, and in the Properties editor.

In the Properties Editor, you can define completely new attributes, and change or add values ​​to existing attributes. If you make any changes in the Properties Editor, do not forget to save them using the Confirm changes button, otherwise all these changes are lost.

No alt text provided

As soon as you have created new or imported existing assets that you want to work with further, you can start creating your Data Lineage by using a Canvas.

Creating your own data lineage

To create a Data Lineage, you must first create one or more Canvases. Then, you can place Nodes (assets in the Repository Tree) from the Repository Tree on the Canvas using drag & drop and connect the individual Nodes using various Edges (lineage relationships between different Nodes).

You can work with multiple Canvases within one single Connection, which helps to better organize your metadata and make them more readable. It is completely up to you how to split the data lineage among several Canvases. Perhaps you want to define DIRECT edges at one Canvas and PERSPECTIVE edges at the other. Or part of the lineage at one Canvas and complex ETL transformations on another.

Placing Nodes on Canvas

In most situations, you can drag and drop on the Canvas only the Nodes that contain individual Columns (so-called second level Nodes, for example a Table or View or ETL Component, first level Nodes being the Columns themselves). However, in some specific cases, you might need to drop even third level nodes (such as Database schema,) or even higher level nodes in case you want to define Perspectives.

If you want to change the location of the Node after it was placed on Canvas, just drag it by the top of the Node Layout, the outer rectangle representing the Resource level.

For the Node placed on Canvas, you can easily change the visibility of its Child Nodes using the eye icon that appears next to the name of the Node. The icon opens a modal window where you can set which Child Nodes you want to display at this Canvas. Tthe setting might be different for other Canvases. If the eye icon is crossed-out, some of the Child nodes are hidden in this particular Canvas. Otherwise, all Child nodes are visible.

For deleting Node from Canvas, just select it’s outermost rectangle by left click and either press DELETE / BACKSPACE or use the Remove From Canvas button in Properties Editor. You can even use multi-select by holding CTRL (or Mac command ⌘ key) + clicking individual Nodes.

Nodes that were removed from Canvas are still available in the Repository Tree. If you want to remove the Node from Connection, it has to be done by using the Remove action in the Repository Tree context menu.

If you remove a Node from Canvas and the Node has already some edges attached, such edges are removed and thus lost. Be observant, especially in the case of removing nodes that were imported either from Manta Flow repository or from Open Manta Extension CSV files as the removal might lead to loss of the defined edges in the OMD Connection. The edges remain intact in Manta Flow Repository.

PRINCIPLES: Working with Canvas

After placing the Nodes on the Canvas, you can start creating Edges representing the flow of data.

Creating Edges

Creating edges is rather straightforward using just drag and drop, but OMD simplifies it even more.

First, you can start creating edges only from Nodes where |-> icon appears. If there is no available target node placed in the Canvas, this icon does not appear at all. So for example, you cannot start creating an edge from Database Node in Oracle Resource, until there is some Container Node placed at Canvas, because PERSPECTIVE is the only Edge Type available for Database Node and PERSPECTIVE can be attached only to some Container Nodes from Perspective Layer.

No alt text provided

Second, when you drag the edge, OMD outlines the Nodes that could be used as a target for such an edge. This is depicted as a dashed line around the target node candidates:

No alt text provided

Third, once the edge is attached to the target, OMD tries to automatically determine its edge type based on the information from Resource Templates. If there are more Edge Types available, OMD preselects the first one and the user can change it in case the other one should be used.

No alt text provided

In addition, edges can be created by two different ways:

In case you want to change the Edge type or define some custom edge attributes, just navigate to Properties Editor where you can access all existing Edge attributes or define your own.

For deleting edge, just select it by left click and hit DELETE / BACKSPACE or use Delete from canvas button in Properties Editor. You can select multiple edges at once using either of the following methods:

Edges are not stored in OMD Repository Tree, so be aware that once deleted from Canvas, these are lost including all Edge Attributes that might be defined. This applies even for Edges that were imported from the Manta Flow repository or from Open Manta Extension CSV files. Edges remain intact in Flow Repository, they are removed only from OMD Connection.

PRINCIPLES: Creating Edges

Importing metadata from Manta Flow Repository

If your Open Manta Designer is properly set up and Manta Flow Server is accessible, you can Import Nodes from Manta Flow Server and continue with data lineage definition or you can enhance imported nodes with some custom Attributes.

When you're importing nodes from the Flow Repository, you have to first select the revision and the Layer you want to pull the nodes from. Immediately, you can see all Resources in the given Layer and as you browse the Nodes hierarchy, the nodes are dynamically loaded from the Flow Repository.

After you select the checkboxes for the Nodes you want to import, confirm the action and wait for the import status information showing a total number of Nodes and Edges imported or updated. In case some errors occur, these are shown in this Import dialog as well. Although OMD considers Manta Flow Repository content as a source of truth and all Nodes are always imported as it is, the imported metadata are validated against the available Resource Template for the given technology. In case an appropriate Resource Template is not found or there are some discrepancies found, the metadata are still imported into OMD, however it is not possible to create any Child Nodes or Edges.

All import errors and warnings can be accessed in Log Viewer / Admin UI as well.

PRINCIPLES: Import from Manta Flow Repository

To prevent unnecessary transfer of data between Flow Repository and Open Manta Designer, it is highly recommended to import second level nodes, preferably just those you need for creating your data lineage. When user tries to import highest level nodes having a lot of child nodes, it may cause even timeout in some cases.

Check for Update of Imported Nodes

For nodes imported into OMD from the Manta Flow Server, you can easily check if they have changed since the time of import on the Manta Flow Repository side. For this, the "Check for Update" feature is used, which you can simply call up from the context menu in the Repository Tree for a given imported Node, or you can run this action for all imported Nodes at once.

No alt text provided

No alt text provided

In the dialog box that appears, first select the revision against which you want to compare the given Node. By default, in OMD Settings, it is specified that the latest revision is used automatically.

OMD then tries to find the given Node and its entire sub-hierarchy in the Manta Flow Repository and compare whether there have been any changes. A Node that was imported into OMD could be deleted in one of the subsequent revisions, the number of its Child Nodes could change, and there could be changes in the definition of Node attributes or edges. All these differences can be detected by OMD and displayed to the user. The user can subsequently decide whether he wants to synchronize these changes to OMD, which will of course change the metadata on the OMD side, or whether he will ignore these differences.

No alt text provided

Importing Open Manta Extension CSV files into OMD

If you have your own Connections defined in CSV files (the Open Manta Extensions format), then you can simply import them into OMD, and from then onward you can use OMD for all additional changes. What used to be rather complicated, complex and error-prone now becomes very simple and intuitive.

The set of Open Manta Extensions CSV files (layer.csv, resource.csv, node.csv, edge.csv, node_attribute.csv, edge_attribute.csv) has to be within a ZIP archive which is then uploaded into OMD when Import Connection from file action is selected at OMD Homepage. The name of the ZIP archive is then used for creating a brand new Connection that is shown on the OMD Homepage after the CSV files as successfully imported.

You can easily change the name of this Connection later on.

No alt text provided

PRINCIPLES: CSV Import into OMD

Export from OMD into CSV and JSON

When the custom metadata is ready to be published into the Manta Flow repository, you can just export the Connection from OMD into set of CSV files. Then the import into Manta Flow Repository can be done via Manta Admin UI in the same way it is done using plain Open Manta Extensions.

No alt text provided

PRINCIPLES: Exporting CSV files

OMD offers even the possibility to export all data about a specific Connection in JSON format, including information about individual Canvases, their layout, node positions etc. Such exported data can be easily imported into OMD again later on. This can be useful in case you want to share your Connection designs between different people and/or different computer devices.

Advanced OMD features

Using Edge Wizard (a.k.a. Macroedges)

In situations where you want to connect two Parent Nodes having a large number of Child Nodes and the mapping of individual edges between them is rather straightforward (for example, if it is a simple duplication of two database tables, 1:1 loading of the contents of a File via ETL Transformation into a DB Table, etc.) , you can use the Edge Wizard.

Instead of mechanically creating individual edges between 1st level leaf nodes, you can directly create an edge between given parent nodes. As soon as you create this edge, a dialog opens where you can define how individual edges should be created at the lowest level.

You can choose from the following strategies that determine how to map individual nodes:

In the case of name matching, the user can define various input parameters that influence the mapping algorithm. For example, it is possible to first remove prefixes/suffixes either from the Source/Target Node or both sides, the user can define if case sensitivity must be taken into account or the special characters removed before the matching algorithm is used.

Similarly, in wise mapping, the leaf nodes are tried to be matched by name (taking into account all the input parameters as well). However, for those Nodes that could not be matched, a 1:N edges are created attached to all the unmatched Leaf Nodes on the other side of the mapping.

No alt text provided

Mapping by node order is simply creating edges between individual Leaf nodes based on the order within the parent Nodes.

Edge Wizard can be used even in scenarios where the user needs to not only create individual edges but even create the Leaf Nodes in the target. The by copy mapping strategy can be used for this where the user can enter additional parameters such as removing prefixes/suffixes from source nodes or adding prefixes/suffixes to the target.

PRINCIPLES: Edge Wizard a.k.a. Macroedges