Open Manta Designer Guide
About Open Manta Designer
Open Manta Designer (OMD for short) is intended for creating metadata to augment Manta automated lineage and also for technologies that cannot be automatically scanned by IBM Automatic Data Lineage scanners. It replaces the previous method of manual creation of custom metadata CSV configuration files (known as Open MANTA Extensions). Although it is still possible to use this previous method, now there is an intuitive tool for rapid prototyping that significantly speeds up the work, makes it much less tedious and less error-prone.
Basic OMD principles
OMD consists of two basic screens, the so-called Homepage and Designer. The Homepage is the first screen a user sees after logging in. It contains an overview of all Connections that the user has created in the past or those that the user is currently working on.
Although OMD does not support a simultaneous work of multiple users within a single Connection, a user with the appropriate permissions can still see or access other users' Connections (in case the data is saved to OMD server).
In the properties section of each Connection, the user can see the name, description, status and the date of the last modification.
Data from each Connection can be stored either locally in the browser's storage, in which case they have the status “Unsaved", or they are stored in the OMD server's storage and marked as "Saved". Be aware that Connections can also be deleted, in which case all data for that Connection is irretrievably deleted, regardless it was stored locally or on the OMD server.
Connections can be created either from scratch or by importing existing Open Manta Extension CSV files/JSON Connection data. These might be existing Open Manta Extension projects you have already worked on, or is also a means for sharing completed Connection data extracts with other team members using JSON. This also means that none of your previous work is lost and you don't have to create anything twice.
If you want to open a given Connection, just click on its name in the tile or directly on the Canvas name, the list of which can also be seen in the Connection tile (more about Canvases later in this topic). Alternatively, you can open the Connection from within the context menu.
Then the main Designer window opens, it consists of:
-
Resizable Repository Tree area on the left
-
Canvas(es) area in the center
-
Resizable Properties Editor on the right
The following sections will show you how to create your Node hierarchies, Import nodes from Flow Repository, create edges and basically taking all the steps you need for creating your custom data lineage. Once it is defined, do not forget to save your Connection, otherwise all the data will be stored only in your local browser storage.
Using the Repository Tree for creating Nodes hierarchy
Before you start creating a data lineage showing the flow of data between individual nodes and technologies, you must first define which assets you want work with. In the Repository Tree, you can create an entire hierarchy of assets yourself, but not only that, you can also import Nodes and entire sub-hierarchies of Nodes from the Manta Flow repository in case you want to extend the existing scanned metadata, enrich it with your own attributes, or continue with the manual creation of data lineage.
First you have to create a Layer to which you will then add Resources and Nodes. If you decide to import Nodes from the Manta Flow repository, then the corresponding Layers and Resources will be imported automatically. Of course, you can also delete the assets from Repository Tree or change their name and description. This is done using the Property Editor in the right part of the main Designer window, where you can change names of the Layers, Resource and Nodes. In addition, you can see there attributes for Nodes / Edges which were either created in OMD or imported from Manta Flow Repository.
Once you have created a Layer, you can add Resources that represent particular technologies. When creating a Resource, you must choose its
Name
and Type
. Based on the type selection, OMD will then allow you to create only correct Node hierarchies, and you are thus sure that the hierarchy will be always valid. When creating a new Resource, you can also select
the Create Child node Hierarchy
option. In that case, OMD will generate a sample hierarchy containing all possible Node types. You can then continue to work with this generated hierarchy and change it however you like, of course adding
new or renaming/removing existing Nodes.
Adding of the new Nodes below the given Parent Node can be done either in Repository Tree using + icon or directly in Canvas using the same. Both options give the same result. When adding more nodes of the same type, you can add as many nodes as needed and all will be created simultaneously.
In Create New Node dialog, you can define new nodes in several ways:
-
By clicking
+ Add node
or -
By hitting
TAB key
for entering new node or
- By
Copy & Paste
a list of nodes from clipboard by CTRL+C / CTRL+V. Both New line and Tab characters are automatically recognized as delimiters of the Node Names. This can be handy in case you have a list of Nodes in some other data source, e.g. Excel sheet, and you want to create all of them at once via copy of Excel rows or columns (or both).
Paste node names from clipboard - simply press CTRL+V and the nodes will be automatically placed into the dialog.
OMD always checks node conflicts and prevents user from creating two nodes with same Node Name a Node Type within the same Parent Node.
PRINCIPLES: Creating Nodes hierarchy
- You can define as many Layers as needed in one Connection, providing the Layers have different types and names.
- You cannot have two Resources with the same Resource Name and Resource Type within one Connection. For example, you can create two Oracle Resource in the same Connections, but both need to have different names.
- Some Resources can be created only in specific Layer Types. For example Oracle or MSSQL Resources can be created only in Physical layer, PowerDesigner or Erwin Resources only in Logical/Conceptual layers and Perspectives only in Perspective Layer. Such rules can be redefined in Resource Templates.
- Depending on the hierarchy level, only valid Node Types are offered to user when new Node is being created.
- Within one Parent Node, there cannot be two Child Nodes with the same Node Name and Node Type.
- You can have as many custom attributes as needed for each Node/Edge. However each attribute needs to have unique name. An attribute can have more values.
- Be aware that Removal of some metadata element - be it Layer, Resource or some Node - will also remove all the children of it's nested hierarchy including edges which might be already defined.
Duplicating Nodes
For easier and faster creation of Node hierarchy, you can also use Duplicate Node feature, which can be accessed via Node’s context menu in Repository Tree. This will help you quickly create new Nodes based on existing ones. In addition to duplicating the selected node and defining its new name, you can duplicate the Node including whole Child Node subhierarchy, defined Edges and Node / Edge attributes. However, when duplicating Node you cannot change its Node Type.
Duplicating single Node or whole subhierarchy
Moving Node within Repository Tree
Existing Node in the OMD Repository Tree can be easily moved into another parent Node, providing the Node Types are compatible. OMD will not allow to move a node under incompatible parent node, so you can be sure the metadata are consistent and valid all the time. Moving of the Node is done by simple drag&drop of the Node within Repository Tree area. Of course, you can move not only a single Node, but whole subhierarchy of Nodes under a new parent node.
Sorting Nodes within OMD Repository Tree
By default, the nodes in OMD Repository Tree are sorted by its internal Node ID, which means the Nodes which were created the last will be at the end of all child nodes. However, this can be changed simply by clicking at the AZ button, which re-arranges the nodes in alphabetical order much like in Manta Viewer:
Searching for Nodes within OMD Repository Tree
To search for a particular node in the Repository Tree, just type its name (or part of) into the search text field over the repository tree itself and hit ENTER or the search icon. The searching will be done only in the subtree of the selected Node/Resource or Layer. If you want to search within whole Connection, then simply select the Connection name in the Tree (highest level):
If one or more occurrences of the search string were found, you can cycle through the search results by pressing the search icon again. The Node is automatically focused in the Repository Tree, so you can immediately see its details in Property Editor area or you can place the Node into Canvas by drag & drop (see later in this topic).
Defining Additional Node Attributes in Properties Editor
If you need to define some additional Node attributes for individual Nodes, or if you want to find out the value of attributes imported with a given node from the Manta Flow Repository, you can do so via the Properties Editor, which is located on
the right side of the screen. Assets and their attributes that were imported from Manta Flow Server have an IMP
label in Repository Tree, at Canvases and in Properties editor.
In the Properties Editor, you can define completely new attributes, change or add values to existing attributes. If you make any changes in the Properties Editor, do not forget to save them using the
Confirm changes
button, otherwise all these changes will be lost.
As soon as you have created new or imported existing assets that you want to work with further, you can start creating your Data Lineage using a Canvas.
Creating your own data lineage
In order to create a Data Lineage, you must first create one or more Canvases. Then you can place Nodes (assets in the Repository Tree) from the Repository Tree on the Canvas using drag & drop and connect the individual Nodes using various Edges (lineage relationships between different Nodes).
You can work with multiple Canvases within one single Connection, which helps to better organise your metadata and making them more readable. It is completely up to you how to split the data lineage among several Canvases. Perhaps you want to define DIRECT edges at one Canvas and PERSPECTIVE edges at the other. Or part of the lineage at one Canvas and complex ETL transformations on another.
Placing Nodes on Canvas
In most of the situations, you will drag & drop on the Canvas only the Nodes containing individual Columns (so-called 2nd level Nodes [an example is a Table or View or ETL Component] , 1st level Nodes being the Columns themselves). However, in some specific cases, you will need to drop even 3rd level nodes (such as Database schema) or even higher level nodes in case you want to define Perspectives.
If you want to change the location of the Node after it was placed on Canvas, just drag it by the top of the Node Layout (i.e. the outer rectangle representing the Resource level).
For the Node placed on Canvas, you can easily change the visibility of its Child Nodes using the "eye" icon which appears next to the name of the Node. The icon opens a modal window where you can set which Child Nodes you want to display at this Canvas (the setting can be different for other Canvases!). If the “eye” icon is crossed-out, it means some of the Child nodes are hidden at this particular Canvas, otherwise all Child nodes are made visible.
For deleting Node from Canvas, just select it’s outermost rectangle by left click and either press DELETE / BACKSPACE
or use the
Remove From Canvas
button in Properties Editor. You can even use multi-select by holding CTRL (or Mac command ⌘ key) + clicking individual Nodes.
Be aware that Nodes removed from Canvas are still available in Repository Tree. If you want to remove the Node from Connection, it has to be done by Remove action in Repository Tree context menu. If you remove a Node from Canvas and the Node has already some edges attached, such edges are removed and thus lost. Be observant especially in case of removing nodes imported either from Manta Flow repository or from Open Manta Extension CSV files as the removal can lead to loss of the defined edges in the OMD Connection. Of course the edges remain intact in Manta Flow Repository.
PRINCIPLES: Working with Canvas
- A Node cannot be placed to the Canvas by drag & drop more than once, of course you can place the Node into as many Canvases as needed.
- For each Node Type, it is defined in Resource Template whether it can be placed to Canvas by drag & drop or not, so be aware that not all Nodes can be placed to the Canvas. Usually just level 2+ nodes can be placed, individual 1st level nodes (such as Columns) cannot be placed to Canvas directly, these appear at Canvas once their parent 2nd level Node is placed.
- If the Node is being be placed to Canvas, then so-called Child Node Visibility Limit applies. It specifies if the Child Nodes will be automatically placed on the Canvas along with the main Node or not. If the number of immediate Child nodes is
lower or equal to this Limit, then the Child Nodes are placed as well. Otherwise the Child Nodes are not visible on the Canvas by default when the main Node is placed (however you can make them visible later on). For example, let’s say that
Node Type "Table" has the limit defined to 30. If the particular Table has less than 31 Columns, then all the Columns are displayed on Canvas along with the Table. If the Table had more Columns, no Columns would be visible at first.
- For most of the 2nd level Nodes (Tables, Views, …), the ChildNodeVisibilityLimit is set to 30.
- For most of the 3rd+ level Nodes (Schema, Database, Server, …), the ChildNodeVisibilityLimit is set to 0, so not Child nodes are shown by default.
- If the Node being placed to Canvas has some edges to Nodes already placed on Canvas, the edges are automatically rendered as well.
- If the Node placed on Canvas has some Child Nodes with incoming or outgoing edges, then such Child Nodes cannot be made hidden on the Canvas once they are shown. It helps the users to have better overview about the metadata being defined and no edges can be forgotten about.
After placing the Nodes on the Canvas, you can start creating Edges representing the flow of data.
Creating Edges
OMD allows the users to create only those edge types, which are allowed for given Node Type in Resource Templates definitions.
First, from the pull down menu at the top left part of the Canvas area, you can select Edge Type for all edges which can be subsequently created (DIRECT being the default value):
Application Direct and Application Filter edges can be created only between highest level Nodes. They depict flows and dependencies between Systems/Resources rather than data flows between individual Columns. Although OMD does not restrict its usage in other Layers, such edges should be used only between nodes in Application Level layers and just among nodes in Resources having “APP” postfix, e.g. Oracle APP, MSSQL APP etc.
When an edge type is selected, OMD then automatically highlights all nodes at Canvas from where you can start dragging the edge using the drag&drop (see dashed line in the preceding screenshot). Similarly, when you drag the edge, OMD will outline only the Nodes which could be used as a target for such edge. Again, this is depicted as a dashed line around the target node candidates:
In general, edges can be created by two different ways:
-
Either by drag & drop from Source to Target Node as described earlier or
-
By left-click at the Source Node, holding SHIFT + left-click to destination nodes. This can be advantageous in case there are more edges from same originating node.
In case you want to change the Edge type or define some custom edge attributes, just navigate to Properties Editor where you can access all existing Edge attributes or define your own.
For deleting edge, just select it by left click and hit
DELETE / BACKSPACE
or use Delete from canvas
button in Properties Editor. You can select multiple edges at once using either from following methods:
-
Hold
CTRL
(or Mac command ⌘ key) + click at individual Edges. -
Hold
CTRL
(or Mac command ⌘ key) + draw a selection rectangle using drag&drop.
Edges are not stored in OMD Repository Tree, so be aware that once deleted from Canvas, these are lost including all Edge Attributes which might be defined. This applies even for Edges which were imported from Manta Flow repository or from Open Manta Extension CSV files (of course Edges remain intact in Flow Repository, they are removed only from OMD Connection).
PRINCIPLES: Creating Edges
- There cannot be two edges of the same type between two nodes, however you can define more edges of various Edge Types between identical Nodes.
- Open Manta Designer always checks the metadata validity and synchronise the changes among all Canvases. If you create an Edge between two nodes, this edge is rendered in all other Canvases where these Nodes were placed to. Such edges at other
Canvases are displayed as informative and "read-only”.
- For example: If there are Table A and Table B placed on two Canvases within one Connection, then if you create edges between these tables in one Canvas, it is automatically propagated to all Canvases where Table A and Table B are placed. Of course the same applies to edge removal or definition of edge attributes, which can be accessed from all Canvases as well.
- The only Canvas where the edge can be removed is the one it was created at.
Importing metadata from Manta Flow Repository
If your Open Manta Designer is properly setup and Manta Flow Server is accessible, you can Import Nodes from Manta Flow Server and continue with data lineage definition or you can enhance imported nodes with some custom Attributes.
When importing nodes from Flow Repository, you have to first select the revision and the Layer you want to pull the nodes from. Immediately you can see all Resources in the given Layer and as you browse the Nodes hierarchy, the nodes are dynamically loaded from Flow Repository.
After selecting the checkboxes for the Nodes you want to import, just confirm the action and wait for the import status information showing total number of Nodes and Edges imported or updated. In case some errors occurred, these are shown in this Import dialog as well. Although OMD considers Manta Flow Repository content as a source-of-truth and all Nodes are always imported as it is, the imported metadata are validated against available Resource Template for the given technology. In case appropriate Resource Template is not found or there are some discrepancies found, the metadata are still imported into OMD, however it is not possible to create any Child Nodes or Edges.
All import errors and warnings can be accessed in Log Viewer / Admin UI as well.
PRINCIPLES: Import from Manta Flow Repository
- If Node is imported into OMD, all the parent Nodes including Resource and Layer are automatically imported as well.
- if you import another Nodes, these are going to be merged into existing OMD repository hierarchy in case its parent nodes were imported before. The Nodes are merged according to Node name, Node Type and full path.
- There are certain rules which influence whether the Child Nodes for the node being imported are pulled into OMD as well. For each Node Type, OMD define so-called Child Node Import Limit (usually it is set to value 999). If the Node being imported
has less immediate Child Nodes than this threshold limit, all its Child nodes are imported as well. Otherwise, no Child Nodes are imported along with the selected Parent Node. This “Limit” can be seen directly in Import Dialog when user hovers
over the particular Node.
- For example, if you select whole DB schema for import, all its content will be imported in case the number of Child Nodes is less than the Child Node Import Limit.
- Edges are also imported into OMD along with the Nodes, if the edges are defined between the Nodes which are being imported.
- When you import high-level nodes from Application Level layer, these nodes are imported into OMD including the edges between the nodes. However, all DIRECT edges are automatically converted into APPLICATION_DIRECT edges and similarly all FILTER edges are converted to APPLICATION_FILTER edges. Both these APPLICATION edges are used only within OMD, so when such metadata are exported into Open Manta Extension CSV files (for importing into Manta Flow Server later on), these edges are converted back: APPLICATION_DIRECT → DIRECT, APPLICATION_FILTER → FILTER
Check for Update of Imported Nodes
For nodes imported into OMD from Manta Flow Server, you can easily check if they have changed since the time of import on the Manta Flow Repository side. For this, the "Check for Update" feature is used, which you can simply call up from the context menu in the Repository Tree for a given imported Node, or you can run this action for all imported Nodes at once.
Check for update of particular node and its subhierarchy
Check for update of all imported nodes
In the dialog box that appears, first select the revision against which you want to compare the given Node (by default, in OMD Settings it is specified that latest revision is used automatically).
OMD will then try to find the given Node and its entire sub-hierarchy in the Manta Flow Repository and compare whether there have been any changes.
A Node that was imported into OMD could be deleted in one of the subsequent revisions, the number of its Child Nodes could change, there could be changes in the definition of Node attributes or edges. All these differences can be detected
by OMD and displayed to the user. Users can subsequently decide whether they want to synchronise these changes to OMD, which will of course change the metadata on the OMD side, or whether they will ignore these differences.
Check for Update of imported nodes
Importing Open Manta Extension CSV files into OMD
If you have your own Connections defined in CSV files (Open Manta Extensions format), then you can simply import them into OMD and from then onward you can use OMD for all additional changes. What used to be rather complicated, complex and error-prone now becomes very simple and intuitive.
The set of Open Manta Extensions CSV files (layer.csv, resource.csv, node.csv, edge.csv, node_attribute.csv, edge_attribute.csv) has to be within a ZIP archive which is then uploaded into OMD when
Import Connection from file
action is selected at OMD Homepage. The name of the ZIP archive is then used for creating brand new Connection which is shown at OMD Homepage after the CSV files as successfully imported.
Of course you can easily change the name of this Connection later on.
Importing Open Manta Extension CSV files or JSON Connection data
PRINCIPLES: CSV Import into OMD
- Open Manta Designer can import CSV files both with headers or without. It is detected automatically and no additional settings needs to be done.
- The layer.csv, resource.csv and node.csv files are mandatory, other files can be optional in case no edges or node/edge attributes were defined in the Connection.
- When the Connection is being imported from CSV files, OMD tries to recognise and associate relevant Resource Templates describing hierarchical structure of Nodes in the given technologies. All the Nodes are imported even If the Resource Templates cannot be found. However, in such case no Child nodes and Edges can be created under the imported Resource.
- Similarly to import from Manta Flow Server, all import warnings or errors are shown in Admin UI/Log Viewer as well.
Export from OMD into CSV and JSON
When the custom metadata are ready to be published into Manta Flow repository, you can just export the Connection from OMD into set of CSV files. Then the import into Manta Flow Repository can be done via Manta Admin UI in the same way it is done using plain Open Manta Extensions.
PRINCIPLES: Exporting CSV files
- When exporting Connection from Open Manta Designer, all Nodes from all Layers and Resources are processed along with all node attributes.
- Similarly, all edges defined in all available Canvases are exported as well, although some Canvases might be closed.
- By default, the CVS files generated from OMD do not contain headers, although this can be changed in application properties if you want to include headers. However, be sure that your settings in Manta Admin UI are set accordingly.
OMD offers even the possibility to export all data about a specific Connection in JSON format, including information about individual Canvases, their layout, node positions etc. Such exported data can be easily imported into OMD again later on. This can be useful in case you want to share your Connection designs between different people and/or different computers devices.
Advanced OMD features
Using Edge Wizard (a.k.a. Macroedges)
In situations where you want to connect two Parent Nodes having a large number of Child Nodes and the mapping of individual edges between them is rather straightforward (for example, if it is a simple duplication of two database tables, 1:1 loading of the contents of a File via ETL Transformation into a DB Table, etc.) , you can use the Edge Wizard.
Instead of mechanically creating individual edges between 1st level leaf nodes, you can directly create an edge between given parent nodes. As soon as you create this edge, a dialog opens where you can define how individual edges should be created at the lowest level.
You can choose from the following strategies that determine how to map individual nodes:
-
Mapping based on name matching (by name)
-
Mapping based on name matching with wise mapping algorithm
-
Mapping based on node order (by order)
-
Mapping based on copying of nodes and creating individual edges (by copy)
In case of name matching, user can define various input parameters which influence the mapping algorithm. For example, it is possible to first remove prefixes/suffixes either from Source/Target Node or both sides, user can define if case sensitivity should be take into account or the special characters removed before the matching algorithm is used.
Similarly, in wise mapping, the leaf nodes are tried to be matched by name (taking into account all the input parameters as well). However, for those Nodes which could not be matched, a 1:N edges are created attached to all the “unmatched” Leaf Nodes on the other side of the mapping:
Mapping by node order is simply creating edges between individual Leaf nodes based on the order within the parent Nodes.
Edge Wizard can be used even in scenarios where user needs to not only create individual edges, but even create the Leaf Nodes in the target. Mapping strategy “By Copy” can be used for this where user can enter additional parameters such as removing prefixes/suffixes from source nodes or adding prefixes/suffixes to target.
PRINCIPLES: Edge Wizard a.k.a. Macroedges
- Once the individual edges are created by Edge Wizzard, these behave like usual edges create manually and so these can be modified/removed as necessary or enhanced with additional Edge Attributes.
- Edge Wizzard takes into account all Child Nodes from the given Parent Nodes, not only the Node which are visible at Canvas.
- Before the individual Edges are created, user can see lists of both the matching Nodes which were found by the selected mapping algorithm as well as the unmatched nodes from both the source and target. The lists can be filter using RegEx or plain string match.