Dataflow Analysis
Once you have selected the elements, tuned the visualization parameters, and clicked on the Visualize button, or just clicked on a link in your documentation, the initial dataflow graph is shown. In this section, our visual representation of a data flow will be introduced together with all the actions that can be performed on it such as browsing, filtering, and exporting.
Introducing Data Flows
If you select your favorite table or file that is filled from a source system and then transformed into a target system and set a high level of detail in both directions, a visualization of indirect flows, no filter, and a steps displayed parameter of three, you can get a dataflow graph similar to the following one.
You can see several elements, some of which are connected by arrows, and make the following observations.
-
Some elements contain other elements, so they create hierarchies similar to those in the object catalog.
-
For example, you can see a database that contains a table containing several columns.
-
In some cases, not all elements in a hierarchy are shown to preserve clarity such as none but the last directory in a path to a file. See note.
-
Levels of hierarchy are also distinguished by color — leaves are white, their parents are saturated colors, and other elements are pastel colors. See note.
-
-
Some element hierarchies have the same color; others have different colors.
-
Colors distinguish between the various technologies present in the environment.
-
For example, files are light blue while Teradata tables and views are ochre. See note.
-
In some cases, one element hierarchy can have more colors, since objects from one technology logically contain objects from other technologies; for example, a TPT script contains a Teradata SQL script or an Oracle procedure contains its DDL body in its database schema.
-
-
All elements have a label and an icon.
-
The label corresponds to the element’s simple name; imaginary element labels often contain their location in their parent script or their order in their statement.
-
The icon corresponds to the element type such as a database, table, column, file, script, workflow, etc.
-
-
One or more elements are black. See note.
- Black elements are the elements selected from the object catalog screen; all the elements shown either influence or are influenced by these elements.
-
Some elements have a plus or minus sign.
- An element that contains other elements can be collapsed or expanded so you can see more or fewer details for each element.
-
Some elements are connected by arrows — solid black ones or dashed blue ones. See note.
-
A solid black arrow represents a direct data flow from a source element to a target element.
-
A dashed blue arrow represents an indirect data flow from a source element to a target element.
-
Arrows connect only the most nested elements.
-
-
In some cases, wider variants of both solid black and dashed blue arrows connect elements.
-
These arrows signify that there are one or more paths between the source and target elements of (in)direct data flows, but the elements on these paths are not visible at the moment. It can happen when an element’s technology is filtered (filtered paths) or the element is not within the maximum distance from the selected elements (aggregated paths).
-
It is possible to see what elements are between the source and target elements by selecting the Show hidden lineage option from the context menu that appears when you right-click on the selected edge.
-
-
Some elements have a flow icon on the right.
-
This icon indicates that the data flow from the selected elements does not end in this element but continues to other elements that are not visible yet.
-
This happens when the steps displayed parameter is set lower than the maximum distance between the selected elements and some affected elements.
-
This icon is also present in elements that have both incoming and outgoing arrows but there are other arrows that have not been discovered yet.
-
You can discover all the data flows leading from or to this element by clicking on this icon.
-
Note: This is a default configuration that can be changed by your application administrator.
Starting from version R42.4, a more informative column description is displayed instead of Column names (which are in most cases autogenerated as numerical ordinal value) for Files under Filesystem assets. The column descriptions are also stored as attribute of the column and usually come from the tool that reads/writes the file (most often data integration tool). This substitution for a more user-friendly name is applied to the Object catalog, Lineage listing, and Data flow, as illustrated in the following screenshot. In cases when Manta Alias is specified for the column, it takes precedence and is displayed accordingly. However, it's important to note that these substituted names derived from the description attributes cannot be utilized as keywords for search purposes.
It is not applied to any exports. In the exports, the description columns are exported the same way as regular attributes.
Browsing Data Flows
Once you know what all the objects on a dataflow graph represent, you can try interacting with them.
Moving and Zooming
When a dataflow graph is shown, it is possible that you will see only a few full elements and flows and that the others will be only partially visible. Now, you can do two things:
-
Move a view to the other elements you want to see — hold the left mouse button somewhere outside the elements and arrows on the white background and move the mouse from one side to the other. When you are happy with your position or your mouse goes off the screen, just release the button.
-
Zoom out to see more elements at once — you can either click on the magnifier icon with a minus sign in the lower-left part of the screen or you can roll the mouse wheel down. Similarly, you can either click on the magnifier icon with a plus sign in the lower-left part of the screen or you can roll the mouse wheel up to zoom in.
Selecting Elements and Flows
When you need to get more information about an element from a dataflow graph or when you need to see how an element is connected to others by direct data flows, click on it and you will see a change like in the following image.
First, you can see that the selected element turns yellow. Then, all the elements connected to this element by direct data flows (even transitively) change color together with all the affected arrows. The other arrows are now lighter so the colored ones stand out. This feature is really helpful when you select a whole table for dataflow analysis and then want to see column data flows separately. Blue represents data flow to the selected element, red represents data flow from the selected element, and violet represents data flow both to and from the selected element.
Moreover, the name of the element appears in the upper-left part of the screen under the search bar. You can click on the down arrow there to see other element attributes, as you can see in the following image. You can hide these attributes by clicking on the same icon, or you can select another element to see its attributes.
When you move the mouse to an arrow, it turns red to make it easily distinguishable from the others. When you click on it, all the elements and arrows on the dataflow paths going through this arrow turn red like when you select an element. You can see the selected arrow in the following image.
To deselect elements and/or arrows, just click on the white background.
Finding Elements
Sometimes, especially on larger dataflow graphs, you might want to find a visible element. It is easy. Just type its name into the search bar in the upper-left part of the screen. A list of possible elements will appear from which you can choose the desired element by name, full path, and type. Click on the desired element so that it is centered on your screen and the element is highlighted. If it is not what you were searching for, you can click on another element. When you are satisfied, just click on the white background to cancel the highlighting and hide the list.
Expanding and Collapsing Elements
In this example, you can see all elements at their highest level of detail. However, this might not be necessary for some less interesting elements. In such a case, you can collapse an element containing other elements by clicking on its minus sign. The elements it contains will no longer be visible, as you can see in the following illustration.
On the other hand, if you start your initial data flow at a medium level of detail, all the elements will look like those collapsed in the previous figure. In this case, you can expand the elements to see their children by clicking their plus sign.
Discover Further Data Flow
When the steps displayed parameter is set lower than the maximum distance between the selected elements and some influenced elements, some elements will have flow icons on the right as you can see in the following image.
That means that there are other elements that are transitively connected by data flow to the elements selected through the element with the icon. To see more elements on the path from the selected elements to the other connected elements through the element with the icon, just click on the icon. The icon will disappear and the new elements will appear like in the following illustration.
It will look the same as if you had set the steps displayed parameter to four for only this particular path, thus the other paths with other elements and flows that are not important to your analysis are not shown. By clicking on these icons, you can follow only those paths that are important.
When no icon is present in an element, you can be sure that it is only connected to elements that are already visible. (We certainly only consider those connections that start at the elements which were originally selected.)
Restarting Data Flows
If you find an element that you want to do a dataflow analysis on, just right-click it and select Restart Visualization from the element options in the context menu. You will get an initial dataflow graph for this element with the same parameters as for the previous one.
Switching Data Flows to Another Layer
In the case of two or more layers, just one layer is active at a time and only the elements and flows belonging to the active layer are displayed. Typically, the active layer is selected on the object catalog page.
If an element maps or is mapped by another element in a different layer, it is possible to switch to that layer. Right-click the element and select Switch to ... Layer from the context menu, as shown in the following image.
After that, the visualization will restart from the mapping or mapped element in the selected layer.
Center the Camera on the Start Node
It is possible to center the camera on the start node by simply clicking on the button in the lower-left part of the visualization.
Filtering Data Flows
Manta Flow has two types of filters.
-
Resource filters — auto-generated for every currently-existing resource
-
Custom filters — configured by administrators; the following filter unit types are supported:
-
By resource — hides elements of the listed resources
-
By name — hides elements with names matching the defined regular expression
-
By type — hides elements of the listed types
-
By attribute — hides elements with the given attribute having any of the listed values
-
Both of them can be bound to the particular layer that the filter is valid for. If not, the filter is valid for all layers.
-
Resource filters — the layer binding is determined by the layer the resource belongs to
-
Custom filters — the layer binding is configured by the administrators
Both of them can be grouped into filter groups, and each group may contain filters of both types. The groups are configured by the administrators. The layer validity of the filter group is determined by the layer validity of the contained filters. The filter group is valid for a layer if and only if at least one filter in that group is valid for that layer.
You can select a filter group as a visualization parameter on the catalog screen by choosing the appropriate item from the Filters options. Only groups valid for the selected layer are displayed. If you select the /Oracle/orcl/infasuper
schema and choose medium detail, both directions, no visualization of indirect flows, the DBs, Files &
Reports filter, and a steps displayed parameter of one, you can get a dataflow graph similar to the following one.
Now you can see the direct data flow between the source elements and the selected elements. There are no elements from transformation technologies. The transformation elements have been replaced by wider arrows.
Note that the distance is counted without filtered elements. That is why the displayed source elements fulfill the maximal depth one condition, although some other elements exist between the source element and the one that has been selected.
There are several ways to modify the filtering on the visualization screen. You can display the filtered elements behind a particular arrow by right-clicking on that arrow and selecting the Show hidden lineage option. The arrow will be replaced by the hidden elements.
Other filtering options are available in the Options box in the upper-right part of the screen. Click on it to unpack it.
-
Under the Resources tab, you can check the resources that you want to display and uncheck the ones that you want to be filtered out. Then confirm your selection using the Apply button.
-
Under the Filters tab, you can check the custom filters that you want to apply and uncheck the ones that you want to turn off. Then confirm your selection using the Apply button.
In both cases, only filters valid for the active layer are displayed.
To filter by the technology (resource) of a particular element, right-click on that element and select Filter ... Technology.
To filter only one particular element (and all its descendants) manually, right-click on that element and select Hide element.
To cancel manual filtering and hidden lineage (see the Show hidden lineage command mentioned earlier), unpack the Options box in the upper right, go to the Filters tab, and click the Clean Manual Visibility button.
Undo Your Last Action
Each action that somehow changes the visualized data flow (such as expand/discover/filter) is recorded and can be reversed. In the default settings, the application stores 10 recorded states that can be browsed in both directions (undo/redo). This
can be done using the buttons with the left and right arrows located in the lower-left part of the screen (see the following illustration) or by using the well-known keyboard shortcuts
Ctrl+Z
for Undo and Ctrl+Y
for Redo. Beware that when you go back a few actions using Undo and then change the data flow using
Expand, all Redo states will be forgotten, as is standard undo/redo behavior in most applications.
Viewing the Source Code
While browsing data flows, it can be interesting to see the original scripts that are represented by the flow. You can do this by choosing a node that represents a procedure or command inside a transformation element and selecting Show Context for This Element, as shown in the following illustration. You can also double-click on the transformation element.
This will open a pop-up window with the entire formatted script, and the statement represented by the clicked element will be highlighted, as shown in the following illustration.
To do a full-text search, use the shortcut Ctrl+F
. Type the string you are searching for in the Search field and press Enter. The first occurrence of the string after the cursor will be focused, and all occurrences
will be highlighted. Also, the appropriate parts of the vertical scrollbar will be highlighted, as shown in the following illustration.
Details of Nodes and Edges
You can view the details of nodes and edges by simply selecting them. When a node or an edge is selected, it is possible to expand its detail by clicking on the expand arrow located next to the element name in the upper-left part of the screen just under the search box, as shown in the following illustration.
The expanded dialog box contains detailed element information including its attributes. Strings that are too long have a link next to them that opens an additional dialog box showing the whole value. It is also possible to collapse the dialog box, returning it to its previous state. An example is shown in the following illustration.
You can find two special attributes in the edge detail.
-
Selected edges — the selected edge is between non-leaf elements such as tables; the value represents the number of represented edges between the leaves
-
Filtered nodes — the selected edge is aggregated through several filtered nodes, for example, when you filter script resources and select an edge between two tables; the value represents the number of these filtered nodes
Bulk Settings for Level of Detail
It is possible to change the level of detail for all elements of a specific resource. This is done under the Detail tab in the Options dialog box as shown in the following image.
The GUI settings are listed in a table. The resource list is on the left side and the level of detail is on the right side. In the case of two or more layers, only resources belonging to the active layer are visible.
There are four options for each resource.
-
L — Low detail — shows only databases, folders, directories, etc.
-
M — Medium detail — also shows tables, procedures, transformations, scripts, etc.
-
H — High detail — shows all elements including columns, ports, and attributes
-
X — Custom detail — preserves the element settings which have been manually selected by the user
Confirm the changes by clicking on Apply.
Graphically Comparing Two Revisions
Start the Revision Comparison
You can use Manta Visualization to view the differences between two selected revisions by doing the following.
-
Select the main revision.
-
Click on the Compare Revisions button.
-
Select the older revision that it should be compared to.
-
Select the starting element and visualization parameters as you would for a normal visualization.
-
Display the visualization by clicking on the Visualize button.
Comparison Layout
The layout for revision comparison is based on the normal visualization but with several special effects. The highlighting is similar to the standard highlighting style for showing differences. This style is used by diff tools for version control system applications like GIT or SVN. That means that new objects will be green and the old ones will be red.
Detailed description:
-
The merge statement in the procedure
IMPORT_CRM
was only in the older revision. -
The merge statement in the procedure
IMPORT_LOAN
was only in the newer revision. -
The merge insert was updated in the new revision by removing two columns —
SRC_ID
andSHORT_NAME
. -
The rest of the columns are the same in both revisions.
-
The header shows the dates of the revisions being compared. You can see the detail by hovering over them with the cursor.
It is possible to change the comparison color layout to the normal visualization in the color layer settings.
Exporting Data Flows
When you find the information you are looking for, there are three different ways you can export it using the Export menu in the upper-left part of the screen.
-
Export all visible elements as a PNG image at 1:1 zoom. This export is ideal as an attachment to a change requirement specification document.
-
Export the initial dataflow graph as a permalink, but note that only the initial dataflow graph is linked, not any changes that you have made by expanding/collapsing, following, or filtering. This export is ideal as environment documentation because it is always up-to-date.
-
Export all elements that influence or are influenced by the selected elements to a ZIP archive containing two CSV files (
relations.csv
andvertices.csv
), but note that all the elements affected, not only the visible ones, are exported. This export is ideal for further processing. Note that it is also possible to perform this export directly from the repository screen without dataflow visualization by clicking on the arrow on the Visualize button and selecting the Export option.
Structure of Exported CSV Files
The first file, relations.csv
, describes all data flows (one row for each data flow). Example:
Let’s imagine that we have a direct data transfer from column t1c2
of table1 to column t2c1
of table2. Both tables belong to a database called “db” under the database system Teradata.
Field | Example value | Description |
---|---|---|
TYPE | DIRECT | Either DIRECT or FILTER, depending on the type of data flow |
SourcePath | Teradata.db.table1.t1c2 | The full address of the source |
TargetPath | Teradata.db.table2.t2c1 | The full address of the target |
SourceColumnName | t1c2 | The name of the source column |
SourceColumnType | Column | |
TargetColumnName | t2c1 | The name of the target column |
TargetColumnType | Column | |
SourceObjectName | table1 | The name of the source object (table) |
SourceObjectType | Table | The object type is usually TABLE or VIEW |
TargetObjectName | table2 | The name of the target object (table) |
TargetObjectType | Table | The object type is usually TABLE or VIEW |
SourceGroupName | db | The name of the source group (database) |
SourceGroupType | Database | The group type is usually Database |
TargetGroupName | db | The name of the target group (database) |
TargetGroupType | Database | The group type is usually Database |
SourceResourceName | Teradata | Resource name of the source (Teradata, Oracle, etc.) |
SourceResourceType | Teradata | |
TargetResourceName | Teradata | Resource name of the target (Teradata, Oracle, etc.) |
TargetResourceType | Teradata | |
RevisionState | STABLE | State of the object between two compared revisions (NEW, DELETED, STABLE, INNER) |
The second file, vertices.csv
, contains information about all the elements in the reference view (i.e., databases, tables, columns, etc.). If the element has any attributes, the file contains one row for each of its attributes (and
they differ in their AttributeName
and
AttributeValue
). If the element doesn’t have any attributes, the file contains one row about this element (in this case the AttributeName
and AttributeValue
remain empty).
Let’s imagine that the reference view contains the column
Teradata.db.table1.t1c1
. Then the file contains records about this column, about the table Teradata.db.table1
, about the database
Teradata.db
, and about the resource Teradata
.
Field | Example — Column | Example — Table | Description |
---|---|---|---|
FullNodeName | Teradata.db.table1.t1c1 | Teradata.db.table1 | The full address of the node |
ColumnName | t1c1 | - | The name of the column |
ColumnType | Column | - | |
ObjectName | table1 | table1 | The name of the object |
ObjectType | Table | Table | The object type is usually TABLE or VIEW |
GroupName | db | db | The name of the group (database) |
GroupType | Database | Database | The group type is usually Database |
ResourceName | Teradata | Teradata | Teradata, Oracle, etc. |
ResourceType | Teradata | Teradata | Teradata, Oracle, etc. |
RevisionState | STABLE | STABLE | State of the object when comparing two revisions (NEW, DELETED, STABLE, INNER) |
AttributeName | datatype | - | The name of one of the attributes |
AttributeValue | integer | - | The value of this attribute |
NodeName | t1c1 | table1 | The name of the node |
NodeType | Column | Table | The type of the node |
The order of elements in the exported files is not exactly defined. For example, it is not guaranteed that the CreateView element will precede the Delete element (delete a record from the view) and vice versa.
The columns NodeName and NodeType were added because in previous versions, not all objects in the file vertices.csv
had an object type. This made it difficult for users to sort through the list of exported nodes from
a Automatic Data Lineage visualization. With these two columns, it is easier to sort exported nodes by type. Therefore, these two columns are always populated in each row.
Indirect Flow Edge Categorization
Indirect flow edges contain the attribute EDGE_FILTER_TYPE
describing the type of operations where the condition comes from.
Possible EDGE_FILTER_TYPE
Values Grouped into Three Logical Categories
-
Represents a situation where the source affects the value of the target expression
-
Expression
— Expression-level indirect edges such as built-in functions (DECODE
,SUBSTRING
indices, etc.),CASE
expressions, etc. -
AnalyticWindow
— Sources participating in the definition of the analytic window for the ordered analytical functions -
Index
— Array index; key or field for map, JSON, XML
-
-
Affects the whole dataset (typically affects what rows will be present or in what order)
-
Where
— Limits the rows included in the target based on a condition -
JoinCondition
— Defines how a join will be performed -
GroupBy
— Row grouping for aggregate functions -
Having
— Limits aggregated rows included in the target -
OrderBy
— Affects row order -
Limit
— Limits rows included in the target based on output row count / percentage -
Pivot
— Pivoted columns -
Transpose
— Transposition of a dataset -
Distinct
— Removal of key columns for duplicates -
Temporal
— A temporal condition
-
-
Affects whether the given operation will be executed at all
-
If
— A condition (if
,case
/switch
) affecting whether the operation will be executed -
Loop
— A condition affecting loop execution (while
,for
,exit
,continue
) -
TriggerCondition
— Trigger condition -
ExceptionHandling
— Connects inputs to exception handling operations
-
Transformation Logic
Purpose
This functionality generates transformation descriptions, showing exactly how the value for a target (e.g., column, routine return value, etc.) is computed. The descriptions are stored on transformation column nodes immediately upstream of the target. The description does not include conditions under which the target is affected by the transformation. (That kind of information is provided in the form of filter edges in the lineage graph.)
This results in three types of transformation descriptions.
-
Text version for visualization
-
Used in visualization
-
Built without the context of other scripts, which means that the user-defined routine definition is not incorporated into the transformation description
-
-
Tag with a transformation classification
-
Visible in the visualization
-
Generated when generating the text version
-
Possible values:
-
NON_TRANSFORMING — value of one column is directly inserted into another column
-
TRANSFORMING — if the data flow contains any kind of logic
-
-
-
Object version for export
-
Used for exports to third-party applications
-
Merges the user-defined routine definition from the other scripts into the final transformation description
-
Supported Source Systems
-
Microsoft SQL Server
-
Netezza
-
Oracle
-
PostgreSQL
-
Snowflake
-
Teradata
-
Embedded SQL queries
-
DataStage
-
IPC
-
Tableau
-
SSIS (limited to queries in database source components and SQL Task)
-
Where the Transformation Description and Classification Is Shown
Text Version for Visualization
-
Manta Server → Viewer → Visualize → Node Properties → Transformation Logic
-
It is generated on nodes leading to a database persistent entity (like a column) and output points of routines.
- The transformation description is not directly on the target node so that one can distinguish between multiple writes to the same target.
-
As the value is typically longer, it is necessary to open the Attribute Value dialog box to see the whole thing.
-
An example of the resulting transformation description:
-
A — Selected node
-
B — Tag with the transformation classification
-
C — Location of the text version of the transformation description
-
D — Value of the text version of the transformation description
-
Tag with the Transformation Classification
-
Manta Server → Viewer → Visualize → Node Properties → Transformation Type
-
It is generated on the same nodes as the text version.
Object Version for Export
-
In Collibra DGC, the transformation logic is available as the Transformation Logic attribute of field mapping complex relations. It can be shown on the Mapping Specification asset overview page, on the overview pages of the source and target attributes (e.g., column), and in data lineage diagrams.
-
In Informatica EDC, the transformation logic is available as the Expression system attribute of attribute-level transformation assets (e.g., column). It is shown on the overview page of the particular transformation assets.
-
In IBM InfoSphere Information Governance Catalog, the transformation logic is available as the EXPRESSION attribute of attribute-level transformation assets (e.g., column). It is shown on the overview page of the particular transformation assets.
How to Read Expressions
Each line of transformation description consists of:
TARGET := EXPRESSION_ELEMENT_1 EXPRESSION_ELEMENT_2 .. EXPRESSION_ELEMENT_n
TARGET
is typically the column for which the transformation description is generated or the definition of the EXPRESSION_ELEMENT
from the line for which it is a sub-line.
EXPRESSION_ELEMENT
is typically the qualified name of a source column, operator, keyword, or variable defined in its sub-line.
This is followed by lines explaining the elements which can be drilled down (like variables). These sub-lines are indented and are valid only for their direct parent line.
Details:
- Access to an item from an array is represented as "
MY_ARR[]
".
Exceptions:
-
In the definition of the OUTPUT parameter of the called routine, the TARGET (left side) is missing.
-
CASE Statement is multi-lined for better readability.
Example 1: From an Oracle Import Script
PARTY.BUS_NAME := CASE
WHEN VAR_16 THEN VAR_17
ELSE VAR_19
END
VAR_16 := LOAN_CUSTOMER.CUSTOMER_TYPE <> IMPORT_LOAN.C_LOAN_CUSTOMER_PERSON
IMPORT_LOAN.C_LOAN_CUSTOMER_PERSON := 'P'
VAR_17 := COALESCE(LOAN_CUSTOMER_CORPORATE.CORPORATE_NAME, COALESCE.2, IMPORT_LOAN.C_NA)
COALESCE.2 := LOAN_CUSTOMER.CUSTOMER_NAME
IMPORT_LOAN.C_NA := 'N/A'
VAR_19 := IMPORT_LOAN.C_NA
IMPORT_LOAN.C_NA := 'N/A'
Here is some information about each line of the text version of the transformation description in the preceding example.
-
The first line tells us that this transformation description is for column
BUS_NAME
from tablePARTY
and its value is created from a CASE statement. -
The conditional value for the CASE statement, where the temporal variables (
VAR_16
andVAR_17
) generated only for this transformation description are used. They are used when they represent a sub-expression that is too long to inline or would have a negative effect on transformation description readability. They have the formatVAR_xx
(which can be parametrized in an XML file). -
Part of the CASE statement like on line 2.
-
End line of the CASE statement.
-
Sub-line of line 1 (visible by indentation) that defines
VAR_16
. -
Sub-line of line 5 that defines the literal value of the constant
C_LOAN_CUSTOMER_PERSON
. -
Sub-line of line 1 that defines the value of
VAR_17
as a result of calling the built-in function COALESCE. The first parameter is directly in this line, but the second and third parameters are defined in the following sub-lines. -
Sub-line of line 7 that defines the value of the second parameter of the function COALESCE as the column
CUSTOMER_NAME
. -
Sub-line of line 7 that defines the value of the third parameter of the function COALESCE as literal
'N/A'
. -
Sub-line of line 1 that defines the value of
VAR_19
. Here the valueIMPORT_LOAN.C_NA
is not inlined since it is detected as a variable that can have a meaningful name, which can help with the readability of the resulting transformation description. -
Sub-line of line 10 that defines the value of
IMPORT_LOAN.C_NA
as literal'N/A'
.
Example 2: From an Oracle Function Definition Script
TEST_TL_FNC1.RETVAL := TEST_TL_FNC1.IN1 + 2
DWH.TEST_TL_FNC1(TEST_TL_FNC1.IN1)
Here is some information about each line of the text version of the transformation description in the preceding example.
-
The first line tells us that this transformation description is for a return value from the user-defined function
TEST_TL_FNC1
and its value is created as the value of the input parameterIN1
plus 2. -
This line defines the input parameters inside user-defined routines since it is variable and it could be unclear from where it originates. (The same line format is used for sub-lines that define output parameters from routines. But it should be clear which type it is from context, as this line only occurs in the transformation description of the output point of the routine so the routine name matches the qualified name of the final target.)
Limitations
-
Some less common constructs are not supported yet.
-
The level of support varies across technologies.
-
The format of the generated transformation description may change in future releases.
-
It is not guaranteed that new versions of Automatic Data Lineage will be able to read the descriptions (object version) created by an older version; that is, these descriptions may be lost in old repository revisions after the update.
-
Most conditions (such as WHERE clauses and IF blocks) are intentionally omitted from descriptions because they would make the description too long, which would work against the intended usage of this information. If they are needed, they can be displayed in Automatic Data Lineage using the option to Visualize Indirect Flows.
-
The command execution order is not considered. This means that in the case of a
SET1; USE; SET2;
sequence of commands, the description forUSE
will show values from bothSET1
andSET2
.