Detailed Information About Connection Mappings
Connection Object
The connection used internally by IBM Manta Data Lineage, typically to reference a database, has the following attributes.
-
Type — specifies the target technology; for example, Oracle, Db2, or a generic ODBC or JDBC connection type
-
Connection_String — unique identification of the connection as available in the job that uses it, which can be a connection name (e.g., in the case of an ETL tool or ODBC connection), JDBC connection string, Oracle connect descriptor, etc.
-
Server_Name
-
Database_Name
-
Schema_Name
-
User_Name
The type
is one of the following. (This field is case-sensitive.)
-
Teradata
-
Oracle
-
MSSQL
-
Sybase
-
Netezza
-
DB2
-
PostgreSQL
-
MySQL
-
Snowflake
-
BigQuery
-
SAPHana
-
Hive
-
SSAS
-
ODBC
-
JDBC
-
Kafka
-
Unknown
Connections File
Many (ETL, reporting tool) scanners extract connection details as part of the extraction process done by Manta Data Lineage. For some technologies, it is necessary to define connections manually.
For example, actual database connections of
Modeling scanners (such as PowerDesigner, ER/Studio, or erwin) are not part of the model and must be defined to determine the mapping of the extracted physical model to the real physical model previously extracted from
another technology. This is done using a connections.ini
file. (See Resource Configuration for the respective scanner
Modeling.)
The following structure is used to define a single connection.
[{physical_model_name}]
Connection_String={connection_string}
Type={connection_type}
Server_Name={server_name}
Database_Name={database_name}
Schema_Name={schema_name}
User_Name={user_name}
If a field is not set, such a line entry should not be present.
The only mandatory fields are connection name (physical_model_name)
,
Connection_String
, and Type
. Connection_String
can either refer to a dictionary identifier (each defined connection to a technology must contain a unique dictionary identifier) or it can be interpreted as
a connection string to a specific technology (JDBC), which is then parsed and processed into different parts, depending on the technology.
Field processing is further described below.
Specific Mapping Algorithms
Teradata
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and the relevant parts (the hostname and database) are matched to theHost name
andIncluded / Excluded databases
or, alternatively, the hostname is matched to theHost name
orConnection ID
orDictionary ID
. -
The
Server_Name
is interpreted as a JDBC connection string and the relevant parts (the hostname and database) are matched to theHost name
andIncluded / Excluded databases
or, alternatively, the hostname is matched to theHost name
orConnection ID
orDictionary ID
. -
The
Server_Name
is matched to theHost name
. -
The
Connection_String
is matched to theHost name
. -
The
Connection_String
is matched to theDictionary ID
.
Oracle
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as an Oracle Connect Descriptor and matched to theHost name
,Port
, andInstance name
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and matched up in one of the ways listed below.-
SID to the
Instance name
-
Service name to the
Instance name
-
Hostname to the
Host name
,Connection ID
,Instance name
,Global Database Name
, orDictionary ID
-
-
Parts of the
Connection_String
are interpreted as an Oracle Easy Connect String and the service name is matched to theInstance name
or, alternatively, the hostname is matched to theHost name
orConnection ID
. -
The
Connection_String
is matched to theInstance name
. -
The
Connection_String
is matched to theGlobal Database Name
. -
The
Connection_String
is matched to theDictionary ID
.
MS SQL
-
The
Connection_String
is matched to theConnection ID
. -
The
Server_Name
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and matched to theServer name
,IP
,Port
, andDatabase instance name
or theConnection ID
orDictionary ID
as follows.-
Hostname / IP, port, instance name to the
Server name
,IP
,Port
, andDatabase instance name
-
"<Hostname>\<Instance name>"
to theConnection ID
orDictionary ID
-
Hostname to the
Connection ID
orDictionary ID
-
-
Parts of the
Server_Name
are interpreted as a JDBC connection string — the same as above. -
The
Connection_String
is matched to theServer name
,IP
,Port
, andDatabase instance name
. -
The
Server_Name
is matched to theServer name
,IP
,Port
, andDatabase instance name
. -
The
Connection_String
is matched to theDictionary ID
. -
The
Server_Name
is matched to theDictionary ID
.
Hive
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and the relevant parts (the hostname and database) are matched to theHost name
andIncluded / Excluded databases
or, alternatively, the hostname is matched to theHost name
orConnection ID
orDictionary ID
. -
Parts of the
Server_Name
are interpreted as a JDBC connection string and the relevant parts (the hostname and database) are matched to theHost name
andIncluded / Excluded databases
or, alternatively, the hostname is matched to theHost name
orConnection ID
orDictionary ID
. -
The
Server_Name
is matched to theHost name
. -
The
Connection_String
is matched to theHost name
. -
The
Connection_String
is matched to theDictionary ID
.
Netezza
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and the relevant parts (the hostname and port) are matched to theHost name
andPort
or, alternatively, the hostname is matched to theConnection ID
orHost name
orDictionary ID
. -
Parts of the
Server_Name
are interpreted as a JDBC connection string — the same as above. -
The
Connection_String
is matched to theHost name
. -
The
Server_Name
is matched to theHost name
. -
The
Connection_String
is matched to theDictionary ID
.
Db2
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and the relevant parts (the hostname and database name) are matched to theHost Name
andDatabase name
or, alternatively, the hostname is matched to theConnection ID
or theHost name
andInstance name
or theDictionary ID
. -
Parts of the
Server_Name
are interpreted as a JDBC connection string — the same as above. -
The
Server_Name
,Database_Name
, andSchema_Name
are matched to theHost name
,Instance name
,Database name
, andIncluded / Excluded schemas
. -
The
Connection_String
,Database_Name
, andSchema_Name
are matched to theHost name
,Instance name
,Database name
, andIncluded / Excluded schemas
. -
The
Connection_String
is matched to theDictionary ID
.
PostgreSQL
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and matched up in one of the ways listed below.-
Hostname and port to the
Hostname / Endpoint
andPort
-
Hostname and database name to the
Hostname / Endpoint
andIncluded / excluded databases
-
Hostname to the
Connection ID
-
Hostname to the
Hostname / Endpoint
-
Hostname to the
Dictionary ID
-
-
Parts of the
Server_Name
are interpreted as a JDBC connection string — the same as above. -
The
Server_Name
is matched to theHostname / Endpoint
andPort
. -
The
Connection_String
is matched to theHostname / Endpoint
. -
The
Connection_String
is matched to theDictionary ID
.
SAP HANA
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and matched up in one of the ways listed below.-
Hostname and port to the
Hostname / Endpoint
andPort
-
Hostname and database name to the
Hostname / Endpoint
andIncluded / excluded databases
-
Hostname to the
Connection ID
-
Hostname to the
Hostname / Endpoint
-
Hostname to the
Dictionary ID
-
-
Parts of the
Server_Name
are interpreted as a JDBC connection string — the same as above. -
The
Server_Name
is matched to theHostname / Endpoint
andPort
. -
The
Connection_String
is matched to theHostname / Endpoint
. -
The
Connection_String
is matched to theDictionary ID
.
SSAS
-
The
Connection_String
is matched to theConnection ID
. -
The
Server_Name
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a connection string and the data source part is matched to theData Source
andDatabase instance name
or, alternatively, the initial catalog is matched to theInitial Catalog
. -
Parts of the
Server_Name
are interpreted as a connection string — the same as above. -
The
Connection_String
is matched to theData Source
andDatabase instance name
. -
The
Server_Name
is matched to theData Source
andDatabase instance name
. -
The
Connection_String
is matched to theDictionary ID
. -
The
Server_Name
is matched to theDictionary ID
. -
The
Database_Name
is matched to theInitial Catalog
.
BigQuery
Default Hostname = https://www.googleapis.com/bigquery/v2
-
The
Connection_String
is matched to theConnection ID
. -
The
Database_Name
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and the hostname or default hostname (if the hostname part isn’t found in the connection string) and the project ID are matched to theHost name
andIncluded/Excluded projects/datasets
. -
The
Server_Name
or default hostname (if theServer_Name
isn’t found in the connection object),Database_Name
, andSchema_Name
are matched to theHost name
andIncluded/Excluded projects/datasets
. -
The project ID part of the
Connection_String
is interpreted as a JDBC connection string and matched to theConnection ID
. -
The project ID part of the
Connection_String
is interpreted as a JDBC connection string and matched to theDictionary ID
. -
The
Connection_String
is matched to theDictionary ID
. -
The
Database_Name
is matched to theDictionary ID
.
Snowflake
In Snowflake, the server name is account_name.region
.
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and matched to theAccount Name
,Region
, andIncluded / excluded databases
in one of the ways listed below.-
Account name, region, and database name to the
Account Name
,Region
, andIncluded / excluded databases
-
Account name and region to the
Account Name
andRegion
-
Account name to the
Account Name
-
-
The
Server_Name
is matched to theAccount Name
andRegion
. -
The
User_Name
is matched to theAccount Name
. -
The
Connection_String
is matched to theDictionary ID
.
Kafka
-
The
Connection_String
is matched to theConnection ID
. -
The
Connection_String
is matched to theBroker URL
.
MySQL
-
The
Connection_String
is matched to theConnection ID
. -
Parts of the
Connection_String
are interpreted as a JDBC connection string and matched up in one of the ways listed below.-
Hostname and port to the
Hostname / Endpoint
andPort
-
Hostname and database/schema name to the
Hostname / Endpoint
andIncluded / excluded databases/schemas
-
Hostname to the
Connection ID
-
Hostname to the
Hostname / Endpoint
-
Hostname to the
Dictionary ID
-
-
Parts of the
Server_Name
are interpreted as a JDBC connection string — the same as above. -
The
Server_Name
is matched to theHostname / Endpoint
andPort
. -
The
Connection_String
is matched to theHostname / Endpoint
. -
The
Connection_String
is matched to theDictionary ID
.