GitHubContribute in GitHub: Edit online

Detailed Information About Connection Mappings

Connection Object

The connection used internally by IBM Manta Data Lineage, typically to reference a database, has the following attributes.

The type is one of the following. (This field is case-sensitive.)

Connections File

Many (ETL, reporting tool) scanners extract connection details as part of the extraction process done by Manta Data Lineage. For some technologies, it is necessary to define connections manually.
For example, actual database connections of Modeling scanners (such as PowerDesigner, ER/Studio, or erwin) are not part of the model and must be defined to determine the mapping of the extracted physical model to the real physical model previously extracted from another technology. This is done using a connections.ini file. (See Resource Configuration for the respective scanner Modeling.)

The following structure is used to define a single connection.

[{physical_model_name}]
Connection_String={connection_string}
Type={connection_type}
Server_Name={server_name}
Database_Name={database_name}
Schema_Name={schema_name}
User_Name={user_name}

If a field is not set, such a line entry should not be present.

The only mandatory fields are connection name (physical_model_name), Connection_String, and Type. Connection_String can either refer to a dictionary identifier (each defined connection to a technology must contain a unique dictionary identifier) or it can be interpreted as a connection string to a specific technology (JDBC), which is then parsed and processed into different parts, depending on the technology.

Field processing is further described below.

Specific Mapping Algorithms

Teradata
  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as a JDBC connection string and the relevant parts (the hostname and database) are matched to theHost name and Included / Excluded databases or, alternatively, the hostname is matched to the Host name or Connection ID or Dictionary ID.

  3. The Server_Name is interpreted as a JDBC connection string and the relevant parts (the hostname and database) are matched to the Host name and Included / Excluded databases or, alternatively, the hostname is matched to the Host name or Connection ID or Dictionary ID.

  4. The Server_Name is matched to the Host name.

  5. The Connection_String is matched to the Host name.

  6. The Connection_String is matched to the Dictionary ID.

Oracle
  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as an Oracle Connect Descriptor and matched to the Host name, Port, and Instance name.

  3. Parts of the Connection_String are interpreted as a JDBC connection string and matched up in one of the ways listed below.

    • SID to the Instance name

    • Service name to the Instance name

    • Hostname to the Host name, Connection ID, Instance name, Global Database Name, or Dictionary ID

  4. Parts of the Connection_String are interpreted as an Oracle Easy Connect String and the service name is matched to the Instance name or, alternatively, the hostname is matched to the Host name or Connection ID.

  5. The Connection_String is matched to the Instance name.

  6. The Connection_String is matched to the Global Database Name.

  7. The Connection_String is matched to the Dictionary ID.

MS SQL
  1. The Connection_String is matched to the Connection ID.

  2. The Server_Name is matched to the Connection ID.

  3. Parts of the Connection_String are interpreted as a JDBC connection string and matched to the Server name, IP, Port, and Database instance name or the Connection ID or Dictionary ID as follows.

    • Hostname / IP, port, instance name to the Server name, IP, Port, and Database instance name

    • "<Hostname>\<Instance name>" to the Connection ID or Dictionary ID

    • Hostname to the Connection ID or Dictionary ID

  4. Parts of the Server_Name are interpreted as a JDBC connection string — the same as above.

  5. The Connection_String is matched to the Server name, IP, Port, and Database instance name.

  6. The Server_Name is matched to the Server name, IP, Port, and Database instance name.

  7. The Connection_String is matched to the Dictionary ID.

  8. The Server_Name is matched to the Dictionary ID.

Hive
  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as a JDBC connection string and the relevant parts (the hostname and database) are matched to the Host name and Included / Excluded databases or, alternatively, the hostname is matched to the Host name or Connection ID or Dictionary ID.

  3. Parts of the Server_Name are interpreted as a JDBC connection string and the relevant parts (the hostname and database) are matched to the Host name and Included / Excluded databases or, alternatively, the hostname is matched to the Host name or Connection ID or Dictionary ID.

  4. The Server_Name is matched to the Host name.

  5. The Connection_String is matched to the Host name.

  6. The Connection_String is matched to the Dictionary ID.

Netezza
  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as a JDBC connection string and the relevant parts (the hostname and port) are matched to the Host name and Port or, alternatively, the hostname is matched to the Connection ID or Host name or Dictionary ID.

  3. Parts of the Server_Name are interpreted as a JDBC connection string — the same as above.

  4. The Connection_String is matched to the Host name.

  5. The Server_Name is matched to the Host name.

  6. The Connection_String is matched to the Dictionary ID.

Db2
  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as a JDBC connection string and the relevant parts (the hostname and database name) are matched to the Host Name and Database name or, alternatively, the hostname is matched to the Connection ID or the Host name and Instance name or the Dictionary ID.

  3. Parts of the Server_Name are interpreted as a JDBC connection string — the same as above.

  4. The Server_Name, Database_Name, and Schema_Name are matched to the Host name, Instance name, Database name, and Included / Excluded schemas.

  5. The Connection_String, Database_Name, and Schema_Name are matched to the Host name, Instance name, Database name, and Included / Excluded schemas.

  6. The Connection_String is matched to the Dictionary ID.

PostgreSQL
  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as a JDBC connection string and matched up in one of the ways listed below.

    • Hostname and port to the Hostname / Endpoint and Port

    • Hostname and database name to the Hostname / Endpoint and Included / excluded databases

    • Hostname to the Connection ID

    • Hostname to the Hostname / Endpoint

    • Hostname to the Dictionary ID

  3. Parts of the Server_Name are interpreted as a JDBC connection string — the same as above.

  4. The Server_Name is matched to the Hostname / Endpoint and Port.

  5. The Connection_String is matched to the Hostname / Endpoint.

  6. The Connection_String is matched to the Dictionary ID.

SAP HANA
  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as a JDBC connection string and matched up in one of the ways listed below.

    • Hostname and port to the Hostname / Endpoint and Port

    • Hostname and database name to the Hostname / Endpoint and Included / excluded databases

    • Hostname to the Connection ID

    • Hostname to the Hostname / Endpoint

    • Hostname to the Dictionary ID

  3. Parts of the Server_Name are interpreted as a JDBC connection string — the same as above.

  4. The Server_Name is matched to the Hostname / Endpoint and Port.

  5. The Connection_String is matched to the Hostname / Endpoint.

  6. The Connection_String is matched to the Dictionary ID.

SSAS
  1. The Connection_String is matched to the Connection ID.

  2. The Server_Name is matched to the Connection ID.

  3. Parts of the Connection_String are interpreted as a connection string and the data source part is matched to the Data Source and Database instance name or, alternatively, the initial catalog is matched to the Initial Catalog.

  4. Parts of the Server_Name are interpreted as a connection string — the same as above.

  5. The Connection_String is matched to the Data Source and Database instance name.

  6. The Server_Name is matched to the Data Source and Database instance name.

  7. The Connection_String is matched to the Dictionary ID.

  8. The Server_Name is matched to the Dictionary ID.

  9. The Database_Name is matched to the Initial Catalog.

BigQuery

Default Hostname = https://www.googleapis.com/bigquery/v2

  1. The Connection_String is matched to the Connection ID.

  2. The Database_Name is matched to the Connection ID.

  3. Parts of the Connection_String are interpreted as a JDBC connection string and the hostname or default hostname (if the hostname part isn’t found in the connection string) and the project ID are matched to the Host name and Included/Excluded projects/datasets.

  4. The Server_Name or default hostname (if the Server_Name isn’t found in the connection object), Database_Name, and Schema_Name are matched to the Host name and Included/Excluded projects/datasets.

  5. The project ID part of the Connection_String is interpreted as a JDBC connection string and matched to the Connection ID.

  6. The project ID part of the Connection_String is interpreted as a JDBC connection string and matched to the Dictionary ID.

  7. The Connection_String is matched to the Dictionary ID.

  8. The Database_Name is matched to the Dictionary ID.

Snowflake

In Snowflake, the server name is account_name.region.

  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as a JDBC connection string and matched to the Account Name, Region, and Included / excluded databases in one of the ways listed below.

    • Account name, region, and database name to the Account Name, Region, and Included / excluded databases

    • Account name and region to the Account Name and Region

    • Account name to the Account Name

  3. The Server_Name is matched to the Account Name and Region.

  4. The User_Name is matched to the Account Name.

  5. The Connection_String is matched to the Dictionary ID.

Kafka
  1. The Connection_String is matched to the Connection ID.

  2. The Connection_String is matched to the Broker URL.

MySQL
  1. The Connection_String is matched to the Connection ID.

  2. Parts of the Connection_String are interpreted as a JDBC connection string and matched up in one of the ways listed below.

    • Hostname and port to the Hostname / Endpoint and Port

    • Hostname and database/schema name to the Hostname / Endpoint and Included / excluded databases/schemas

    • Hostname to the Connection ID

    • Hostname to the Hostname / Endpoint

    • Hostname to the Dictionary ID

  3. Parts of the Server_Name are interpreted as a JDBC connection string — the same as above.

  4. The Server_Name is matched to the Hostname / Endpoint and Port.

  5. The Connection_String is matched to the Hostname / Endpoint.

  6. The Connection_String is matched to the Dictionary ID.