Matillion Scanner Guide
Matillion is an ETL/ELT tool for cloud database platforms designed for data loading and transformation on cloud database platforms such as Snowflake, Amazon Redshift, Google BigQuery, Microsoft Azure Synapse, and Delta Lake. Manta is a powerful data lineage platform that simplifies data management by supporting the automated extraction of Matillion ETL/ELT.
Follow these steps to configure a connection to Matillion.
Step 1: Create a Certificate for the Connection
-
Within in the Admin UI, navigate to the following: Configuration → CLI → Common → Common Config.
-
Select the blue Edit button in the upper-right corner of the page, and navigate to Manta Flow CLI System Connectors Settings.
-
Execute the following steps: Select the blue Edit button beside Manta Flow CLI System Connectors Settings.
-
To add certification, execute the following steps: Add entry → Provide a link from the page → enter your Matillion URL instance (see the example in the screenshot below) → Load → Click the checkmark → Confirm → Close → Select the blue Save button in the upper-right corner of the page.
Step 2: Configure the Connection
Create a new connection in Admin UI http(s)://”hostname”:”port”/manta-admin-gui/app/index.html?#/platform/connections/
to enable automated extraction of Matillion by Manta. The connection requirements are listed in Matillion Integration Requirements.
Properties That Must Be Configured
-
Connection name — User-defined field to identify the connection
-
Matillion instance platform type — Matillion instance cloud data warehouse type, supported by Manta, for example, Snowflake
-
Matillion instance name — User-defined field to identify the Matillion instance. The name will be used in the viewer and might impact the paths of extracted and input files for this connection. Do not use URLs or dots, as they are not allowed in this name.
-
Matillion instance address — Server host name of the Matillion instance. This is the URL used to log in to the Matillion instance.
-
Matillion instance schema — HTTP or HTTPS, depending on your instance. In case when only HTTPS is selected, you must configure a certificate.
-
Matillion instance port — Port of the Matillion server
-
Matillion instance username — Username for the connection to the Matillion server
-
Matillion instance password — Password for the connection to the Matillion server
Optional Fields
-
Jobs to be extracted — Comma-separated list of regular expressions describing the jobs that should be extracted. The full path of the job includes the names of the parent project group, the parent project and its version, and the job itself. Example:
group_name/project_name/.*/^transformation.+,.project_name/default/\D*
. -
Jobs not to be extracted — Comma-separated list of regular expressions describing the jobs that should not be extracted. The full path of the job includes the names of the parent project group, the parent project and its version, and the job itself. Example:
group_name/project_name/.*/^transformation.+,.project_name/default/\D*
.
Step 3: Advanced Configuration Settings
-
Navigate to the Advanced tab on the same screen as the connection configuration, and under the overridden common properties, set Verify Certification Hostname to True.
If the hostname that you want to connect to does not match the hostname in the certificate, set this property to False. -
Save to validate the connection.