Matillion Resource Configuration
Before you configure your scanner, make sure you meet the prerequisites. Read our guide on Matillion integration requirements to double-check.
Source System Properties
This configuration can be setup by creating a new connection on Admin UI > Connections tab or editing an existing connection in Admin UI / Connections / Data Integration Tools / Matillion /specific connection. New connection can also be created via Manta Orchestration API.
One IBM Automatic Data Lineage connection for Matillion corresponds to one Matillion server that will be analyzed.
Property name |
Description |
Example(s) |
---|---|---|
|
An arbitrary name that identifies the Matillion instance for you. The name will be used in the viewer and might affect the paths of extracted and input files for the Automatic Data Lineage connection. Do not use URLs; dots are not allowed in the name. |
snowflake_instance |
|
Matillion instance cloud data warehouse type. |
Snowflake |
|
(Optional) A list of regular expressions, separated by commas, which describe jobs that should be extracted. The full path to the job includes the names of the parent project group, the parent project and its version, and the job itself. Leave blank
to extract all jobs that don’t match the
|
|
|
(Optional) A list of regular expressions, separated by commas, which describe jobs that should not be extracted. The full path to the job includes the names of the parent project group, the parent project and its version, and the job itself. Leave
blank to extract all jobs that match the
|
|
|
(Optional) A list of regular expressions, separated by commas, which describe environments that should be extracted. The full path to the environment includes the names of the parent project group, the parent project and its version, and the environment
itself. Leave blank to extract all environments that do not match the
|
|
|
(Optional) A list of regular expressions, separated by commas, which describe environments that should not be extracted. The full path to the environment includes the names of the parent project group, the parent project and its version, and the environment
itself. Leave blank to extract all environments that match the
|
|
|
Matillion server address used for the connection |
matillion.getmanta.com 192.168.0.16 |
|
Matillion server scheme type used for the connection |
HTTP HTTPS |
|
Matillion server port used for the connection |
443 |
|
Username for the connection to the Matillion server |
guest |
|
Password for the connection to the Matillion server |
password |
matillion.extraction.method |
Set to Agent:default when the desired extraction method is the default Manta Extractor Agent, set to Agent:{remote_agent_name} when a remote Agent is the desired extraction method, set to Git:{git.dictionary.id} when the Git ingest method is the desired extraction method. For more information on setting up a remote extractor Agent please refer to the Manta Flow Agent Configuration for Extraction documentation. For additional details on configuring a Git ingest method, please refer to the Manta Flow Agent Configuration for Extraction:Git Source documentation. |
default Git agent |
Common Scanner Properties
This configuration is common for all Matillion source systems and for all Matillion scenarios, and is configure in Admin UI / Configuration / CLI / Matillion / Matillion Common. It can be overridden on individual connection level.
Property name |
Description |
Example(s) |
---|---|---|
|
Path to the directory with manual input files extracted from the Matillion instance |
${manta.dir.input}/matillion/${matillion.extractor.server} |
|
Path to directory with extracted output files |
${manta.dir.temp}/matillion/${matillion.extractor.server} |
|
Path to the optional TXT file with configurations of user-defined shared job usage |
${manta.dir.input}/matillion/${matillion.extractor.server}/sharedJobsConfig.txt |
|
When using HTTPS, whether the hostname of the server's certificate should be validated to match the hostname of the server |
true |
|
Whether paths to files should be lowercased (false for case-sensitive file systems, true otherwise) |
true |
User-Defined Shared Jobs Configuration File
The file is optional and is expected in the path defined by the common property matillion.shared.jobs.config.file
. This file must be created if user-defined shared jobs were used in any of the analyzed orchestration jobs.
[<shared.job.package.name>/<shared_job_name>/<shared_job_revision_number>]
<project_group_name>/<project_name>/<project_version_name>/<orchestration_job_name>/<shared_job_component_name>
<shared.job.package.name>/<shared_job_name>/<shared_job_revision_number>/<orchestration_job_name>/<nested_shared_job_component_name>
[...]
...
The scope definition (in square brackets) represents the full path to a particular user-defined shared job. It must be followed by one or more record entries which represent the full path to the component running the shared job declared in the scope, as shown above. Note that the usage of nested shared jobs also needs to be configured (as shown on line 3).
Use the correct file format. Spaces and tabs are allowed, but each scope definition and record entry must be on a separate line.
UNKNOWN_COMPONENTS_WERE_FOUND
keyword. If the prepared record entry does not represent a user-defined shared job, delete the entry.