Matillion Resource Configuration

Before you configure your scanner, make sure you meet the prerequisites. Read our guide on Matillion integration requirements to double-check.

Source System Properties

This configuration can be setup by creating a new connection on Admin UI > Connections tab or editing an existing connection in Admin UI / Connections / Data Integration Tools / Matillion /specific connection. New connection can also be created via Manta Orchestration API.

One IBM Automatic Data Lineage connection for Matillion corresponds to one Matillion server that will be analyzed.

Property name

Description

Example(s)

matillion.extractor.server

An arbitrary name that identifies the Matillion instance for you. The name will be used in the viewer and might affect the paths of extracted and input files for the Automatic Data Lineage connection. Do not use URLs; dots are not allowed in the name.

snowflake_instance

matillion.platform

Matillion instance cloud data warehouse type.

Snowflake

matillion.extractor.jobs.include

(Optional)

A list of regular expressions, separated by commas, which describe jobs that should be extracted. The full path to the job includes the names of the parent project group, the parent project and its version, and the job itself. Leave blank to extract all jobs that don’t match the matillion.extractor.jobs.exclude property.

group_name/project_name/.+/^transformation.*,.+/project_name/default/\\D+

matillion.extractor.jobs.exclude

(Optional)

A list of regular expressions, separated by commas, which describe jobs that should not be extracted. The full path to the job includes the names of the parent project group, the parent project and its version, and the job itself. Leave blank to extract all jobs that match the matillion.extractor.jobs.include property.

matillion.extractor.environments.include

(Optional)

A list of regular expressions, separated by commas, which describe environments that should be extracted. The full path to the environment includes the names of the parent project group, the parent project and its version, and the environment itself. Leave blank to extract all environments that do not match the matillion.extractor.environments.exclude property.

group_name/project_name/.+/^env_.*,.+/project_name/default/\\d+

matillion.extractor.environments.exclude

(Optional)

A list of regular expressions, separated by commas, which describe environments that should not be extracted. The full path to the environment includes the names of the parent project group, the parent project and its version, and the environment itself. Leave blank to extract all environments that match the matillion.extractor.environments.include property.

matillion.extractor.address

Matillion server address used for the connection

matillion.getmanta.com

192.168.0.16

matillion.extractor.scheme

Matillion server scheme type used for the connection

HTTP

HTTPS

matillion.extractor.port

Matillion server port used for the connection

443

matillion.extractor.user

Username for the connection to the Matillion server

guest

matillion.extractor.password

Password for the connection to the Matillion server

password

matillion.extraction.method

Set to Agent:default when the desired extraction method is the default Manta Extractor Agent, set to Agent:{remote_agent_name} when a remote Agent is the desired extraction method, set to Git:{git.dictionary.id} when the Git ingest method is the desired extraction method. For more information on setting up a remote extractor Agent please refer to the Manta Flow Agent Configuration for Extraction documentation. For additional details on configuring a Git ingest method, please refer to the Manta Flow Agent Configuration for Extraction:Git Source documentation.

default

Git

agent

Common Scanner Properties

This configuration is common for all Matillion source systems and for all Matillion scenarios, and is configure in Admin UI / Configuration / CLI / Matillion / Matillion Common. It can be overridden on individual connection level.

Property name

Description

Example(s)

matillion.input.dir

Path to the directory with manual input files extracted from the Matillion instance

${manta.dir.input}/matillion/${matillion.extractor.server}

matillion.output.dir

Path to directory with extracted output files

${manta.dir.temp}/matillion/${matillion.extractor.server}

matillion.shared.jobs.config.file

Path to the optional TXT file with configurations of user-defined shared job usage

${manta.dir.input}/matillion/${matillion.extractor.server}/sharedJobsConfig.txt

matillion.extractor.verifyHostname

When using HTTPS, whether the hostname of the server's certificate should be validated to match the hostname of the server

true
false

filepath.lowercase

Whether paths to files should be lowercased (false for case-sensitive file systems, true otherwise)

true
false

User-Defined Shared Jobs Configuration File

The file is optional and is expected in the path defined by the common property matillion.shared.jobs.config.file. This file must be created if user-defined shared jobs were used in any of the analyzed orchestration jobs.

[<shared.job.package.name>/<shared_job_name>/<shared_job_revision_number>]
<project_group_name>/<project_name>/<project_version_name>/<orchestration_job_name>/<shared_job_component_name>
<shared.job.package.name>/<shared_job_name>/<shared_job_revision_number>/<orchestration_job_name>/<nested_shared_job_component_name>

[...]
...

The scope definition (in square brackets) represents the full path to a particular user-defined shared job. It must be followed by one or more record entries which represent the full path to the component running the shared job declared in the scope, as shown above. Note that the usage of nested shared jobs also needs to be configured (as shown on line 3).

Use the correct file format. Spaces and tabs are allowed, but each scope definition and record entry must be on a separate line.

If unknown components were found during analysis, these components may be user-defined shared job components. Therefore, you can create a new configuration file or modify an existing one with content prepared in the corresponding analysis log file and fill in the missing shared job paths. To find this content, search for the UNKNOWN_COMPONENTS_WERE_FOUND keyword. If the prepared record entry does not represent a user-defined shared job, delete the entry.