Talend Manual Inputs

This page explains the specific structure and files accepted by this scanner. See Manta Flow Usage: Preparing Scanner Inputs for additional methods how to provide these inputs via Process Manager or Orchestration API. Check Ingest Source Support how to use Manta Agent to obtain input files from a remote machine or Git.

IBM Automatic Data Lineage supports analysis of both Talend export formats (zip and item), but there can be differences in the results.

Please note that the content and structure of the Talend archive file format export must comply with the same directory and file structure as generated by Talend Studio.

Exporting Jobs

Jobs have to be exported and provided manually for dataflow analysis. See https://community.qlik.com/t5/Official-Support-Articles/Exporting-items-from-Talend-Studio/ta-p/2150920 or https://help.qlik.com/talend/en-US/studio-user-guide/8.0-R2024-08/exporting-items.

By default, assuming that <proj_name> is the name of the project, the exported files related to jobs are located in /../<proj_name>/process. Inside the folder process, jobs can be further organized into subfolders. All files related to one job are always located in the same (sub)directory and have the same filename (differing only in extension) — this is mandatory as, otherwise, Talend Studio is unable to import the job back into the workspace.

The whole folder <proj_name> can also be exported as a .zip archive.

Building Jobs

Talend Studio allows you to build jobs as standalone units containing scripts for their execution or scheduling (.bat file for windows). See https://help.qlik.com/talend/en-US/studio-user-guide/8.0-R2024-08/building-job-as-standalone-job. The structure is very similar to one of the exported jobs. By default, the built job is zipped into <job_name>.zip. Among other things, the archive contains the folder <proj_name> where job definitions are stored as described in the previous section. However, it is important to note that when setting the build properties, the user can choose not to include the .item files, at which point the job definitions are not stored in the build — which is not desirable.

If the built job references (depends on) other jobs within the same project (i.e., it calls some child jobs as a part of its ETL operations), then all necessary definitions of those referenced are automatically also included in the build.

Input Folder Structure

To analyze Talend project files, create a directory structure as follows.

The folder above can be provided to Automatic Data Lineage for execution as per Manta Flow Usage: Preparing Scanner Inputs.

Export File Structure

The zip export structure generated by Talend can be used directly however, Manta only uses the following files: