Annotated Script Scanner: Quick Start Guide / Configuration Checklist

This is a checklist for configuring Manta Annotated Script Scanner and an overview of how to run the associated scans.

For more information about individual annotations, their arguments, and how to implement them, see Open Manta Annotated Script Scanner Usage. See also Open Manta Annotated Script Scanner Resource Configuration.

File to Configure the Annotated Script Scanner

The file to configure the Annotated Script Scanner is called customScriptResourceTypesConfiguration.csv and is located in mantaflow\cli\scenarios\manta-dataflow-cli\etc.

Content and format of the annotatedscriptResourceTypesConfiguration.csv file

Entity Type;"Resource Name";"Resource Type";"Hierarchy"
PostgreSQL;"PostgreSQL";"PostgreSQL";"Server/Database/Schema/Table/Column"
Hive;"Hive";"Hive";"Server/Database/Schema/Table/Column"
Filesystem;"Filesystem";"Filesystem";"(Directory)\*/File/Column"
S3;"Filesystem";"Filesystem";"(Directory)\*/File/Column"
If you want to ignore the outlined process (creating the necessary connection properties file via the host), an annotations scanner connection can also be created via Manta Admin GUI.

Place the necessary connection_name.properties file in mantaflow\cli\scenarios\manta-dataflow-cli\etc\annotatedscript.

You can also remove or copy the placeholder/template file there to create the needed iteration for your connection (which should be named template.properties). Ensure that the annotatedscript.connection.id property value is set accordingly.

Example: CONNECTION_NAME.properties file

# Enter a name of the directory containing scenario files for Annotated Script Scanner.
# It will be used as subdirectory name for input files.
annotatedscript.connection.id=annotations_connection_id
# Resource name for the analyzed technology.
annotatedscript.resource.name=Python
# Path to JSON file containing configuration of Manta annotations format for the analyzed technology.
# e.g.: Comment delimiters, query RegExp, and postprocessing steps
annotatedscript.annotationsFormat.path=${manta.dir.scenario}/etc/annotatedscriptAnnotationsFormatPython.json
# Enter an encoding of input scripts.a
annotatedscript.script.encoding=UTF-8
# Enter an encoding of input scripts referenced by @MANTAInclude annotation.
annotatedscript.includes.encoding=UTF-8

Where to Put the Input Script Files to Be Scanned

  1. The annotatedscript and CONNECTION_NAME directories needs to be created in the IBM Automatic Data Lineage input script directory.
  2. Place your annotated scripts and connectionsConfiguration.prm in mantaflow\cli\input\annotatedscript\CONNECTION_NAME.
  3. Place connectionsConfiguration.prm in mantaflow\cli\input\annotatedscript.
  4. Place all annotated scripts in mantaflow\cli\input\annotatedscript\annotations_connection_id.

This example uses annotations_connection_id as the connection ID name, as outlined in the sample file snippet.

To Run the Scan

Go to Admin UI > Process Manager and create a workflow that runs "annotated Script Dataflow Scenario".