Open Manta Annotated Script Scanner Resource Configuration
Source System Properties
One connection or source system in Annotated Script Scanner corresponds to one project or package that is analyzed under one connection object.
New connections can be created in Manta Admin UI under the Connections tab. The connection configuration for Open Manta Annotated Script Scanner has the following structure.
| Property name | Description | Example |
|---|---|---|
| annotatedscript.connection.id | Name of the source system containing the scenario files for Annotated Script Scanner | template |
| annotatedscript.resource.name | Resource name for the analyzed technology | Python |
| annotatedscript.annotationsFormat.path | Path to the JSON file containing the configuration of the Manta annotations format for the analyzed technology | ${manta.dir.scenario}/etc/annotatedscriptAnnotationsFormatPython.json |
| annotatedscript.script.encoding | Input script encoding | UTF-8 |
| annotatedscript.includes.encoding | Encoding of input scripts referenced by the @MANTAInclude annotation | UTF-8 |
Common Properties
This configuration is common for all Annotated Script Scanner source systems and for all Annotated Script Scanner scenarios, and it is available on the Manta Admin UI screen under Configurations > CLI > Annotated Script > Annotated Script Common. Only the properties that you can safely edited are listed.
| Property name | Description | Example |
|---|---|---|
| annotatedscript.input.dir | Directory with the Annotated Script Scanner project/package. | ${manta.dir.input}/annotatedscript/${annotatedscript.connection.id} |
| annotatedscript.includes.folder | Name of the includes directory for script files referenced by the @MANTAInclude annotation. If it is in ${annotatedscript.input.dir} it is ignored as the projects/packages source sub-directory. |
includes |
| annotatedscript.includes.path | Directory that serves as a root for searching scripts included by the @MANTAInclude annotation. | ${annotatedscript.input.dir}/${annotatedscript.includes.folder} |
| annotatedscript.connections.file | Name of the file that contains the connection ID configuration. | connectionsConfiguration.prm |
| annotatedscript.connections.path | Path to the file that containt the connection ID configuration. | ${manta.dir.input}/annotatedscript/${annotatedscript.connections.file} |
| annotatedscript.resourceTypes.file | File that contains resource type definitions. | ${manta.dir.scenario}/etc/annotatedscriptResourceTypesConfiguration.csv |
| filepath.lowercase | Whether paths to files need be lowercase (false for case-sensitive file systems, true otherwise). | true false |
Preparing Data for the Provided Packages/Projects
Packages/projects have to be stored in the file system in the proper location.
- Create a new Annotated Script Scanner connection.
- Create the
<annotatedscript.input.dir>directory (typically located atmanta/cli/input/annotatedscript/<annotatedscript.connection.id>/) and place all the script files there (typically for Python.py). The required structure using custom sub-directories can be used. - Create the
<annotatedscript.includes.path>directory (typically located atmanta/cli/input/annotatedscript/<annotatedscript.connection.id>/includes/) and place all the input scripts referenced by the@MANTAIncludeannotation there. The required structure using custom sub-directories can be used. - Create the
<annotatedscript.connections.path>file (typically located atmanta/cli/input/annotatedscript/connectionsConfiguration.prm) that contains the connection ID configuration as described in Open Manta Annotated Script Scanner Usage.
Language Format Configuration JSON File Structure
This configuration file defines how Automatic Data Lineage finds annotations and queries in the input script. Typically, a separate configuration file is needed for each host language analyzed.
Example of a language format configuration JSON file containing all allowed constructs:
{
"commentLocators": [
{"locatorType": "LINE", "commentDelimiter": "#"},
{"locatorType": "BLOCK", "startDelimiter": "/*", "innerLinePrefix": "*", "endDelimiter": "*/"}
],
"queryLocators": [
{ "locatorType": "REGEXP", "queryRegExp": "\"\"\".*\"\"\"",
"queryTranslators": [
{ "translatorType": "LOOKUP",
"lookup": [
["\"\"\"", ""]
]
},
{ "translatorType": "REGEXP_REPLACEALL", "regex": "^\\\\x([A-Fa-f0-9]{2})", "replacement": "\\\\u00$1"},
{ "translatorType": "AGGREGATE",
"translators": [
{"translatorType": "OCTAL_UNESCAPER"},
{"translatorType": "UNICODE_UNESCAPER"}
]
},
{"translatorType": "JAVA_STRING_UNESCAPER"}
]
},
{ ... }
]
}
Where:
-
commentLocators(see the example on line 2)- Contains the configuration for recognizing comments in the source script file that could contain Annotated Script specific annotations.
- All defined
locatorTypewill be searched in the input script file.
-
locatorType LINE(see the example on line 3)- Configures a single-line comment.
commentDelimiter— prefix delimiter starting comment
-
locatorType BLOCK(see the example on line 4)- Configures a block comment.
startDelimiter—starting delimiter (e.g., for Java “/*“).innerLinePrefix—inner-line prefix (e.g., for Java “*“).endDelimiter—ending delimiter (e.g., for Java “*/“).
-
queryLocators(see the example on line 6)- Contains the configuration for recognizing and post-processing SQL query used in the
@MANTASQLannotation. - All defined
locatorTypewill be tested until a query is found. (The query can be multi-line.)
- Contains the configuration for recognizing and post-processing SQL query used in the
-
locatorType REGEXP(see the example on line 7)- Searches for a query using a regular expression.
- Only
queryLocatoris allowed. queryRegExp—pattern identifying the searched query.queryTranslators—list of translators used to post-process the found query string.
-
queryTranslators(see the example on line 8)- List of
translatorTypeused to post-process the found query string. - Translators are applied in order based on the results of the previous translator.
- List of
-
translatorType LOOKUP(see the example on line 9)- Translates a value using a lookup table.
lookup—list of replacements used together in one pass-through input query (in the example all“““have been removed)
-
translatorType REGEXP_REPLACEALL(see the example on line 14)- Replaces each sub-string of this string that matches the given regular expression with the given replacement.
- Requires the whole query string to process, so it cannot be used inside
translatorType AGGREGATE. regex—searched regular expression.replacement—target replacement string
-
translatorType AGGREGATE(see the example on line 15)- Executes a sequence of translators one after the other in one pass-through input query.
translators—ordered list of appliedtranslatorType
-
translatorType OCTAL_UNESCAPER(see the example on line 17)- Translate the escaped octal strings back to their octal values. For example, "\45" should go back to being the specific value (a %).
- Note that this currently only supports the viable range of octal for Java, namely 1 to 377. This is because parsing Java is the main use case.
-
translatorType UNICODE_UNESCAPER(see the example on line 18)- Translates the escaped Unicode values in the format
\\u+\d\d\d\dback to Unicode. It supports multiple 'u' characters and works with or without the +.
- Translates the escaped Unicode values in the format
-
translatorType JAVA_STRING_UNESCAPER(see the example on line 21)- Unescapes any Java literals found in the query.
- Requires the whole query string to process, so it cannot be used inside
translatorType AGGREGATE.
Example Configuration for Use in Java Source Code
Provided files:
- MantaJDBCExample.java — An example of annotated user source code, connecting results from a database to an export file.
- FilesFlowTest_Connections.prm — A minimalistic configuration of the connection used.
- javaAnnotationsFormat.json — A configuration for the use of Annotated Script Scanner for Java.
Requirements:
- Manta annotations have to be placed inside line comments after
//. - Queries have to be complete SQL in one literal, allowing multiple lines and comments, as shown in the example. (Comments can not contain
".)- If the code creates queries dynamically, they have to be generated for Manta before scanning (possibly into a separate file referenced by
@MANTAInclude).
- If the code creates queries dynamically, they have to be generated for Manta before scanning (possibly into a separate file referenced by