Snowflake Resource Configuration

Before you configure your scanner, make sure you meet the prerequisites. Read our guide on Snowflake integration requirements to double-check.

Source System Properties

This configuration can be setup by creating a new connection on Admin UI / Connections tab or editing an existing connection in Admin UI / Connections / Databases / Snowflake / specific connection. New connection can also be created via Manta Orchestration API.

Granularity of the IBM Automatic Data Lineage connection for Snowflake is one Snowflake server. Use filter on databases and schemas to limit the scope of analysis as needed. Use of multiple connections against a single Snowflake server may lead to within-system lineage not be connected properly.

Property name	Description	Example
snowflake.dictionary.id	Name of a resource representing this Snowflake server known as the dictionary ID, used as an output subdirectory name for extracted DDL files and the database dictionary	snowflake
snowflake.url	JDBC connection string for the database server with the warehouse and role specified; the role has to have sufficient privileges as per Snowflake Integration Requirements	jdbc:snowflake://manta.us-east-1.snowflakecomputing.com/?warehouse=DEMO_WH&role=SELECT_CATALOG_ROLE
snowflake.extraction.type	Extraction type used for extracting metadata from the Snowflake server	INFORMATION_SCHEMA or ACCOUNT_USAGE
snowflake.extraction.method	Set to Agent:default when the desired extraction method is the default Manta Extractor Agent, set to Agent:{remote_agent_name} when a remote Agent is the desired extraction method, set to Git:{git.dictionary.id} when the Git ingest method is the desired extraction method. For more information on setting up a remote extractor Agent please refer to the Manta Flow Agent Configuration for Extraction documentation. For additional details on configuring a Git ingest method, please refer to the Manta Flow Agent Configuration for Extraction:Git Source documentation.	default Git agent
snowflake.username	User name for connection to the database server	manta_user
snowflake.password (supported up to and including R42.13)	User password for connection to the database server Leave empty if you use key pair authentication for the Snowflake connection	manta_password
snowflake.privatekeyfile	File path containing the private key for key pair authentication Complete this field to use key pair authentication for the Snowflake connection	C:\Users\TestUser\Desktop\authentication\keypairfile
snowflake.privatekeyfilepwd	Password to the file containing the private key for key pair authentication Only complete if you use key pair authentication for the Snowflake connection	private_key_file_password
snowflake.tablestage	Regular expression of all table stages from which staged files should be extracted The format of the regular expression is the fully qualified name of the table stage where each segment is encapsulated in double quotes Leave empty if no staged files should be extracted from any table stages	`\"mydb\"\\.\"schema1\"\\.\".\"\|\"mydb\"\\.\"myschema\"\\.\"abc.\"`
snowflake.extractedDbsSchemas	List of databases and schemas to extract, separated by commas, which are to be provided in the format database/schema; each part is evaluated as a regular expression	database1/sch1,database2/sch2,database3
snowflake.excludedDbsSchemas	List of databases and schemas to exclude from extraction, separated by commas	database3/sch3,database3/sch4
snowflake.ddl.encoding	Encoding of automatically extracted DDL scripts. See Encodings for applicable values.	utf8
snowflake.script.encoding	Encoding of manually provided SQL scripts performed on this database server. See Encodings for applicable values.	utf8

Common Scanner Properties

This configuration is common for all Snowflake source systems and for all Snowflake scenarios, and is configure in Admin UI / Configuration / CLI / Snowflake / Snowflake Common. It can be overridden on individual connection level.

Property name	Description	Example
snowflake.dictionary.dir	Directory with data dictionaries extracted from Snowflake	${manta.dir.temp}/snowflake
filepath.lowercase	Whether paths to files should be lowercase (false for case-sensitive file systems, true otherwise)	false true
snowflake.dll.output	Directory for automatically extracted Snowflake DDL scripts (for the extraction phase)	${manta.dir.temp}/snowflake/${snowflake.dictionary.id}/ddl
snowflake.ddl.input	Directory with automatically extracted Snowflake DDL scripts (for the analysis phase)	${snowflake.dll.output}
snowflake.script.input	Directory with manually provided SQL scripts which are performed on a given database server (for the analysis phase)	${manta.dir.input}/snowflake/${snowflake.dictionary.id}/
snowflake.script.replace	Path to the CSV file with the replacements to be applied to the provided SQL scripts; see Placeholder Replacement in Input Scripts for details about the replacement file format	${manta.dir.input}/snowflake/${snowflake.dictionary.id}/replace.csv
snowflake.script.replace.regex	Flag specifying whether replacements for SQL scripts in the provided CSV file specified in `snowflake.script.replace` should be interpreted as regular expressions (true) or simple text (false)	false true
snowflake.analyze.parallelCount	Number of parallel threads which will analyze DDL and SQL scripts	4
snowflake.dictionary.mappingFile	Path to automatically generated mappings for Snowflake servers	${manta.dir.temp}/snowflake/snowflakeDictionaryMantaMapping.csv
snowflake.dictionary.mappingManualFile	Path to mappings provided manually for Snowflake servers	${manta.dir.scenario}/conf/snowflakeDictionaryMantaMappingManual.csv
snowflake.searchPath	Schema lookup path used by the Snowflake server; default value: $current,$public	$current,$public, db1.sch1
snowflake.extractor.numThreads	Number of threads performing queries to Snowflake databases	4
snowflake.extractor.persistence.pipelineSize	Maximum number of deserialized objects stored in the persistence buffer waiting to be written into the dictionary; restricts memory footprint	1000
snowflake.analyze.retainUnusedResultSetColumns	Flag specifying whether the data lineage should include sub-query resultset columns that do not have any downstream lineage By default, set to false	false true
snowflake.extractor.liststages	Enables or disables extraction (listing of stage contents) for extracted stages; By default, set to `true`, meaning stage contents (files) will be listed and will participate in lineage. Setting to `false` means stages will not be listed, and individual files under stages will not participate in lineage. As of R42.4: The maximum number of files extracted per stage is set to 100. If a stage contains more files, the stage is extracted but the files that this stage contains are not extracted. This limit is applicable to external stages, and selected table stages.	false true
snowflake.expressionDescriptions.enabled.ddl	Enables or disables building transformation logic descriptions from SQL code in DDL scripts	false true
snowflake.expressionDescriptions.enabled.script	Enables or disables building transformation logic descriptions from SQL code in SQL scripts	false true
snowflake.expressionDescriptions.enabled.queryService	Enables or disables building transformation logic descriptions from Snowflake SQL code defined in other technologies	false true