Snowflake Resource Configuration
Before you configure your scanner, make sure you meet the prerequisites. Read our guide on Snowflake integration requirements to double-check.
Source System Properties
This configuration can be setup by creating a new connection on Admin UI / Connections tab or editing an existing connection in Admin UI / Connections / Databases / Snowflake / specific connection. New connection can also be created via Manta Orchestration API.
Granularity of the IBM Automatic Data Lineage connection for Snowflake is one Snowflake server. Use filter on databases and schemas to limit the scope of analysis as needed. Use of multiple connections against a single Snowflake server may lead to within-system lineage not be connected properly.
Property name |
Description |
Example |
---|---|---|
snowflake.dictionary.id |
Name of a resource representing this Snowflake server known as the dictionary ID, used as an output subdirectory name for extracted DDL files and the database dictionary |
snowflake |
snowflake.url |
JDBC connection string for the database server with the warehouse and role specified; the role has to have sufficient privileges as per Snowflake Integration Requirements |
jdbc:snowflake://manta.us-east-1.snowflakecomputing.com/?warehouse=DEMO_WH&role=SELECT_CATALOG_ROLE |
snowflake.extraction.type |
Extraction type used for extracting metadata from the Snowflake server |
INFORMATION_SCHEMA or ACCOUNT_USAGE |
snowflake.extraction.method |
Set to Agent:default when the desired extraction method is the default Manta Extractor Agent, set to Agent:{remote_agent_name} when a remote Agent is the desired extraction method, set to Git:{git.dictionary.id} when the Git ingest method is the desired extraction method. For more information on setting up a remote extractor Agent please refer to the Manta Flow Agent Configuration for Extraction documentation. For additional details on configuring a Git ingest method, please refer to the Manta Flow Agent Configuration for Extraction:Git Source documentation. |
default Git agent |
snowflake.username |
User name for connection to the database server |
manta_user |
snowflake.password |
User password for connection to the database server Leave empty if you use key pair authentication for the Snowflake connection |
manta_password |
snowflake.privatekeyfile |
File path containing the private key for key pair authentication Complete this field to use key pair authentication for the Snowflake connection |
C:\Users\TestUser\Desktop\authentication\keypairfile |
snowflake.privatekeyfilepwd |
Password to the file containing the private key for key pair authentication Only complete if you use key pair authentication for the Snowflake connection |
private_key_file_password |
snowflake.tablestage |
Regular expression of all table stages from which staged files should be extracted The format of the regular expression is the fully qualified name of the table stage where each segment is encapsulated in double quotes Leave empty if no staged files should be extracted from any table stages |
|
snowflake.extractedDbsSchemas |
List of databases and schemas to extract, separated by commas, which are to be provided in the format database/schema; each part is evaluated as a regular expression |
database1/sch1,database2/sch2,database3 |
snowflake.excludedDbsSchemas |
List of databases and schemas to exclude from extraction, separated by commas |
database3/sch3,database3/sch4 |
snowflake.ddl.encoding |
Encoding of automatically extracted DDL scripts. See Encodings for applicable values. |
utf8 |
snowflake.script.encoding |
Encoding of manually provided SQL scripts performed on this database server. See Encodings for applicable values. |
utf8 |
Common Scanner Properties
This configuration is common for all Snowflake source systems and for all Snowflake scenarios, and is configure in Admin UI / Configuration / CLI / Snowflake / Snowflake Common. It can be overridden on individual connection level.
Property name |
Description |
Example |
---|---|---|
snowflake.dictionary.dir |
Directory with data dictionaries extracted from Snowflake |
${manta.dir.temp}/snowflake |
filepath.lowercase |
Whether paths to files should be lowercase (false for case-sensitive file systems, true otherwise) |
false |
snowflake.dll.output |
Directory for automatically extracted Snowflake DDL scripts (for the extraction phase) |
${manta.dir.temp}/snowflake/${snowflake.dictionary.id}/ddl |
snowflake.ddl.input |
Directory with automatically extracted Snowflake DDL scripts (for the analysis phase) |
${snowflake.dll.output} |
snowflake.script.input |
Directory with manually provided SQL scripts which are performed on a given database server (for the analysis phase) |
${manta.dir.input}/snowflake/${snowflake.dictionary.id}/ |
snowflake.script.replace |
Path to the CSV file with the replacements to be applied to the provided SQL scripts; see Placeholder Replacement in Input Scripts for details about the replacement file format |
${manta.dir.input}/snowflake/${snowflake.dictionary.id}/replace.csv |
snowflake.script.replace.regex |
Flag specifying whether replacements for SQL scripts in the provided CSV file specified in |
false |
snowflake.analyze.parallelCount |
Number of parallel threads which will analyze DDL and SQL scripts |
4 |
snowflake.dictionary.mappingFile |
Path to automatically generated mappings for Snowflake servers |
${manta.dir.temp}/snowflake/snowflakeDictionaryMantaMapping.csv |
snowflake.dictionary.mappingManualFile |
Path to mappings provided manually for Snowflake servers |
${manta.dir.scenario}/conf/snowflakeDictionaryMantaMappingManual.csv |
snowflake.searchPath |
Schema lookup path used by the Snowflake server; default value: $current,$public |
$current,$public, db1.sch1 |
snowflake.extractor.numThreads |
Number of threads performing queries to Snowflake databases |
4 |
snowflake.extractor.persistence.pipelineSize |
Maximum number of deserialized objects stored in the persistence buffer waiting to be written into the dictionary; restricts memory footprint |
1000 |
snowflake.analyze.retainUnusedResultSetColumns |
Flag specifying whether the data lineage should include sub-query resultset columns that do not have any downstream lineage By default, set to false |
false |
snowflake.extractor.liststages |
Enables or disables extraction (listing of stage contents) for extracted stages; By default, set to Setting to As of R42.4: The maximum number of files extracted per stage is set to 100. If a stage contains more files, the stage is extracted but the files that this stage contains are not extracted. This limit is applicable to external stages, and selected table stages. |
false true |
snowflake.expressionDescriptions.enabled.ddl |
Enables or disables building transformation logic descriptions from SQL code in DDL scripts |
false |
snowflake.expressionDescriptions.enabled.script |
Enables or disables building transformation logic descriptions from SQL code in SQL scripts |
false |
snowflake.expressionDescriptions.enabled.queryService |
Enables or disables building transformation logic descriptions from Snowflake SQL code defined in other technologies |
false |