Snowflake
The Snowflake destination writes data to Snowflake. You can use the Snowflake destination to write to any accessible Snowflake database, including those hosted on Amazon S3, Microsoft Azure, and private Snowflake installations.
The Snowflake destination stages data to an internal table before writing it to Snowflake.
When you configure the destination, you specify the Snowflake region, database, table, and schema to use. You also specify the user account and password. You can optionally specify a custom role that overrides the default role for the user account. The user account or the custom role must have the required Snowflake privileges.
You can also use a connection to configure the destination.
You specify the write mode to use: overwrite, append, or merge. When appending data, the destination creates the table if it does not exist. To merge data, you configure additional merge properties.
You specify how to map record fields to table columns and select the behavior for data type mismatches. You can configure the origin to preserve existing capitalization for column names. You can also specify the number of connections to use and configure additional Snowflake properties.
Required Privileges and Custom Roles
The Snowflake destination requires a Snowflake role that grants the following privileges:
Object | Privilege |
---|---|
Schema | CREATE TABLE |
Table | SELECT, INSERT |
- Assign the custom role as the default role
- In Snowflake, assign the custom role as the default role for the Snowflake user account specified in the stage. A Snowflake user account is associated with a single default role.
- Override the default role with the custom role
- In the stage, use the Role property to specify the name of the custom role. The custom role overrides the role assigned to the Snowflake user account specified in the stage.
- Use a Snowflake connection
- When working with Control Hub, you can configure a Snowflake connection to provide connection details for Snowflake stages.
Write Mode
- Overwrite existing table
- If the table exists, the destination drops and recreates the table. Then, the destination inserts all data to the table.
- Append rows to existing table or create table if none exists
- If the table exists, the destination appends data to the table. If the table does not exist, the destination creates the table.
- Merge rows to existing table
- The destination merges data with data in an existing table. The destination performs inserts, updates, and deletes based on the specified merge properties.
Merge Properties
- Join Key
- One or more key columns in the table. Used to perform updates and deletes and to ensure that duplicate rows do not exist for inserts. Pipeline records must include a matching field name.
- When Clause
- Action that the destination
performs when a record meets the specified conditions. You can specify
multiple merge configurations for the destination to perform. Important: The destination performs the writes in the specified order. Best practice is to list merge configurations with the smallest number of affected records first, progressing to the largest number of affected records. When defining multiple merge configurations of the same type, carefully consider the order that you use.
Snowflake Data Types
When writing to Snowflake, the Snowflake destination converts the Spark data types used in the pipeline to Snowflake data types. The following table describes how this conversion occurs.
Spark Data Type | Snowflake Data Type |
---|---|
ArrayType | VARIANT |
BooleanType | BOOLEAN |
ByteType | INTEGER |
DateType | DATE |
DecimalType | DECIMAL |
DoubleType | DOUBLE |
FloatType | FLOAT |
IntegerType | INTEGER |
LongType | INTEGER |
MapType | VARIANT |
ShortType | INTEGER |
StringType | VARCHAR(N) for specified lengths, otherwise VARCHAR |
StructType | VARIANT |
TimestampType | TIMESTAMP |
Configuring a Snowflake Destination
Configure the Snowflake destination to write data to Snowflake.