Comparing Snowflake and Other Engines
For users already familiar with and or , here's how working with is similar... and different.
pipelines are configured on the canvas, just like and pipelines. The difference lies in the available functionality within the pipelines and how the pipelines run.
As described in How It Works, does not perform actual pipeline processing like . Instead, follows the model. Just as passes pipeline configuration to Spark for processing, generates a SQL query based on pipeline configuration and passes the query to Snowflake for processing. This structural similarity explains how got its name.
With and , you can use heterogeneous origins and destinations to read from and write to a wide range of systems. pipelines process Snowflake data – all origins and destinations read from and write to Snowflake.
However, many concepts and behaviors remain exactly the same. For example, you use origin, processor, and destination stages to define processing in all pipeline types. You create jobs to run pipelines. You can use runtime parameters in all pipelines.
- Similarities
- Since you design and run in , some basic concepts remain the same:
- Create pipelines in the pipeline canvas.
- Preview pipelines to help with pipeline development. For more information, see the documentation.
- Use origin, processor, destination, and executor stages to design the pipeline data flow.
-
Processors with the same names as and stages probably do what you expect at a high level, but might have subtle differences or additional features since the processing occurs in Snowflake.
For example, supports using the Snowflake SQL query language for data processing. See stage documentation for details, such as the Filter processor.
-
Like , you use the expression language in properties that are evaluated only once, before pipeline processing begins, such as runtime parameters in pipeline properties.
- Create jobs to run pipelines.
- Use Control Hub team-based features, such as version control and user management.
- Differences
- Unlike and :
- provides a hosted engine that most organizations use to avoid
installing and maintaining engines. You can deploy engines
based on the account agreement for your organization.
For more information, see Hosted or Deployed Engines.
- includes stages based on Snowflake functionality, such as the Cube processor to apply the Group by Cube command.
- uses the terms "column" and "row" to align with Snowflake terminology. and use the terms "field" and "record" to refer to the same concepts.
- Like , includes executor stages to perform tasks, such as sending an
email notification.
executors perform tasks using Snowflake integrations after all pipeline writes complete, when triggered by the data. These executors can be placed anywhere in the data flow.
executors expect to be triggered by special event records, which are only generated by certain stages. These executors should be placed downstream from event-generating stages.
- You can monitor Snowflake jobs as you would any other job. However, the Snowflake job summary displays the following
different information:
- Input and output row count
You cannot view an error row count, row throughput, or runtime statistics as you can for other jobs.
- Log messages
- Snowflake queries run for the job
- Input and output row count
- provides a hosted engine that most organizations use to avoid
installing and maintaining engines. You can deploy engines
based on the account agreement for your organization.