Comparing Snowflake and Other Engines

For users already familiar with and or , here's how working with is similar... and different.

pipelines are configured on the canvas, just like and pipelines. The difference lies in the available functionality within the pipelines and how the pipelines run.

As described in How It Works, does not perform actual pipeline processing like . Instead, follows the model. Just as passes pipeline configuration to Spark for processing, generates a SQL query based on pipeline configuration and passes the query to Snowflake for processing. This structural similarity explains how got its name.

With and , you can use heterogeneous origins and destinations to read from and write to a wide range of systems. pipelines process Snowflake data – all origins and destinations read from and write to Snowflake.

However, many concepts and behaviors remain exactly the same. For example, you use origin, processor, and destination stages to define processing in all pipeline types. You create jobs to run pipelines. You can use runtime parameters in all pipelines.

Here are some highlights of the similarities and differences between and and :
Similarities
Since you design and run in , some basic concepts remain the same:
Differences
Unlike and :
  • provides a hosted engine that most organizations use to avoid installing and maintaining engines. You can deploy engines based on the account agreement for your organization.

    For more information, see Hosted or Deployed Engines.

  • includes stages based on Snowflake functionality, such as the Cube processor to apply the Group by Cube command.
  • uses the terms "column" and "row" to align with Snowflake terminology. and use the terms "field" and "record" to refer to the same concepts.
  • Like , includes executor stages to perform tasks, such as sending an email notification.

    executors perform tasks using Snowflake integrations after all pipeline writes complete, when triggered by the data. These executors can be placed anywhere in the data flow.

    executors expect to be triggered by special event records, which are only generated by certain stages. These executors should be placed downstream from event-generating stages.

  • You can monitor Snowflake jobs as you would any other job. However, the Snowflake job summary displays the following different information:
    • Input and output row count

      You cannot view an error row count, row throughput, or runtime statistics as you can for other jobs.

    • Log messages
    • Snowflake queries run for the job