Preview
Overview
You can preview data to help build or fine-tune a pipeline. You can preview complete or incomplete pipelines.
When you preview data, source data from the origin passes through the pipeline, allowing you to review how the data passes and changes through each stage. You can edit stage properties and run the preview again to see how your changes affect the data.
You can choose to preview using Spark libraries embedded in the Transformer installation or using the Spark cluster configured for the pipeline.
You can preview all stages in the pipeline, or you can perform a partial preview when one of the pipeline stages encounters an error. You can preview data for one stage at a time or for a group of stages. You can also view the preview data in list or table view.
When previewing data for a processor, you can choose how to display the order of output records. You can display output records in the order that matches the input records or in the order produced by the processor.
After running preview, you can view the input and output schema for each stage on the Schema tab in the pipeline properties panel.
Preview Availability
You can preview complete and incomplete pipelines.
The Preview icon () becomes
active when preview is available. You can preview data under the following
conditions:
- The authoring Transformer is an available registered Transformer.
- All stages in the pipeline are connected.
- All required properties are defined.
Spark for Preview
- Embedded Spark libraries
- Transformer includes embedded Spark libraries that you can use to preview a local or cluster pipeline.
- Spark cluster configured for the pipeline
- Transformer previews the pipeline using the Spark cluster configured for the pipeline.
In most cases, you'll want to preview using the configured Spark cluster so that Transformer uses the same processing as when you run the pipeline. However, if you want to quickly test pipeline logic that doesn't require using the source data from the origin, you can use development origins and then preview the pipeline using the embedded Spark libraries.
Writing to Destinations
As a tool for development, preview does not write data to destinations by default.
If you like, you can configure the preview to write data to destinations. We advise against writing preview data to production destinations.
Preview Data Types
MMM d, y h:mm:ss
a
.Preview Codes
In Preview mode, Transformer displays different colors for different types of data. Transformer uses other codes and formatting to highlight changed fields.
Preview Code | Description |
---|---|
Black values | Date data |
Blue values | Numeric data |
Green values | String data |
Red values | Boolean data |
Light red background | Fields removed by a stage |
Green stage | First stage in a multiple-stage preview |
Red stage | Last stage in a multiple-stage preview |
Processor Output Order
When previewing data for a processor, you can preview both the input and the output data. You can display the output records in the order that matches the input records or in the order produced by the processor.
In most cases when you preview data for a processor, you'll want to compare matching input and output records side by side because the processor produces updated records. For example, when you preview data for a Field Renamer processor, Transformer by default displays the output records in matching order with the input records. The Preview panel highlights the changed field in each record, as follows:
However, some processors such as the Aggregate or Profile processor don’t update records; they create new records. And other processors such as the Sort processor reorder the records. In these cases, comparing matching input and output records isn’t relevant. It's more helpful to display the output records in the order produced by the processor.
For example, when you preview data for an Aggregate processor, Transformer displays the output records in the output order by default. The Preview panel displays the input records under Input Data and the output records under Output Data without attempting to match the records, as follows:
If you display the output records in matching order with the input records for the same Aggregate processor, Transformer attempts to match the input and output records. The Preview panel displays the input records first, noting under Output Data that no matching records exist. The Preview panel then displays the new output records created by the processor, as follows:
Input and Output Schema for Stages
After running preview for a pipeline, you can view the input and output schema for each stage on the Schema tab in the pipeline properties panel. The schema includes each field path and data type.
Control Hub uses the schema extracted from the last data preview to list available field paths when you invoke expression completion for a stage property.
If you change the schema for a pipeline, for example if you remove a field, rename a field, or change the data type of a field, then you must run preview again so that the schema reflects the change.
In most cases as you configure stage properties, you can use expression completion to specify a field path. However, in some cases, you might use the Schema tab to copy a field path.
For example, let’s say you are configuring a Spark SQL Query processor and you need to
reference fields in the query. After running preview, you select the processor in the
pipeline canvas, and then click the Schema tab in the pipeline properties panel. You
click the Copy Field Path to Clipboard icon () to
copy the field path from the Schema tab, and then paste the field path into the Spark
SQL query to run.
The following image displays a sample Schema tab with the time of the last data preview:

Previewing a Pipeline
Preview a pipeline to review the values for each record to determine if the pipeline transforms data as expected. You can preview data for a single stage or for a group of linked stages.
Editing Properties
When running preview, you can edit stage properties to see how the changes affect preview data. For example, you might edit the condition in a Stream Selector processor to see how the condition alters which records pass to the different output streams.
When you edit properties, you can test the change by refreshing the preview data.
-
To edit stage properties while running preview, select the stage you want to
edit and click the Stage Configuration icon:
.
- Change properties as needed.
-
To test the changed properties, click Reload
Preview.
This refreshes the preview data.
- To revert your change, manually change the property back.