Designing the data flow

You can branch and merge streams in the flow.

Branching streams

When you connect a stage to multiple stages, all data passes to all connected stages. You can configure required fields for a stage to discard records before they enter the stage, but by default all records are passed.

For example, in the following flow, all of the data from the Directory source passes to both branches of the flow for different types of processing. But you might optionally configure required fields for the Field Splitter or Field Replacer to discard any records that are not needed.

A single Directory source passes to two stages, creating two branches of the flow

To route data based on more complex conditions, use a Stream Selector.

Some stages generate events that pass to event streams. Event streams originate from an event-generating stage, such as a target or source, and pass from the stage through an event stream output, as follows:

A flow with an event stream branching from the event-generating stage

For more information about the event framework and event streams, see Dataflow triggers overview.

Merging streams

You can merge streams of data in a flow by connecting two or more stages to the same downstream stage. When you merge streams of data, Data Collector channels the data from all streams to the same stage, but does not perform a join of records in the stream.

For example, in the following flow, the Stream Selector stage sends data with null values to the Field Replacer stage:

A flow canvas shows the Stream Selector stage sending data to both the Field Replacer stage and the Expression Evaluator stage. The Field Replacer stage also sends data to the Field Replacer stage, merging the streams of the flow.

The data from the Stream Selector default stream and all data from Field Replacer pass to Expression Evaluator for further processing, but in no particular order and with no record merging.

Important: Flow validation does not prevent duplicate data. To avoid writing duplicate data to targets, configure the flow logic to remove duplicate data or to prevent the generation of duplicate data.

Note that you cannot merge event streams with data streams. Event records must stream from the event-generating stage to targets or executors without merging with data streams. For more information about the event framework and event streams, see Dataflow triggers overview.