Configuring a flow

About this task

Configure a flow to define how data moves and transforms from the source system to the target systems.

You can include the following stages in a flow:
  • A single source stage
  • Multiple processor stages
  • Multiple target stages
  • Multiple executor stages

Procedure

  1. From within a project, on the Assets page, click New Asset > Create a real-time streaming data flow.
  2. On the Create a StreamSets flow page, configure the following properties:
    Flow Details Property Description
    Name Name for the flow.
    Description Optional description.
    Environment StreamSets environment to use. Determines the stages and functionality that you can configure for the flow.
    Select an available environment.
  3. Click Create.
    An empty canvas appears.
  4. Use the Node palette to add a source stage. In the Properties panel, configure the stage properties and click Save.
    For configuration details about source stages, see Sources.
  5. Use the Node palette to add the next stage that you want to use. Configure the stage properties and click Save.
    Tip: The canvas connects each new stage to the last-selected stage. You can select a stage or link to delete the stage or link from the canvas.
    • For configuration details about processors, see Processors.
    • For configuration details about targets, see Targets.
    • For configuration details about executors, see Executors.
  6. Add additional stages as necessary.
  7. Click the Save icon to save your changes.
  8. Optionally configure flow properties, such as error record handling and the environment to execute the flow.
  9. When appropriate, click Validate for a list of validation issues with your flow.
    For more information, see Flow validation.
  10. After you finish configuring the flow, you can create and run a job to execute the flow.