Creating a pipeline

Create a pipeline to run an end-to-end scenario to automate all or part of the AI lifecycle. For example, create a pipeline that creates and trains an asset, promotes it to a space, creates a deployment, then scores the model.

Beta notice: The Watson Studio Pipelines service is provided as a Beta, solely for purposes of testing and providing feedback to IBM prior to general availability. It is not intended for production use. You can download this service from the Watson Studio Pipelines Early Access Site.

Watch this video to see how to create and run a sample pipeline.

This video provides a visual method as an alternative to following the written steps in this documentation.

Adding a pipeline to a project

  1. Open a project.

  2. Click New asset and choose Pipelines.

  3. Enter a name and optional description.
  4. (Optional) Select DataStage functions to enable DataStage functions in the pipelines Expression Builder.
  5. Click Save to open the canvas.

Pipeline access

When you use a pipeline to automate a flow, you must have access to all of the elements in the pipeline. Make sure that you create and run pipelines with the proper access to all assets, projects, and spaces used in the pipeline.

Overview: Building a pipeline

Follow these high-level steps to build and run a pipeline.

  1. Drag any node objects onto the canvas. For example, drag a Run notebook node onto the canvas.
  2. Use the action menu for each node to view and select options.
  3. Configure a node as required. You are prompted to supply required input options. For some nodes, you can view or configure output options as well. For examples of configuring nodes, see Configuring pipeline components.
  4. Drag from one node to another to connect and order the pipeline.
  5. Optional: Click the Global objects icon in the toolbar to configure runtime options for the pipeline.
  6. When the pipeline is complete, click the Run icon on the toolbar to run the pipeline. You can run a trial run to test the pipeline, or you can schedule a job when you are confident in the pipeline.

Configuring nodes

As you add nodes to a pipeline, you must configure them to provide all of the required details. For example, if you add a node to run an AutoAI experiment, you must configure the node to specify the experiment, load the training data, and specify the output file. For example:

AutoAI node parameters

Connecting nodes

When you build a complete pipeline, the nodes must be connected in the order in which they should execute in the pipeline. To connect nodes, hover over a node and drag a connection to the target node. Disconnected nodes are executed in parallel.

Connecting nodes

Defining pipeline parameters

A pipeline parameter defines a global variable for the whole pipeline. Use pipeline parameters to specify data from one of these categories:

Parameter type Can specify
Basic JSON types such as string, integer or a JSON object
CPDPath Resources, such as assets, asset containers, or jobs
InstanceCRN Storage, machine learning instances, and so on
Other Various configuration types, such as status, timeout length, estimator, error policies and so on

To specify a pipeline parameter:

  1. Click the Parameter icon in the toolbar to configure options.
  2. Assign a name and optional description. Select a type and provide any required information. Click Add when the definition is complete.
  3. Repeat until you finish defining parameters.
  4. Click Save to make the parameters available to the pipeline.

Saving a version of a pipeline

You can save a version of a pipeline and revert to it at a later time. For example, if you want to preserve a particular configuration before you make changes, save a version. You can revert the pipeline to a previous version. When you share a pipeline, the latest version is used.

To save a version:

  1. Click the version icon on the toolbar.
  2. In the version pane, assign a name to the version and save.

When you run the pipeline, you can choose from available versions.

Note: You cannot delete a saved version.

Running a pipeline

The Run option gives you several options:

When you run a pipeline from a trial run or a job, click the node output to view the results of a successful run. If the run fails, error messages and logs are provided to help you correct issue.

Next steps

Configure pipeline components

Parent topic: Watson Studio Pipelines