Creating a pipeline

Create a pipeline to run an end-to-end scenario to automate all or part of the AI lifecycle. For example, create a pipeline that creates and trains an asset, promotes it to a space, creates a deployment, then scores the model.

Watch this video to see how to create and run a sample pipeline.

This video provides a visual method to learn the concepts and tasks in this documentation.

Overview: Adding a pipeline to a project

Follow these steps to add a pipeline to a project:

  1. Open a project.

  2. Click New asset > Automate model lifecycle.

  3. Enter a name and an optional description.

  4. Optional: Select DataStage functions to enable DataStage functions in the pipeline's Expression Builder.

  5. Click Create to open the canvas.

Pipeline access

When you use a pipeline to automate a flow, you must have access to all of the elements in the pipeline. Make sure that you create and run pipelines with the proper access to all assets, projects, and spaces used in the pipeline.

Overview: Building a pipeline

Follow these high-level steps to build and run a pipeline.

  1. Drag any node objects onto the canvas. For example, drag a Run notebook job node onto the canvas.
  2. Use the action menu for each node to view and select options.
  3. Configure a node as required. You are prompted to supply the required input options. For some nodes, you can view or configure output options as well. For examples of configuring nodes, see Configuring pipeline components.
  4. Drag from one node to another to connect and order the pipeline.
  5. Optional: Click the global objects icon in the toolbar to configure runtime options for the pipeline.
  6. When the pipeline is complete, click the Run icon on the toolbar to run the pipeline. You can run a trial to test the pipeline, or you can schedule a job when you are confident in the pipeline.

Configuring nodes

As you add nodes to a pipeline, you must configure them to provide all of the required details. For example, if you add a node to run an AutoAI experiment, you must configure the node to specify the experiment, load the training data, and specify the output file:

AutoAI node parameters

Connecting nodes

When you build a complete pipeline, the nodes must be connected in the order in which they run in the pipeline. To connect nodes, hover over a node and drag a connection to the target node. Disconnected nodes are run in parallel.

Connecting nodes

You can re-link nodes by dragging the links on canvas from one node to another. General conditions attached to the links will refer to the new node, including conditions that refer to node status.

Change the connection between nodes can result in errors. Error messages will notify if you must update the node configuration. For example, if you refer to an inaccessible node, this will result in an error.

Defining pipeline parameters

A pipeline parameter defines a global variable for the whole pipeline. Use pipeline parameters to specify data from one of these categories:

Parameter type Can specify
Basic JSON types such as string, integer, or a JSON object
CPDPath Resources available within the platform, such as assets, asset containers, connections, notebooks, hardware specs, projects, spaces, or jobs
InstanceCRN Storage, machine learning instances, and other services.
Other Various configuration types, such as status, timeout length, estimator, error policies and other various configuration types.

To specify a pipeline parameter:

  1. Click the global objects icon global objects icon in the toolbar to open the Manage global objects window.
  2. Select the Pipeline parameters tab to configure parameters.
  3. Click Add pipeline parameter.
  4. Specify a name and an optional description.
  5. Select a type and provide any required information.
  6. Click Add when the definition is complete, and repeat the previous steps until you finish defining the parameters.
  7. Close the Manage global objects dialog.

The parameters are now available to the pipeline.

Next steps

Configure pipeline components

Parent topic: IBM Orchestration Pipelines