Important:

Important: IBM Cloud Pak® for Data Version 4.8 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.
Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.

Planning a pipeline (Watson Pipelines)

Review these considerations as you plan for how you will connect to resources, add assets, and manage resources to your pipeline.

Accessing the components in your pipeline

When you use a pipeline to automate a flow, you must have access to all of the elements in the pipeline. Make sure that you create and run pipelines with the proper access to all assets, projects, and spaces used in the pipeline. Collaborators who run the pipeline must also be able to access the pipeline components.

Managing pipeline credentials

To run a job, the pipeline must have access to Cloud Pak for Data credentials. Typically, a pipeline uses your personal API key to execute long-running operations in the pipeline without disruption. If credentials are not available when you create the job, you are prompted to supply an API key or create a new one.

To generate an API key from yourIBM Cloud Pak for Data user account:

  1. Go into your user profile.
  2. Click API keys > Generate new token.
  3. Create or select an API key for your user account.

Adding assets to a pipeline

When you create a pipeline, you add assets, such as data, notebooks, deployment jobs, or Data Refinery jobs to the pipeline to orchestrate a sequential process. The strongly recommended method for adding assets to a pipeline is to collect the assets in the project containing the pipeline and use the asset browser to select project assets for the pipeline.

Attention: Although you can include assets from other projects, doing so can introduce complexities and potential problems in your pipeline and could be prohibited in a future release. The recommended practice is to use assets from the current project.

Manage resources by setting memory limits

Set your Cloud Pak for Data instance's memory size limit of Redis to avoid memory overconsumption. The recommended memory size is a multiple of the maximum parallel runs and user variable size limit. For example, if you accommodate 1000 parallel pipelines and your user variable size limit is 256Ki, consider setting your memory limit to 256Mi.

Update default runtime type

You can update the default runtime type for nodes by updating your ConfigMap.

Open ConfigMap watson-pipelines-config. and update the value default_runtime_type with:

  • shared defaults nodes to use shared runtimes.
  • standalone defaults nodes to use standalone runtimes.

An example is as follows:

oc -n cpd-instance get cm watson-pipelines-config  -o yaml
apiVersion: v1
data:
  default_runtime_type: shared
  shutdown: "false"
  user_variables_size_limit: 64Ki
kind: ConfigMap

Updates to the ConfigMap affects new nodes only. Existing nodes are unaffected.

Parent topic: Getting started with Watson Pipelines