Samples for programming a pipeline

See samples for programming a pipeline with Bash scripts, such as installing third party packages, and more.

Using CPDCTL

Refer to samples to get started with some end-to-end example code, including how to deploy and monitor pipelines.

Programming with Bash

There are various development tools you can use by running Bash scripts in pipelines.

Using Bash scripts for utility functions

The following are scripts you can run in the Run Bash script node. You must have DataStage installed and have created a persistent storage volume for your Pipelines as the scripts are installed to ds-storage persistent volume. See Storage and data access for IBM Orchestration Pipelines.

DSAddEnvVar.sh

Add an environment key value pair to a runtime environment in your current project.

Example

DSAddEnvVar.sh -n <env_name> --env=key1 --value=value1

DSDeleteEnvVar.sh

Delete an environment variable from a runtime environment.

Example

DSDeleteEnvVar.sh -n <env_name> --env key1

DSGetIdForJob.sh

Retrieve the job id for a job.

Example

DSGetIdForJob.sh <job_name>

DSGetJobInfo.sh

Retrieve the job information for a job run.

Example

DSGetJobInfo.sh <DSJ.JOBSTATUS|DSJ.JOBNAME|DSJ.PARAMLIST|DSJ.STAGELIST|DSJ.USERSTATUS|DSJ.JOBDESC|DSJ.JOBELAPSED> --job-id=<id of job> --run-id=< id of job run>

DSGetLinkInfo.sh

Retrieve the URL information of a flow.

Example

DSGetLinkInfo.sh <link_name> <link_metric_name> --job-id=<id of job>--run-id=< id of job run>

link_metric_name is "DSJ.LINKNAME" or "DSJ.LINKROWCOUNT".

DSGetParamInfo.sh

To get the current value of a parameter, get the project id, job id, and run id. Write a wrapper function to retrieve the job run metadata.

The parameter accepts either job-id and run-id, or job-name and run-name as inputs. Job-name and job-id is required, while run-name and run-id is optional. You can provide the job name with the suffix only, or the job name with the invocation name, for example Flow1.DataStagejob or Flow1.DataStage job.invocationName. You can also provide the job name and the run name separately. If you do not provide the run name or the run ID, the latest available run will be used by default.

To call the function in the before/after job subroutines, add the parameter $ENABLE_CPDCTL=1 at a flow level. For more information on before/after subroutines, see Setting up before-job and after-job subroutines in DataStage.

Example

DSGetParamInfo.sh <param_name> --job-id=<job id> --run-id=<id of job run>

DSGetParamInfo.sh <param_name> --job-name=<jobname> --run-name=<invocation id> or <run name>

DSGetStageInfo.sh

Get the stage info of a DataStage job run.

Example

rows=DSGetStageInfo.sh "DSJ.STAGEINROWNUM" "Peek_1" --job-id=<id of job> --run-id=< id of job run>
links=DSGetStageInfo.sh "DSJ.LINKLIST" "Peek_1" --job-id=<id of job>--run-id=< id of job run>

DSGetUserStatus.sh

Get the status of user from a DataStage job run.

Example

DSGetUserStatus.sh --job-id= --run-id=< id of job run>

DSGetVersionInfo.sh

Get the DataStage service version info.

Example

DSGetVersionInfo.sh DSJ.VER.DS

DSJobNameFromJobId.sh

Get a job name by job id.

Example

DSJobNameFromJobId.sh --job-id=

DSListEnvVars.sh

List all environment variable for a runtime environment.

Example

DSListEnvVars.sh -n

DSRunJob.sh

Run a job by calling the job id.

Example

DSRunJob.sh --job-id=

DSSetParam.sh

Update job parameter value for a job. It accepts either a job ID or a job name as an input.

Example

DSSetParam.sh <job_id> | <job_name> <param_name> <param_value>

DSStopJob.sh

Stop a job run by calling its ID.

Example

DSStopJob.sh --job-id= --run-id=< id of job run>

DSTranslateCode.sh

Translate numeric status code to status string message.

Example

DSTranslateCode.sh <numeric_code>

DSWaitForJob.sh

Wait for job run to complete with timeout.

Example

DSWaitForJob.sh "${jobid1},${jobid2}" "${runid1},${runid2}" "600"

UtilityRunJob.sh

Run a DataStage job with run options.

Example

UtilityRunJob.sh <param1=value1|param2=value2>

UtilityAbortToLog.sh

Log error message and exit the execution while returning value 1.

Example

UtilityAbortToLog.sh "error happened"

UtilityMessageToLog.sh

Log info message and exit the execution with return value 0.

Example

UtilityAbortToLog.sh "info message"

UtilityWarningToLog.sh

Log warning message and exit the execution while returning value 0.

Example

UtilityAbortToLog.sh "warning message"

Installing packages with Bash

Download the required (psql) packages on any Linux machine outside the runtime pod, then copy the libraries into the volume mounted by PXRuntime.

Note:

The image RH UBI 9 is used in Pipelines 4.8.3 or later, while RH UBI 8 is used in Pipelines 4.8.2 or earlier. You need to extract the psql package from Red-hat Package Manager.

  1. Download PostgreSQL RPM packages in a separate machine:
    yum install --downloadonly --downloaddir=/tmp/psql postgresql
    
  2. Extract and reconstruct RPM packages:
    cd /tmp/psql
    mkdir postgresql
    cd postgresql
    rpm2cpio ../openssh-*.rpm | cpio -idmv
    # Repeat for each downloaded RPM package (cannot use wildcards)
    
  3. Create tarball:
    cd /tmp/psql
    tar -zcvf psql.tar.gz ./postgresql
    
  4. Copy to storage volume:
    # Copy files to px-runtime pod
    oc cp psql.tar.gz <px-runtime-pod>:/px-storage/
    
  5. Extract in runtime pod:
    # Remote shell into px-runtime pod
    oc -n <cp4d-namespace> rsh <px-runtime-pod>
    cd /px-storage
    tar -xzf psql.tar.gz
    

For more information, see: Installing third-party libraries and creating custom images in DataStage.

Environment configuration

  1. Create extra_config.sh script:
    # Create /px-storage/extra_config.sh with:
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/px-storage/postgresql/usr/lib64:/px-storage/postgresql/usr/lib:/px-storage/postgresql/usr/share
    export PATH=$PATH:/px-storage/postgresql/usr/bin
    
  2. Apply configuration to deployment:
    oc set env deployment/<instance-name>-ibm-datastage-px-runtime DS_EXTRA_CONFIG_SH=/px-storage/extra_config.sh
    oc set env sts/<instance-name>-ibm-datastage-px-compute DS_EXTRA_CONFIG_SH=/px-storage/extra_config.sh
    

Testing and verification

  1. Test in bash script node:
    # Verify installation
    psql --version
    pg_dump --version
    
    # Test database connection
    psql -U ${USER} -h ${HOST_IP} -p ${PORT} -d ${DB} -c "\l"
    

Using API samples

See some samples for using Pipelines API for cases such as:

  • List all pipelines in a project.
  • Finds a specific pipeline by ID.
  • Deletes a pipeline and its pipeline versions.
  • Lists all pipeline versions of a given pipeline.
  • Gets a pipeline version by pipeline version ID.
  • Deletes a pipeline version by pipeline version ID.
  • Upload a pipeline file and create a new pipeline
  • Upload a pipeline file and create a new pipeline version
  • Commit a pre-existing volatile default version to finished state.
  • Create a new pipeline from existing pipeline.