Table of contents

Installing the DataStage Enterprise service

You can install the DataStage® Enterprise service on IBM® Cloud Pak for Data System.

Before you begin

Required role: To complete this task, you must be an administrator of the project (namespace) where you will deploy the service.

Important: Install the Cloud Pak for Data control plane before you install the service. For details, see Installing Cloud Pak for Data.

In addition, ensure that you meet the minimum requirements for installing the service. For details, see System requirements for services.

Attention: You must uninstall any existing version of DataStage Enterprise or DataStage Enterprise Plus before installation.

Before you can install the service, a cluster administrator must complete the steps in Setting up the cluster for the DataStage Enterprise Plus service.

If you are running the installation on an air-gapped cluster, ensure that a Red Hat® OpenShift® administrator has completed the steps in Preparing for air-gapped installations to download the required files for the service.

Ensure that the Mac OS or Linux machine where you will run the commands meets the appropriate requirements for your environment:

Requirements for the machine Cluster is connected to the internet Cluster is air-gapped
Can connect to the cluster.
Is connected to the internet.  
Has the oc command-line interface.
You can download the appropriate client tools for your operating system from Red Hat OpenShift:

Ensure that the version is compatible with the version of Red Hat OpenShift on your cluster.

Has the Cloud Pak for Data command-line interface.

See Obtaining the installation files. Use the same version of the command-line interface each time you run the commands.

Has the updated repo.yaml file in the same directory as the Cloud Pak for Data command-line interface.

See Obtaining the installation files.

 
Has the cpd-Operating_System-workspace directory, which contains the required files.

See Preparing for air-gapped installations.

 

Ensure that you have the following information from your Red Hat OpenShift cluster administrator:

Required information Description
OpenShift_URL:port The URL and port number to use when logging in to your Red Hat OpenShift cluster.

Ensure that you have the appropriate credentials to log into the cluster using oc login.

Value:

Your cluster administrator should tell you whether your cluster is connected to the internet or is air-gapped.

Assembly_version

Needed for air-gapped installations only.

The version of the assembly to install.

Value:

Storage_class_name The name of the storage class to use to provision storage for the service.

For information about the types of storage that the service supports, see System requirements for services.

If your cluster is not set up to use dynamic storage provisioning, work with an IBM Support representative to determine how you can specify persistent volume claims when you install the service.

Value:

Registry_location The location to store the images in the registry server.

If you are installing the service when you are connected to the internet, ensure that you have the appropriate credentials to push images to the registry server.

You can run the following command to find the route to the registry:
oc get routes --all-namespaces

Value:

Guidance for Red Hat OpenShift registry users:
  • To determine the external route to the registry, run the appropriate command for your environment:
    • OpenShift 3.11:
      oc get route/docker-registry -n default --template {{.spec.host}}

      The command returns a route similar to docker-registry-default.apps.my_cluster_address

      Append the project name to the route. For example:
      docker-registry-default.apps.my_cluster_address/project
    • OpenShift 4.5:
      oc get route/default-route -n openshift-image-registry --template='{{ .spec.host }}'

      The command returns a route similar to default-route-openshift-image-registry.apps.my_cluster_address.

      Append the project name to the route. For example:
      default-route-openshift-image-registry.apps.my_cluster_address/project
  • When you specify a value for the Registry_location variable, ensure that you include the project name.
Registry_from_cluster The location from which pods on the cluster can pull images.

Value:

Guidance for Red Hat OpenShift registry users:
  • This is the internal name of the registry service. The default service name is:
    • OpenShift 3.11:
      docker-registry.default.svc:5000/project
    • OpenShift 4.5:
      image-registry.openshift-image-registry.svc:5000/project
  • When you specify a value for the Registry_from_cluster variable, ensure that you include the project name.
You can run the following command to get the internal name of the Red Hat OpenShift registry service:
oc registry info --internal=true
Project The project (namespace) where the IBM Cloud Pak for Data control plane is installed.

Value:

Important: If you plan to install both DataStage Enterprise and Watson™ Knowledge Catalog, install the services in the same namespace.

Determine whether you need to create an override YAML file based on your installation environment. For details, see Installation override files for DataStage Enterprise.

About this task

If you are installing multiple services on your cluster, you must run the installations one at a time and wait until the installation completes before installing another service. You cannot run the installations in parallel.

Procedure

To install the service:

  1. Run the appropriate cpd command for your environment:
    Tip: For a list of all available options, enter the command: ./cpd-Operating_System --help.
    • To install the service on a cluster that can connect to the internet:
      1. Change to the directory where you placed the Cloud Pak for Data command-line interface and the repo.yaml file.
      2. Log in to your Red Hat OpenShift cluster as a project administrator:
        oc login OpenShift_URL:port
      3. Run the following command to see a preview of what will be installed when you install the service.
        ./cpd-Operating_System --repo ./repo.yaml \
        --assembly ds-ent \
        --namespace Project \
        --arch Processor_architecture \
        --storageclass Storage_class_name \
        --transfer-image-to Registry_location \
        --cluster-pull-prefix Registry_from_cluster \
        --ask-push-registry-credentials
        If you need to use an override YAML file, specify the --override flag with the fully qualified location of the override YAML file for the service. Add the following line to your installation command after the --storageclass flag:
        --override Override_file_path \

        If you are installing with Portworx storage, specify the default storage class and specify the --override flag with the fully qualified location of the override YAML file for the service:

        --storageclass portworx-shared-gp3  \
        --override ds-ent-pwx-x86.YAML \

        Replace the following values:

        Variable Replace with
        Operating_System For Linux, specify linux. For Mac OS, specify darwin.
        Project Use the value provided by your cluster administrator.
        Storage_class_name Use the value provided by your cluster administrator.
        Registry_location Use the value provided by your cluster administrator.
        Registry_from_cluster Use the value provided by your cluster administrator.
    • To install the service on an air-gapped cluster:
      1. Change to the directory where you placed the Cloud Pak for Data command-line interface.
      2. Log in to your Red Hat OpenShift cluster as a project administrator:
        oc login OpenShift_URL:port
      3. Run the following command to see a preview of what will be installed when you install the service.
        Important: If you are using the internal Red Hat OpenShift registry:
        • Do not specify the --ask-pull-registry-credentials parameter.
        • If you are using the default self-signed certificate, specify the --insecure-skip-tls-verify flag to prevent x509 errors.
        ./cpd-Operating_System \
        --assembly ds-ent \
        --arch Processor_architecture \
        --version Assembly_version \
        --namespace Project \
        --storageclass Storage_class_name \
        --cluster-pull-prefix Registry_from_cluster \
        --ask-pull-registry-credentials \
        --load-from Image_directory_location \
        --dry-run
        If you need to use an override YAML file, specify the --override flag with the fully qualified location of the override YAML file for the service. Add the following line to your installation command after the --storageclass flag:
        --override Override_file_path \

        If you are installing with Portworx storage, specify the default storage class and specify the --override flag with the fully qualified location of the override YAML file for the service:

        --storageclass portworx-shared-gp3  \
        --override ds-ent-pwx-x86.YAML \

        Replace the following values:

        Variable Replace with
        Operating_System For Linux, specify linux. For Mac OS, specify darwin.
        Assembly_version Use the value provided by your cluster administrator.
        Project Use the value provided by your cluster administrator.
        Storage_class_name Use the value provided by your cluster administrator.
        Registry_from_cluster Use the value provided by your cluster administrator.
        Image_directory_location The location of the cpd-Operating_System-workspace directory.
      4. Rerun the previous command without the --dry-run flag to install the service.
  2. To verify that the installation completed successfully, run the following command:
    ./cpd-Operating_System status \
    --assembly ds-ent \
    --namespace Project \
    

    Replace Project with the value you used in the preceding commands.

    If the installation completed successfully, the status of the assembly and the modules in the assembly is Ready.

    If the installation failed, see Manually resuming the installation from a specific module.

What to do next

Complete the following tasks in order before users can access the service:

  1. Determine whether there are any patches available for your installation:
    • To check for patches on a cluster that can connect to the internet:
      Run the following command to check for patches:
      ./cpd-Operating_System --repo ./repo.yaml status \
      --namespace Project \ 
      --assembly ds-ent \
      --patches \
      --available-updates 
    • To check for patches on an air-gapped cluster:

      See the list of available patches for DataStage Enterprise.

    If you need to apply patches to the service, follow the guidance in Applying patches.