Installing DataStage
A project administrator can install DataStage on IBM® Cloud Pak for Data.
The installation process is the same for both DataStage® Enterprise and DataStage Enterprise Plus. The one that is installed is determined by the catalog source that you created.
- Permissions you need for this task
- You must be an administrator of the OpenShift® project (Kubernetes namespace) where you will deploy DataStage.
- Information you need to complete this task
-
- DataStage
needs only the
restrictedsecurity context constraint (SCC). - DataStage must installed in the same project as Cloud Pak for Data.
- DataStage requires the Cloud Pak for Data common core services. If the common core services are not installed in the project where you plan to install DataStage, the common core services will be automatically installed when you install DataStage, which will increase the amount of time the installation takes to complete.
- DataStage
uses the following storage classes. If you don't use these storage classes on your cluster, ensure
that you have a storage class with an equivalent definition:
- OpenShift Container
Storage:
ocs-storagecluster-cephfs - IBM Spectrum®:
ibm-spectrum-scale-sc - NFS:
managed-nfs-storage - Portworx:
portworx-shared-gp3 - IBM Cloud File Storage:
ibmc-file-gold-gidoribm-file-custom-gold-gid
- OpenShift Container
Storage:
- DataStage
needs only the
Before you begin
Ensure that the cluster meets the minimum requirements for installing DataStage. For details, see System requirements.
Additionally, ensure that a cluster administrator completed the required Pre-installation tasks for your environment. Specifically, verify that a cluster administrator completed the following tasks:
- Cloud Pak for Data is installed. For details, see Installing Cloud Pak for Data.
- For environments that use a private container registry, such as air-gapped environments, the DataStage software images are mirrored to the private container registry. For details, see Mirroring images to your container registry.
- The cluster is configured to pull the DataStage software images. For details, see Configuring your cluster to pull images.
- The DataStage catalog source exists. For details, see Creating catalog sources.
- The DataStage operator subscription exists. For details, see Creating operator subscriptions.
If these tasks are not complete, the DataStage installation will fail.
Procedure
Complete the following tasks to install DataStage:
Installing the service
To install DataStage:
- Log in to Red Hat® OpenShift Container Platform as a user with sufficient permissions to
complete the
task:
oc login OpenShift_URL:port - Create a DataStage custom resource to install DataStage. Follow the appropriate guidance
for your environment.Important: By creating a DataStage custom resource with
spec.license.accept: true, you are accepting the license terms for DataStage. You can find links to the relevant licenses in IBM Cloud Pak for Data License Information.cat <<EOF |oc apply -f - apiVersion: ds.cpd.ibm.com/v1alpha1 kind: DataStage metadata: name: datastage # This is the recommended name, but you can change it namespace: project-name # Replace with the project where you will install DataStage spec: license: accept: true license: Enterprise | Standard # Specify the license you purchased version: 4.0.9 storageClass: storage-class-name # See the guidance in "Information you need to complete this task" EOF
When you create the custom resource, the DataStage operator installs DataStage.
Verifying the installation
When you create the custom resource, the DataStage operator processes the contents of the custom resource and starts up the
microservices that comprise DataStage, including DataStage. (The DataStage microservice is defined by the datastage custom
resource.) DataStage is installed when the DataStage status is Completed.
To check the status of the installation:
- Change to the project where you installed DataStage:
oc project project-name - Get the status of DataStage (datastage):
oc get DataStage datastage -o jsonpath='{.status.dsStatus} {"\n"}'DataStage is ready when the command returns Completed
- Get the status of the PXRuntime instance
(ds-px-default)
oc get pxruntime ds-px-default -o jsonpath='{.status.dsStatus} {"\n"}'The PXRuntime instance is ready when the command returns
Completed.
Choosing a service upgrade plan
You can choose how DataStage is upgraded when you install a newer version of the DataStage operator on the cluster.
- Automatic upgrade (recommended)
-
If you want DataStage to be automatically upgraded when you install a newer version of the DataStage operator on the cluster, remove the
versionentry from the DataStage custom resource.To remove the
versionentry, run the following command. You must update the command with the appropriate project name before you run the command.oc patch DataStage datastage \ --namespace project-name \ --type=json \ --patch '[{ "op": "remove", "path": "/spec/version" }]' - Manual upgrade
-
If you want to manually upgrade DataStage after you install a newer version of the DataStage operator, you can pin the installation at a specific version in the DataStage custom resource.
By default, when you create the DataStage custom resource, it includes the
versionentry, so no additional action is required.If you removed the
versionentry from the DataStage custom resource, run the following command to pin the installation at Version 4.0.9. You must update the command with the appropriate project name before you run the command.oc patch DataStage datastage \ --namespace project-name \ --type=merge \ --patch '{"spec": {"version":"4.0.9"}}'For a list of operand versions supported by the DataStage operator, see Operator and operand versions.