Installing the Watson Machine Learning Accelerator service

A project administrator can install Watson Machine Learning Accelerator on IBM® Cloud Pak for Data.

Permissions you need for this task: You must be an administrator of the OpenShift® project (Kubernetes namespace) where you will deploy Watson Machine Learning Accelerator.

Information you need to complete this task

Watson Machine Learning Accelerator needs only the restricted security context constraint (SCC).
Watson Machine Learning Accelerator must be installed in a project that is tethered to the project where Cloud Pak for Data is installed.
Watson Machine Learning Accelerator requires the Cloud Pak for Data common core services.
Watson Machine Learning Accelerator uses the following storage classes. If you don't use these storage classes on your cluster, ensure that you have a storage class with an equivalent definition:
- OpenShift Container Storage: ocs-storagecluster-cephfs
- IBM Spectrum®: ibm-spectrum-scale-sc
- NFS: managed-nfs-storage
- Portworx: portworx-shared-gp3

Before you begin

Ensure that the cluster meets the minimum requirements for installing Watson Machine Learning Accelerator. For details, see System requirements.

Additionally, ensure that a cluster administrator completed the required Pre-installation tasks for your environment. Specifically, verify that a cluster administrator completed the following tasks:

Cloud Pak for Data is installed. For details, see Installing Cloud Pak for Data.
The project where you plan to install Watson Machine Learning Accelerator exists.
Install prerequisite services. See Prerequisite services.
Install prerequisite operators. See Prerequisite operators.
For environments that use a private container registry, such as air-gapped environments, the Watson Machine Learning Accelerator software images are mirrored to the private container registry.
For information about mirroring the software images to the private container registry, see Mirroring images to your container registry. After setting up private container registries, you can set additional container registry rules for Watson Machine Learning Accelerator, see Container mirror registry rules for Watson Machine Learning Accelerator.
The cluster is configured to pull the Watson Machine Learning Accelerator software images. For details, see Configuring your cluster to pull images.
The Watson Machine Learning Accelerator catalog source exists. For details, see Creating catalog sources.
The Watson Machine Learning Accelerator operator subscription exists. For details, see Creating operator subscriptions.
The node settings are adjusted for Watson Machine Learning Accelerator. For details, see Changing required node settings.

If these tasks are not complete, the Watson Machine Learning Accelerator installation will fail.

Prerequisite services

Before you install Watson Machine Learning Accelerator, ensure that the following services are installed and running:

You must install the scheduling service. See Installing the scheduling service.

Prerequisite operators

After you install the required services, you must install the GPU Operator:

x86-64:

For a cluster connected to the internet, complete the Installing the NVIDIA GPU Operator on OpenShift instructions.
For an air-gapped cluster, complete the Installing the NVIDIA GPU Operator instructions.

POWER:

For POWER hardware with Red Hat OpenShift 4.8, see GPU Operator for POWER.

Procedure

Complete the following tasks to install Watson Machine Learning Accelerator:

Creating the Watson Machine Learning Accelerator add-on custom resource
Setting up a tethered project
Installing the service
Verifying the installation
What to do next

Creating the Watson Machine Learning Accelerator add-on custom resource

Log in to Red Hat® OpenShift Container Platform as a user with sufficient permissions to complete the task:
```
oc login OpenShift_URL:port
```
Create a Watson Machine Learning Accelerator add-on (Wmla-add-on) custom resource in the Cloud Pak for Data namespace:
```
cat << EOF | oc apply -f -
apiVersion: spectrumcomputing.ibm.com/v1
kind: Wmla-add-on
metadata:
  name: wmla
  namespace: cpd-instance    #Specify the name of your Cloud Pak for Data namespace
spec:
  version: "2.3.9"
  wmlaNamespace: wmla-ns      #Specify the name of your Watson Machine Learning Accelerator namespace
EOF
```
When you create the Wmla-add-on custom resource, the following configmap objects are created in the Cloud Pak for Data namespace:
- wml-accelerator
- wml-accelerator-connection-info-extension-xxxid
- wmla-add-on-instance-name-wml-accelerator-instance-cm
where wmla-add-on-instance-name is the Watson Machine Learning Accelerator add-on instance name.

Setting up a tethered project

You must update the IBM Cloud Pak for Data NamespaceScope Operator and the Cloud Pak for Data control plane (ZenService) to watch the project where you plan to install Watson Machine Learning Accelerator.

Permissions you need for this task

You must be either:

A cluster administrator
An administrator of the following projects:
- The project where the IBM Cloud Pak® for Data platform operator is installed, either ibm-common-services or cpd-operators
- The project where Cloud Pak for Data is installed
- The project where you plan to install Watson Machine Learning Accelerator

When you need to complete this task

You must complete this task if you plan to install Watson Machine Learning Accelerator in a tethered project, but the project is not yet tethered to the project where Cloud Pak for Data is installed.

To tether the project:

Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task:
```
oc login OpenShift_URL:port
```

Enable the IBM Cloud Pak for Data platform operator and the IBM Cloud Pak foundational services operator to watch the project where you will install Watson Machine Learning Accelerator.

Update the

IBM
NamespaceScope Operator

in the IBM Cloud Pak for Data platform operator project to watch the project where you plan to install Watson Machine Learning Accelerator.

Edit the namespaceMembers list to add the project where you plan to install Watson Machine Learning Accelerator.

cat <<EOF |oc apply -f -
apiVersion: operator.ibm.com/v1
kind: NamespaceScope
metadata:
  name: cpd-operators
  namespace: cpd-operators        # (Default) Replace with the Cloud Pak for Data platform operator project name 
spec:
  namespaceMembers:
  - cpd-operators                 # (Default) Replace with the Cloud Pak for Data platform operator project name
  - cpd-instance                  # Replace with the project where Cloud Pak for Data is installed
  - wmla-ns             # Replace with the project where Watson Machine Learning Accelerator is installed
EOF

The above change adds the Watson Machine Learning Accelerator namespace wmla-ns to the namespace-scope configmap object in the cpd-operators namespace.

Modify the ZenService custom resource to add a tetheredNamespaces entry that lists the project that you want to tether to the project where Cloud Pak for Data is installed:
1. Run the following command to get the name of the ZenService:
```
oc get ZenService -n cpd-instance
```
  By default, the custom resource name is lite-cr.
2. Run the following command to edit the ZenService custom resource:
```
oc edit ZenService custom-resource-name
```
3. Add the tetheredNamespaces entry to the custom resource:
```
apiVersion: zen.cpd.ibm.com/v1
kind: ZenService
metadata:
  name: lite-cr
  namespace: cpd-instance                             # The project where Cloud Pak for Data is installed
spec:
  csNamespace: ibm-common-services
  version: 4.3.1
  storageClass: RWX-storage-class                     # The RWX storage class you specified during installation
  zenCoreMetaDbStorageClass: RWO-storage-class        # The RWO storage class you specified during installation
  tetheredNamespaces:                                 # Add this entry
  - "wmla-ns"                         # Add the name of the Watson Machine Learning Accelerator project to tether
  cloudpakfordata: true 
```
4. Save your changes to the ZenService custom resource. For example, if you are using vi, hit esc and enter :wq
When you save the changes to the ZenService custom resource, the following actions occur:
- The required secret and configmap objects are copied from the Cloud Pak for Data project to the tethered project. (Only objects that include the icpdata_tether_resource: true label are copied.)
- The cpd-admin-role and cpd-viewer-role roles are copied to the tethered project.
- The required rolebinding objects are created. The rolebinding objects grant the zen-admin-sa and zen-editor-sa security accounts in the Cloud Pak for Data project access to the cpd-admin-role in the tethered project.
By adding the Watson Machine Learning Accelerator namespace as a tethered namespace, this enables Cloud Pak for Data functions like audit logging, monitoring and diagnostics.

Installing the service

To install Watson Machine Learning Accelerator:

Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task:
```
oc login OpenShift_URL:port
```

Create a Wmla custom resource to install Watson Machine Learning Accelerator. Make sure to specify the storage class name.

The recommended storage class names are described in Setting up shared persistent storage.

Important:

By creating a Wmla custom resource with spec.license.accept: true, you are accepting the license terms for Watson Machine Learning Accelerator. You can find links to the relevant licenses in IBM Cloud Pak for Data License Information.
Use a storage class with attributes similar to the storage class described in the Service persistent storage requirements section of Storage requirements.

cat <<EOF |oc apply -f -
apiVersion: spectrumcomputing.ibm.com/v1
kind: Wmla

metadata:
  name: wmla # IMPORTANT: the instance name (for example: wmla) and Cloud Pak for Data namespace must be the same as the namespace used by the add-on (Wmla-add-on) custom resource created in the previous steps 
  namespace: wmla-ns

spec:
  license: # Specify the license you purchased
    accept: true
    license: Standard | Enterprise # Required. Specify "Standard" or "Enterprise"

  version: 2.3.9
  acceptRollback: true
  cpdServiceInstance: "" # Specify the wmla add-on custom resoruce name here only if the wmla custom resource name is different from the wmla add-on custom resource name
  cpdNamespace: cpd-instance # Specify the namespace where Cloud Pak for Data is installed
  podSchedulerNamespace: ibm-cpd-scheduling # Specify the namespace where the scheduling service is installed
  storageClass: storage-class-name # Specify the storage class name: portworx-shared-gp3, managed-nfs-storage or ocs-storagecluster-cephfs. See the guidance in "Information you need to complete this task"

EOF

Additional installation options

The following additional installation options can be considered:

Configure the serviceReplicas setting in the service CR by setting the replica value to 1 or greater. This controls the number of pods used by core services in Watson Machine Learning Accelerator.
- To disable multiple service replicas, set this value to 1.
- To enable multiple service replicas, set this value to 2. Setting this value greater than 2 increases the number of replicas.
```
spec:
   serviceReplicas: 1
```
Configure the scaleConfig setting in the service CR. By default, the service uses a small deployment. This value can be set to small, medium or large. See details, Scaling services.
```
spec:
   scaleConfig: small
```

Verifying the installation

When you create the custom resource, the Watson Machine Learning Accelerator operator processes the contents of the custom resource and starts up the microservices that comprise Watson Machine Learning Accelerator, including Wmla. (The Wmla microservice is defined by the wmla custom resource.) Watson Machine Learning Accelerator is installed when the Wmla status is Completed.

To check the status of the installation:

Change to the project where you installed Watson Machine Learning Accelerator:
```
oc project wmla-ns
```
Get the status of Watson Machine Learning Accelerator (wmla):
```
oc get Wmla wmla -o jsonpath='{.status.wmlaStatus} {"\n"}'
```
Watson Machine Learning Accelerator is ready when the command returns Completed

What to do next

To connect your Watson™ Machine Learning Accelerator service to the Watson Machine Learning service, see Connecting Watson Machine Learning Accelerator to Watson Machine Learning.
Add users to the Watson Machine Learning Accelerator instance and provide access to the service console, see Add users to the Watson Machine Learning Accelerator instance.
Configure monitoring resource usage for Watson Machine Learning Accelerator, see Resource usage.