Installing on IBM Cloud Private

WML Accelerator can be deployed in Docker containers in a Kubernetes-managed cluster of IBM® Power® Systems servers with GPUs. To deploy your containers, you need to use IBM Cloud Private, version 3.1 or later to install the WML Accelerator using an archive that can be downloaded from IBM Passport Advantage after you procure the necessary licenses.

If you want to deploy WML Accelerator to a cluster of IBM Power Systems servers with GPUs, use the Installing WML Accelerator stand-alone process.

  • Do not install Anaconda Distribution in the docker containers that are part of a deployment on IBM Cloud Private because it is already installed as part of PowerAI.
  • Do not change any Spark instance group configuration parameters, such as creating consumers or resource groups, changing plan configurations, adding packages from a repository, or changing the Jupyter notebook configuration from the IBM Spectrum Conductor user interface because these are preconfigured through Helm charts when the product is deployed on IBM Cloud Private.
  • The Notebook user interface is not supported in ingress proxy mode
  • The IBM Spectrum Conductor user interface is not supported in ingress proxy mode if authentication and authorization are enabled for the Spark instance group.

To install WML Accelerator with IBM Cloud Private, complete the following steps:

  1. Install Docker version 1.13.1 or later, published by Red Hat Enterprise Linux (RHEL). Configure the storage driver for docker engine to be device_mapper in direct_lvm mode. For more information, see Use the Device Mapper storage driver.
  2. Prepare GPU worker nodes for deployments. See the Configuring a GPU worker node topic.
  3. Install IBM Cloud Private. For more information, see the Installing IBM Cloud Private topic.
  4. Disable NFS version 4 on all the nodes by editing the /etc/sysconfig/nfs file and adding this line: RPCNFSDARGS="--no-nfs-version 4". Restart the NFS service.

    To verify that NFS version 3 is active in all the nodes, run this command: rpcinfo -u localhost nfs.

  5. Download the WML Accelerator PPA file from IBM Passport Advantage.
  6. To make the WML Accelerator Docker image available in IBM Cloud Private, see topic cloudctl catalog load-archive --archive TGZ_ARCHIVE [--registry REGISTRY] [--repo HELM_REPO_NAME].
  7. WML Accelerator Docker image is now visible in the IBM Cloud Private administration console. Go to Menu > Manage > Images. Change the scope of the image to global.
  8. Go to IBM Cloud Private administration console, go to Menu > Manage > Helm Repositories and click Sync Repositories.
  9. WML Accelerator Helm chart is now listed in the IBM Cloud Private administration console. Go to catalog and search for ibm-powerai-enterprise-prod.
  10. Review the WML Accelerator Helm chart readme file for WML Accelerator carefully. It documents prerequisites, requirements, and limitations of WML Accelerator in IBM Cloud Private.
  11. Create Persistent volumes as instructed in the Helm chart readme file.

    On the NFS server, ensure that the NFS volumes have the no_root_squash flag set in /etc/exports; for example, /var/nfs *(rw,no_root_squash,no_subtree_check).

  12. Optionally create a custom pod security policy. For instructions, review the "PodSecurityPolicy Requirements" section in the WML Accelerator Helm chart readme file.
  13. Create the required secrets and service accounts as instructed in the Helm chart readme file.
  14. Click Configure and enter information for the Helm Release name and the Namespace fields. The default user name is admin and the default password is admin. For more information, see Namespaces. Optionally specify a custom SSL certificate

    By default, a self signed certificate is generated, however, you can provide your own certificate that is signed by an appropriate authority. To use your own certificate, follow these steps:

    1. Ensure that you have your certificate private key and PEM encoded certificate as two separate files. These should be provided by your certificate authority; however, you can generate these yourself. For example:
      openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /tmp/tls.key -out
                /tmp/tls.crt -subj "/CN=test.com"
      Note: /tmp/tls.key and /tmp/tls.crt are used in the remaining examples.
    2. Create a kubernetes secret using these files. For example:
      kubectl -n <NAMESPACE> create secret tls tls-secret --key=/tmp/tls.key
                --cert=/tmp/tls.crt
    3. Set the cluster.tlsCertificateSecretName name in your values.yaml (or in the user interface under Custom TLS Certificate) to match then name you used above (in this case, tls-secret).
      Important: The names must match exactly.
    4. Deploy the Helm chart.

    If you later want to change the certificate, update the secret (kubectl edit secret tls-secret) with the new certificate, and restart the -ingress pod.

  15. Click Install.

    The proxy for accessing the cluster management console is by default, set to IngressProxy and the base port set to 30745.

    The IngressProxy base port number and ASCD debug port is unique to each deployment.

    To check the deployment status, go to Menu > Workload > Helm Release or Deployment, search by Helm release name and its current status as Deployed.

    For information about accessing WML Accelerator user interface, go to Menu > Workload > Helm Releases, search for the helm release and click for details.

To access WML Accelerator user interface, you need to complete some postinstallation configuration steps:

  1. Configure your client DNS server to resolve deployment name to any public IP of the Kubernetes cluster. Configure the host mapping in the client host/etc/hosts for a UNIX OS or /etc/hosts counterpart for Windows OS. For instance, replacing the x.x.x.x with any public IP of the Kubernetes cluster.
    $ cat /etc/hosts
    x.x.x.x deployment name
    For example, for a Helm release name by powerai-enterprise and base port to its default 30745 (in step 14), the host file has an entry similar to: powerai-enterprise-paiemaster
  2. Access the cluster by browser with the URL: https://deployment name:base-port/platform, where deployment name matches the Helm release deployment name and baseport matches the base port that is provided during the deployment.

    For example, for a Helm release name of powerai-enterprise and base port as default 30745 (in step 14), the URL to access will be https://powerai-enterprise-paiemaster:30745/platform

  3. If you are using the default self-signed certificate, refer to steps 3 and onward in Locating the cluster management console.