Table of contents

Installing the Watson Assistant service

An administrator can install the Watson™ Assistant service on IBM® Cloud Pak for Data.

Before you begin

Required role: To complete this task, you must be an administrator of the project (namespace) where you will deploy the service.

You must have installed the Cloud Pak for Data control plane and have a storage solution configured.

The systems that host the service must meet these additional requirements:
  • Run on Intel architecture nodes only.
  • CPUs must have 2.4 GHz or higher clock speed.
  • CPUs must support Linux SSE 4.2.
  • CPUs must support the AVX2 instruction set extension. (Most CPUs since 2013 support this extension). The service cannot function properly without AVX2 support.
The components that are necessary to process some natural languages require more resources. The following languages require that additional CPU and memory be available for the installation.
Language Pods for production Pods for development Additional memory requirements per pod
Chinese (Simplified or Traditional or both) 2 1 8 GB
German 2 1 2 GB
Japanese 2 1 2 GB
Korean 2 1 4 GB
Each of these languages requires an additional VPC for a production deployment and an additional 1/2 VPC for a development deployment.

The service uses datastore and other resources to support its functions. These resources are typically stored in Portworx storage.

Red Hat OpenShift 4.5: Alternatively, you can use VMware vSphere volumes, Microsoft Azure Disk volumes, or Amazon Web Services Elastic Block Store (EBS) as a storage solution. For more information, see Understanding persistent storage.

The following table lists the storage resources that are required to support the persistent volumes that are used by the service.
Component Number of replicas Space per pod
Postgres 3 10 GB
etcd 5 10 GB
Minio 4 5 GB
MongoDB 3 75 GB
Backup 1 1 GB
Before you can use the service, you must complete the following steps:
  1. Create a wa-repo.yaml file.
    Add the following content to the file:
    registry:
      - url: cp.icr.io/cp/cpd
        username: "cp"
        apikey: {entitlement-key}
        namespace: ""
        name: base-registry
      - url: cp.icr.io
        username: "cp"
        apikey: {entitlement-key}
        namespace: "cp/watson-assistant"
        name: wa-registry
    fileservers:
      - url: https://raw.github.com/IBM/cloud-pak/master/repo/cpd3
  2. If you don't have one or don't know it, get the {entitlement-key} from myibm.com.
  3. Replace the {entitlement-key} references in the YAML file with your entitlement key value, and then save and close the wa-repo.yaml file.
  4. If you're using Portworx storage, create a Portworx storage class named portworx-assistant. For more information, see Creating Portworx storage classes.

Ensure that the Mac OS or Linux machine where you will run the commands meets the appropriate requirements for your environment:

Requirements for the machine Cluster is connected to the internet Cluster is air-gapped
Can connect to the cluster.
Is connected to the internet.  
Has the oc command-line interface.
You can download the appropriate client tools for your operating system from Red Hat® OpenShift®:
Has the Cloud Pak for Data command-line interface.
Has the wa-repo.yaml in the same directory as the Cloud Pak for Data command-line interface.  
Has the cpd-Operating_System-workspace directory, which contains the required files.
See Preparing for air-gapped installations.
Important: When you follow the instructions, use these values for the parameters:
  • Replace repo.yaml with wa-repo.yaml
  • Use ibm-watson-assistant as the assembly name.
  • Use 1.4.2 as the assembly version.
  • For Registry_location, you must specify a route to the registry followed by the namespace (project name). The route must be accessible from the machine where you run the install command. If the cluster you are installing does not have a route to the registry, you can (temporarily) enable external access to the registries. For more information, see one of the following topics: When you add the --transfer-image-to parameter, you can specify the registry location as follows:
    • OpenShift 4.5:
      oc get route/default-route -n openshift-image-registry --template='{{ .spec.host }}'

      The command returns a route similar to default-route-openshift-image-registry.apps.my_cluster_address.

      Append the namespace to the route. For example:
      default-route-openshift-image-registry.apps.my_cluster_address/zen
    • OpenShift 3.11:
      oc get route/docker-registry -n default --template {{.spec.host}}

      The command returns a route similar to docker-registry-default.apps.my_cluster_address

      Append the namespace to the route. For example:
      docker-registry-default.apps.my_cluster_address/zen
  • When asked for credentials, specify the appropriate Open Shift administrator user name, such as kubeadmin or ocadmin. You can use oc whoami -t to specify the associated password.
 

About this task

If you are installing multiple services on your cluster, you must run the installations one at a time and wait until the installation completes before installing another service. You cannot run the installations in parallel.

Procedure

  1. This service requires the restricted SecurityContextConstraints to be bound to the target namespace prior to installation. If this SCC is already applied to the control plane, skip this step.
    Run the following command to bind the restricted SecurityContextConstraint to the Cloud Pak for Data namespace in which you will install the service:
    oc adm policy add-scc-to-group restricted system:serviceaccounts:{namespace}

    where {namespace} is the namespace in which Cloud Pak for Data is installed.

  2. Add the cluster namespace label to your service namespace.
    Important: This step is required. The label is needed to permit communication between your application's namespace and the Cloud Pak for Data namespace by using a network policy. If you don't label your namespace, your installation will succeed, but you won't be able to provision a service instance afterward.
    1. Log in to OpenShift.
      oc login
    2. Add the label.
      oc label --overwrite namespace {namespace} ns={namespace}
      where {namespace} is the namespace in which Cloud Pak for Data is installed.
      For example:
      oc label --overwrite namespace zen ns=zen
      If you get a message that says namespace/zen not labeled, it means the namespace was already labeled. No action is required.
    3. Make sure you are pointing to the correct project.
      oc project {namespace}
      where {namespace} is the namespace in which Cloud Pak for Data is installed.
  3. From the namespace where the Cloud Pak for Data cluster is installed, get the name of the secret for pulling images from the internal Docker registry.
    oc get secrets | grep default-dockercfg
    Make a note of the secret. You will add it as the value for the global.image.pullSecret setting in the override file that you create in the next step. For example:
    global:
      image:
        pullSecret: "default-dockercfg-gqfb4"
  4. Create an override file where you can change configuration settings for your deployment. You can use the sample overrides.yaml file that is provided with the service installation files as a starting point.
    For more information about the configuration changes that you can make, see Override values for Watson Assistant installation.

To install the service:

  1. Run the appropriate cpd command for your environment:
    Tip: For a list of all available options, enter the command: ./cpd-Operating_System --help.
    • To install the service on a cluster that can connect to the internet:
      1. Change to the directory where you stored the Cloud Pak for Data command-line interface and the wa-repo.yaml file.
      2. Log in to your Cloud Pak for Data cluster as a project administrator:
        oc login OpenShift_URL:port
      3. Run the following command to see a preview of what will be installed when you install the service.
        ./cpd-{Operating_System} --repo ./wa-repo.yaml \
                --assembly ibm-watson-assistant \
                --version assembly_version \
                --namespace Project \
                --transfer-image-to Registry_location \
                --target-registry-username OpenShift_Username \
                --target-registry-password OpenShift_Password \
                --insecure-skip-tls-verify \
                --cluster-pull-prefix Registry_from_cluster \
                --storageclass Storage_class_name \
                --override Filepath_to_override.yaml \
                --dry-run
        • Replace the {Operating_System} in the cpd-{Operating_System} command:
          • Linux: linux
          • Mac OS: darwin
        • The wa-repo.yaml file is the file you created earlier.
        • For {assembly_version}, specify 1.4.2.
        • For Registry_location, you must specify a route to the registry followed by the namespace (project name). The route must be accessible from the machine where you run the install command. If the cluster you are installing does not have a route to the registry, you can (temporarily) enable external access to the registries. For more information, see one of the following topics: When you add the --transfer-image-to parameter, you can specify the registry location as follows:
          • OpenShift 4.5:
            oc get route/default-route -n openshift-image-registry --template='{{ .spec.host }}'

            The command returns a route similar to default-route-openshift-image-registry.apps.my_cluster_address.

            Append the namespace to the route. For example:
            default-route-openshift-image-registry.apps.my_cluster_address/zen
          • OpenShift 3.11:
            oc get route/docker-registry -n default --template {{.spec.host}}

            The command returns a route similar to docker-registry-default.apps.my_cluster_address

            Append the namespace to the route. For example:
            docker-registry-default.apps.my_cluster_address/zen
        • For Registry_from_cluster, specify the address of the internal OpenShift docker registry and add /{namespace} to it. The values are typically:
            • OpenShift 4.x: image-registry.openshift-image-registry.svc:5000
            • OpenShift 3.x: docker-registry.default.svc:5000
        • {namespace} is the namespace that Cloud Pak for Data was installed into, which is typically zen.
        • Provide the username and password for a user with access to the registry in the target-registry-username and target-registry-password parameters. The default username is typically:
            • OpenShift 4.x: kubeadmin
            • OpenShift 3.x: ocadmin
            If you specify $(oc whoami -t) as the password, the corresponding password is populated for you.
        • If you are using the internal Red Hat OpenShift registry and you are using the default self-signed certificate, specify the --insecure-skip-tls-verify flag to prevent x509 errors.
        • Specify the --storageclass parameter only if you're using a storage class other than portworx-assistant.
        • Specify overrides.yaml as the Filepath_to_override.yaml.
        For example:
        ./cpd-linux --repo wa-repo.yaml --assembly ibm-watson-assistant \
        --version 1.4.2 --namespace zen --transfer-image-to $(oc registry info)/zen \
        --target-registry-username kubeadmin --target-registry-password $(oc whoami -t) \
        --insecure-skip-tls-verify --cluster-pull-prefix image-registry.openshift-image-registry.svc:5000/zen \
        --override overrides.yaml --dry-run
      4. If the dry-run is successful, then you are ready to install the service. Remove the --dry-run parameter from the command and enter the command again. Otherwise, fix any problems that exist before you try to install the service.
    • To install the service on an air-gapped cluster:
      Important: You should have already performed the steps in Preparing for air-gapped installations to prepare for an air-gapped installation and used the following values for the parameters:
      • Replace repo.yaml with wa-repo.yaml
      • Use ibm-watson-assistant as the assembly name.
      • Use 1.4.2 as the assembly version.
      • For Registry_location, you must specify a route to the registry followed by the namespace (project name). The route must be accessible from the machine where you run the install command. If the cluster you are installing does not have a route to the registry, you can (temporarily) enable external access to the registries. For more information, see one of the following topics: When you add the --transfer-image-to parameter, you can specify the registry location as follows:
        • OpenShift 4.5:
          oc get route/default-route -n openshift-image-registry --template='{{ .spec.host }}'

          The command returns a route similar to default-route-openshift-image-registry.apps.my_cluster_address.

          Append the namespace to the route. For example:
          default-route-openshift-image-registry.apps.my_cluster_address/zen
        • OpenShift 3.11:
          oc get route/docker-registry -n default --template {{.spec.host}}

          The command returns a route similar to docker-registry-default.apps.my_cluster_address

          Append the namespace to the route. For example:
          docker-registry-default.apps.my_cluster_address/zen
      • When asked for credentials, specify the appropriate Open Shift administrator user name, such as kubeadmin or ocadmin. You can use oc whoami -t to specify the associated password.
      1. Change to the directory where you placed the Cloud Pak for Data command-line interface.
      2. Log in to your Red Hat OpenShift cluster as a project administrator:
        oc login OpenShift_URL:port
      3. Run the following command to install the service.
        ./cpd-{Operating_System} \ 
                --load-from Image_directory_location \ 
                --assembly ibm-watson-assistant \
                --version Assembly_version \
                --namespace Project \
                --storageclass Storage_class_name  \
                --override Filepath_to_override.yaml \
                --cluster-pull-prefix Registry_from_cluster
                
        • Replace the {Operating_System} in the cpd-{Operating_System} command:
          • Linux: linux
          • Mac OS: darwin
        • For Image_directory_location, specify the location of the {cpd-Operating_System-workspace} directory.
        • For {assembly_version}, specify 1.4.2.
        • For Registry_from_cluster, specify the address of the internal OpenShift docker registry and add /{namespace} to it. The values are typically:
            • OpenShift 4.x: image-registry.openshift-image-registry.svc:5000
            • OpenShift 3.x: docker-registry.default.svc:5000
        • Specify overrides.yaml as the Filepath_to_override.yaml.
        • Specify the --storageclass parameter only if you're using a storage class other than portworx-assistant.
        • If you are using the internal Red Hat OpenShift registry, do not specify the --ask-pull-registry-credentials parameter.
        For example:
        ./cpd-linux --load-from ./cpd-{Operating_System}-workspace \
        --assembly ibm-watson-assistant --version 1.4.2 --namespace zen \
        --cluster-pull-prefix image-registry.openshift-image-registry.svc:5000/zen \
        --override overrides.yaml
  2. For deployments that don't use Portworx: This step is required if you are using a storage solution other than Portworx, such as VMware vSphere volumes, Microsoft Azure Disk volumes, or Amazon Web Services Elastic Block Store (EBS). You must run the backup cron job as part of the installation process.

    You must run the back up cron job as part of the installation process.

    The installation process waits until all persistent volume claims (PVC) have been bound before it completes. However, if you use a storage solution other than Portworx, the PVC for the Postgres backup is not bound until after the Postgres backup cron job runs for the first time. To prevent the installation from having to wait for the job or timing out, you can start the cron job manually.

    1. Check the status of the installation. Do not run the cron job until after the store pod is running. You can check the status of the store pod by using the following command:
      oc get pods -l release=watson-assistant,component=store
    2. Run the cron job:
      oc create job --from=cronjob/watson-assistant-backup-cronjob 
            watson-assistant-backup-cronjob-first-run -n {namespace-name}
    3. After the job creates a pod, the PVC becomes bound and the installation can finish. After the installation is complete, you can delete the job you created.

    For more information about the back up cron job, see Backing up and restoring data.

What to do next

Verify that the installation was successful
After the installation completes, run some tests to verify that the service is working as expected.
  1. Check the status of the assembly and modules.
    ./cpd-linux status --namespace {namespace} --assembly ibm-watson-assistant
  2. If you haven't yet, set up the Helm environment.
    export TILLER_NAMESPACE={namespace}
    oc get secret helm-secret -n $TILLER_NAMESPACE \
    -o yaml|grep -A3 '^data:'|tail -3 | awk -F: '{system("echo "$2" |base64 \
    --decode > "$1)}'
    export HELM_TLS_CA_CERT=$PWD/ca.cert.pem
    export HELM_TLS_CERT=$PWD/helm.cert.pem
    export HELM_TLS_KEY=$PWD/helm.key.pem
    helm version --tls
    
    The output should look like this:
    Client: &version.Version{SemVer:"v2.14.3", \
    GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
    Server: &version.Version{SemVer:"v2.14.3", \
    GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
    
  3. Check the status of the resource.
    helm status watson-assistant --tls
  4. Run the Helm tests.
    helm test watson-assistant --tls --timeout=18000 [--cleanup]
    • --timeout={time} waits for the time in seconds for the tests to run.
    • --cleanup deletes test pods upon completion.
Check for available patches
From a cluster that can connect to the internet, run the following command to check for available patches:
./cpd-Operating_System --repo ./wa-repo.yaml status \
--namespace Project \ 
--assembly ibm-watson-assistant \
--patches
For example:
./cpd-linux --repo wa-repo.yaml status --namespace zen \
--assembly ibm-watson-assistant --patches

To check for patches on an air-gapped cluster, see the list of available patches for Watson Assistant.

Next steps

Provision the service, set access levels, and create required connections. See Administering Watson Assistant.