Installing Watson Discovery

Before you can use Watson™ Discovery, you must prepare the cluster to work with Watson Discovery and then install the Watson Discovery service on the cluster.

Before you begin

Configure nodes for Elasticsearch

Watson Discovery uses Elasticsearch, a distributed, RESTful search and analytics engine. Before you install Watson Discovery, you must ensure that each node in the cluster is configured for Elasticsearch. You do this by issuing the following commands. You must set the vm.max_map_count parameter to 262144.

For more information, see Virtual memory in the Elasticsearch reference.

Openshift 3

For Openshift 3 clusters, you will need to have ssh access into each of the cluster's nodes and run the following commands:

echo vm.max_map_count=262144 >> /etc/sysctl.conf
sysctl -w vm.max_map_count=262144

The change will take effect immediately and persist on reboot.

Openshift 4

For Openshift 4 clusters, Elasticsearch can be configured by creating and deploying a MachineConfig object. Once you've logged into the cluster, run the following command:

cat << EOF | oc create -f -
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-sysctl-elastic
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          # vm.max_map_count=262144
          source: data:text/plain;charset=utf-8;base64,dm0ubWF4X21hcF9jb3VudD0yNjIxNDQ=
        filesystem: root
        mode: 0644
        path: /etc/sysctl.d/99-elasticsearch.conf
EOF

Once the object has been deployed, run the following command to verify the change has been made:

oc wait mcp/worker --for condition=updated --timeout=25m

For more information on Openshift 4's MachineConfig operator, see OpenShift 4 machine-configure-operator repository.

Decide what you want to install
  • Deploy multiple Watson Discovery service instances on a single Red Hat OpenShift cluster, but installed in separate Cloud Pak for Data installations, in different Red Hat® OpenShift® projects (namespaces). If you choose this option, follow the installation instructions.
  • If you purchased Watson Discovery for Content Intelligence, to activate the option, you need to create an override file. For more information, see Override values for Watson Discovery installation.

Ensure that you have proper permissions on the cluster and that you have already installed IBM Cloud Pak for Data.

Note: Because Watson Discovery uses Elastic Search and the ElasticSearch data node is backed up through MinIO, increased storage sizes are required for MinIO.

If the size of ElasticSearch data node is 30 GB and user have 2 node, the MinIO needs more 120GB storage. It means you need to increase 30 GB of the storage of a MinIO node if you have 4 node of MinIO. You can calculate the additionally required storage of MinIO with the following formula:

<size of a data node> * <number of data nodes> * 2 / <number of MinIO node> =
                            <amount of additional storage per MinIO node>

In above case, it is: 30 * 2 * 2 / 4 = 30