Creating a service instance for Watson Query with the cpd-cli service-instance create command

After you install Watson Query, you must create at least one Watson Query service instance. Each service instance must be in a different Red Hat® OpenShift® Container Platform project. You can create a service instance in the operands project or in a project that is tethered to the operands project. If you are a Cloud Pak for Data user, you can use the cpd-cli service-instance create command to script the process of creating service instances.

Who needs to complete this task?
To create a service instance by using the cpd-cli, you must have the Create service instances (can_provision) permission in Cloud Pak for Data.
When do you need to complete this task?
Complete this task only if you want to create a service instance from the cpd-cli by using the cpd-cli service-instance create command.
Alternative methods for creating a service instance

Information you need to complete this task

Review the following information before you create a service instance for Watson Query:

Version requirements

All of the components that are associated with an instance of Cloud Pak for Data must be installed or created at the same release. For example, if Watson Query is installed at Version 4.8.7, you must create the service instance at Version 4.8.7.

Important: Watson Query uses a different version number from Cloud Pak for Data. This topic includes a table that shows the Watson Query version for each refresh of Cloud Pak for Data. Use this table to find the correct version based on the version of Cloud Pak for Data that is installed.
Environment variables

The commands in this task use environment variables so that you can run the commands exactly as written.

  • If you don't have the script that defines the environment variables, see Setting up installation environment variables.
  • To use the environment variables from the script, you must source the environment variables before you run the commands in this task. For example, run:
    source ./cpd_vars.sh

Before you begin

This task assumes that the following prerequisites are met:

Prerequisite Where to find more information
Watson Query is installed. If this task is not complete, see Installing Watson Query.
The cpd-cli command-line interface is installed on the workstation from which you will create the service instance. If this task is not complete, see Setting up a client workstation.
You created a Cloud Pak for Data user profile on the workstation from which you will create the service instance.

The profile must be associated with a user who has the Create service instances (can_provision) permission in Cloud Pak for Data.

If this task is not complete, see Creating a profile to use the cpd-cli management commands.

Procedure

Complete the following tasks to create a service instance:

  1. Creating a service instance
  2. Validating that the service instance was created
  3. What to do next

Creating a service instance

To create a service instance:

  1. Change to the directory on your workstation where you want to create the JSON file that defines the service instance payload.
  2. Set the environment variables that are used to populate the JSON payload for the service instance:
    1. Set the INSTANCE_SHORT_NAME environment variable to the unique name that you want to use to identify the service instance:
      export INSTANCE_SHORT_NAME="<display-name>"

      The short name is a string and can contain alphanumeric characters (a-z, A-Z, 0-9), dashes (-), and underscores (_).

    2. Set the INSTANCE_PROJECT to the project where you want to create the service instance:
      Create the service instance in the operands project
      export INSTANCE_PROJECT=${PROJECT_CPD_INST_OPERANDS}

      The command uses the PROJECT_CPD_INST_OPERANDS variable, which is already defined in your installation environment variables script.


      Create the service instance in a tethered project
      Important: If multiple tethered projects are associated with this instance of Cloud Pak for Data, make sure that the ${PROJECT_CPD_INSTANCE_TETHERED} environment variable is set to the correct project name before you run the export command:
      echo $PROJECT_CPD_INSTANCE_TETHERED
      export INSTANCE_PROJECT=${PROJECT_CPD_INSTANCE_TETHERED}

      Remember: You can create only one service instance in each project.
    3. Set the INSTANCE_NAME environment variable:
      export INSTANCE_NAME="watson-query-${INSTANCE_PROJECT}-${INSTANCE_SHORT_NAME}"
    4. Set the INSTANCE_DESCRIPTION environment variable to the description that you want to use for the service instance:
      export INSTANCE_DESCRIPTION="<description>"

      This description is displayed on the Instances page of the Cloud Pak for Data web client.

      The description is a string and can contain alphanumeric characters, spaces, dashes, underscores, and periods. Make sure that you surround the display name with quotation marks, as shown in the preceding export command.

    5. Set the INSTANCE_VERSION environment variable to the version that corresponds to the version of Cloud Pak for Data on your cluster:
      export INSTANCE_VERSION=<version>

      Use the following table to determine the appropriate value:

      Cloud Pak for Data version Service instance version
      4.8.7 2.2.5
      4.8.6 2.2.5
      4.8.5 2.2.5
      4.8.4 2.2.4
      4.8.3 2.2.2
      4.8.2 2.2.2
      4.8.1 2.2.0
      4.8.0 2.2.0
    6. Set the INSTANCE_CPU environment variable to the amount of CPU to allocate to the service instance:
      export INSTANCE_CPU=<integer>

      Specify a value between 4 and 64.

      Size the instance based on your workload. For more information about the number of CPU to allocate to the service instance, see the component scaling guidance PDF, which you can download from the IBM® Entitled Registry.

    7. Set the INSTANCE_MEMORY environment variable to the amount of memory to allocate to the service instance:
      export INSTANCE_MEMORY=<integer>

      Specify a value between 10 Gi and 512 Gi. Specify the value as an integer. Omit the unit of measurement.

      Size the instance based on your workload. For more information about the amount of memory to allocate to the service instance, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

    8. Set the INSTANCE_WORKERS environment variable to the number of worker nodes to run the service instance on:
      export INSTANCE_WORKERS=<integer>
      The maximum number of workers that you can specify depends on whether Db2U is configured to run with elevated privileges:
      • If Db2U is configured to run with limited privileges, you can specify a value between 1 and the total number of worker nodes on the cluster.
      • If Db2U is configured to run with elevated privileges, you can specify a value between 1 and 999.

      Most workloads can run on 1 to 3 nodes. For more information about the number of nodes recommended based on your workload, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

    9. Set the PV_SIZE environment variable to the amount of storage that you want to allocate to the service instance:
      export PV_SIZE=<integer>

      Specify a value between 10 Gi and 10240 Gi. The default recommendation is 50 Gi. Specify the value as an integer. Omit the unit of measurement.

      Size the volume based on the size of the queries that you plan to run. For guidance, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

    10. Set the PV_SIZE_CACHE environment variable to the amount of storage that you want to allocate to caching for the service instance:
      export PV_SIZE_CACHE=<integer>

      Specify a value between 1 Gi and 10240 Gi. The default recommendation is 100 Gi. Specify the value as an integer. Omit the unit of measurement.

      Size the volume based on the size of the data cache. For guidance, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

    11. Set the PV_SIZE_AUDITING environment variable to the amount of storage that you want to allocate to audit logs for the service instance:
      export PV_SIZE_AUDITING=<integer>

      Specify a value between 1 Gi and 10240 Gi. The default recommendation is 30 Gi. Specify the value as an integer. Omit the unit of measurement.

      Size the volume based on the number of auditable events that are logged. For guidance, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

  3. Create the watson-query-instance.json payload file.

    The command that you need to run depends on the type of storage on your cluster.


    Portworx storage
    cat << EOF > ./watson-query-instance.json
    {
        "addon_type": "dv",
        "display_name": "${INSTANCE_NAME}",
        "addon_version": "${INSTANCE_VERSION}",
        "namespace": "${INSTANCE_PROJECT}",
        "create_arguments": {
            "description": "${INSTANCE_DESCRIPTION}",
            "metaData": {},
            "parameters" : {
                "resources.dv.requests.cpu": "${INSTANCE_CPU}",
                "resources.dv.requests.memory": "${INSTANCE_MEMORY}Gi",
                "image.pullPolicy": "IfNotPresent",
                "workerCount": "${INSTANCE_WORKERS}",
                "persistence.storageClass": "portworx-db2-rwo-sc",
                "persistence.size": "${PV_SIZE}Gi",
                "persistence.cachingpv.storageClass": "portworx-db2-rwx-sc",
                "persistence.cachingpv.size": "${PV_SIZE_CACHE}Gi",
                "persistence.auditpv.storageClass": "portworx-db2-rwx-sc",
                "persistence.auditpv.size": "${PV_SIZE_AUDITING}Gi"
            },
        "resources": {
            "cpu": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_CPU} ))",
            "memory": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_MEMORY} ))"
        },
        "transientFields": {}
      }
    }
    EOF

    Amazon Elastic storage
    cat << EOF > ./watson-query-instance.json
    {
        "addon_type": "dv",
        "display_name": "${INSTANCE_NAME}",
        "addon_version": "${INSTANCE_VERSION}",
        "namespace": "${INSTANCE_PROJECT}",
        "create_arguments": {
            "description": "${INSTANCE_DESCRIPTION}",
            "metaData": {},
            "parameters" : {
                "resources.dv.requests.cpu": "${INSTANCE_CPU}",
                "resources.dv.requests.memory": "${INSTANCE_MEMORY}Gi",
                "image.pullPolicy": "IfNotPresent",
                "workerCount": "${INSTANCE_WORKERS}",
                "persistence.storageClass": "${STG_CLASS_FILE}",
                "persistence.size": "${PV_SIZE}Gi",
                "persistence.cachingpv.storageClass": "${STG_CLASS_FILE}",
                "persistence.cachingpv.size": "${PV_SIZE_CACHE}Gi",
                "persistence.auditpv.storageClass": "${STG_CLASS_FILE}",
                "persistence.auditpv.size": "${PV_SIZE_AUDITING}Gi"
            },
        "resources": {
            "cpu": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_CPU} ))",
            "memory": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_MEMORY} ))"
        },
        "transientFields": {}
      }
    }
    EOF
    The following environment variables use the values that are already defined in your installation environment variables script:
    • ${STG_CLASS_FILE}

    All other storage
    cat << EOF > ./watson-query-instance.json
    {
        "addon_type": "dv",
        "display_name": "${INSTANCE_NAME}",
        "addon_version": "${INSTANCE_VERSION}",
        "namespace": "${INSTANCE_PROJECT}",
        "create_arguments": {
            "description": "${INSTANCE_DESCRIPTION}",
            "metaData": {},
            "parameters" : {
                "resources.dv.requests.cpu": "${INSTANCE_CPU}",
                "resources.dv.requests.memory": "${INSTANCE_MEMORY}Gi",
                "image.pullPolicy": "IfNotPresent",
                "workerCount": "${INSTANCE_WORKERS}",
                "persistence.storageClass": "${STG_CLASS_BLOCK}",
                "persistence.size": "${PV_SIZE}Gi",
                "persistence.cachingpv.storageClass": "${STG_CLASS_BLOCK}",
                "persistence.cachingpv.size": "${PV_SIZE_CACHE}Gi",
                "persistence.auditpv.storageClass": "${STG_CLASS_FILE}",
                "persistence.auditpv.size": "${PV_SIZE_AUDITING}Gi"
            },
        "resources": {
            "cpu": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_CPU} ))",
            "memory": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_MEMORY} ))"
        },
        "transientFields": {}
      }
    }
    EOF
    The following environment variables use the values that are already defined in your installation environment variables script:
    • ${STG_CLASS_BLOCK}
    • ${STG_CLASS_FILE}

  4. Set the PAYLOAD_FILE environment variable to the fully qualified name of the JSON payload file on your workstation:
    export PAYLOAD_FILE=<fully-qualified-JSON-file-name>
  5. Create the service instance from the payload file:
    cpd-cli service-instance create \
    --profile=${CPD_PROFILE_NAME} \
    --from-source=${PAYLOAD_FILE}
Note: The c-db2u-dv-dvcaching pod remains in the "0/1 Init" state during the entire Watson Query instance-provisioning process. The pod switches to the "1/1 Running" state after the process is complete.

Validating that the service instance was created

To validate that the service instance was created, run the following command:

cpd-cli service-instance status ${INSTANCE_NAME} \
--profile=${CPD_PROFILE_NAME} \
--output=json
  • If the command returns PROVISIONED, the service instance was successfully created.
  • If the command returns PROVISION_IN_PROGRESS, wait a few minutes and run the command again.
  • If the command returns FAILED, review the pod logs for the zen-core-api and zen-watcher pods for possible causes.

What to do next

  1. When you provision the Watson Query service, you are automatically assigned the Watson Query Admin role. After you provision the service, you must give other users access to the service. For more information, see Managing users in Watson Query.
  2. To connect to the Watson Query service, use the JDBC URL that is provided in the Configure connection page for the service. Additionally, if you have a load balancer, you must open the port in your load balancer and your firewall. For more information, see Configuring network requirements for Watson Query.
Now you can use the Watson Query service. For more information, see Virtualizing data.