Creating a service instance for Db2 Big SQL with the cpd-cli service-instance create command

After you install Db2 Big SQL, you must create at least one Db2 Big SQL service instance in the operands project. If you are a Cloud Pak for Data user, you can use the cpd-cli service-instance create command to script the process of creating service instances.

Who needs to complete this task?
To create a service instance by using the cpd-cli, you must have the Create service instances (can_provision) permission in Cloud Pak for Data.
When do you need to complete this task?
Complete this task only if you want to create a service instance from the cpd-cli by using the cpd-cli service-instance create command.
Alternative methods for creating a service instance

Information you need to complete this task

Review the following information before you create a service instance for Db2 Big SQL:

Version requirements

All of the components that are associated with an instance of Cloud Pak for Data must be installed or created at the same release. For example, if Db2 Big SQL is installed at Version 5.0.3, you must create the service instance at Version 5.0.3.

Important: Db2 Big SQL uses a different version number from Cloud Pak for Data. This topic includes a table that shows the Db2 Big SQL version for each refresh of Cloud Pak for Data. Use this table to find the correct version based on the version of Cloud Pak for Data that is installed.
Environment variables

The commands in this task use environment variables so that you can run the commands exactly as written.

  • If you don't have the script that defines the environment variables, see Setting up installation environment variables.
  • To use the environment variables from the script, you must source the environment variables before you run the commands in this task. For example, run:
    source ./cpd_vars.sh

Before you begin

This task assumes that the following prerequisites are met:

Prerequisite Where to find more information
Db2 Big SQL is installed. If this task is not complete, see Installing Db2 Big SQL.
The cpd-cli command-line interface is installed on the workstation from which you will create the service instance. If this task is not complete, see Setting up a client workstation.
You created a Cloud Pak for Data user profile on the workstation from which you will create the service instance.

The profile must be associated with a user who has the Create service instances (can_provision) permission in Cloud Pak for Data.

If this task is not complete, see Creating a profile to use the cpd-cli management commands.

Procedure

Complete the following tasks to create a service instance:

  1. Creating a service instance
  2. Validating that the service instance was created
  3. What to do next

Creating a service instance

To create a service instance:

  1. Change to the directory on your workstation where you want to create the JSON file that defines the service instance payload.
  2. Set the environment variables that are used to populate the JSON payload for the service instance:
    1. Set the INSTANCE_NAME environment variable to the unique name that you want to use as the display name for the service instance:
      export INSTANCE_NAME=<display-name>

      This name is displayed on the Instances page of the Cloud Pak for Data web client.

      The display name is a string and can contain only alphanumeric characters (a-z, A-Z, 0-9) and dashes (-). The name must start and end with alphanumeric characters.

    2. Set the INSTANCE_DESCRIPTION environment variable to the description that you want to use for the service instance:
      export INSTANCE_DESCRIPTION="<description>"

      This description is displayed on the Instances page of the Cloud Pak for Data web client.

      The description is a string and can contain alphanumeric characters, spaces, dashes, underscores, and periods. Make sure that you surround the display name with quotation marks, as shown in the preceding export command.

    3. Set the INSTANCE_VERSION to the version that corresponds to the version of Cloud Pak for Data on your cluster:
      export INSTANCE_VERSION=<version>

      Use the following table to determine the appropriate value:

      Cloud Pak for Data version Service instance version
      5.0.3 7.7.3
      5.0.2 7.7.0
      5.0.1 7.7.0
      5.0.0 7.7.0
    4. Set the INSTANCE_CPU environment variable to the amount of CPU to allocate to the service instance:
      export INSTANCE_CPU=<integer>

      Specify a value between 4 and 64.

      Size the instance based on your workload. For more information about the number of CPU to allocate to the service instance, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

    5. Set the INSTANCE_MEMORY environment variable to the amount of memory to allocate to the service instance:
      export INSTANCE_MEMORY=<integer>

      Specify a value between 16 Gi and 512 Gi. Specify the value as an integer. Omit the unit of measurement.

      Size the instance based on your workload. For more information about the amount of memory to allocate to the service instance, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

    6. Set the INSTANCE_WORKERS environment variable to the number of worker nodes to run the service instance on:
      export INSTANCE_WORKERS=<integer>
      The maximum number of workers that you can specify depends on whether Db2U is configured to run with elevated privileges:
      • If Db2U is configured to run with limited privileges, you can specify a value between 1 and the total number of worker nodes on the cluster.
      • If Db2U is configured to run with elevated privileges, you can specify a value between 1 and 999.

      Most workloads can run on 1 to 3 nodes. For more information about the number of nodes that are recommended based on your workload, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

    7. Set the PV_SIZE environment variable to the amount of storage that you want to allocate to the service instance.
      export PV_SIZE=<integer>

      Specify a value between 200 GB and 10240 GB. The default recommendation is 200 GB. Specify the value as an integer. Omit the unit of measurement.

      Size the persistent volume based on the size of your data set, the number of queries you plan to run, and the complexity of the queries that you plan to run. For more information about the amount of storage to allocate to the service instance, see the component scaling guidance PDF, which you can download from the IBM Entitled Registry.

  3. Set the environment variables for the data source that you want to connect to from the service instance
    Hadoop cluster
    1. Set the REMOTE_CLUSTER_PROTOCOL environment variable to the protocol to use when connecting to the Hadoop cluster manager, for example HTTP or HTTPS:
      export REMOTE_CLUSTER_PROTOCOL=<protocol>
    2. Set the REMOTE_CLUSTER_HOST environment variable to the host name or IP address of the Hadoop cluster manager:
      export REMOTE_CLUSTER_HOST=<host>
    3. Set the REMOTE_CLUSTER_PORT environment variable to the port number to connect to on the Hadoop cluster manager:
      export REMOTE_CLUSTER_PORT=<port-number>
    4. Set the CM_ADMIN_USER environment variable to the username of the Hadoop cluster manager administrator.
      export CM_ADMIN_USER=<username>
    5. Set the CM_ADMIN_PASSWORD environment variable to the password of the Hadoop cluster manager administrator.
      export CM_ADMIN_PASSWORD=<password>
    6. Set the KERBEROS_TYPE environment variable based on the Kerberos configuration on the Hadoop cluster.
      • If the cluster does not use Kerberos, set the environment variable to 0.
        export KERBEROS_TYPE=0
      • If the cluster uses MIT KDC Kerberos, set the environment variable to 1.
        export KERBEROS_TYPE=1
      • If the cluster uses a custom Kerberos keytab file, set the environment variable to 2.
        export KERBEROS_TYPE=2
    7. MIT KDC Kerberos configurations only. Set the KERBEROS_PRINCIPAL environment variable to the identity of a Kerberos administrator that has permission to create Kerberos identities for Db2 Big SQL users.
      export KERBEROS_PRINCIPAL=<admin-ID>
    8. MIT KDC Kerberos configurations only. Set the KERBEROS_PASSWORD environment variable to the password for the Kerberos administrator ID:
      export KERBEROS_PASSWORD=<password>
    9. Kerberos custom keytab configurations only. Set the KERBEROS_CUSTOM_KEYTAB environment variable to the fully qualified name of your keytab file:
      export KERBEROS_CUSTOM_KEYTAB=<fully-qualified-file-name>

    Object store
    1. Set the OS_ENDPOINT environment variable to the service endpoint for the object store:
      export OS_ENDPOINT=<endpoint>
    2. Set the HMAC_ACCESS_KEY environment variable to HMAC access key associated with your object store service account:
      export HMAC_ACCESS_KEY=<your-HMAC-access-key>
    3. Set the HMAC_SECRET_KEY environment variable to the secret key associated with your access key:
      export HMAC_SECRET_KEY=<your-HMAC-secret-key>
    4. SSL environments only. Set the OS_SSL_CERT environment variable based on how you want to provide the SSL certificate:
      • If you want to use an SSL certificate that is stored as a secret in a vault, use the following format:
        export OS_SSL_CERT=vault:<vault-secret-name>:<vault-secret-key>
      • If you want to use a manually supplied SSL certificate:
        export OS_SSL_CERT="<my-certificate-contents>"

        Ensure that you surround the contents of the certificate with double quotation marks ("").

    5. Set the OS_BUCKET environment variable based on whether you want to limit access to a specific bucket:
      • If you want to limit access to a specific bucket, set the environment variable to the name of the bucket:
        export OS_BUCKET=<bucket-name>
      • If you don't want to limit access to a specific bucket, set the environment variable to "":
        export OS_BUCKET=""
    6. Set the OS_PATH_STYLE_ACCESS environment variable based on whether the object store service is configured for path-style access:
      • If the object store is configured for path-style access, set the environment variable to true:
        export OS_PATH_STYLE_ACCESS=true
      • If the object store is configured for virtual-hosted-style access, set the environment variable to false:
        export OS_PATH_STYLE_ACCESS=false

  4. Create the big-sql-instance.json payload file:
    Instance points to a remote Hadoop cluster without Kerberos
    cat << EOF > ./big-sql-instance.json
    {
        "addon_type": "bigsql",
        "display_name": "${INSTANCE_NAME}",
        "addon_version": "${INSTANCE_VERSION}",
        "namespace": "${PROJECT_CPD_INST_OPERANDS}",
        "create_arguments": {
            "resources": {
                "cpu": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_CPU} ))",
                "memory": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_MEMORY} ))"
            },
            "parameters": {
                "global.persistence.storageClassName": "${STG_CLASS_BLOCK}",
                "resources.engine.requests.cpu": "${INSTANCE_CPU}",
                "resources.engine.requests.memory": "${INSTANCE_MEMORY}Gi",
                "workerCount": "${INSTANCE_WORKERS}",
                "persistence.headPvSize": "${PV_SIZE}Gi",
                "remoteCluster.cmAdminUserEncoded": "$([[ ! -z ${CM_ADMIN_USER} ]] && echo ${CM_ADMIN_USER} | base64)",
                "remoteCluster.cmAdminPasswordEncoded": "$([[ ! -z ${CM_ADMIN_PASSWORD} ]] && echo ${CM_ADMIN_PASSWORD} | base64)",
                "remoteCluster.cmProtocol": "${REMOTE_CLUSTER_PROTOCOL}",
                "remoteCluster.cmHost": "${REMOTE_CLUSTER_HOST}",
                "remoteCluster.cmPort": "${REMOTE_CLUSTER_PORT}",
                "remoteCluster.useCloudObjectStore": "false",
                "secretsInstanceKey": "${INSTANCE_SECRET_KEY}",
                "persistence.auditPvSize": "30Gi"
            },
            "description": "",
            "metadata": {
                "engine-service": "c-bigsql-{INSTANCEID}-db2u-0.c-bigsql-{INSTANCEID}-db2u-internal.zen",
                "database-name": "bigsql",
                "jdbc-port": 50000,
                "secure-jdbc-port": 50001,
                "sslConnection": true,
                "credentials": {
                    "user": "dmcuser",
                    "securityMechanism": "9"
                },
                "ssl": {
                    "certKey": "ca.crt",
                    "certLabel": "CN=zen-ca-cert",
                     "secretName": "bigsql-{INSTANCEID}-internal-tls"
                }
                    },
        "deployment_id": ""
      },
      "transient_fields": {}
    }
    EOF
    The following environment variables use the values that are already defined in your installation environment variables script:
    • ${PROJECT_CPD_INST_OPERANDS}
    • ${STG_CLASS_BLOCK}

    Instance points to a remote Hadoop cluster that uses MIT KDC Kerberos
    cat << EOF > ./big-sql-instance.json
    {
        "addon_type": "bigsql",
        "display_name": "${INSTANCE_NAME}",
        "addon_version": "${INSTANCE_VERSION}",
        "namespace": "${PROJECT_CPD_INST_OPERANDS}",
        "create_arguments": {
            "resources": {
                "cpu": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_CPU} ))",
                "memory": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_MEMORY} ))"
            },
            "parameters": {
                "global.persistence.storageClassName": "${STG_CLASS_BLOCK}",
                "resources.engine.requests.cpu": "${INSTANCE_CPU}",
                "resources.engine.requests.memory": "${INSTANCE_MEMORY}Gi",
                "workerCount": "${INSTANCE_WORKERS}",
                "persistence.headPvSize": "${PV_SIZE}Gi",  
                "remoteCluster.cmAdminUserEncoded": "$([[ ! -z ${CM_ADMIN_USER} ]] && echo ${CM_ADMIN_USER} | base64)",
                "remoteCluster.cmAdminPasswordEncoded": "$([[ ! -z ${CM_ADMIN_PASSWORD} ]] && echo ${CM_ADMIN_PASSWORD} | base64)",
                "remoteCluster.cmProtocol": "${REMOTE_CLUSTER_PROTOCOL}",
                "remoteCluster.cmHost": "${REMOTE_CLUSTER_HOST}",
                "remoteCluster.cmPort": "${REMOTE_CLUSTER_PORT}",
                "remoteCluster.kerberosPrincipalEncoded": "$([[ ! -z ${KERBEROS_PRINCIPAL} ]] && echo ${KERBEROS_PRINCIPAL} | base64)",
                "remoteCluster.kerberosPasswordEncoded": "$([[ ! -z ${KERBEROS_PASSWORD} ]] && echo ${KERBEROS_PASSWORD} | base64)",
                "remoteCluster.useCloudObjectStore": "false",
                "secretsInstanceKey": "${INSTANCE_SECRET_KEY}",
                "persistence.auditPvSize": "30Gi"
            },
            "description": "",
            "metadata": {
                "engine-service": "c-bigsql-{INSTANCEID}-db2u-0.c-bigsql-{INSTANCEID}-db2u-internal.zen",
                "database-name": "bigsql",
                "jdbc-port": 50000,
                "secure-jdbc-port": 50001,
                "sslConnection": true,
                "credentials": {
                    "user": "dmcuser",
                    "securityMechanism": "9"
                },
                "ssl": {
                    "certKey": "ca.crt",
                    "certLabel": "CN=zen-ca-cert",
                     "secretName": "bigsql-{INSTANCEID}-internal-tls"
                }
                    },
        "deployment_id": ""
      },
      "transient_fields": {}
    }
    EOF
    The following environment variables use the values that are already defined in your installation environment variables script:
    • ${PROJECT_CPD_INST_OPERANDS}
    • ${STG_CLASS_BLOCK}

    Instance points to a remote Hadoop cluster that uses a custom Kerberos keytab file
    cat << EOF > ./big-sql-instance.json
    {
        "addon_type": "bigsql",
        "display_name": "${INSTANCE_NAME}",
        "addon_version": "${INSTANCE_VERSION}",
        "namespace": "${PROJECT_CPD_INST_OPERANDS}",
        "create_arguments": {
            "resources": {
                "cpu": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_CPU} ))",
                "memory": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_MEMORY} ))"
            },
            "parameters": {
                "global.persistence.storageClassName": "${STG_CLASS_BLOCK}",
                "resources.engine.requests.cpu": "${INSTANCE_CPU}",
                "resources.engine.requests.memory": "${INSTANCE_MEMORY}Gi",
                "workerCount": "${INSTANCE_WORKERS}",
                "persistence.headPvSize": "${PV_SIZE}Gi",
                "remoteCluster.cmAdminUserEncoded": "$([[ ! -z ${CM_ADMIN_USER} ]] && echo ${CM_ADMIN_USER} | base64)",
                "remoteCluster.cmAdminPasswordEncoded": "$([[ ! -z ${CM_ADMIN_PASSWORD} ]] && echo ${CM_ADMIN_PASSWORD} | base64)",
                "remoteCluster.cmProtocol": "${REMOTE_CLUSTER_PROTOCOL}",
                "remoteCluster.cmHost": "${REMOTE_CLUSTER_HOST}",
                "remoteCluster.cmPort": "${REMOTE_CLUSTER_PORT}",
                "remoteCluster.kerberosCustomKeytab": "$(base64 -w0 ${KERBEROS_CUSTOM_KEYTAB_FILENAME})",
                "remoteCluster.useCloudObjectStore": "false",
                "secretsInstanceKey": "${INSTANCE_SECRET_KEY}",
                "persistence.auditPvSize": "30Gi"
            },
            "description": "",
            "metadata": {
                "engine-service": "c-bigsql-{INSTANCEID}-db2u-0.c-bigsql-{INSTANCEID}-db2u-internal.zen",
                "database-name": "bigsql",
                "jdbc-port": 50000,
                "secure-jdbc-port": 50001,
                "sslConnection": true,
                "credentials": {
                    "user": "dmcuser",
                    "securityMechanism": "9"
                },
                "ssl": {
                    "certKey": "ca.crt",
                    "certLabel": "CN=zen-ca-cert",
                     "secretName": "bigsql-{INSTANCEID}-internal-tls"
                }
                    },
        "deployment_id": ""
      },
      "transient_fields": {}
    }
    EOF
    The following environment variables use the values that are already defined in your installation environment variables script:
    • ${PROJECT_CPD_INST_OPERANDS}
    • ${STG_CLASS_BLOCK}

    Instance points to an object store that is not configured for SSL
    cat << EOF > ./big-sql-instance.json
    {
        "addon_type": "bigsql",
        "display_name": "${INSTANCE_NAME}",
        "addon_version": "${INSTANCE_VERSION}",
        "namespace": "${PROJECT_CPD_INST_OPERANDS}",
        "create_arguments": {
            "resources": {
                "cpu": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_CPU} ))",
                "memory": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_MEMORY} ))"
            },
            "parameters": {
                "global.persistence.storageClassName": "${STG_CLASS_BLOCK}",
                "resources.engine.requests.cpu": "${INSTANCE_CPU}",
                "resources.engine.requests.memory": "${INSTANCE_MEMORY}Gi",
                "workerCount": "${INSTANCE_WORKERS}",
                "persistence.headPvSize": "${PV_SIZE}Gi",
                "objectStore.endpoint": "${OS_ENDPOINT}",
                "objectStore.hmacAccess": "${HMAC_ACCESS_KEY}",
                "objectStore.hmacSecret": "${HMAC_SECRET_KEY}",
                "objectStore.sslEnabled": "false",
                "objectStore.bucketName": "${OS_BUCKET}",
                "objectStore.pathStyleAccess": "${OS_PATH_STYLE_ACCESS}",
                "remoteCluster.useCloudObjectStore": "true",
                "persistence.auditPvSize": "30Gi"
            },
            "description": "",
            "metadata": {
                "engine-service": "c-bigsql-{INSTANCEID}-db2u-0.c-bigsql-{INSTANCEID}-db2u-internal.zen",
                "database-name": "bigsql",
                "jdbc-port": 50000,
                "secure-jdbc-port": 50001,
                "sslConnection": true,
                "credentials": {
                    "user": "dmcuser",
                    "securityMechanism": "9"
                },
                "ssl": {
                    "certKey": "ca.crt",
                    "certLabel": "CN=zen-ca-cert",
                     "secretName": "bigsql-{INSTANCEID}-internal-tls"
                }
                    },
        "deployment_id": ""
      },
      "transient_fields": {}
    }
    EOF
    The following environment variables use the values that are already defined in your installation environment variables script:
    • ${PROJECT_CPD_INST_OPERANDS}
    • ${STG_CLASS_BLOCK}

    Instance points to object storage with SSL
    cat << EOF > ./big-sql-instance.json
    {
        "addon_type": "bigsql",
        "display_name": "${INSTANCE_NAME}",
        "addon_version": "${INSTANCE_VERSION}",
        "namespace": "${PROJECT_CPD_INST_OPERANDS}",
        "create_arguments": {
            "resources": {
                "cpu": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_CPU} ))",
                "memory": "$(( (${INSTANCE_WORKERS} + 1) * ${INSTANCE_MEMORY} ))"
            },
            "parameters": {
                "global.persistence.storageClassName": "${STG_CLASS_BLOCK}",
                "resources.engine.requests.cpu": "${INSTANCE_CPU}",
                "resources.engine.requests.memory": "${INSTANCE_MEMORY}Gi",
                "workerCount": "${INSTANCE_WORKERS}",
                "persistence.headPvSize": "${PV_SIZE}Gi",
                "objectStore.endpoint": "${OS_ENDPOINT}",
                "objectStore.hmacAccess": "${HMAC_ACCESS_KEY}",
                "objectStore.hmacSecret": "${HMAC_SECRET_KEY}",
                "objectStore.sslEnabled": "true",
                "objectStore.sslCertificate": "${OS_SSL_CERT}",
                "objectStore.bucketName": "${OS_BUCKET}",
                "objectStore.pathStyleAccess": "${OS_PATH_STYLE_ACCESS}",
                "remoteCluster.useCloudObjectStore": "true",
                "persistence.auditPvSize": "30Gi"
            },
            "description": "",
            "metadata": {
                "engine-service": "c-bigsql-{INSTANCEID}-db2u-0.c-bigsql-{INSTANCEID}-db2u-internal.zen",
                "database-name": "bigsql",
                "jdbc-port": 50000,
                "secure-jdbc-port": 50001,
                "sslConnection": true,
                "credentials": {
                    "user": "dmcuser",
                    "securityMechanism": "9"
                },
                "ssl": {
                    "certKey": "ca.crt",
                    "certLabel": "CN=zen-ca-cert",
                     "secretName": "bigsql-{INSTANCEID}-internal-tls"
                }
                    },
        "deployment_id": ""
      },
      "transient_fields": {}
    }
    EOF
    The following environment variables use the values that are already defined in your installation environment variables script:
    • ${PROJECT_CPD_INST_OPERANDS}
    • ${STG_CLASS_BLOCK}

  5. Set the PAYLOAD_FILE environment variable to the fully qualified name of the JSON payload file on your workstation:
    export PAYLOAD_FILE=<fully-qualified-JSON-file-name>
  6. Create the service instance from the payload file:
    cpd-cli service-instance create \
    --profile=${CPD_PROFILE_NAME} \
    --from-source=${PAYLOAD_FILE}

Validating that the service instance was created

To validate that the service instance was created, run the following command:

cpd-cli service-instance status ${INSTANCE_NAME} \
--profile=${CPD_PROFILE_NAME} \
--output=json
  • If the command returns PROVISIONED, the service instance was successfully created.
  • If the command returns PROVISION_IN_PROGRESS, wait a few minutes and run the command again.
  • If the command returns FAILED, review the pod logs for the zen-core-api and zen-watcher pods for possible causes.

What to do next

Depending on your environment, you might have to do some post-provisioning tasks. For more information, see Db2 Big SQL post-provisioning tasks.

After you provision an instance, you must add one or more users to the instance. You (instance owner) are not automatically added as a user. For more information, see Managing access to Db2 Big SQL instances.