Setting up a connection from Db2 Big SQL to a remote data source

After you specify the resources that you want to use for persistent storage, set up a connection to one or more remote data sources.

Before you begin

To connect Db2® Big SQL to a remote Hadoop cluster or object store, you must meet some requirements. For more information, see Remote Hadoop cluster or public or private object store.

About this task

You can connect to a Hadoop cluster on Cloudera Data Platform (CDP) Private Cloud Base 7.1.7, an object store, or to both a Hadoop cluster and an object store. In the third scenario, you must configure the remote cluster to connect to the object store because Db2 Big SQL uses the Hive metastore as its catalog. For more information about each scenario, see Db2 Big SQL architecture.

Note: If updates are made to the remote Hadoop cluster's core-site.xml configuration file to set up the cluster access to an object store service, its Hive service must be restarted, even if the cluster manager doesn't indicate that a restart is needed.
Using vault stored credentials and certificates

To connect to object store services, you can use credentials and certificates that are stored as secrets in a vault. To connect to an object store service by using secrets from a vault, obtain the following information:

  • The name of the vault secret that contains the credentials or certificate that you want to use.
  • The name of the key under which the credential or certificate value is stored in the secret.

    For example, secrets that contain credentials have two keys, username and password that store a user name and password.

To obtain this information, in Cloud Pak for Data, go to Administration > Configurations > Vaults and secrets > Secrets. If you cannot access this information, ask your Cloud Pak for Data administrator to provide it.

Procedure

  1. To connect to a Hadoop cluster, select the Configure Hadoop cluster checkbox, and specify the following information:
    • The Hadoop cluster manager (Cloudera cluster manager server) host URL
      Note: If you are connecting to a Cloudera cluster manager server over SSL, you must configure SSL. For more information, see Connecting to an TLS (SSL) enabled Hadoop cluster.
    • The Hadoop cluster manager username
    • The Hadoop cluster manager user password
  2. Under Configure Kerberos, select one of the following options.
    • If Kerberos is not enabled on the cluster, select No Kerberos.
    • To allow Db2 Big SQL to automate the creation of principals and keytabs when MIT Kerberos security is enabled on the Hadoop cluster, select the Using MIT KDC Kerberos checkbox, and specify the following information for the user that can create principals for Db2 Big SQL in the Kerberos Key Distribution Center (KDC):
      • The Kerberos admin principal
      • The Kerberos admin password
    • If Active Directory is used for Kerberos, select Using a custom keytab, and then click Upload to upload a keytab file.

      The contents of the keytab file must be base64 encoded. For more information, see Enabling Active Directory as the Kerberos server.

  3. To connect to an object store, select the Configure object store checkbox, and specify the following information:
    • The object store service endpoint
    • The object store service access ID key

      To use a secret from a vault, type vault:<vault-secret-name>:<vault-secret-key>.

    • The object store service secret key

      To use a secret from a vault, type vault:<vault-secret-name>:<vault-secret-key>.

    Remember: If the Db2 Big SQL instance is also set up to connect to a Hadoop cluster, the cluster must be configured to access the same object store service.
  4. If you are accessing a public object store, select the Setup SSL for object store configuration checkbox.
    Note: Most public cloud object stores use a well-known certificate authority, so it is generally not necessary to provide a Secure Sockets Layer (SSL) certificate in the SSL Certificate box. But select the Setup SSL for object store configuration checkbox so that the connection to the object store is secure.
  5. If you are accessing an on-premises, self-hosted object store that is accessed by using SSL, provide the SSL certificate.
    Note: Skip this step if you are using a CA certificate to connect to internal servers from the platform, and the object store service that you are connecting the Db2 Big SQL instance to is secured by a certificate that is generated by using that CA. In this case, you do not need to provide the certificate again.
    1. Select the Setup SSL for object store configuration checkbox.
    2. If you are using a secret from a vault for the certificate, in the SSL Certificate box, type vault:<vault-secret-name>:<vault-secret-key>.
    3. If you are not using a secret from a vault for the certificate, in the SSL Certificate box, copy the SSL certificate in base64 encoded Privacy Enhanced Mail (PEM) format, including the PEM header and footer.
      For example,
      -----BEGIN CERTIFICATE-----
      MIIFVzCCAz+gAwIBAgIJAM+JlcdkA2RaMA0GCSqGSIb3DQEBCwUAMEIxCzAJBgNV
      BAYTAlhYMRUwEwYDVQQHDAxEZWZhdWx0IENpdHkxHDAaBgNVBAoME0RlZmF1bHQg
      ...
      ZzV0pevZpJWFCt2QYEprZppj0KyiGHKQcEXAn/953YPTOmmdzGVOu5eLoTncICte
      oBzKDOdxT6CTenizfaiP5LlWH1LfPvw/+0Nz
      -----END CERTIFICATE-----
  6. If you are connecting to an object store or to both a Hadoop cluster and an object store, and you want to limit access to a single object store bucket, select the Specify the object store bucket name checkbox, and enter the bucket name.
  7. If the object store service is configured for path style access, select the Use path style access checkbox.

    If the object store service uses virtual hosted style, do not select the checkbox.

  8. Click Next.
  9. In the Summary page, review the information that you specified.
    1. If you want to make changes, click Previous and go to the page where you want to edit information.
    2. When all configuration information is correct, click Create.

Results

A Db2 Big SQL instance is created. To check that the instance is ready to use, run the following command:

oc get bigsql -l app.kubernetes.io/name=db2-bigsql

When the instance is ready, the command returns the following output:

NAME                   DB2UCLUSTER            STATE   AGE
bigsql-<instance_id>   bigsql-<instance_id>   Ready   19m

What to do next

Depending on your environment, you might have to do some post-provisioning tasks. For more information, see Db2 Big SQL post-provisioning tasks.

After you provision an instance, you must add one or more users to the instance. You (instance owner) are not automatically added as a user. For more information, see the section on specifying which users can access Db2 Big SQL instances in Configuring, monitoring, and managing access to Db2 Big SQL instances.