Configuring Db2 Big SQL to use datanode hostnames when connecting to the datanodes on the Hadoop cluster

By default, Db2® Big SQL connects to DataNodes by using the IP address provided by the NameNode. Depending on the network configuration, this IP address might be unreachable from Db2 Big SQL.

About this task

To resolve this problem, let Db2 Big SQL use DNS resolution of the DataNode hostname.

Note: If the Hadoop configuration files are resynchronized from the Hadoop cluster as described in Updating the Db2 Big SQL and Hadoop configuration, then the following steps must be reapplied.

Procedure

  1. Log in to your OpenShift® cluster as a project administrator:
    oc login <OpenShift_URL>:<port>
  2. Change to the project where the Cloud Pak for Data control plane is installed:
    oc project ${PROJECT_CPD_INSTANCE}
    Note: This command uses an environment variable so that you can run the command exactly as written. For information about sourcing environment variables, see Setting up installation environment variables.
  3. Identify the Db2 Big SQL instance ID:
    oc get cm -l component=db2bigsql -o custom-columns="Instance Id:{.data.instance_id},Instance Name:{.data.instance_name},Created:{.metadata.creationTimestamp}"
  4. Get the name of the Db2 Big SQL head pod:
    head_pod=$(oc get pod -l app=bigsql-<instance_id>,name=dashmpp-head-0 --no-headers=true -o=custom-columns=NAME:.metadata.name)
  5. Set the client configuration:
    oc exec -i $head_pod -- sudo su - db2inst1 -c "/usr/ibmpacks/current/bigsql/bigsql/bigsql-cli/python/bigsql_config.py set-property --xml-path /etc/hadoop/conf/hdfs-site.xml --name dfs.client.use.datanode.hostname --value true"
  6. Restart Db2 Big SQL:
    oc exec -i $head_pod -- sudo su - db2inst1 -c "bigsql stop; bigsql start"