Configuring Db2 Big SQL to use datanode hostnames when connecting to the datanodes on the Hadoop cluster

By default, Db2 Big SQL connects to DataNodes by using the IP address provided by the NameNode. Depending on the network configuration, this IP address might be unreachable from Db2 Big SQL.

About this task

To resolve this problem, let Db2 Big SQL use DNS resolution of the DataNode hostname.

Note: If the Hadoop configuration files are resynchronized from the Hadoop cluster as described in Updating and refreshing the Hadoop configuration files, then the following steps must be reapplied.
Best practice: You can run the commands in this task exactly as written if you set up environment variables. For instructions, see Setting up installation environment variables.

Ensure that you source the environment variables before you run the commands in this task.

Procedure

  1. Log in to Red Hat® OpenShift® Container Platform as an instance administrator.
    ${OC_LOGIN}
    Remember: OC_LOGIN is an alias for the oc login command.
  2. Change to the project where the IBM® Software Hub control plane is installed:
    oc project ${PROJECT_CPD_INST_OPERANDS}
  3. Identify the Db2 Big SQL instance ID:
    oc get cm -l component=db2bigsql -o custom-columns="Instance Id:{.data.instance_id},Instance Name:{.data.instance_name},Created:{.metadata.creationTimestamp}"
  4. Get the name of the Db2 Big SQL head pod:
    head_pod=$(oc get pod -l app=bigsql-<instance_id>,name=dashmpp-head-0 --no-headers=true -o=custom-columns=NAME:.metadata.name)
  5. Set the client configuration:
    oc exec -i $head_pod -- sudo su - db2inst1 -c "/usr/ibmpacks/current/bigsql/bigsql/bigsql-cli/python/bigsql_config.py set-property --xml-path /etc/hadoop/conf/hdfs-site.xml --name dfs.client.use.datanode.hostname --value true"
  6. Restart Db2 Big SQL:
    oc exec -i $head_pod -- sudo su - db2inst1 -c "bigsql stop; bigsql start"