Configuring Hive with Kerberos for quality tasks (Watson Knowledge Catalog)

You can use Hive with Kerberos with metadata import, automated discovery, and data analysis tasks. Before you use it, you must configure the connection.

About this task

To configure Hive with Kerberos, you must copy keytab and krb5.conf files to the conductor pod from your Hive Ambari cluster, which has Kerberos enabled.

For more information about quality-related tasks and automated discovery, see Curating data.

If you want to use Hive with Kerberos for quick scan, see Configuring quick scan for Hive with Kerberos.


  1. Connect to your load balancer by running this command:
    ssh <your load balancer node>
    Replace <your load balancer node> with the host name or IP address of your load balancer.
  2. Copy the following files from the Hive server to load balancer.
    scp root@<hive-server-node>:<path_to_kerberosConfFile> /tmp
    scp root@<hive-server-node>:<path_to_userKeytabFile> /tmp
    scp root@ /tmp
    scp root@ /tmp
  3. Find the name of the conductor pod. Run this command:
    oc get pods | grep conductor
  4. Log in to the pod by using the name from the previous step. For example:
    oc exec -it is-en-conductor-0 bash
  5. Create the directory:
    mkdir /mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles
  6. Copy these files from the load balancer tmp directory to the conductor pod:
    oc cp /tmp/<userKeytabFile> <namespace>/is-en-conductor-0:/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/<userKeytabFile>
    oc cp /tmp/krb5.conf <namespace>/is-en-conductor-0:/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/krb5.conf
    Provide the names of your keytab file and your namespace in these commands, for example:
    oc cp /tmp/user1.keytab zen/is-en-conductor-0:/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/user1.keytab
  7. Edit the /opt/IBM/InformationServer/ASBNode/lib/java/JDBCDriverLogin.conf file in the following way:
    JDBC_DRIVER_keytab{ required
    principal="<your principal URL>"
    Provide the name of your principal, for example principal="user1@IBM.COM".
  8. Edit the /opt/IBM/InformationServer/ASBNode/bin/ file and add the and parameters before -classpath:
    The updated command will look like this:
    eval exec '"${JAVA_HOME}/bin/java"' '$PLATFORM_OPTIONS' '-Xbootclasspath/a:conf:eclipse/plugins/' -Xss2M -Xmso2M '$LANGUAGE_OPTIONS' '-Djava.ext.dirs=$JAVA_HOME/lib/ext:lib/java:eclipse/plugins:eclipse/plugins/' '-Djava.util.logging.config.file=${NODE_DIR}/conf/' '' '' -classpath 'conf:eclipse/plugins/' ${J2EE_OPTS}
  9. Edit the /opt/IBM/InformationServer/Server/DSEngine/dsenv file. Add the following line:
  10. Run this command to start the DataStage engines:
    cd /opt/IBM/InformationServer/Server/DSEngine
    . ./dsenv
  11. Go to /opt/IBM/InformationServer/Server/DSEngine/bin and restart the ISFAgents service. Use these commands:
    cd /opt/IBM/InformationServer/Server/DSEngine/bin
    ./uv -admin -stop
    ./uv -admin -start
  12. Restart the agents by running these commands:
    service ISFAgents stop
    service ISFAgents start
    After the agents are started, verify that the and parameters where properly added. Run the following command:
    ps -aef | grep Agent

    You should see the newly added parameters.


The configured Hive connection is:
jdbc:ibm:hive://<your load balancer node>:10000;MaxStringSize=256;AuthenticationMethod=kerberos;ServicePrincipalName=hive/<host>@<EXAMPLE.COM>;loginConfigName=JDBC_DRIVER_keytab