Configuring Hive with Kerberos for quality tasks (Watson Knowledge Catalog)

You can use Hive with Kerberos with legacy metadata import, automated discovery, and data analysis tasks.

About this task

Note: This task is only for users who have the base configuration installed, which means that legacy features are enabled.
Such connections must be created through legacy metadata import. Select IBM > JDBC connector as the connector when you create the new import area. For more information, see Creating connections for use in automated discovery jobs.

To configure Hive with Kerberos, you must copy keytab and krb5.conf files to the conductor pod from your Hive Ambari cluster, which has Kerberos enabled.

For more information about quality-related tasks and automated discovery, see Curating data.

Procedure

  1. Connect to your load balancer by running this command:
    ssh <your load balancer node>
    Replace <your load balancer node> with the host name or IP address of your load balancer.
  2. Copy the following files from the Hive server to load balancer.
    scp root@<hive-server-node>:<path_to_kerberosConfFile> /tmp
    scp root@<hive-server-node>:<path_to_userKeytabFile> /tmp
    
    Example:
    scp root@192.0.2.24:/etc/krb5.conf /tmp
    scp root@192.0.2.24:/etc/security/keytabs/user1.keytab /tmp
  3. Find the name of the conductor pod. Run this command:
    oc get pods | grep conductor
  4. Log in to the pod by using the name from the previous step. For example:
    oc exec -it is-en-conductor-0 bash
  5. Create the directory:
    mkdir /mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles
  6. Copy these files from the load balancer tmp directory to the conductor pod:
    oc cp /tmp/<userKeytabFile> <namespace>/is-en-conductor-0:/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/<userKeytabFile>
    oc cp /tmp/krb5.conf <namespace>/is-en-conductor-0:/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/krb5.conf
    Provide the names of your keytab file and your namespace in these commands, for example:
    oc cp /tmp/user1.keytab zen/is-en-conductor-0:/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/user1.keytab
  7. Edit the /mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/JDBCDriverLogin.conf file in the following way:
    JDBC_DRIVER_keytab{
    com.ibm.security.auth.module.Krb5LoginModule required
    credsType=both
    principal="<your principal URL>"
    useKeytab="FILE:/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/user1.keytab";
    };
    Provide the name of your principal, for example principal="user1@IBM.COM".
  8. Edit the /opt/IBM/InformationServer/ASBNode/bin/Agent.sh file and add the java.security.auth.login.config and java.security.krb5.conf parameters before -classpath:
    '-Djava.security.auth.login.config=/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/JDBCDriverLogin.conf'
    '-Djava.security.krb5.conf=/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/krb5.conf'
    The updated command will look like this:
    eval exec '"${JAVA_HOME}/bin/java"' '$PLATFORM_OPTIONS' '-Xbootclasspath/a:conf:eclipse/plugins/com.ibm.iis.client' -Xss2M -Xmso2M '$LANGUAGE_OPTIONS' '-Djava.ext.dirs=$JAVA_HOME/lib/ext:lib/java:eclipse/plugins:eclipse/plugins/com.ibm.iis.client' '-Djava.util.logging.config.file=${NODE_DIR}/conf/asbagent-logging.properties' '-Djava.security.auth.login.config=/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/JDBCDriverLogin.conf' '-Djava.security.krb5.conf=/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/krb5.conf' -classpath 'conf:eclipse/plugins/com.ibm.iis.client' ${J2EE_OPTS} com.ibm.iis.isf.agent.impl.AgentImpl
  9. Edit the /opt/IBM/InformationServer/Server/DSEngine/dsenv file. Add the following line:
    CC_JVM_OPTIONS="-Djava.security.auth.login.config=/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/JDBCDriverLogin.conf -Djava.security.krb5.conf=/mnt/dedicated_vol/Engine/is-en-conductor-0/EngineClients/KrbFiles/krb5.conf"; export CC_JVM_OPTIONS
  10. Run this command to start the DataStage engines:
    cd /opt/IBM/InformationServer/Server/DSEngine
    . ./dsenv
  11. Go to /opt/IBM/InformationServer/Server/DSEngine/bin and restart the ISFAgents service. Use these commands:
    cd /opt/IBM/InformationServer/Server/DSEngine/bin
    ./uv -admin -stop
    ./uv -admin -start
  12. Go to /opt/IBM/InformationServer/ASBNode/bin and restart the agents by running these commands:
    ./NodeAgents.sh stop
    ./NodeAgents.sh start
    After the agents are started, verify that the java.security.auth.login.config and java.security.krb5.conf parameters where properly added. Run the following command:
    ps -aef | grep Agent

    You should see the newly added parameters.

Results

The configured Hive connection is:
jdbc:ibm:hive://<your load balancer node>:10000;MaxStringSize=256;AuthenticationMethod=kerberos;ServicePrincipalName=hive/<host>@<EXAMPLE.COM>;loginConfigName=JDBC_DRIVER_keytab
Example:
jdbc:ibm:hive://load.balancer.node:10000;MaxStringSize=256;AuthenticationMethod=kerberos;ServicePrincipalName=hive/_HOST@EXAMPLE.COM;loginConfigName=JDBC_DRIVER_keytab